Skip to content

Genomic File Formats, 1000 genomes project

UZH BIO392 HS22 - Day 04

Izaskun Mallona (email: izaskun.mallona at sib.swiss).

Morning

We will have a lecture and run a set of exercises (on site).

  • Overview of the standard genomics data formats
  • FASTA
  • FASTQ
  • SAM
  • BED
  • GFF
  • VCF
  • Basic file processing for bioinformatics
  • wc, grep, awk
  • Exercises
  • Project

Afternoon

Exercises and project.

Slides

Exercises

Exercises 1-28 cover FASTA, FASTQ, SAM, BED, GTF and VCF. They include a small project.

  • Exercises. Exercises 5-14 cover FASTA and FASTQ.

Resources

Cheatsheets:

File formats at a glance