Genomic File Formats, 1000 genomes project
UZH BIO392 HS22 - Day 04
Izaskun Mallona (email: izaskun.mallona at sib.swiss).
Morning
We will have a lecture and run a set of exercises (on site).
- Overview of the standard genomics data formats
- FASTA
- FASTQ
- SAM
- BED
- GFF
- VCF
- Basic file processing for bioinformatics
- wc, grep, awk
- Exercises
- Project
Afternoon
Exercises and project.
- SAM v1 format specification
- BEDtools paper
- 0-start, 1-start, open, closed: how do we count
- GFF3 format
- VCF format
Slides
Exercises
Exercises 1-28 cover FASTA, FASTQ, SAM, BED, GTF and VCF. They include a small project.
- Exercises. Exercises 5-14 cover FASTA and FASTQ.