BIO390 - Introduction to Bioinformatics
Summary
The handling and analysis of biological data using computational methods has become an essential part in most areas of biology. In this lecture, students will be introduced to the use of bioinformatics tools and methods in different topics, such as molecular resources and databases, standards and ontologies, sequence and high performance genome analysis, biological networks, molecular dynamics, proteomics, evolutionary biology and gene regulation. Additionally, the use of low level tools (e.g. Programming and scripting languages) and specialized applications will be demonstrated. Another topic will be the visualization of quantitative and qualitative biological data and analysis results.
Practical Information
Requirements
The introduction to Bioinformatics is a series of lectures aimed at students w/ a medium to advanced undergrate level in Life Sciences. Participants are expected to be knowledgeable in the basic concepts of molecular biology and genetics, but also to have some basic understanding in statistics and concepts of programming, if not practical experience (i.e. have attended introductory courses, done some data analyses in R or Python etc.). Experience with common platforms used for shared code/document management (e.g. Gitlab/Github...) is helpful but not strictly required.
Schedule & Notes
- Autumn semesters
- 1 x 2h / week
- Tue 08:00-09:45
- UZH Irchel campus, Y-03G-85
- OLAT - but not much there...
- No lecture recordings - we do not record the lectures since HS23 (regular attendance is expected) but there might be still 2022 lecture recordings available
- Course language is English
Syllabus
Some learning goals should provide you with additional guidance - but please be aware that those may include details which not have been covered in the current semester; still good to know but not necessarily relevant for the exam.
Next: Building a Biological Information Resources
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Qingyao Huang
This lecture will use introduce bioinformatics methods, principles and tools for building and maintaining information resources in life sciences, with particular emphasis on 'omics data types.
Continue readingUpcoming: Clinical Bioinformatics
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Valerie Barbie (Director SIB Clinical Bioinformatics)
Medical practice is undergoing a revolution around personalized health: this major change is driven by the continuous development of cost-effective high-throughput technologies that produce gigantic quantities of data in numerous areas, from imaging to genomics, and of the corresponding tools required to process these data.
Continue readingUpcoming: Genomic Data Risks & Opportunities
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Michael Baudis
The understanding of the impact of inherited and somatic genome variants on phenotypes and diseases requires a thorough understanding of such variants amongst populations in general and carriers of the phenotypes and diseases in particular. Such information can only be provided through the inclusion of data from a multitude of genome resources in variant evaluation efforts, including such from outside (international) jurisdictions. However, opening such resources carries the inherent risk of breaching privacy, particularly through re-identification of individuals or their relatives and potentially through the exposure of individual genome-related personal information including phenotypic and "performance" prediction and relative disease risk.
Continue readingUpcoming: BIO390 Exam
BIO390 UZH HS24 - Introduction to Bioinformatics08:15-09:45 @ Y-03G-85 and Y-03G-91 UZH Irchel
The exam will be on the last day of the course on site:
- time: 08:15-09:45
- multiple (single + multiple) choice w/ one or two open questions
- no material, phones etc.
- student ID for entrance
- please refer to the learning goals for guidance
- ¡topics may be edited throughout the course!
- these just provide some non-exclusive guidance
Upcoming: BIO390 Repeat Exam
BIO390 UZH HS24 - Introduction to Bioinformatics
The repeat exam has been tentatively planned for the week of January 20-24, 2025:
- Exact date TBD; time: 09:15-10:45
- Planned room: Y13-L-11/13
- multiple (single + multiple) choice w/ one or two open questions
- no material, phones etc.
- student ID for entrance
- please refer to the learning goals for guidance
- ¡topics may be edited throughout the course!
- these just provide some non-exclusive guidance
Components of the Semantic web
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Ahmad Aghaebrahimian (ZHAW)
Biomedical science is rich in structured and unstructured textual data including but not limited to hundreds of ontologies as well as millions of scientific publications. The semantic web and its stack of standards provide an efficient way for organizing knowledge extracted from such a huge volume of data. Modeling data in knowledge graphs makes complex question answering and reasoning over abundance of information manageable and feasible. In this session, we will find out how.
Continue readingBiological Networks
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Andreas Wagner
This part of the course BIO390 (Introduction to Bioinformatics) will review examples of biological networks their basic properties.
Learning goals for exam preparation 2024
After this lecture you should be able to
Continue readingText Mining and Search Strategies
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Patrick Ruch (HES-SO/HEG Geneva)
Search engines, stemming, NGRAMs ... and much more.
Continue readingProteomics
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Katja Baerenfaller, Swiss Institute of Allergy and Asthma Research (SIAF) and University of Zurich
In proteomics one of the important bioinformatics tasks is to generate lists of reliably identified peptides and proteins in mass spectrometry-based experiments. For this, amino acid sequences are assigned to measured tandem mass spectra. The quality of the peptide spectrum assignments are scored and criteria are applied that allow to distinguish the good from the bad hits and to estimate the quality of the dataset.
Continue readingMetagenomics
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Shinichi Sunagawa (ETHZ)
Abstract:
Microorganisms are numerically dominant on Earth and drive the cycling of energy, elements and matter. Thanks to advances in high-throughput DNA sequencing technologies and computational power, microbial communities can now be studied without the need to cultivate them in a laboratory setting. Essential tasks in studying microbial communities include the identification and quantification of their member taxa and the pair-wise compositional comparison of different microbial communities.
Continue readingRegulatory Genomics and Epigenomics
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Izaskun Mallona
We will introduce the epigenomics and regulatory genomics fields, including their aims, techniques, and data analysis approaches.
Continue readingMachine Learning for Biological Use Cases
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Valentina Boeva (ETHZ)
Brief note: In this lecture V. Boeva will cover the standard machine learning methods used in the analysis of biological data: dimensionality reduction, clustering, classification and regression.
Continue readingBiological Sequence Informatics
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Christian von Mering
The analysis of biological sequences - primarily DNA, RNA and protein sequences - constitutes one of earliest and core areas of bioinformatics. This lecture introduces principles and examples of bioinformatic sequence analyses and inter-sequence comparisons. Continue reading
Statistical Bioinformatics
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Mark Robinson
Today's topic is the use of statistical methods in the analysis of biological datasets, with examples from high-throughput (sequencing and array) technologies and single cell analyses.
Continue readingWhat is Bioinformatics? Introduction and Resources
BIO390 UZH HS24 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Michael Baudis
The "What is Bioinformatics? Introduction and Resources" provides a general introduction into the field and a description of the lecture topics, timeline and procedures.
Topics covered in the lecture are e.g.:
- a term definition for bioinformatics
- the relation of hypothesis driven and data driven science, with respect to bioinformatics
- categories of bioinformatics tools and data
- research areas and topics
- the varying emphasis on "bio" and "informatics"
- databases (primary vs. derived) and data curation
- data collection & curation
- file Formats, ontologies & APIs ass areas/topics (w/o details)
- "not-bioinformatics"
... but also an introduction into the cancer genomics and data sharing topics.
Continue readingBIO390 Repeat Exam
BIO390 UZH HS23 - Introduction to Bioinformatics
The repeat exam will be on January 24, 2024:
- time: 10:15-11:45
- Changed room: Y13-L-11/13
- multiple (single + multiple) choice w/ one or two open questions
- no material, phones etc.
- student ID for entrance
- please refer to the learning goals for guidance
- ¡topics may be edited throughout the course!
- these just provide some non-exclusive guidance
BIO390 Exam
BIO390 UZH HS23 - Introduction to Bioinformatics08:15-09:45 @ Y-03G-85 and Y-03G-91 UZH Irchel
The exam will be on the last day of the course on site:
- time: 08:15-09:45
- multiple (single + multiple) choice w/ one or two open questions
- no material, phones etc.
- student ID for entrance
- please refer to the learning goals for guidance
- ¡topics may be edited throughout the course!
- these just provide some non-exclusive guidance
Genomic Data Risks & Opportunities
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Michael Baudis
The understanding of the impact of inherited and somatic genome variants on phenotypes and diseases requires a thorough understanding of such variants amongst populations in general and carriers of the phenotypes and diseases in particular. Such information can only be provided through the inclusion of data from a multitude of genome resources in variant evaluation efforts, including such from outside (international) jurisdictions. However, opening such resources carries the inherent risk of breaching privacy, particularly through re-identification of individuals or their relatives and potentially through the exposure of individual genome-related personal information including phenotypic and "performance" prediction and relative disease risk.
Continue readingClinical Bioinformatics
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Valerie Barbie (Director SIB Clinical Bioinformatics)
Medical practice is undergoing a revolution around personalized health: this major change is driven by the continuous development of cost-effective high-throughput technologies that produce gigantic quantities of data in numerous areas, from imaging to genomics, and of the corresponding tools required to process these data.
Continue readingBuilding a Genomics Resource
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Michael Baudis
In this lecture we will use our Progenetix resource, a website providing information about genomic copy number mutations in cancer - to present the different components needed for generating, storing, representing, visualizing and accessing a specific type of genomic data and associated classifications.
Continue readingComponents of the Semantic web
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Ahmad Aghaebrahimian (ZHAW)
Biomedical science is rich in structured and unstructured textual data including but not limited to hundreds of ontologies as well as millions of scientific publications. The semantic web and its stack of standards provide an efficient way for organizing knowledge extracted from such a huge volume of data. Modeling data in knowledge graphs makes complex question answering and reasoning over abundance of information manageable and feasible. In this session, we will find out how.
Continue readingText Mining and Search Strategies
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Patrick Ruch (HES-SO/HEG Geneva)
Search engines, stemming, NGRAMs ... and much more.
Continue readingBiological Networks
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Pouria Dasmeh
This part of the course BIO390 (Introduction to Bioinformatics) will review examples of biological networks their basic properties.
Continue readingProteomics
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Katja Baerenfaller, Swiss Institute of Allergy and Asthma Research (SIAF) and University of Zurich
In proteomics one of the important bioinformatics tasks is to generate lists of reliably identified peptides and proteins in mass spectrometry-based experiments. For this, amino acid sequences are assigned to measured tandem mass spectra. The quality of the peptide spectrum assignments are scored and criteria are applied that allow to distinguish the good from the bad hits and to estimate the quality of the dataset.
Continue readingMachine Learning for Biological Use Cases
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Valentina Boeva (ETHZ)
Brief note: In this lecture V. Boeva will cover the standard machine learning methods used in the analysis of biological data: dimensionality reduction, clustering, classification and regression.
Continue readingRegulatory Genomics and Epigenomics
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Izaskun Mallona
Continue readingMetagenomics
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Shinichi Sunagawa (ETHZ)
Abstract:
Microorganisms are numerically dominant on Earth and drive the cycling of energy, elements and matter. Thanks to advances in high-throughput DNA sequencing technologies and computational power, microbial communities can now be studied without the need to cultivate them in a laboratory setting. Essential tasks in studying microbial communities include the identification and quantification of their member taxa and the pair-wise compositional comparison of different microbial communities.
Continue readingStatistical Bioinformatics
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Mark Robinson
Today's topic is the use of statistical methods in the analysis of biological datasets, with examples from high-throughput (sequencing and array) technologies and single cell analyses.
Continue readingWhat is Bioinformatics? Introduction and Resources
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Michael Baudis
This year happening at the second lecture day, the "What is Bioinformatics? Introduction and Resources" provides a general introduction into the field and a description of the lecture topics, timeline and procedures.
Topics covered in the lecture are e.g.:
- a term definition for bioinformatics
- the relation of hypothesis driven and data driven science, with respect to bioinformatics
- categories of bioinformatics tools and data
- research areas and topics
- the varying emphasis on "bio" and "informatics"
- databases (primary vs. derived) and data curation
- data collection & curation
- file Formats, ontologies & APIs ass areas/topics (w/o details)
- "not-bioinformatics"
Biological Sequence Informatics
BIO390 UZH HS23 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Christian von Mering
The analysis of biological sequences - primarily DNA, RNA and protein sequences - constitutes one of earliest and core areas of bioinformatics. This lecture introduces principles and examples of bioinformatic sequence analyses and inter-sequence comparisons. Continue reading
BIO390 Repeat Exam
BIO390 UZH HS22 - Introduction to Bioinformatics
The repeat exam will be on January 24, 2023:
- time: 10:15-11:45
- Y03-G-85 (normal lecture hall, unless noted of change)
- multiple (single + multiple) choice w/ one or two open questions
- no material, phones etc.
- student ID for entrance
- please refer to the learning goals for guidance
- ¡topics may be edited throughout the course!
- these just provide some non-exclusive guidance
BIO390 Exam
BIO390 UZH HS22 - Introduction to Bioinformatics08:15-09:45 @ UZH Irchel Y03-G-85
The exam will be on the last day of the course on site:
- time: 08:15-09:45
- ¡¡¡ NEW: Room change to Y15-G-20 !!!
- multiple (single + multiple) choice w/ one or two open questions
- no material, phones etc.
- student ID for entrance
- please refer to the learning goals for guidance
- ¡topics may be edited throughout the course!
- these just provide some non-exclusive guidance
Genomic Data Risks & Opportunities
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Michael Baudis
The understanding of the impact of inherited and somatic genome variants on phenotypes and diseases requires a thorough understanding of such variants amongst populations in general and carriers of the phenotypes and diseases in particular. Such information can only be provided through the inclusion of data from a multitude of genome resources in variant evaluation efforts, including such from outside (international) jurisdictions. However, opening such resources carries the inherent risk of breaching privacy, particularly through re-identification of individuals or their relatives and potentially through the exposure of individual genome-related personal information including phenotypic and "performance" prediction and relative disease risk.
Continue readingClinical Bioinformatics
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Valerie Barbie (Director SIB Clinical Bioinformatics)
Medical practice is undergoing a revolution around personalized health: this major change is driven by the continuous development of cost-effective high-throughput technologies that produce gigantic quantities of data in numerous areas, from imaging to genomics, and of the corresponding tools required to process these data.
Continue readingBuilding a Genomics Resource
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Michael Baudis
In this lecture we will use our Progenetix resource, a website providing information about genomic copy number mutations in cancer - to present the different components needed for generating, storing, representing, visualizing and accessing a specific type of genomic data and associated classifications.
Continue readingComponents of the Semantic web
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Ahmad Aghaebrahimian (ZHAW)
Biomedical science is rich in structured and unstructured textual data including but not limited to hundreds of ontologies as well as millions of scientific publications. Semantic web and its stack of standards provide an efficient way for organizing knowledge extracted from such huge volume of data. Modeling data in knowledge graphs makes complex question answering and reasoning over abundance of information manageable and feasible. In this session we will find out how.
Continue readingText Mining and Search Strategies
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Patrick Ruch (HES-SO/HEG Geneva)
Search engines, stemming, NGRAMs ... and much more.
Continue readingBiological Networks
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Pouria Dasmeh
This part of the course BIO390 (Introduction to Bioinformatics) will review examples of biological networks their basic properties.
Continue readingProteomics
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Katja Baerenfaller, Swiss Institute of Allergy and Asthma Research (SIAF) and University of Zurich
In proteomics one of the important bioinformatics tasks is to generate lists of reliably identified peptides and proteins in mass spectrometry-based experiments. For this, amino acid sequences are assigned to measured tandem mass spectra. The quality of the peptide spectrum assignments are scored and criteria are applied that allow to distinguish the good from the bad hits and to estimate the quality of the dataset.
Continue readingMetagenomics
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Shinichi Sunagawa (ETHZ)
Abstract:
Microorganisms are numerically dominant on Earth and drive the cycling of energy, elements and matter. Thanks to advances in high-throughput DNA sequencing technologies and computational power, microbial communities can now be studied without the need to cultivate them in a laboratory setting. Essential tasks in studying microbial communities include the identification and quantification of their member taxa and the pair-wise compositional comparison of different microbial communities.
Continue readingRegulatory Genomics and Epigenomics
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Izaskun Mallona
Continue readingMachine Learning for Biological Use Cases
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Valentina Boeva (ETHZ)
Brief note: In this lecture V. Boeva will cover the standard machine learning methods used in the analysis of biological data: dimensionality reduction, clustering, classification and regression.
Continue readingStatistical Bioinformatics
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Mark Robinson
Continue readingBiological Sequence Informatics
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Christian von Mering
The analysis of biological sequences - primarily DNA, RNA and protein sequences - constitutes one of earliest and core areas of bioinformatics. This lecture introduces principles and examples of bioinformatic sequence analyses and inter-sequence comparisons. Continue reading
What is Bioinformatics? Introduction and Resources
BIO390 UZH HS22 - Introduction to Bioinformatics08:00-09:45 @ UZH Irchel Y03-G-85
Michael Baudis
The first day of the "Introduction to Bioinformatics" lecture series starts with a general introduction into the field and a description of the lecture topics, timeline and procedures.
Topics covered in the lecture are e.g.: Continue reading
UZH BIO390 - Learning Goals
This page indicates some of the learning goals, as emphasised by the different lecturers. Some points will have been discussed in different lectures; accordingly, exam questions may not refer to information of one specific presentation.
Learning goals have a general scope
Please be aware that some of the "Learning Goals" may reflect aspects not necessarily captured by the lectures in the current semester - The ones reelevant for the current semester's exam are related to the given lectures. Also, updates may occurr at any time.
Consider individual pages
Please consider individual course pages and documents linked from there. Those might be updated later on, so it is a good idea to check back before the end of the course.
Bioinformatics: Definition & Concepts
- definition of "Bioinformatics" (cf. Anna Tramontano)
- categories of informatics tools used in bioinformatics
- hypothesis versus data driven science
- areas of bioinformatics/bioinformaticians, (in contrast to "pure" modelling, statistics etc.)
- 3 main categories of biological data, and example resources
- definition of API
- common sequence related file formats
- hierarchies and relationships as 2 main principles of ontologies
- areas of "not-bioinformatics", and why
- bioinformatics tools (programming languages, libraries, online resources) and their specific use cases
Sequence Analysis
- basics of DNA and Protein sequences
- substitution matrices
- BLAST
- parameters
- terms
- scores
Statistical Bioinformatics & Machine Learning
- usage of gene expression profiling
- multiple testing correction
- parameters for hierarchical clustering
- statistical evidence for a change in the mean
- dimensionality reduction
- central limit theorem
- hierarchical clustering
- clustering coefficient
- unsupervised machine learning tasks/components
- ML model types
Bioinformatics tools & resources
- common programming/analysis languages/environments in bioinformatics and their preferred use
- e.g. R, Perl, Python, JavaScript ... but also environments & packages like Mkdocs, ReadTheDocs, Bioconductor ...
- components of bioinformatics online resources
- Databases, middleware, APIs, frontends ...
- database types / concepts
- SQL vs. document databases
- data curation (biocuration)
- importance of classifications, ontologies
- some ontologies and their use (NCIt, UBERON, DO)
- CURIEs as identifiers (see below)
- Null Island
- ISO 8601 for dates and times (and why / why not?)
Some Q & A (thanks to the providers of these questions)
- Progenetix use case: In comparative genomic hybridization, in the case of an high
copy number segment of DNA, more tumor DNA will hybridize to the metaphase chromosomes
just because of higher likelihood?
- in essence, yes; it is a mix of higherlikelihood and therefore higher binding probability (it is actually hard for a given fragment to encounter the right place on the chromosome or array) w/ or w/o competition effects (latter when using normal reference DNA)
- CURIEs: ...hierarchical coding systems where individual codes are represented as
CURIEs - aren't they a type of URI rather then a "code"?
- CURIEs are universal identifiers (URIs) consisting of a public and a local part
- they are universal (like UUIDs) but unlike UUIDs (which can be anonymous) they are resolvable (i.e. the public part can be resolved to a URL where then the local part can be used to retrieve the resource)
- the "hierarchical coding systems" usually don't use the CURIE internally but only the private part; but using the complete CURIE externally makes it unambiguous
- Progenetix use case: In Progenetix one can either use the GA4GH-Beacon API to
query (i.e., do information retrieval) OR use the pgxRpi API to load the data
into an analysis environment (i.e., for eventual knowledge extraction)?
- Not really. The bycon software stack implements database access / middleware /
Beacon API instance. This (outward facing) API (compatible to the Beacon specification)
can be accessed by various clients for data retrieval (e.g. the
beaconplus-web
orprogenetix-web
JavaScript front ends, manual http requests, Beacon aggregator services...). - The
pgxRpi
itself is an API for the R environment, i.e. another client accessing the Beacon API. So here one gets(DATA - Progenetix' Beacon API - pgxRpi API - R analysis environment
.
- Not really. The bycon software stack implements database access / middleware /
Beacon API instance. This (outward facing) API (compatible to the Beacon specification)
can be accessed by various clients for data retrieval (e.g. the
Regulatory Genomics and Epigenomics
- secondary/tertiary human genome structure
- functional genome content
- transcription factors & genome interaction
- chemical genome modifications, their effectors and results
- Chip-Seq
- read mapping
- peak calling
- sequence compression algorithms
Metagenomics
- concept of taxonomic diversity
- concept microbial community dissimilarity
- how are sequences used to derive an adopted species concept for prokaryotes
- principle steps for 16S rRNA-based taxonomic composition analysis
- essential steps of short sequencing read assembly into contigs and scaffolds
- basic steps of metagenomic analysis: from raw reads to the reconstruction of genomic scaffolds
Proteomics
- principles of proteome organization in the cell
- key experimental and computational concepts for the collection and analysis of high confidence protein-protein interaction data
- peptide fragmentation
- target-decoy approach
- protein quantification
- MassSpec similarity queries
Biological Networks
- Protein interaction and metabolic networks
- databases and online resources for different types of pathway and interaction data
- detection of protein complexes
- graphs, nodes, edges, paths
- geodesics, graph diameter
- common types of degree distributions
- adjacency matrix
- shortest path matrix
- assortative and disassortative graphs
- community (module) detection
- cliques
- motifs, graph representations of metabolic networks
Text Mining
- text mining pipelines & (current) common programs/applications
- article/literature repositories (with focus on accessibility)
- processing steps in text mining
- stemming etc.
- common problems in text mining
- search engine precision metrics
- benchmarking
Semantic web, RDF, Ontologies
- semantic web
- elements
- benefits
- components of ontologies
- stack of standards in semantic web and their functions
- RDF for modeling data
- OWL/OBO for modeling a biomedical domain
- querying knowledge graphs for answering biomedical questions
Clinical Bioinformatics & Personalized Medicine
- genomic variants (types, numbers)
- reference genome(s)
- main bottlenecks of molecular diagnostics in the clinical setting
- goals of many personalized health initiatives
- currently favoured clinical NGS technology
- clinical trial participation
Genomic data & privacy
- reasons for needing many genomes
- genomic privacy and re-identification (concepts)
- principle of re-identification attacks over the Beacon protocol
- long range familial searches
- opinions about risk vs. opportunities
- direct-to-consumer genetic testing -> what, how
- technical and regulatory solutions against privacy breaches & data abuse
- genotype-phenotype (G2P) (ab-)use