Bioinformatics, Biomedical Data Science and Computational Biology in the Zurich Area
Multiple research groups at the University of Zurich (UZH), the ETH Zurich (ETHZ) and other larger Zürich area institutions (PSI, ZHAW ...) carry out research in computational biology, biomedical data science and bioinformatics methods, in fields as diverse as statistical and functional genomics, evolutionary biology, biostatistics, medical informatics, and neurobiology. The individual research groups form a loose confederation of scientists with common research interests and collaborations as well as shared teaching activities.
Colorectal Cancer Through the Lense of Whole Transcriptome Imaging
Helena Crowell (CNAG)
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Multi-cellular systems orchestrate function through an interplay of their molecular constituents and structural organization. Recent spatial molecular imaging (SMI) technologies can profile tissues at molecular resolution while retaining target coordinates — albeit limited in the number of (transcript) targets measurable thus far. In an immune-oncology context, these data have the potential to characterize (pre)malignant tissues with molecular precision, thereby laying the ground stones for personalized medical decisions to be made.
Here, we leveraged 1k-plex SMI data on the human tonsil to develop a computational pipeline to process these data Continue reading
Spatially variable genes: methods, benchmarking, and future directions
Lukas Weber
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract N/A
Continue readingBias Correction and Differential Motif Activity in ATAC-seq Data
Jiayi Wang
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract The Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) has become widely adopted for assessing chromatin accessibility due to its efficiency in time and input material. However, technical biases in ATAC-seq data can complicate the downstream analysis tasks, and here we focus on identifying differentially-active transcription factors (TFs). Continue reading
Marker identification and cell annotation approaches for (spatial) transcriptomic data to unravel tissue cellular heterogeneity
Jinjin Chen (WEHI)
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Advances in technology have produced an unprecedented volume and types of multi-omics and spatial datasets crucial for identifying markers in biological and medical research. My PhD focused on developing methods for marker and cell type identification in a range of data types, ranging from bulk-RNAseq, single cell RNAseq to single cell spatial transcriptomics.
In this presentation, I’ll first briefly introduce mastR, a Bioconductor package for automatically identifying markers in bulk RNA sequencing data. mastR employs a rank-product based test by leveraging statistical results from edgeR and limma to identify markers with precision akin to expert curation. Next, I’ll describe an ensemble workflow that integrates various cell annotation methods Continue reading
Robust data-driven gene expression state inference for RNA-seq using curated intergenic regions
Alessandro Brandulas Cammarata (UniL)
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Bulk and single-cell RNA-Seq are powerful and widely used techniques that provide quantitative information on gene expression. While the primary focus of many applications is to estimate gene expression levels, a crucial first step in assessing gene activity is to distinguish technical or biological transcriptional noise from actively expressed genes. Typically, this is accomplished by setting an arbitrary abundance threshold (e.g., TPM>2) for calling a gene expressed. However, because of the substantial variation in technical and biological noise levels across RNA-Seq experiments the usage of such a fixed threshold can lead to either a loss of information if it is set too high or to an increase in false positives if set too low.
To overcome these limitations, we propose an updated dynamic approach. Continue reading
On the identification of differentially-active transcription factors from ATAC-seq data
Emanuel Sonder
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract ATAC-seq has emerged as a rich epigenome profiling technique, and is commonly used to identify Transcription Factors (TFs) underlying given phenomena. A number of methods can be used to identify differentially-active TFs through the accessibility of their DNA-binding motif, however little is known on the best approaches for doing so. We benchmarked several such methods using a combination of curated datasets with various forms of short-term perturbations on known TFs, as well as semi-simulations. Continue reading
Benchmarking current spatial transcriptomics domain identification methods
Jieran Sun
Zurich Seminars in Bioinformatics
- 12:15 ZOOM Call only!
Abstract Spatial transcriptomics (ST) preserves transcriptomic information within a spatial context, enabling an unprecedented understanding of tissue architecture and cellular heterogeneity. Since 2020, over 50 methods have been developed to identify spatial domains in ST datasets. Existing benchmarking efforts are limited by imbalances in dataset and technology inclusion, hindering a comprehensive overview of current methods. This study is a pilot demonstration of SpaceHack, a collaborative and community-driven benchmarking framework initiated by more than 40 researchers from across the world. Continue reading
Copy number variation heterogeneity reveals biological inconsistency in hierarchical cancer classifications
Ziying Yang
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Cancers are heterogeneous diseases with unifying features of abnormal and consuming cell growth, where the deregulation of normal cellular functions is initiated by the accumulation of genomic mutations in cells of - potentially - any organ. At diagnosis malignancies typically present with patterns of somatic genome variants on diverse levels of heterogeneity. Among the different types of genomic alterations, copy number variants (CNV) represent a distinct, near-ubiquitous class of structural variants.
Continue readingBioinformatics for beginners course at ZHAW
ZHAW Wädenswil course
This is a hands-on course to cover bioinformatics tools, and best practices for genomic analyses, including cancer genomics, and machine learning approaches.
Deadline for registration is May 20; the course itself will take place on 03.06.2024 and 07.06.2024, at ZHAW Wädenswil.
Continue readingSystematic comparison of sequencing-based spatial transcriptomic methods with cadasSTre and SpatialBench
Prof Matthew Ritchie, Epigenetics and Development Division (WEHI)
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Sequencing-based Spatial Transcriptomics (sST) allows gene expression to be measured within complex tissue contexts. Although a wide array of sST technologies are currently available to researchers, efforts to comprehensively benchmark different platforms are currently lacking. The inherent variability across technologies and datasets poses challenges in formulating standardized evaluation metrics. Continue reading
Proximal Short Tandem Repeat Variations as Regulators of Gene Expression across Multiple Cancers
Feifei Xia (ZHAW)
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Short tandem repeats (STRs) have been reported as potential regulators of gene expression changes in healthy populations and colorectal cancer (CRC). While STR mutations are also enriched in stomach cancer (STAD) and endometrial cancer (UCEC), especially in tumors with the microsatellite instability (MSI) phenotype, the functional impacts of STRs on gene expression remain unclear across cancer types. In particular, previous studies were mostly limited to association analyses between single STR locus and gene expression.
Using whole exome sequencing, gene expression, and clinical information from TCGA, we examined the linear relationships between gene expression and STR length variations in the gene region. We identified 714, 359, and 101 expression STRs (eSTRs) in CRC, STAD, and UCEC, respectively, of which 10 genes are shared in all three cancer types. We then extended our analysis by performing conditional analyses to discover the eSTRs showing independent functional effects within the context of MSI phenotype. Moreover, we found that the lengths of eSTRs proximal to a gene are often correlated with each other and have consistent effects on the expression levels of the gene. Our findings can expand the catalog of genetic variants that affect gene expression, underlying the importance of STR variants in tumorigenesis and suggesting coordinated potential regulatory mechanisms across diverse cancer types.
Continue readingIntegration of Clinical, Laboratory and Multi-Omics Data to Leverage Machine Learning for Diagnostics
Jan Kruta, FHNW Muttenz
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Early and accurate diagnosis is crucial for preventing disease development and defining therapy strategies. Due to predominantly unspecific symptoms, diagnosis of autoimmune diseases is notoriously challenging. Clinical decision support systems are a promising method with the potential to enhance and expedite precise diagnostics by physicians. However, due to the difficulties of integrating and encoding multi-omics data with clinical values, as well as a lack of standardization, such systems are often limited to certain data types. Accordingly, even sophisticated data models fall short when making accurate disease diagnoses and presenting data analyses in a user-friendly form. Therefore, the integration of various data types is not only an opportunity but also a competitive advantage for research and industry.
We have developed an integration pipeline to enable the use of machine learning for patient classification based on multi-omics data in combination with clinical values and laboratory results, that resulted in 95% prediction accuracy of autoimmune diseases studied. Our results deliver insights into autoimmune disease research and have the potential to be adapted for applications across disease conditions.
Continue readingComparison of Single-cell Long-read and Short-read Transcriptome Sequencing of Patient-derived Organoid Cells of ccRCC: Quality Evaluation of the MAS-ISO-seq Approach
Natalia Zajac, Functional Genomics Center Zurich
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Single-cell RNA sequencing is used in profiling gene expression differences between cells. Short-read sequencing platforms provide high throughput and high-quality information at the gene-level, but the technique is hindered by limited read length, failing in providing an understanding of the cell heterogeneity at the isoform level. This gap has recently been addressed by the long-read sequencing platforms that provide the opportunity to preserve full-length transcript information during sequencing.
To objectively evaluate the information obtained from both methods, we sequenced four samples of patient-derived organoid cells of clear cell renal cell carcinoma and one healthy sample of kidney organoid cells on Illumina Novaseq 6000 and PacBio Sequel IIe. For both methods, for each sample, the cDNA was derived from the same 10x Genomics 3' single-cell gene expression cDNA library. Here we present the technical characteristics of both datasets and compare cell metrics and gene-level information.
We show that the two methods largely overlap in the results but we also identify sources of variability which present a set of advantages and disadvantages to both methods.
Continue readingCharacterizing Intestinal Fibroblast Diversity in Health and Inflammatory Bowel Disease Through Single-Cell Analysis
Melissa Ensmenger - Master Thesis Presentation
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Crohn’s disease (CD), a form of inflammatory bowel disease (IBD) characterized by chronic intestinal inflammation, emerges from a complex interplay of genetic, environmental, and cellular factors. Among these, intestinal fibroblasts represent a pivotal yet underexplored component. Leveraging single-cell RNA sequencing and CITE-seq, we conducted a comprehensive analysis of 154,000 cells from the lamina propria, submucosa, and gut-associated lymphoid tissues of both the colon and small intestine derived from a cohort of 4 CD patients and 6 healthy controls. This investigation enabled the identification of four distinct fibroblast populations, each characterized by distinctive transcriptional identities, functional roles, and diverse distribution across intestinal sites. Through differential gene expression analysis and gene set enrichment analysis, we investigated changes in the predominant fibroblast population in the ileal lamina propria of CD patients, uncovering a potential involvement of these cells in disease pathogenesis.
Continue readingDaniel Schulz - Large-scale cancer data collection and analysis within the IMI2 funded IMMUcan consortium
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Many cancer patients benefit from immune checkpoint inhibitor therapy. However, response rates vary across cancer types and patients and current biomarkers have limited capacity to identify patients who might respond to treatment. To profile the tumor microenvironment (TME) with multiple technologies and potentially identify novel biomarkers the IMMUcan consortium set out in 2019 to profile and analyze the TME from up to 3000 cancer patients until 2026. Tumor samples from every patient undergo bulk RNA-seq, whole exome sequencing, whole-slide multispectral imaging and imaging mass cytometry. We describe our workflow enabling us to reproducibly measure thousands of samples by imaging mass cytometry. We show first results of a retrospective cohort of non-small cell lung cancer where we highlight our approaches for the identification of biomarkers using the imaging technologies and comparisons with the sequencing based data.
Continue readingDaniel Incicau - Evaluation of cell type annotation methods in multiplexed imaging
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract In the field of biomedical imaging, multiplexed imaging technologies have become indispensable for complex tissue analysis, enabling unprecedented depth of cellular and molecular understanding. This technology allows the spatial profiling of molecular markers within tissues at a single-cell resolution. However, the novelty of the technology requires specialised computational tools to extract biologically meaningful information. In recent years, numerous methods have emerged for each data analysis step in multiplexed imaging. Yet, it is still unclear which method is best and under what circumstances. Continue reading
Open science: improving your research workflow to increase transparency and reproducibilitys
Two-day workshop for graduate students in Neuchatel
- December 7-8 2023, University of Neuchâtel, Chemistry building, room GE14
- Organizer: Dr Dominique Roche, Social Sciences and Humanities Research Council of Canada
Abstract This two-day workshop will teach graduate student how to adopt open science practices to promote transparency and reproducibility in research. The first day of the workshop will consist of 5 lectures by invited speakers on topics including the replication crisis in science and the benefits of open science practices, publication bias and measures to counter it, statistical misconceptions and best practices, common mistakes in study design and reporting, and open data and code. Continue reading
Siyuan Luo - Benchmarking computational methods for single-cell chromatin data analysis
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Single-cell chromatin accessibility assays, such as scATAC-seq, are increasingly employed in individual and joint multi-omic profiling of single cells. As the accumulation of scATAC-seq and multi-omics datasets continue, challenges in analyzing such sparse, noisy, and high-dimensional data become pressing. Specifically, one challenge relates to optimizing the processing of chromatin-level measurements and efficiently extracting information to discern cellular heterogeneity. This is of critical importance, since the identification of cell types is a fundamental step in current single-cell data analysis practices.
We benchmarked 8 feature engineering pipelines derived from 5 recent methods to assess their ability to discover and discriminate cell types. By using 10 metrics calculated at the cell embedding, shared nearest neighbor graph, or partition levels, we evaluated the performance of each method at different data processing stages. This comprehensive approach allowed us to thoroughly understand the strengths and weaknesses of each method and the influence of parameter selection.
Our analysis provides guidelines for choosing analysis methods for different datasets. Overall, feature aggregation, SnapATAC, and SnapATAC2 outperform latent semantic indexing-based methods. For datasets with complex cell-type structures, SnapATAC and SnapATAC2 are preferred. With large datasets, SnapATAC2 and ArchR are most scalable.
Continue readingCarino Gurjao - Genetic Analyses of Colorectal Cancer across Ancestries and Mutagenic Exposures
ETHZ special presentation hosted by Valentina Boeva
- 11:00 at ETHZ, Sonneggstrasse 3, 8092 Zürich
Abstract Colorectal cancer (CRC) has several established risk factors, including diet and microbiome. However, their mutagenic effect has not been observed directly in patients’ tumors and the individuals or ethnic groups who are most susceptible to diet-induced carcinogenesis are yet to be identified. In particular, CRC disproportionately affects African American (AA) patients who have worse clinical outcomes, but the molecular underpinnings are still poorly understood. We hypothesize that mutational signature analyses in CRC, coupled with epidemiologic, tumor molecular, micro-environmental, and patient germline data, can be linked to pre-diagnosis diet and specific germline alterations, which can further inform cancer prevention efforts. Continue reading
Frederik Philipona - Simulation of spatial transcriptomic data
Zurich Seminars in Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract A parametric and interpretable method to simulate spatial transcriptomic data is proposed. The simulated data is used to compare domain recognition methods.
Continue readingNavigating the global ocean microbiome through a web-based genome collection
Zurich Seminars in Bioinformatics - Samuel Miravet Verde (Sunagawa Lab ETHZ)
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract
High-throughput sequencing has empowered researchers to profile taxonomic, genomic, and functional compositions of ocean microbiomes through the reconstruction of metagenome-assembled genomes (MAGs) from environmental samples. As part of an environment that is marked by vast physicochemical gradients and immense phylogenetic and functional microbial diversity, the ocean microbiome offers ample opportunities to study the ecology and evolution of microbial communities as well as gene-encoded functions in the context of its natural environment. However, this information is currently scattered across the primary literature and sequence databases and/or difficult to access, hindering the systematic analysis of large-scale metagenomic datasets at global scale.
Continue readingInvestigation of the potential of Covariates for Multiphenotype Studies (CMS) to improve genetic risk prediction
Zurich Seminars in Bioinformatics - Anja Estermann
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Polygenic risk scores (PRSs) estimate the genetic risk of an individual to develop a certain disease or trait and find application in disease prevention and personalized medicine. These scores are calculated based on summary statistics results of genome-wide association studies (GWAS). Covariates for multiphenotype studies (CMS) is an algorithm that has been developed to increase the detection of associated genetic variants in GWAS by leveraging covariates measured on behalf of a primary outcome. Continue reading
SuperCellCyto: Enabling efficient analysis of large scale cytometry datasets
Zurich Seminars in Bioinformatics - Givanna Putri (WEHI)
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract The rapid advancements in cytometry technologies have enabled the quantification of up to 50 proteins across millions of cells at the single-cell resolution. The analysis of cytometry data necessitates the use of computational tools for tasks such as data integration, clustering, and dimensionality reduction. While numerous computational methods exist in the cytometry and single-cell RNA sequencing (scRNAseq) fields, many are hindered by extensive run times when processing large cytometry data containing millions of cells. Existing solutions, such as random subsampling, often prove inadequate as they risk excluding small, rare cell subsets.
To address this, we propose a practical strategy that builds on the SuperCell framework from the scRNAseq field. The supercell concept involves grouping single cells with highly similar transcriptomic profiles, and has been shown to be an effective unit of analysis for scRNAseq data.
We show that for cytometry datasets, there is no loss of information by grouping cells into supercells. Further, we demonstrate the effectiveness of our approach by conducting a series of downstream analyses on six publicly available cytometry datasets at the supercell level, and successfully replicating previous findings performed at the single cell level. We present a computationally efficient solution for transferring cell type labels from single-cell multiomics data which combines RNA with protein measurements, to a cytometry dataset, allowing for more precise cell type annotations.
Our SuperCellCyto R package and the associated analysis workflows are available on our GitHub repositories (github.com/phipsonlab/SuperCellCyto and phipsonlab.github.io/SuperCellCyto-analysis/).
Continue readingsplicekit: a comprehensive toolkit for splicing analysis from short-read RNA-seq
Zurich Seminars in Bioinformatics - Gregor Rot
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Splicing of RNA is a fundamental biological process. Dysregulation of splicing has been implicated in many human diseases and successfully exploited as a therapeutic target. Splicing analysis using short-read RNA-sequencing is a powerful technique to triage mechanism of action and safety profiles of drug candidates. There is, however, currently no comprehensive open-source software pipelines for such applications. Continue reading
Modeling the Tumor Microenvironment with Graph Concept Learning
Zurich Seminars in Bioinformatics - Santiago Castro Dau
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Heterogeneity is an emergent property of tumors, linked to cancer resistance and poor treatment outcomes 1. Geometric deep learning using graph representations has emerged as a promising approach to investigate tumor heterogeneity. Still, these approaches suffer from interpretability and transferability limitations 2. In this work, we propose a geometric deep-learning model with an interpretability framework to predict metadata from spatial datasets. Continue reading
Microbiome Signatures in Host - Spatial and Single Cell Transcriptomics
Zurich Seminars in Bioinformatics - Stine Anzböck
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract The tumor-associated microbiome is an integral part of the tumor microenvironment and holds great promise for a better understanding of carcinogenesis, response to therapy and early diagnosis of cancer. However, current methods limit the understanding of the spatial distribution of the microbiome in the tumor and their interaction with host cells. To this end, we have developed SpaceMicrobe, a novel computational framework to detect and profile the microbiome in 10X Visium Spatial Gene Expression data, a widely used commercially-available spatial transcriptomics technology. Continue reading
Variational Autoencoders Supporting Conditioning in Single Cell Transcriptomics and Their Consistency
Zurich Seminars in Bioinformatics - Eljas Röllin
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Prediction of cell response to perturbation is a key goal in computational biology. To this end, different Variational Autencoder (VAE) based models have been suggested in the past. We are interested in such models and their ability to recognize their own output. We argue that this is a desirable model property, and phrase it in the context of cycle consistency. Continue reading
Analysis of copy number variant heterogeneity in the hierarchical NCIt cancer classification system
Zurich Seminars in Bioinformatics - Ziying Yang
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Cancers are heterogeneous diseases with unifying features of abnormal and consuming cell growth, where the deregulation of normal cellular functions is caused by accumulative mutations. Due to these mutations, malignant tumors present with patterns of somatic genome variants on diverse levels of heterogeneity. Among the different mutation types, genomic copy number aberrations(CNA) have emerged as one of the most distinct classes. Continue reading
Detecting Single Cell Blasts in Acute Myeloid Leukaemia using an Auto-Encoder
Zurich Seminars in Bioinformatics - Alice Driessen
- 12:15 UZH Irchel Y13-K-05 and ZOOM Call
Abstract Acute myeloid leukaemia (AML) is a haematological cancer in the bone marrow, with accumulation and expansion of immature cells of the myeloid lineage. Unfortunately, almost half of paediatric AML patients relapse after standard treatment with chemotherapy or stem cell transplantation. Personalised medicine including immunotherapies have the potential to target chemotherapy resistant cells and achieve long-term remission. However, identifying suitable targets for AML therapy is hampered by high patient heterogeneity, complex disease evolution and challenging discrimination between aberrant and developing cells. Therefore, we aimed to build a single-cell cytometry AML map to identify malignant cells and place them along the developmental trajectory using data from 20 patients and three time points over the course of the disease. Continue reading
Multi-omics studies of cancer signalling and immune infiltration
Zurich Seminars in Bioinformatics - Pedro Beltrao
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Genetic alterations in cancer cells trigger oncogenic transformation, a process largely mediated by the dysregulation of kinase and transcription factor (TF) activities. While the mutational profiles of thousands of tumours have been extensively characterised, the measurements of protein activities have been technically limited until recently. We compiled public data of matched genomics and (phospho)proteomics measurements for 1,110 tumours that we used to estimate activity changes in 218 kinases and 292 TFs. Continue reading
Inflammatory bowel disease at single-cell and sub-cellular spatial resolution
Zurich Seminars in Bioinformatics - Helena L. Cromwell
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract CosMx Spatially Molecular Imaging (SMI) (NanoString) is a recently developed technology that enables spatially resolved profiling of tissue at molecule-level resolution at an unprecedented scale (up to 1,000 RNA analytes). We applied scRNA-seq and CosMx SMI to investigate the molecular basis of Ulcerative colitis and Crohn’s disease: chronic inflammatory bowel diseases (IBD) that show a perplexing heterogeneity in manifestations and response to treatment. Continue reading
Searching in nucleotide archives at Petabase scale with MetaGraph
Zurich Seminars in Bioinformatics - Mikhail Karasikov
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract High-throughput sequencing data is continuously accumulating in massive archives such as the NCBI Sequence Read Archive (SRA), which currently contains over 50 Petabytes of sequences. However, despite recent advances, even such a basic operation as sequence search effectively remains intractable due to the lack of cost-efficient solutions. To address this problem and enable aggregated data analysis at Petabase scale, we developed MetaGraph, a tool for indexing very large collections of sequences in de Bruijn graphs. Continue reading
Nucleosome footprints in the cell-free DNA of cancer patients
Zurich Seminars in Bioinformatics - Zsolt Balázs
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Cell-free DNA (cfDNA) is free floating DNA released from cells into bodily fluids like blood. cfDNA sequencing is being increasingly utilized in cancer diagnosis and monitoring. While many applications focus on detecting cancer-specific mutations, cfDNA can potentially tell us much more about the disease. Nucleosome-protected regions are slow to degrade in the bloodstream, therefore, nucleosome occupancy in the cell of origin can be inferred from cfDNA fragmentation. Continue reading
Genome-wide study of variations in Plasmodium falciparum and their association with different malaria interventions in Tanzania
Zurich Seminars in Bioinformatics - Catherine Bakari Mvaa (Christian Nsanzabana group @ TPH)
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Malaria affects millions of people globally, and it remains one of the major public health problems occurring in most parts of Tanzania, causing thousands of deaths each year, especially among vulnerable persons in rural settings with poor health systems. Even though the burden has decreased due to the deployment of different interventions in recent years, all those gains are threatened by biological threats that largely affect the success of different control strategies. The parasite uses different mechanisms to escape diagnostic tools and drug treatment, challenging one of the key control strategies, based on prompt detection and case management. Sequencing technologies, especially Next Generation Sequencing (NGS) have played a major role in understanding parasite genomic variation including drug resistance, diagnostic resistance, and population structure. Continue reading
A journey into the nucleus: Lnc-ing paraspeckles, lncRNA, RNA processing and phase separation
NCCR RNA and Disease seminar
Archa Fox, The University of Western Australia
Host: Prof. Dr. Magdalini Polymenidou
The seminar will be preceded by a short presentation:
Unraveling genetic modifiers and cellular pathways of physiological TDP-43 oligomerization.”
Laura de Vos (Polymenidou group, UZH)
Continue readingComprehensive network prediction for any fully sequence genome in STRING database
Zurich Seminars in Bioinformatics - Damian Szklarczyk
Abstract Every day new genomes are sequenced and existing genomes are re-sequenced and re-annotated. In the new version of STRING the user can submit any fully sequenced genome for complete network and functional annotation. To do so. will require only minimal input from the user in a form of the proteome in a FASTA format and, if known, a taxonomical clade of the given genome. Continue reading
Towards a quantitative understanding of long-range transcriptional regulation
ETHZ seminar - Luca Giorgetti, FMI Basel
- recently published Nature paper on “Nonlinear control of transcription through enhancer–promoter interactions”
CompbioZurich Website Redesign
Re-implementing the Site with Mkdocs
The compbiozurich.org website has been redesigned based on the Mkdocs framework & using the Material template system. Continue reading
Novel Minimal HDV-like Ribozymes
Zurich Seminars in Bioinformatics - Lukas Malfertheiner
Abstract The hepatitis delta virus (HDV) ribozyme catalyzes site-specific self-cleavage and was first discovered in the single-stranded circular RNA virus HDV. HDV-like ribozymes (DRZ) share the conserved nested double-pseudoknot structure motif. Through a bioinformatic search using an adapted minimal active DRZ motif, we discovered hundreds of novel minimal DRZ sequences in bacteriophage genomes associated with the human microbiome. A subclass of these hits was identified to occur in direct conjunction with viral tRNA genes, indicating that we have found a solely RNA-based factor that can site-specifically cleave tRNA 3'-trailers, thus making large protein enzymes redundant for this task.
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Supervised spatial inference of dissociated single-cell data with _SageNet_
Zurich Seminars in Bioinformatics - Elyas Heidari
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Spatially-resolved transcriptomics uncovers patterns of gene expression at supercellular, cellular, or subcellular resolution, providing insights into spatially variable cellular functions, diffusible morphogens, and cell-cell interactions. However, for practical reasons, multiplexed single cell RNA-sequencing remains the most widely used technology for profiling transcriptomes of single cells, especially in the context of large-scale anatomical atlassing. Continue reading
The resemblance is uncanny a.k.a how similar are cancer cell lines really to their origins?
Zurich Seminars in Bioinformatics - Rahel Paloots
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Cancer cell lines are good and inexpensive models to study disease mechanisms and identify possible new drugs, However, many times cancer cell lines have been found to be either misidentified or contaminated. Additionally, cancer cell lines only represent a small clonal population of the origin as well as accumulate some additional mutations during in vitro handling. Continue reading
Copy number variation data calibration towards intgerative analysis in cancer
Zurich Seminars in Bioinformatics - Hangjia Zhao
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Genomic instability is common in human cancer. As a form of genomic instability, copy number variations (CNV) play an important role in cancer development. Elucidating the relationship between CNV and cancer evolvement can improve the understanding of the pathogenetic mechanism of cancer.
Continue readingFrequent co-regulation of splicing and polyadenylation by RNA-binding proteins inferred with MAPP
Zurich Seminars in Bioinformatics - Maciek Bak | Biozentrum, University of Basel | Swiss Institute of Bioinformatics
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
The processing of nascent pre-mRNA consists of many steps, where splicing and primary transcript polyadenylation play key roles in determining transcriptome and subsequently, proteome diversity. Several studies indicate that many RNA-binding proteins (RBPs) act both on splicing as well as 3’ end processing but the context of this multi-level regulation and the full spectrum of RBPs involved are yet to be discovered. To facilitate answering these questions we have developed a novel computational method to identify RBPs that could shape the pre-mRNA maturation.
Continue readingAn ion channel that allows you to see at night: the cryo-EM structure of the rod CNG channel opens up the hypothesis of mRNA editing on the CNGB1 sequence
Zurich Seminars in Bioinformatics - Jacopo Marino, Laboratory of Biomolecular Research, Paul Scherrer Institut
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract I will present the latest advances in cryo-electron microscopy and how this technique allows to solve protein structures at fast pace also of difficult protein targets, thus revolutionizing the field of structural biology and biomedicine. I will then focus on the recent developments on solving the structure of a ion channel that is located in rod photoreceptors of the retina, a protein complex that was discovered nearly 35 years ago and its structure has finally being revealed (article).
Continue readingTaxonomic profiling of metagenomes from diverse environments with mOTUs3
Zurich Seminars in Bioinformatics - Hans-Joachim Ruschewey
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Taxonomic profiling is a crucial task in microbiome research that aims at detecting and quantifying the relative abundance of microbes in biological samples. However, in many environments, varying fractions of microbial species still lack a sequenced genome and remain unaccounted for during taxonomic profiling based on shotgun metagenomic data.
Continue readingmVIRs: A tool to identify and locate inducible prophages in microbial genomes using NGS data
Zurich Seminars in Bioinformatics - Mirjam Zünd
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract An estimated 40 - 50% of mammalian gut bacteria carry a prophage in their genome. Prophages can be induced upon stress and produce phage particles to infect and kill bacterial populations and thereby directly modulate the composition of the gut microbiota. Despite advances in studying prophages using whole-genome sequencing, challenges in identifying and locating inducible prophages within their host have limited our understanding of phage-bacterial interactions.
Continue readingCanIsoNet beta-version: A Database to Study the Functional Impact of Isoform Switching Events in Cancer
Zurich Seminars in Bioinformatics - Tülay Karakulak
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Alternative splicing, as an essential regulatory mechanism in normal mammalian cells, is frequently disturbed in cancer. Switches in the expression of alternative isoforms can alter protein interaction networks of associated genes giving rise to cancer progression and metastases. We have recently analyzed the pathogenic impact of switching events in 1209 cancer samples covering 27 different cancer types.
Continue readingStructured metadata for genomic correlations in the Progenetix database
Zurich Seminars in Bioinformatics - Ziying Yang
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Abstract Enormous amounts of biomedical data have been and are being produced at an unprecedented rate by researchers all over the world. However, in order to enable reuse, there is an urgent need to understand the structure of datasets, the experimental conditions under which they were produced and the information that other investigators may need to make sense of the data.
Continue readingFunctional implications of Short tandem repeat (STR) variation in NGS data
Zurich Seminars in Bioinformatics - Max Verbiest
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
Short tandem repeats (STRs) are genomic elements that consist of consecutive repetitions of a 1-6 nucleotide motif. They are abundant in the human genome and are known to be mutational hotspots. Several cancer types are known to have a microsatellite instability high (MSI-H) subtype wherein the DNA mismatch repair system is defective, leading to hypermutation of STRs.
Continue readingIdentifying clones and quantifying diversity in repertoire sequencing data.
Zurich Seminars in Bioinformatics - Siyuan Luo
- 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call
The adaptive immune system is remarkable for its ability to produce immunoglobulins that can specifically bind to a wide variety of antigens. During adaptive immune responses, activated B cells expand and undergo accelerated mutation of their B cell receptor (BCR), forming a clone of diversified cells that can be related back to a common ancestor.
Continue reading