Copy number variation heterogeneity reveals biological inconsistency in hierarchical cancer classifications
Ziying Yang

Zurich Seminars in Bioinformatics

  • 12:15 UZH Irchel Y55-l-06/08 and ZOOM Call

Abstract Cancers are heterogeneous diseases with unifying features of abnormal and consuming cell growth, where the deregulation of normal cellular functions is initiated by the accumulation of genomic mutations in cells of - potentially - any organ. At diagnosis malignancies typically present with patterns of somatic genome variants on diverse levels of heterogeneity. Among the different types of genomic alterations, copy number variants (CNV) represent a distinct, near-ubiquitous class of structural variants.

Cancer classifications are foundational for patient care and oncology research. Terminologies such as the National Cancer Institute Thesaurus (NCIt) provide large sets of hierarchical cancer classification vocabularies and promote data interoperability and ontology-driven computational analysis. To find out how categorical classifications reflect biological facts, we conducted a meta-analysis of inter-sample genomic heterogeneity at different levels of the classification hierarchies based on genome-spanning CNV profiles from 97,142 individual samples across 512 cancer entities. The use of a large data set of individual cancer samples allows for a greater exploration of genomic tumor heterogeneity between and inside given diagnostic concepts by quantifying cancer heterogeneity among cancer entities through measuring dissimilarity of CNV events among cancer entities and subsets. Our results highlight specific biological mechanisms across cancer entities with the potential for improvement of patient stratification and future enhancement of cancer classification systems.