Thematic Month Week 5: Networks and Molecular Biology / Mois thématique Semaine 5 : Réseaux et biologie moléculaire

Collection Thematic Month Week 5: Networks and Molecular Biology / Mois thématique Semaine 5 : Réseaux et biologie moléculaire

Organisateur(s) Baudot, Anais ; Hubert, Florence ; Moss, Brigitte ; Rémy, Elisabeth ; Tichit, Laurent ; Vignes, Matthieu
Date(s) 02/03/2020 - 06/03/2020
URL associée https://conferences.cirm-math.fr/2305.html
00:00:00 / 00:00:00
4 8

Matrix factorisation techniques for data integration

De Kim-Anh Lê Cao

Gene module detection methods aim to group genes with similar expression profiles to shed light into functional relationships and co-regulation, and infer gene regulatory networks. Methods proposed so far use clustering to group genes based on global similarity in their expression profiles (co-expression), bi-clustering to group genes and samples simultaneously, network inference to model regulatory relationships between genes. In this talk I will focus on multivariate matrix decomposition techniques that enable dimension reduction and the identification of molecular signatures. We will consider two different types of assays: bulk and single cell assays. Bulk transcriptomics assays use RNA-sequencing techniques to monitor the average expression profile of all the constituent cells, but fail to identify the distinct transcriptional profiles from different cell types. Single cell assays use similar RNA-seq techniques (scRNA-seq) to those used for bulk cell populations, but provide unprecedented resolution at the cell level to understand cellular heterogeneity and uncover new biology. However, scRNA-seq present new computational and analytical challenges, because of their sheer size (100K – 500K of cells are sequenced) and their zero inflated distribution due to technical drop-outs. I will illustrate how we can use matrix factorisation technique to mine these data and identify gene modules that underpin molecular mechanisms in cell identity in scRNA-seq. I will also give further perspective on how we could extend similar concepts to integrate different omics data types (e.g. bulk transcriptomics, proteomics, metabolomics) to identify tightly connected multi-omics signatures that holistically describe a biological system.

Informations sur la vidéo

Données de citation

  • DOI 10.24350/CIRM.V.19620803
  • Citer cette vidéo Lê Cao, Kim-Anh (05/03/2020). Matrix factorisation techniques for data integration. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19620803
  • URL https://dx.doi.org/10.24350/CIRM.V.19620803

Bibliographie

  • DRIER, Yotam, SHEFFER, Michal, et DOMANY, Eytan. Pathway-based personalized analysis of cancer. Proceedings of the National Academy of Sciences, 2013, vol. 110, no 16, p. 6388-6393. - https://doi.org/10.1073/pnas.1219651110
  • LIU, Chao, SRIHARI, Sriganesh, CAO, Kim-Anh Lê, et al. A fine-scale dissection of the DNA double-strand break repair machinery and its implications for breast cancer therapy. Nucleic acids research, 2014, vol. 42, no 10, p. 6106-6127. - https://doi.org/10.1093/nar/gku284
  • LIU, Chao, SRIHARI, Sriganesh, LAL, Samir, et al. Personalised pathway analysis reveals association between DNA repair pathway dysregulation and chromosomal instability in sporadic breast cancer. Molecular oncology, 2016, vol. 10, no 1, p. 179-193. - https://doi.org/10.1016/j.molonc.2015.09.007
  • HASTIE, Trevor et STUETZLE, Werner. Principal curves. Journal of the American Statistical Association, 1989, vol. 84, no 406, p. 502-516. - https://www.tandfonline.com/doi/abs/10.1080/01621459.1989.10478797
  • SAELENS, Wouter, CANNOODT, Robrecht, et SAEYS, Yvan. A comprehensive evaluation of module detection methods for gene expression data. Nature communications, 2018, vol. 9, no 1, p. 1-12. - https://doi.org/10.1038/s41467-018-03424-4
  • COMON, Pierre. Independent component analysis, a new concept?. Signal processing, 1994, vol. 36, no 3, p. 287-314. - https://doi.org/10.1016/0165-1684(94)90029-9
  • YAO, Fangzhou, COQUERY, Jeff, et LÊ CAO, Kim-Anh. Independent principal component analysis for biologically meaningful dimension reduction of large biological data sets. BMC bioinformatics, 2012, vol. 13, no 1, p. 24. - http://dx.doi.org/10.1186/1471-2105-13-24
  • SCHAUM, Nicholas, KARKANIAS, Jim, NEFF, Norma F., et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris: The Tabula Muris Consortium. Nature, 2018, vol. 562, no 7727, p. 367. - https://dx.doi.org/10.1038%2Fs41586-018-0590-4
  • CAO, Kim-Anh, ROSSOUW, Debra, ROBERT-GRANIÉ, Christèle, et al. A sparse PLS for variable selection when integrating omics data. Statistical Applications in Genetics & Molecular Biology, 2008, vol. 7, no 1, p. 1-29. - https://doi.org/10.2202/1544-6115.1390
  • BOITARD, Simon et BESSE, Philippe. Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics june (12), Non paginé.(2011), 2011. - https://doi.org/10.1186/1471-2105-12-253
  • TENENHAUS, Arthur, PHILIPPE, Cathy, GUILLEMOT, Vincent, et al. Variable selection for generalized canonical correlation analysis. Biostatistics, 2014, vol. 15, no 3, p. 569-583. - https://doi.org/10.1093/biostatistics/kxu001
  • SINGH, Amrit, SHANNON, Casey P., GAUTIER, Benoît, et al. DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics, 2019, vol. 35, no 17, p. 3055-3062. - https://doi.org/10.1093/bioinformatics/bty1054
  • ROHART, Florian, GAUTIER, Benoit, SINGH, Amrit, et al. mixOmics: An R package for ‘omics feature selection and multiple data integration. PLoS computational biology, 2017, vol. 13, no 11, p. e1005752.
  • LEE, Amy H., SHANNON, Casey P., AMENYOGBE, Nelly, et al. Dynamic molecular changes during the first week of human life follow a robust developmental trajectory. Nature communications, 2019, vol. 10, no 1, p. 1-14. - https://doi.org/10.1038/s41467-019-08794-x
  • LE CAO, Kim-Anh, COSTELLO, Mary-Ellen, LAKIS, Vanessa Anne, et al. MixMC: a multivariate statistical framework to gain insight into microbial communities. PloS one, 2016, vol. 11, no 8. - https://dx.doi.org/10.1371%2Fjournal.pone.0160169
  • WANG, Yiwen et LÊCAO, Kim-Anh. Managing batch effects in microbiome data. Briefings in bioinformatics, 2019. - https://doi.org/10.1093/bib/bbz105
  • BODEIN, Antoine, CHAPLEUR, Olivier, DROIT, Arnaud, et al. A generic multivariate framework for the integration of microbiome longitudinal studies with other data types. Frontiers in Genetics, 2019, vol. 10. - https://dx.doi.org/10.3389%2Ffgene.2019.00963

Dernières questions liées sur MathOverflow

Pour poser une question, votre compte Carmin.tv doit être connecté à mathoverflow

Poser une question sur MathOverflow




Inscrivez-vous

  • Mettez des vidéos en favori
  • Ajoutez des vidéos à regarder plus tard &
    conservez votre historique de consultation
  • Commentez avec la communauté
    scientifique
  • Recevez des notifications de mise à jour
    de vos sujets favoris
Donner son avis