Thematic Month Week 5: Networks and Molecular Biology / Mois thématique Semaine 5 : Réseaux et biologie moléculaire

Collection Thematic Month Week 5: Networks and Molecular Biology / Mois thématique Semaine 5 : Réseaux et biologie moléculaire

Organizer(s) Baudot, Anais ; Hubert, Florence ; Moss, Brigitte ; Rémy, Elisabeth ; Tichit, Laurent ; Vignes, Matthieu
Date(s) 02/03/2020 - 06/03/2020
linked URL https://conferences.cirm-math.fr/2305.html
00:00:00 / 00:00:00
4 8

Matrix factorisation techniques for data integration

By Kim-Anh Lê Cao

Gene module detection methods aim to group genes with similar expression profiles to shed light into functional relationships and co-regulation, and infer gene regulatory networks. Methods proposed so far use clustering to group genes based on global similarity in their expression profiles (co-expression), bi-clustering to group genes and samples simultaneously, network inference to model regulatory relationships between genes. In this talk I will focus on multivariate matrix decomposition techniques that enable dimension reduction and the identification of molecular signatures. We will consider two different types of assays: bulk and single cell assays. Bulk transcriptomics assays use RNA-sequencing techniques to monitor the average expression profile of all the constituent cells, but fail to identify the distinct transcriptional profiles from different cell types. Single cell assays use similar RNA-seq techniques (scRNA-seq) to those used for bulk cell populations, but provide unprecedented resolution at the cell level to understand cellular heterogeneity and uncover new biology. However, scRNA-seq present new computational and analytical challenges, because of their sheer size (100K – 500K of cells are sequenced) and their zero inflated distribution due to technical drop-outs. I will illustrate how we can use matrix factorisation technique to mine these data and identify gene modules that underpin molecular mechanisms in cell identity in scRNA-seq. I will also give further perspective on how we could extend similar concepts to integrate different omics data types (e.g. bulk transcriptomics, proteomics, metabolomics) to identify tightly connected multi-omics signatures that holistically describe a biological system.

Information about the video

Citation data

  • DOI 10.24350/CIRM.V.19620803
  • Cite this video Lê Cao, Kim-Anh (05/03/2020). Matrix factorisation techniques for data integration. CIRM. Audiovisual resource. DOI: 10.24350/CIRM.V.19620803
  • URL https://dx.doi.org/10.24350/CIRM.V.19620803

Bibliography

  • DRIER, Yotam, SHEFFER, Michal, et DOMANY, Eytan. Pathway-based personalized analysis of cancer. Proceedings of the National Academy of Sciences, 2013, vol. 110, no 16, p. 6388-6393. - https://doi.org/10.1073/pnas.1219651110
  • LIU, Chao, SRIHARI, Sriganesh, CAO, Kim-Anh Lê, et al. A fine-scale dissection of the DNA double-strand break repair machinery and its implications for breast cancer therapy. Nucleic acids research, 2014, vol. 42, no 10, p. 6106-6127. - https://doi.org/10.1093/nar/gku284
  • LIU, Chao, SRIHARI, Sriganesh, LAL, Samir, et al. Personalised pathway analysis reveals association between DNA repair pathway dysregulation and chromosomal instability in sporadic breast cancer. Molecular oncology, 2016, vol. 10, no 1, p. 179-193. - https://doi.org/10.1016/j.molonc.2015.09.007
  • HASTIE, Trevor et STUETZLE, Werner. Principal curves. Journal of the American Statistical Association, 1989, vol. 84, no 406, p. 502-516. - https://www.tandfonline.com/doi/abs/10.1080/01621459.1989.10478797
  • SAELENS, Wouter, CANNOODT, Robrecht, et SAEYS, Yvan. A comprehensive evaluation of module detection methods for gene expression data. Nature communications, 2018, vol. 9, no 1, p. 1-12. - https://doi.org/10.1038/s41467-018-03424-4
  • COMON, Pierre. Independent component analysis, a new concept?. Signal processing, 1994, vol. 36, no 3, p. 287-314. - https://doi.org/10.1016/0165-1684(94)90029-9
  • YAO, Fangzhou, COQUERY, Jeff, et LÊ CAO, Kim-Anh. Independent principal component analysis for biologically meaningful dimension reduction of large biological data sets. BMC bioinformatics, 2012, vol. 13, no 1, p. 24. - http://dx.doi.org/10.1186/1471-2105-13-24
  • SCHAUM, Nicholas, KARKANIAS, Jim, NEFF, Norma F., et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris: The Tabula Muris Consortium. Nature, 2018, vol. 562, no 7727, p. 367. - https://dx.doi.org/10.1038%2Fs41586-018-0590-4
  • CAO, Kim-Anh, ROSSOUW, Debra, ROBERT-GRANIÉ, Christèle, et al. A sparse PLS for variable selection when integrating omics data. Statistical Applications in Genetics & Molecular Biology, 2008, vol. 7, no 1, p. 1-29. - https://doi.org/10.2202/1544-6115.1390
  • BOITARD, Simon et BESSE, Philippe. Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics june (12), Non paginé.(2011), 2011. - https://doi.org/10.1186/1471-2105-12-253
  • TENENHAUS, Arthur, PHILIPPE, Cathy, GUILLEMOT, Vincent, et al. Variable selection for generalized canonical correlation analysis. Biostatistics, 2014, vol. 15, no 3, p. 569-583. - https://doi.org/10.1093/biostatistics/kxu001
  • SINGH, Amrit, SHANNON, Casey P., GAUTIER, Benoît, et al. DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics, 2019, vol. 35, no 17, p. 3055-3062. - https://doi.org/10.1093/bioinformatics/bty1054
  • ROHART, Florian, GAUTIER, Benoit, SINGH, Amrit, et al. mixOmics: An R package for ‘omics feature selection and multiple data integration. PLoS computational biology, 2017, vol. 13, no 11, p. e1005752.
  • LEE, Amy H., SHANNON, Casey P., AMENYOGBE, Nelly, et al. Dynamic molecular changes during the first week of human life follow a robust developmental trajectory. Nature communications, 2019, vol. 10, no 1, p. 1-14. - https://doi.org/10.1038/s41467-019-08794-x
  • LE CAO, Kim-Anh, COSTELLO, Mary-Ellen, LAKIS, Vanessa Anne, et al. MixMC: a multivariate statistical framework to gain insight into microbial communities. PloS one, 2016, vol. 11, no 8. - https://dx.doi.org/10.1371%2Fjournal.pone.0160169
  • WANG, Yiwen et LÊCAO, Kim-Anh. Managing batch effects in microbiome data. Briefings in bioinformatics, 2019. - https://doi.org/10.1093/bib/bbz105
  • BODEIN, Antoine, CHAPLEUR, Olivier, DROIT, Arnaud, et al. A generic multivariate framework for the integration of microbiome longitudinal studies with other data types. Frontiers in Genetics, 2019, vol. 10. - https://dx.doi.org/10.3389%2Ffgene.2019.00963

Last related questions on MathOverflow

You have to connect your Carmin.tv account with mathoverflow to add question

Ask a question on MathOverflow




Register

  • Bookmark videos
  • Add videos to see later &
    keep your browsing history
  • Comment with the scientific
    community
  • Get notification updates
    for your favorite subjects
Give feedback