Séminaire des Doctorant·e·s
mardi 09 février 2016 à 17h00 - Salle 9.11
May Taha ()
Modelling gene expression in cancers
Gene expression is tightly controlled to ensure a wide variety of celltypes and functions. The development of diseases, particularly cancers, is invariably related to deregulations of these controls. Our project aims to model the link between RNA expression and DNA features in regulatory regions (typically, presence/absence of transcription factor motifs in promoters). Several studies have shown that penalized linear regression (LASSO) is suitable for this problem, that requires selecting variables in high dimensional data. Using similar approach, we were able to model gene expression in. Further investigations showed that the inferred model is not equally efficient for all genes. More precisely, the model only fits a certain class of genes with specific DNA features. Our perspective is to design a clustering approach to identify coregulatedgenes and train a specific regression model for each cluster to overcome these limits. In other hands, other regression models were also tested to fit these data, such as Random Forest. Affiche