{"id":252,"date":"2018-01-19T19:12:09","date_gmt":"2018-01-19T18:12:09","guid":{"rendered":"https:\/\/journot-lab.igf.cnrs.fr\/?page_id=252"},"modified":"2022-07-28T22:43:12","modified_gmt":"2022-07-28T20:43:12","slug":"stats-for-genomics","status":"publish","type":"page","link":"https:\/\/journot-lab.igf.cnrs.fr\/index.php\/stats-for-genomics\/","title":{"rendered":"Stats for genomics"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-186 alignleft\" src=\"https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2018\/01\/cell8.png\" alt=\"\" width=\"82\" height=\"64\"><span style=\"color: #000000;\">We developed <strong>ISoLDE<\/strong> (<strong>Integrative Statistics of alleLe Dependent Expression)<\/strong>, a novel non-parametric statistical method that directly infers allelic imbalance from RNA-seq data. ISoLDE learns the distribution of a speci\ufb01cally designed test statistic from the data and calls genes allelically imbalanced, bi-allelically expressed or unde\u00adtermined.&nbsp;<a style=\"color: #000000;\" href=\"https:\/\/bioconductor.org\/packages\/release\/bioc\/html\/ISoLDE.html\">ISoLDE<\/a> is available as a Bioconductor package.<\/span><\/p>\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2018\/12\/ISoLDE-1-1024x952.png\" alt=\"\" class=\"wp-image-582\" width=\"627\" height=\"583\" srcset=\"https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2018\/12\/ISoLDE-1-1024x952.png 1024w, https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2018\/12\/ISoLDE-1-300x279.png 300w, https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2018\/12\/ISoLDE-1-768x714.png 768w, https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2018\/12\/ISoLDE-1.png 1359w\" sizes=\"auto, (max-width: 627px) 100vw, 627px\" \/><figcaption> <br><strong>Output of the resampling version of ISoLDE.<\/strong> For each gene, the variability (denominator value of the <em>S<sub>g<\/sub><\/em> statistic) was plotted against the allelic bias (numerator value of the <em>S<sub>g<\/sub><\/em>  statistic). Violet crosses correspond to bi-allelically expressed  (\u2018BA\u2019) genes. Red and blue crosses correspond to genes called maternally  and paternally imbalanced (\u2018AI mat\u2019 and \u2018AI pat\u2019, respectively). Grey  crosses correspond to undetermined (\u2018UN\u2019) genes. Grey circled crosses  correspond to flagged genes (consistency or significance flag,  \u2018UN_flag\u2019).<br><\/figcaption><\/figure><\/div>\n\n\n\n<p class=\"has-very-dark-gray-color has-text-color\">We also developped <strong>TopoFun<\/strong>, a novel machine learning method to identify functional modules in gene co-expression networks and complement Gene Ontology annotations.<\/p>\n\n\n\n<p class=\"has-very-dark-gray-color has-text-color\">A comprehensive, accurate functional annotation of genes is key to systems-level approaches. Forward and reverse genetics produced a substantial amount of data on gene functions; yet, a large fraction of genes are still poorly annotated, even in model organisms. One possible approach to complement existing annotations is to analyze gene co-expression as functionally related genes tend to be co-expressed.<\/p>\n\n\n\n<p class=\"has-very-dark-gray-color has-text-color\">Gene co-expression data are represented as high-dimensional graphs in which nodes denote genes and edges denote co-expression. TopoFun is a machine learning method that combines topological and functional information on co-expression modules. We first selected topological descriptors of gene co-expression modules that discriminate modules made of functionally related genes and modules made of randomly selected genes. Using the selected topological descriptors, we constructed a database of functional and random modules and performed Linear Discriminant Analysis to predict the type of a module. Starting from a given Gene Ontology Biological Process (GO-BP), we used a genetic algorithm to find genes whose co-expression with the largest clique of the GO-BP suggests that they may be functionally related.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"791\" height=\"670\" src=\"https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2022\/07\/actu_IGF_journot.png\" alt=\"\" class=\"wp-image-715\" srcset=\"https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2022\/07\/actu_IGF_journot.png 791w, https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2022\/07\/actu_IGF_journot-300x254.png 300w, https:\/\/journot-lab.igf.cnrs.fr\/wp-content\/uploads\/2022\/07\/actu_IGF_journot-768x651.png 768w\" sizes=\"auto, (max-width: 791px) 100vw, 791px\" \/><figcaption><strong>The TopoFun machine learning method.<\/strong> A. Starting from a module of co-expressed genes M<sub>0<\/sub>, TopoFun deleted the genes that were only marginally connected to the module largest clique and added novel genes that were both highly connected to those of the largest clique and functionally similar, producing the M<sub>f<\/sub> module. B. Distribution of the size ratio, Score<sub>Topo<\/sub> ratio, and Score<sub>Fun<\/sub> ratio. We ran TopoFun on 193 GO-BPs comprising 50-100 genes. For each M<sub>0<\/sub> (=GO-BP) and M<sub>f<\/sub> (=\u2019optimal\u2019 module), we determined the number of genes, the Score<sub>Topo<\/sub>, and the Score<sub>Fun<\/sub>, and plotted the distribution of the ratios of these variables for M<sub>f<\/sub> to M<sub>0<\/sub>. The figures show that the ratios were most often >1, indicating that TopoFun increased modules size, and improved topology and functional similarity. <\/figcaption><\/figure><\/div>\n\n\n\n<p><br><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We developed ISoLDE (Integrative Statistics of alleLe Dependent Expression), a novel non-parametric statistical method that directly infers allelic imbalance from RNA-seq data. ISoLDE learns the distribution of a speci\ufb01cally designed test statistic from the data and calls genes allelically imbalanced, bi-allelically expressed or unde\u00adtermined.&nbsp;ISoLDE is available as a Bioconductor package. We also developped TopoFun, a [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-252","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/journot-lab.igf.cnrs.fr\/index.php\/wp-json\/wp\/v2\/pages\/252","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/journot-lab.igf.cnrs.fr\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/journot-lab.igf.cnrs.fr\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/journot-lab.igf.cnrs.fr\/index.php\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/journot-lab.igf.cnrs.fr\/index.php\/wp-json\/wp\/v2\/comments?post=252"}],"version-history":[{"count":30,"href":"https:\/\/journot-lab.igf.cnrs.fr\/index.php\/wp-json\/wp\/v2\/pages\/252\/revisions"}],"predecessor-version":[{"id":718,"href":"https:\/\/journot-lab.igf.cnrs.fr\/index.php\/wp-json\/wp\/v2\/pages\/252\/revisions\/718"}],"wp:attachment":[{"href":"https:\/\/journot-lab.igf.cnrs.fr\/index.php\/wp-json\/wp\/v2\/media?parent=252"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}