The Networks
Networks are multi-dimensional structures consisting of nodes (in CGNs case in most instances a gene) and communications between these nodes, called edges. Much like network created by humans (i.e for transportation (roads, rail ways and airways) and for social interactions (like who-knows-who), biological and molecular networks are sparse (i.e. most nodes have only one other node to communicate with) and are scale-free (i.e. not randomly connected but interconnected by highly-regulated nodes so called "hubs", see Figure on this page). This topology of biological and molecular networks including gene networks are essential for the ability to reverse-engineer these networks from genome-wide datasets. The two most frequently used algorithms to infer gene networks from genome-wide DNA and RNA data are based on co-expression and Bayesian (probabilistic) edges between nodes. We use Bayesian network reconstruction algorithms to integrate genome-wide DNA and RNA data both at the tissue and cell-level from the same patients. In this way, using the fact that DNA variation always is causal for changes in RNA levels (e.g. gene expression) and not vice versa, he was able to distinguish networks that are causal for disease (i.e. disease driving) from those that are reactive (networks mirroring disease progression). CGN pretty much uses these existing algorithms slightly adjusted to enable applying them on human data sets for inference of clinical gene networks.
Using Amazon cloud computing CGN and its partners infer CAD networks from AtheroCode. The top- down network inference approach is essential for CGNs scientific strategy. This refer to the idea that disease network should first be inferred from genome-wide data generated from the most disease- relevant biopsies, which is from biopsies in the patients who suffer the disease that is under investigation. Then, the networks inferred from the patients are further refined (increasing its resolution by adding more nodes and edges) in disease model systems such as in mouse (to study how the networks evolve driving disease) and in cell culture models (to study the molecular details of networks in different cell types). The CAD networks that are being inferred can also be improved as to their resolution and therefore their statistical capacities to predict disease progression by integrating other types of omic data in the network inference computations. Those could be additional in-house datasets, as well as datasets available on the INTERNET such as several genome-wide datasets from Yeast.