About
AI- or Correlation-Based Networks
11 min
correlation based networks remain a fundamental tool in data analysis due to their simplicity, transparency, and efficiency they offer clear insights into pairwise linear associations and are particularly useful for exploratory analysis, hypothesis generation, and contexts requiring interpretability however, their scope is limited they capture only linear, symmetric relationships and cannot represent higher order or conditional dependencies, therefore risking oversimplifying complex systems ai based networks, by contrast, are designed to detect non linear, multidimensional, and context dependent interactions they excel in uncovering hidden structures and improving predictive accuracy through adaptive learning these strengths make them invaluable in biology, where interactions are dynamic yet, they come with trade offs greater computational demands and reduced interpretability compared to correlation based approaches comparison between ai and linear correlation based networks we compared gastrointestinal – ai ulcerative colitis network with the large intesting – ulcerative colitis correlation based network t values generated from both workflows were subsequently plotted, and a correlation coefficient was calculated statistical significance was then assessed by obtaining the corresponding p value based on these results, we observed that for many gene pairs obtained, normalized t values were higher when gene pair was investigated with ai rather than correlation based workflow this indicates that ai is able to detect gene to gene relations with higher reproducibility across datasets regardless of different noise levels additionally, as expected, we observed many novel, strong, non linear interactions between genes in the ai network that were not observed in the correlation based network the ai network was able to recover some linear interactions as well the spearman correlation coefficient between t values from both workflows was 0 0823 (p < 0 001) strong connection overlap between ai networks and linear correlation based networks to validate that the overlap in strong gene connections between the correlation based networks and ai networks is greater than expected by chance, a permutation test was applied for this aim, gene connections involving the 14,033 shared genes between the gastrointestinal – ai ulcerative colitis (uc) network and the large intestine – ulcerative colitis network were used for each gene individually, we counted the number of overlapping strong gene connections (t value > 0 9) between the networks a null distribution was constructed by randomly shuffling the gene labels of the large intestine – ulcerative colitis network 1,000 times, computing the number of overlapping gene connections for each gene and permutation round for each gene, an enrichment score could then be computed based on the “observed overlap” / “mean overlap from the null distribution” an empirical p value was also computed based on the “number of times the permuted overlap >= observed overlap” / 1,000 results based on the permutation test, we found that approximately 75% of the genes had a positive log2 normalized enrichment score, and 54% (7,717 out of 14,033 genes) were significantly enriched (p <= 0 05) among the top 100 genes, based on significant enrichment scores, a clear connection to uc pathogenesis could be seen for example, among these top genes were ceacam7 and clca4 the cecam gene family are intercellular adhesion molecules highly expressed in intestinal epithelial cells, and ceacam1, 3, 5, 6, and 7 are known to be altered in patients suffering from colon cancer and inflammatory bowel diseases ( docid\ jx2lnprj05pavebiqjwt7 ) clc4 is a human chloride channel regulator, and the calcium activated clca4 is discussed as a driver of epithelial to mesenchymal transition as well as a biomarker for colorectal cancer and uc ( docid\ jx2lnprj05pavebiqjwt7 ) the top 50 genes are presented in table 1 further on, go enrichment analysis of these top 100 genes pointed to significantly enriched biological processes, such as protein rna complex assembly and antimicrobial humoral immune response, both which are known to be closely related to uc and inflammatory bowel disease these results imply that ai networks and correlation based networks are complementary to each other in that each has its own unique connections, however both networks agree on the strongest connections table 1 the top 50 genes ordered based on their significant enrichment scores gene symbol enrichment score p value mt nd6 666 67 <0 001 tsix 333 33 0 003 mt nd2 200 00 <0 001 rpl18a 33 15 <0 001 xist 32 26 0 031 ptma 27 78 0 036 muc5b 24 39 0 040 rps27a 21 28 <0 001 clca4 18 45 <0 001 rps3a 18 30 <0 001 reg1a 18 22 <0 001 txndc5 17 32 <0 001 rpl11 17 02 <0 001 rps4x 16 78 <0 001 rpl19 15 47 <0 001 rplp1 15 38 <0 001 akt2 15 04 0 004 rps25 14 58 <0 001 rpl37a 14 49 <0 001 rpl3 14 46 <0 001 rps11 14 20 <0 001 rpl9 14 15 <0 001 eef1g 13 93 <0 001 rpl13a 13 78 <0 001 rps8 13 78 <0 001 rps18 13 61 <0 001 rpl39 13 57 <0 001 rps16 13 55 <0 001 rps12 13 03 <0 001 eef1a1 12 77 0 003 pla2g2a 12 56 0 001 naca 12 49 <0 001 rpl38 12 47 <0 001 rps27 12 47 <0 001 actg1 12 30 0 002 defa5 12 20 0 002 rps9 12 08 <0 001 ceacam5 12 05 <0 001 muc12 11 92 <0 001 ceacam7 11 76 <0 001 lcn2 11 54 <0 001 ceacam1 11 32 <0 001 rpl18 11 22 <0 001 sfpq 11 11 0 017 rps14 11 09 <0 001 rpl24 10 93 <0 001 rpl7 10 74 <0 001 rps6 10 64 <0 001 arpc5 10 64 <0 001 rpl10a 10 59 <0 001 network validation based on subnetwork similarity to compare the performance between the correlation based network and the ai based, network similarity and dissimilarity analyses were performed, as described in the docid\ uhclpee8 fosx5lplvkqt section in the similarity sets, we expected to observe core tissue /disease related terms, while in the dissimilarity sets, we examined terms that showed strong enrichment in the ai network's top results but were not expected the dissimilarity analysis to reveal non linear functions, such as regulation of transcription factors and epigenetic functions results ai breast cancer vs standard breast cancer network similarity/dissimilarity analyses were first done between breast – ai breast cancer / tissue network and breast – breast cancer/tissue networks in the similarity sets, the top enriched terms (p < 0 05) included chromosome segregation, adaptive immune response, mitotic nuclear division, and regulation of cell cycle process, all of which are important cancer related biological processes in the dissimilarity sets, we observed terms such as protein rna complex assembly, protein rna complex organization, and rna splicing, representing novel biological insights captured by the ai network protein rna complex assembly in breast cancer is important since it involves rna binding proteins (rbps) that regulate gene expression post transcriptionally by influencing rna stability, splicing, and translation of cancer related genes ( docid\ jx2lnprj05pavebiqjwt7 ) the subnetwork similarity analysis demonstrated that fundamental biological processes were preserved in the similarity enrichment results, while the dissimilarity analysis revealed additional biological processes that has not been identified by standard network construction methods these findings validate that the ae approach both recapitulates known disease biology and uncovers novel relevant pathways ai blood vs standard blood network we performed a similar comparison between blood – ai general and blood – general networks, which includes diverse disease conditions affecting blood tissue in the similarity sets, the top enriched terms included eukaryotic translation elongation, viral mrna translation, and eukaryotic translation initiation, all of which are fundamental processes relevant to both viral infection response and cancer biology in blood diseases in the dissimilarity sets, we observed terms such as viral infection pathways, chromatin organization, and mirna mediated post transcriptional gene silencing, representing additional biological mechanisms captured by the ai network the subnetwork similarity analysis confirmed that the blood ai network successfully preserved core biological processes related to blood diseases in the similarity of enrichment results notably, the dissimilarity analysis revealed novel pathways, particularly those related to viral infection mechanisms and epigenetic regulation, which were not prominently identified by standard network construction methods these findings demonstrate that the ae approach effectively captures both shared disease biology and uncovers disease specific regulatory mechanisms relevant to the heterogeneous conditions represented in the blood datasets ai uc vs standard uc network lastly, comparisons were made between gastrointestinal – ai ulcerative colitis and large intestine – ulcerative colitis networks, which represents the tissue specific disease network for uc in the similarity set, the top enriched terms included eukaryotic translation elongation, viral mrna translation, toll like receptor signaling pathway, cell adhesion molecules, and hemostasis these processes represent fundamental mechanisms underlying inflammatory responses and tissue integrity maintenance, which are critical features of uc pathophysiology in the large intestine the presence of toll like receptor signaling and cell adhesion molecules particularly highlights the immune epithelial interactions central to uc disease biology in the dissimilarity sets, we observed terms such as cell cycle, viral infection pathways, interleukin 10 signaling, and rna polymerase iii transcription initiation from type 2 promoter these represent additional regulatory mechanisms captured by the ai network, with il 10 signaling being particularly relevant as it represents anti inflammatory pathways that may be dysregulated in uc the subnetwork similarity analysis confirmed that the ai uc network successfully preserved core biological processes related to intestinal inflammation and barrier dysfunction characteristic of uc notably, the dissimilarity analysis revealed novel pathways, particularly those related to cell cycle regulation and il 10 anti inflammatory signaling, which were not prominently identified by standard tissue specific network construction methods references g saiz gonzalo, n hanrahan, v rossini, r singh, m ahern, m kelleher, s hill, r o’sullivan, a fanning, p t walsh, s hussey, f shanahan, k nally, c m o’driscoll, s melgar, regulation of ceacam family members by ibd associated triggers in intestinal epithelial cells, their correlation to inflammation and relevance to ibd pathogenesis front immunol 12 (2021), doi 10 3389/fimmu 2021 655960/pdf https //pubmed ncbi nlm nih gov/34394073 h chen, y liu, c j jiang, y m chen, h li, q a liu, calcium activated chloride channel a4 (clca4) plays inhibitory roles in invasion and migration through suppressing epithelial mesenchymal transition via pi3k/akt signaling in colorectal cancer med sci monit 25, 4176 (2019) https //pmc ncbi nlm nih gov/articles/pmc6563650 s wang, h sun, g chen, c wu, b sun, j lin, d lin, d zeng, b lin, g huang, x lu, h lin, y liang, rna binding proteins in breast cancer biological implications and therapeutic opportunities crit rev oncol hematol 195, 104271 (2024) https //doi org/10 1016/j critrevonc 2024 104271