What is Mavatar Discovery?
6 min
mavatar discovery is a data‑driven research platform for systems level analyses of disease biology consisting of a broad collection of tissue and disease specific deep integrated gene networks each network is built based on thousands of samples from dozens of independent studies and further connected by mavatar r\&d team curated metadata it is developed to help researchers, pharmaceutical companies, and academic teams analyze complex biological networks, accelerating research within disease and drug discovery, reducing work that traditionally takes months down to minutes or hours a deep integrated gene network is a graph in which nodes represent genes, and edges connect genes that are consistently co expressed across samples unlike single dataset correlations, robust co expression networks are built by aggregating relationships across many independent studies, retaining only the connections that are reproducible across biological contexts within mavatar discovery, you can explore large scale biomedical transcriptomic data to uncover disease mechanisms, pathways, and relationships between genes and diseases this can aid you in discovering potential biomarkers and therapeutic targets, supporting drug discovery, repurposing, and translational research mavatar discovery provides you with ready to use interactive gene networks and analyses, and no coding or custom bioinformatics pipeline is required data availability public repositories, such as gene expression omnibus (geo) and arrayexpress are constantly growing, now covering tens of thousands of datasets, representing millions of samples and billions of data points, and more data is being added every day for many biological questions, the limitations do not lie in a lack of data, but in how to access and make the most use of what is already available if public data sources were used to their full potential, researchers would be able to avoid unnecessary sequencing costs and focus on the missing links rather than on repetitive studies mavatar discovery tackles this issue by compiling data from many different sources into tissue and disease specific deep integrated gene networks, supporting comparability and interpretability of data available today thus, the mavatar discovery platform is not only cost effective but also holds the potential to speed up research and enable scientists to focus their time consuming laboratory work on novel fields and validation studies strengths of deep integrated network analysis the strength in studying deep integrated gene networks is based on the observation that genes thar are involved in the same biological process tend to be expressed together furthermore, genes do not act in isolation, but their relationships are important to fully understand the biology behind specific dysregulations another important factor is that genes that are truly connected should be consistently found as co expressed across different datasets, patients, conditions, sources of noise, etc, and not just in one experiment thus, the more data of variable data sources that can be implemented into the same network, the higher reliability can be put into its ability to find the strongest connections within its biological context in other words, by studying deep integrated gene networks based on a broad collection of data, you can learn how the genes work together within the biological context that the network is based on, which biological processes are active, and how the genes are organized as a system the mavatar discovery approach mavatar discovery consists of a broad collection of tissue and disease specific deep integrated gene networks to ensure consistent findings, each network is constructed based on thousands of samples from many independent studies with variable sources of noise the weights on the edges reflect the number of datasets that support the correlation, as well as their respective correlation value including our k nearest neighbor approach, focusing on the strongest co expression weights within the network, we ensure that the edges represent real gene connections and are not an artefact due to random noise since our approach is fully data driven, we also reduce the risk of study bias the importance of properly curated metadata a limiting factor when connecting many data sources into one network is the variations within their connected metadata curation of the metadata is crucial to properly assign the different data sources to their respective networks, and to gain as much detailed and comparable information as possible from them the r\&d team at mavatar has made an extensive effort to curate metadata information, regarding biopsy sites, disease connections, etc , to each dataset and sample included within the networks this is a continuous effort, both for the aim of gaining details in new networks and network versions, and adding more detailed information into further patient stratification analyses