Custom-made scientific tools help solve complex protein challenges
Making sense of enormous amounts of data from i.e. proteomics studies is no easy task. At CPR, one group of researchers creates scientific tools that help researchers all over the world analyze and visualize protein networks.
Lars Juhl Jensen leads the research group at CPR that spend their time developing creative solutions to complex problems. They are specialized in combining knowledge from text mining of research publications with network biology in order to analyze and visualize known and potential networks of proteins.
“We primarily develop tools that other researchers at CPR and externally base their work on,” says professor and group leader Lars Juhl Jensen.
STRING shows interacting proteins
The most widely known and used tool is the STRING database - Search Tool for Retrieval of Interacting Genes/Proteins. The database integrates all types of protein-protein interactions across more than 5000 organisms. Researchers can use the tool to see how proteins work together in networks.
STRING is a collaboration with University of Zurich and European Molecular Biology Laboratory. Lars Juhl Jensen’s group contributes with text mining of research publications. It mines through all open access publications in full-text and the abstracts available at PubMed to extract any knowledge on proteins.
“The text mining enriches the network analysis with both interactions and knowledge about pathways or diseases where a protein might be playing a role. If a protein is mentioned in relation to a disease in a publication or a protein has been found to bind to another, that relation will be made available in STRING. That makes STRING a powerful tool that can help other researcher get the next great idea for uncovering a new pathway or drug target,” says Lars Juhl Jensen.
Highly cited tool
As intended STRING is a popular tool for scientists with more than 80,000 monthly users. On top of that comes the amount of bioinformatics researchers that simply download the entire database and work with it locally.
For many researchers, STRING is used as a reference work in their daily research, while others use it for large scale analysis and visualization of data. With the large amount of users, it is no surprise that the tool is also highly cited. The version released in 2015 got almost 6,000 citations, and version 11 that released in 2019 already has 1,700 citations; making it a high impact tool in the scientific community.
Need-based development
Lars Juhl Jensen and the group usually develop a tool based on a need from a colleague at CPR or an external partner. For instance, the group had many collaborations with proteomics researchers that wanted to visualize their data for hundreds of proteins on protein networks. That was not possible to do without programming skills, so together with the Cytoscape team, Lars’ group developed stringApp, which allows omics data and all the information from STRING to be visualized in large networks.
“Since we made that tool our collaborations in that area has gone down. Instead of having to ask us, omics researchers can now simply perform their own network analysis,” says Lars Juhl Jensen. He is very satisfied with that development.
“The driving force for me is to make the tools we develop as widely used as possible so they create value for scientist all over the world. That’s why we focus on making them freely available and user friendly,” he says.
Predicting is the future
Lars Juhl Jensen’s approach of democratizing the tools, also ensures he can keep evolving and develop new sought-after resources. Looking ahead, he is focused on exploring deep learning more.
“In the future we will be looking in to using deep learning to predict, for instance, disease genes. With deep learning, if you have knowledge about disease genes and networks, you will be able to predict which other genes could be relevant to study for discovering new diagnostic biomarkers or therapeutic targets, ” Lars Juhl Jensen concludes.
Get access to all the great tools developed at CPR here.