A research group at the IIMCB institute in Poland has developed computer software for RNA prediction and modelling of RNA molecules and RNA-protein complexes.
The International Institute of Molecular and Cell Biology (IIMCB), is a research institute based in Warsaw, Poland. IIMCB is built up of nine research groups, spanning from the Laboratory of Structural Biology to the Laboratory of Zebrafish Developmental Genomics. One of the nine research groups is the Laboratory of Bioinformatics and Protein Engineering. This research group in particular looks at the theoretical and experimental research on sequence structure-function relationships in proteins, nucleic acids and macromolecular complexes. SEQ spoke to group leader Professor Janusz Bujnicki about developing software for RNA prediction and modelling of RNA molecules and RNA-protein complexes.
What is the main work and role of IIMCB?
IIMCB was established in 1997, based on an agreement between Poland and UNESCO. Its modus operandi is based on a dedicated parliamentary bill. The main goal of IIMCB is to perform high-quality research in molecular biosciences and to provide training opportunities for junior scientists. Currently, the institute hosts nine research groups and their research topics cover the wide area of molecular and cell biology, macromolecular structural biology, biochemistry, genomics, and bioinformatics.
I am a leader of one of the research groups. My team comprises experimentalists and theoreticians with a wide area of expertise. We develop computer software, use bioinformatics to make structural and functional predictions, and test hypotheses experimentally. Examples of our work include the development of software for the modelling of protein and RNA structures, bioinformatics-guided identification and characterisation of new enzymes, and computer-aided engineering of macromolecules with new structures and functions.
You and your team are currently working on developing software for the structural prediction and modelling of RNA and RNA-protein complexes. Why is this important and what do you hope the biggest impacts will be?
The cellular and molecular activities of RNAs often require the molecule to fold into a specific structure in order to perform its targeted function. Similar to proteins, in which the amino acid sequence determines the structure, the RNA sequence directly determines the structure of the ribonucleotide chain and its potential to interact with other molecules. In particular, ribonucleotide residues form canonical A-U and C-G pairs as well as many non-canonical pairs and other interactions. Consecutive canonical base pairs form helices, which are typically linked by loops that form various hairpins, bulges and junctions, together forming the secondary structure. The global shape (3D structure) is formed by the interaction of different parts of the RNA chain, stabilised by a variety of non-canonical pairs. Consequently, the understanding of the molecular basis of RNA functions, beyond protein coding, requires knowledge of an RNA tertiary structure.
Unfortunately, experimental determination of RNA tertiary structure at high-resolution is both laborious and difficult, and as a result, the majority of known RNAs remain structurally uncharacterised. To address this problem, various computational tools have been developed that predict RNA secondary structure from sequence. These methods are based on the accumulated knowledge of RNA structures determined so far, including the physical basis of the RNA folding and evolutionary considerations (such as the conservation of functionally important motifs).
One of the fundamental challenges of biology is to prove that we sufficiently understand the way how the biological molecules function, by designing molecules that carry out the functions we desire. In case of biological macromolecules, this requires us not only to predict the structure for molecules that exist, but to be able to design new molecules that form desired spatial structures. Designing RNA sequences with specific structures has already proven useful in many biomedical and biotechnological applications, such as modifying HIV-1 replication mechanisms, reprogramming cellular behaviour, developing novel logic circuits and constructing RNA nano-objects. I believe that the computational design of new molecules, including RNA, will have a huge impact in the near future.
Where do you feel research priorities surrounding RNA diversity should lie and would you agree that these areas is not currently receiving enough attention?
According to the ‘central dogma of molecular biology’ proposed in the 1950s, the role of RNA was primarily to convert the information stored in DNA into proteins. However, since then, RNA molecules were found to be involved in a variety of other biological processes and they are no longer considered passive messengers of genetic information. Some RNAs, termed ribozymes, have been found to carry out catalytic reactions just like protein enzymes do. Many RNA molecules, or their specific elements, sense environmental or metabolic cues and they respond by interacting with other molecules and regulating diverse cellular processes. Various types of RNA molecules have key regulatory roles in human cells, as well as in bacteria and in viruses, so they are very important both in health and in disease. New forms of RNAs with novel functions continue to be discovered and defined, and the contribution of RNA to the life of the cell, while already known to be immense, may still be underestimated.
However, as I mentioned earlier, the world of RNA is not limited to molecules that exist. We can attempt to design and then synthesize an RNA molecule with essentially any sequence. In my personal opinion, the considerations of the diversity of RNA should include the molecules that have not yet been found in the nature but can be created.
In terms of the challenges surrounding RNA research, what would you say are the main themes of this and how can they be overcome?
Currently existing methods for RNA structure prediction and design exhibit many severe limitations. Our knowledge of RNA 2D and 3D structures is primarily based on various experimental observations. On the one hand, experiments can be interpreted in terms of specific structural knowledge about a particular RNA molecule, although large sets of experimental data collected systematically for various RNAs can be used to infer principles that apply to RNA structure in general.
There is a clear need for new technologies, in particular ones that consider low-resolution experimental data, which can be generated quickly with the use of relatively inexpensive biochemical experiments. A huge limitation is that most methods work only on the RNA sequence alone and are unable to consider other molecules such as proteins or small organic molecules that often regulate the RNA function.
Looking to the future, what do you think this holds for RNA research, and what role will institutions such as IIMCB play?
Methods for RNA 3D structure modelling that enable the use of experimental data as restraints could be significantly improved. This applies to both natural RNA sequences as well as the artificial ones, designed with computational methods and synthesized in the lab. There exists a number of experimental techniques that are incapable of unambiguous RNA structure determination, but they generate data that can be potentially used as restraints in computational modelling.
I believe that significant advancement in modelling methods (for RNA, its complexes with other molecules, and actually in molecular modelling in general) could be made by adapting the existing methods to use experimental data at the possibly early stage of processing. This may involve the use of methods of artificial intelligence. Computational approaches that deal with the original data may better capture details of biological processes studied experimentally and could potentially lead to discoveries of new biological phenomena; such a direction of development may also prompt the theoreticians to collaborate with experimentalists more closely.
Prof. Janusz Bujnicki
Group leader
Laboratory of Bioinformatics and Protein Engineering International
Institute of Molecular and
Cell Biology
+48-22 597 0750
iamb@genesilico.pl
http://iimcb.genesilico.pl