Experts from the AIRI Institute for Artificial Intelligence are training the Russian GENA neural network to find viruses and bacteria in bat droppings, Olga Kardymon, a bioinformatician, researcher and head of the AIRI Bioinformatics group, told socialbites.ca.
“We are developing new approaches to the implementation of GENA. For example, we want to apply this to metagenomic community sequencing data from bat droppings. “We are testing whether GENA can identify the different genomes of viruses and bacteria sequenced in this litter.”
GENA is the first language model for DNA trained on the most complete version of the human genome. It was made public at the end of March 2022. The range of possible applications is wide: with its help, geneticists can determine the effect of mutations on the work of genes, search different parts of the genome, classify living organisms according to sequencing data, etc.
“GENA draws attention to areas of biological significance. So even if we can’t find any statement in the scientific literature that a particular region of the genome affects something, that doesn’t mean the neural network is faulty. Maybe he found something new unknown in biology. Such areas need to be studied – they will slowly help unravel the mysteries of our genome,” Kardymon said.
To learn more about how neural networks look for mutations in the human genome, create proteins that did not exist in nature before, and predict the effectiveness of vaccines and drugs, see material “socialbites.ca”.