“B.1.1.7.” “B.1.351.” Before the COVID-19 pandemic, these were just a series of arbitrary letters and numbers. Now, we understand these as ‘variants of concern,’ versions of the SARS-CoV-2 virus with unique genetic codes.
These variants represent just two branches on an enormous tree of the virus’ genetic diversity. As the virus transmits from person to person it evolves and picks up genetic mutations, creating a sprawling canopy of different virus lineages and variants. Visually representing that in an accessible way for researchers, public health experts and members of the community is at the core of a Western-developed open-source bioinformatic pipeline and web application called CoVizu.
Genomic surveillance programs around the world are collecting data on these mutations daily, and so far have identified more than 300,000 unique genetic versions of the COVID-19 virus, which are collected into categories called “lineages” or “variants.”
Unlike B.1.1.7 and B.1.351, whose mutations make the virus potentially more infectious, many variants do not measurably change how the virus behaves. They do, however, help to paint an informative epidemiological picture. Closely examining this breadth of genetic information can help track how the virus is moving through the population, identify outbreaks and potentially pinpoint the next variant of concern.
“What we are trying to do is provide some estimate of how the virus is moving around,” said Art Poon, PhD, associate professor at the Schulich School of Medicine & Dentistry, and the main developer of CoVizu. By tracking all of these variations, the web app visually represents all of the virus’s mutations and how it is changing genetically as it is being transmitted globally.
“We also immediately realized that there is so much data that we needed to have some way of enabling users to search through it in a meaningful way,” Poon said.
By creating CoVizu, he and his research team have distilled this massive amount of genomic surveillance information – collected from more than half a million virus samples from around the world – into an easy-to-understand visual representation. A tree graph shows how different variants are related to one another, and a “bead plot” shows when and where in the world that variant has been sampled, as well as how those samples are related.
All of the data for the software is provided by GISAID – a global science initiative that provides open access to real-time genomic data of viruses. To date, CoVizu is one of 15 authorized software collaborations in the world enabled by GISAID and the only one developed in Canada.
The Western-developed software allows users to filter the data by number of samples, the date it was taken or its country of origin. It also shows a visual representation of how many mutations each lineage of the virus has accumulated since diverging from the common ancestor.
“A major part of this project is making this data accessible and informative so the average person can look at it and understand what the numbers mean to them,” said Emmanuel Wong, a master’s student supervised by Poon who is part of the CoVizu development team. “What it does is allow people to look at these variants and understand how they got here and where they came from.”
The team, including Roux-Cil Ferreira, Molly Liu, Kaitlyn Wade, Laura Muñoz-Baena, Gopi Gugan and Abayomi Olabode, hopes the software will help epidemiologists and public health experts track how the virus is moving through the population and identify variants of concern.
“CoVizu is an effort to provide another tool in our toolkit for understanding the genetic diversity of these viruses,” said Poon. “We know that genomic epidemiology is important and having this data is important, because the fact that we can see and identify variants of concern is made possible by sharing data.”