Revolutionizing RNA Research
New Tool Accurately Uncovers the Hidden World of Long-Read Sequencing
RNA molecules are important for understanding how genes work in the body. They can be cut and joined in different ways, a process called alternative splicing allows a single gene to make several different proteins. This is important in many biological processes like when stem cells turn into specific tissue cells. However, when this process goes wrong, it can lead to disease. To understand the root cause of a condition, it's important to look at all the RNA molecules that come from genes, called the transcriptome.
Traditionally, it's been difficult to look at all the RNA molecules because they are so long. Instead, researchers have used short-read RNA sequencing, which breaks the RNA molecules into shorter pieces, around 200 to 600 bases, and then uses computer programs to reconstruct the full RNA sequences. This method is accurate but limited in the information it can provide.
Recently, long-read platforms that can sequence RNA molecules in one piece have become available. These platforms don't have to break the RNA molecules before sequencing, but they have a higher error rate, typically between 5% to 20%. This high error rate has made it difficult to determine the validity of novel RNA molecules discovered in a particular condition or disease.
Researchers at the Children's Hospital of Philadelphia have developed a new tool called ESPRESSO that can more accurately discover and quantify RNA molecules from these error-prone long-read RNA sequencing data. ESPRESSO compares all long RNA sequencing reads of a given gene to its corresponding genomic DNA and then uses the error patterns of individual long reads to confidently identify splice junctions, where the RNA molecule has been cut and joined, as well as their corresponding full-length RNA molecules.
ESPRESSO can accurately discover and quantify different RNA molecules from the same gene, even with a high error rate. This will help researchers understand rare genetic diseases and other conditions like cancer. This new tool is an exciting step in the transition from short-read to long-read RNA sequencing, a technological transformation that will allow us to uncover RNA variation in diseases.