RNA mutations

Unlocking the Power of ncRNA--Classifying Pathogenic SNPs for Next-Generation Treatments

Our solution addresses the challenge of predicting and personalizing cancer treatment by focusing on non-coding RNA (ncRNA) variants. Despite the critical role of ncRNA in gene regulation and cancer pathology, it remains underexplored compared to protein and coding RNA mutations. Advancing innovation in this area is crucial for enhancing precision medicine and improving patient outcomes, particularly in breast cancer.

We leveraged deep learning to evaluate how mutations disrupt the structural integrity of ncRNA and assess their potential pathogenicity in cancer. Drawing an analogy to a “banana,” we hypothesized that mutations in specific structural regions—such as the surface or tips of the RNA may have a disproportionate impact on RNA function, just as a small change at the tip of a banana may cause a larger structural shift. Our approach utilized the pre-trained RNA-FM model to encode ncRNA sequences into a high-dimensional feature space, capturing both sequential and evolutionary information. These features serve as inputs to a predictive model trained on SNPs associated with cancer survival (ncRNA-eQTL) and neural SNPs (dbSNP) databases.

Overall, our approach identifies druggable regions within ncRNA, pinpointing structural vulnerabilities that could serve as novel therapeutic targets. In breast cancer certain SNPs in ncRNA have been identified that disrupt the secondary structure of the RNA, impairing tumor suppressor genes. These mutations may contribute to tumor progression, highlighting the potential importance of ncRNA variants in cancer treatment. This has the potential to transform personalized oncology treatment by identifying key biomarkers and guiding targeted therapeutic strategies across multiple disease types.

banana analogy created with https://huggingface.co/spaces/stabilityai/stable-diffusion

Workflow for Annotating Cancerous and Common SNPs on ncRNA Using RNA Secondary Structure Prediction

Data Collection

Cancerous SNPs: Extracted from the survival-eQTL dataset of the ncRNA-eQTL Database.
Neutral SNPs: Obtained from the snp151common dataset of the dbSNP Database.
ncRNA transcript sequences: Homo sapiens version GRCh38

RNA Secondary Structure Prediction

Model: Utilize the pretrained deep learning model RNA-FM to predict RNA secondary structures. The model was trained on 23 million ncRNA sequences from 800 species.
Input: ncRNA sequences with annotated SNPs.
Output: Pair-wise binding probability.

Feature Extraction

Extract secondary structure annotation in DSSR-style labels using python-based scripts.
- Stem (S) → Paired bases in a helix (( )).
- Hairpin loop (H) → Unpaired bases enclosed by a single closing pair.
- Bulge (B) → Unpaired bases interrupting a stem on one side.
- Internal Loop (I) → Unpaired bases interrupting a stem on both sides.
- Multiloop (M) → Unpaired bases with multiple branching stems.
- Unstructured (U) → Unpaired bases not in any specific loop.

SNP Selection Processing

Inclusion Criteria:
- Cancerous SNPs: 1,001
- Neutral SNPs: 244,044
Exclusion Criteria:

Excluded SNPs not within exons of ncRNA:
- Remaining cancerous SNPs: 79
- Remainging neutral SNPs: 77
Excluded SNPs that failed to be processed by RNA-FM due to long transcript lengths:
- Remaining cancerous SNPs: 39
- Remaining neutral SNPs: 29

Analysis and Interpretation

Compare structural changes between cancerous and neutral SNPs.

Fig. Cancerous SNPs were found more frequently in structural features

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
ref data/ncRNA-eQTL		ref data/ncRNA-eQTL
result		result
script		script
.DS_Store		.DS_Store
Movie script.md		Movie script.md
README.md		README.md
RNA Structure Presentation.pdf		RNA Structure Presentation.pdf
RP11-44F21.5.md		RP11-44F21.5.md
links.md		links.md
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RNA mutations

Unlocking the Power of ncRNA--Classifying Pathogenic SNPs for Next-Generation Treatments

Workflow for Annotating Cancerous and Common SNPs on ncRNA Using RNA Secondary Structure Prediction

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RNA mutations

Unlocking the Power of ncRNA--Classifying Pathogenic SNPs for Next-Generation Treatments

Workflow for Annotating Cancerous and Common SNPs on ncRNA Using RNA Secondary Structure Prediction

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages