CRISPR gRNA design for
wolf TMPRSS2

CRISPR / Cas9 Canis lupus baileyi Accession: XM_072807049.1 649 candidates scanned Status: active

Canine distemper virus is a leading killer of reintroduced wolf populations. This analysis identifies CRISPR-Cas9 guide RNA candidates targeting the TMPRSS2 gene in Mexican gray wolves, a potential pathway to CDV resistance that could meaningfully improve reintroduction survival rates.

Why CDV resistance matters for wolf reintroduction

Canine distemper virus (CDV) is a paramyxovirus closely related to measles. In wolves, it causes respiratory, gastrointestinal, and neurological disease with case fatality rates that can reach 50–90% in naive populations. Reintroduced wolf packs are already stressed by translocation, unfamiliar territory, and small founding numbers, so a CDV outbreak can be catastrophic.

In Yellowstone, CDV contributed to significant pup mortality between 1999 and 2005, slowing recovery of the Northern Rocky Mountain population during its most critical early years. The Mexican gray wolf (Canis lupus baileyi) recovery program is the most endangered wolf subspecies in the world, with fewer than 200 individuals in the wild. A single outbreak there could meaningfully set back decades of recovery work.

The TMPRSS2 connection

CDV, like many enveloped viruses, requires host cell proteases to activate its fusion protein for cell entry. TMPRSS2 (Transmembrane Serine Protease 2) is one of the key proteases CDV exploits. Disrupting or modifying TMPRSS2 expression could reduce cellular susceptibility to CDV. This is the same mechanism studied extensively in SARS-CoV-2 research, where TMPRSS2 inhibition significantly reduced viral entry.

What is CRISPR-Cas9?

A molecular tool that uses a guide RNA (gRNA) to direct the Cas9 enzyme to a precise location in the genome, where it makes a targeted cut. That cut can disrupt a gene, correct a mutation, or insert new sequence.

What is a gRNA?

A 20-nucleotide RNA sequence that acts as a GPS coordinate for Cas9. It matches the target DNA sequence exactly. This analysis identifies which 20-base windows in the TMPRSS2 gene are the best targets for efficient, safe editing.

What does "off-target risk" mean?

Cas9 can sometimes cut at unintended sites in the genome that partially resemble the target. We use NCBI BLAST to search the entire wolf genome for near-matches, flagging any gRNA with too many close hits as higher risk.

What would this mean in practice?

This is computational prediction only. Experimental validation in cell culture and animal models would be required before any application. The immediate value is as a framework for deciding which targets are worth pursuing in the lab.

Analysis pipeline

The pipeline is fully open-source and reproducible. Each stage is implemented as a standalone Python module using Biopython and pandas, with NCBI remote BLAST for off-target validation. All code is available on GitHub.

Fetch
NCBI GenBank retrieval
Scan
NGG PAM site detection
Score
Doench efficiency model
BLAST
Off-target NCBI search
Report
Final ranked output
Step 1 — Fetch
Downloads wolf TMPRSS2 mRNA from NCBI GenBank (accession XM_072807049.1) via Biopython Entrez. Cached locally as wolf_tmprss2.gb to avoid repeated API calls.
Step 2 — Scan
Slides a 20-nt window along both strands, detecting NGG PAM sites. Found 649 candidate gRNAs total (330 forward, 319 reverse). Filtered to 292 overlapping the coding sequence (CDS: 330–1809).
Step 3 — Score
Applies a position-weight model based on Doench 2016 (PMID: 26780180), incorporating GC content, poly-T avoidance, and nucleotide context. Ranked top 20 candidates by composite score.
Step 4 — BLAST
Remote NCBI blastn against the nt database filtered to Canis lupus. word_size=7, hitlist_size=50. Off-target threshold: ≤3 mismatches. BLASTed top 5 candidates; 4 returned zero off-target hits.
Step 5 — Report
Final score = composite score × risk penalty (LOW: 1.0×, MED: 0.9×, HIGH: 0.6×). Rank #1 candidate AGTCCTGCTGGATTTCCGGG scores 0.930 with zero off-target hits.

Click any pipeline stage to see details.

Wolf pack moving through an alpine wildflower meadow Alpine meadow · Uinta ecosystem

Candidate gRNA rankings

Of 649 total candidates scanned, 20 were scored and the top 5 were validated against the wolf genome via NCBI BLAST. Four candidates returned zero off-target hits with ≤3 mismatches. The top-ranked candidate, AGTCCTGCTGGATTTCCGGG, achieves a Doench efficiency score of 0.860 with no detectable off-target risk.

Filter:
# gRNA sequence Str Position GC% Doench Final score Off-target risk Note

Computational predictions only

This analysis is a computational starting point. Every candidate identified here would require experimental validation: first in cell culture to test editing efficiency and specificity in wolf-derived cells, then in appropriate animal models before any consideration of in vivo application. CRISPR-based interventions in wild animal populations also face significant regulatory, ethical, and ecological considerations that extend well beyond this computational work.

The value of this pipeline right now is as a published, citable, open-source framework: demonstrating which computational approaches are feasible for non-model canid species and providing a benchmark dataset for the field.

Citation

Hansen, A. (2025). Wolf TMPRSS2 CRISPR gRNA design pipeline (v1.0). Rewild Genomics LLC. GitHub: rewild-genomics/wolf-crispr-grna. Preprint in preparation.