A researcher's guide to every feature in LOOM CRISPR Search.
New to CRISPR or pathogen genomics? Start with Learn for a visual, step-by-step introduction before diving into this feature reference.
1. What is this tool?
LOOM CRISPR Search is an open-science platform for discovering CRISPR-based diagnostic targets across pathogen genomes. It uses a 195 KB WebAssembly binary (built on a Burrows-Wheeler Transform FM-index) to perform sub-millisecond exact-match searches across millions of genome sequences — entirely in your browser. No data leaves your machine.
The database contains 140,000 pre-computed CRISPR targets across 13 human pathogens extracted from over 3 million genomes downloaded from NCBI.
2. Pathogen Cards & Diagnostic Priority
The landing page shows a card for each pathogen. Each card displays the number of targets and genomes indexed.
Diagnostic Priority Badge
Many cards have a colored badge in the top-right corner:
Dx High — Score ≥ 70. Greatest unmet diagnostic need.
Tip: Hover over any badge to see the exact numeric score and the factors behind it.
Top Diagnostic Opportunities
The three featured cards at the top highlight pathogens with the highest priority scores. These include a brief summary (e.g., "No CRISPR dx published · 21,000–143,000 deaths/yr") so you can immediately see why they rank high.
3. Targets Table — Columns Explained
After selecting a pathogen, the table shows one row per CRISPR target site (23-mer). Here's what each column means:
Sequence (23-mer)
The 23-nucleotide CRISPR target sequence, shown with color-coded bases (A=green, T=red, G=gold, C=blue). This includes the 20-nt guide + 3-nt PAM motif. Badges may appear next to the sequence (see Cross-Reactivity and Resistance sections below).
Gene
The gene the target falls within, or intergenic if it's between genes. Click on a gene tag to jump to the Disease Context tab and highlight that gene in the gene map.
Position
The start position (0-based offset) within the indexed genome.
PAM
The protospacer adjacent motif. NGG = SpCas9-compatible (most widely used). TTTN = Cas12a-compatible. – = no canonical PAM detected (still a valid target site for PAM-less Cas variants).
Coverage
How many of the indexed genomes contain this exact 23-mer sequence. Shown as both a count and a percentage. Example: 1,832 67.6% means 1,832 out of 2,711 Zika genomes have this sequence.
Higher coverage = more conserved target = works across more strains = better diagnostic candidate.
Score
Guide RNA quality score (see next section).
Actions
BLAST link (see BLAST section below).
4. Guide RNA Quality Score
Each guide gets a composite quality score from 0 to 100, displayed as a colored circle:
86 — High quality (≥ 70). Likely to work well in the lab.
52 — Medium quality (40–69). Usable but may need optimization.
28 — Low quality (< 40). Potential issues with efficiency or specificity.
The score is computed from six factors:
GC content (25%) — Optimal range is 40–60%. Too high or too low reduces binding efficiency.
Poly-T terminator (20%) — Runs of 4+ T's can cause premature transcription termination. Penalized.
Seed-region GC (15%) — The 8–12 nt seed region (PAM-proximal) needs moderate GC for specificity.
Homopolymer runs (15%) — Long stretches of any single base reduce efficiency. Penalized.
PAM type (15%) — NGG (SpCas9) gets full marks; TTTN (Cas12a) gets partial; no PAM gets zero.
Self-complementarity (10%) — Sequences that can fold back on themselves are penalized.
Tip: Hover over any score badge to see the individual GC%, seed-region GC%, and max homopolymer run length.
5. Cross-Reactivity Badges
For the top 100 NGG guides per pathogen, we ran exact-match searches against 7 animal/human host genomes (human GRCh38, pig, bat, chicken, cow, camel, mouse) to check whether the guide sequence also appears in a host genome — which would cause off-target effects in a diagnostic assay.
specific — No hits in any host genome. This guide is pathogen-specific.
3 host hits — Found in 3 host genomes. Hover to see which hosts.
99.7% of the tested guides are specific. Guides without a badge were not among the top 100 NGG guides for that pathogen and have not been tested.
6. BLAST Links
Every row has a BLAST link in the Actions column. Clicking it opens NCBI BLAST (blastn) in a new tab with the guide sequence pre-filled and the database set to nt (all NCBI nucleotide sequences).
You need to click the blue "BLAST" button on the NCBI page to run the search — NCBI does not allow automated submission via URL because each search uses compute resources on their servers.
What BLAST tells you: Which organisms and genomic regions contain this exact (or similar) sequence. Use it to verify specificity beyond our pre-computed host checks, or to explore evolutionary conservation.
7. Drug-Resistance Overlay
Guides that overlap known drug-resistance mutation regions show a ☢ geneSymbol badge next to the sequence. This means the target site sits within a genomic region where drug-resistance mutations are known to occur.
Why it matters: Targeting resistance regions can be a double-edged sword — it could detect resistant strains specifically, but mutations in that region may also cause the guide to lose binding in resistant variants.
Hover over the badge to see the specific drug and mutation region name.
Coverage by pathogen:
HIV-1 — Protease and reverse transcriptase inhibitor resistance regions (58 overlapping guides)
Influenza A — Neuraminidase inhibitor resistance (754 overlapping guides)
Hepatitis B — Polymerase inhibitor resistance (9,298 overlapping guides)
M. tuberculosis — Currently 0 overlaps (coordinate-system mismatch with multi-strain concatenated genome; known limitation)
8. Research Novelty Filter
The dropdown filter lets you view:
All targets — No filtering.
Novel (unstudied genes) — Targets in genes that have NOT been studied in published CRISPR diagnostic papers (based on our PubMed scan). These represent potential new research directions.
Published (studied genes) — Targets in genes already covered by published CRISPR diagnostic work.
Genes are identified using NCBI gene annotations and matched against PubMed-indexed publications via ontology-bridged symbol resolution.
research gap — This gene has no published CRISPR diagnostic studies. Potential novelty.
studied — Published CRISPR diagnostic work exists targeting this gene.
9. Live Search
The Live Search tab lets you search for any arbitrary DNA sequence against a pathogen's FM-index — loaded directly into your browser as a WASM binary.
Select a pathogen from the cards.
Go to the Live Search tab.
Click "Load Index" (downloads the FM-index, typically 5–200 MB).
Type or paste any DNA sequence. Results appear instantly (< 1 ms).
This searches the full indexed genome, not just pre-computed targets. You can search for primers, probes, or any sequence of interest.
Note: Very large indexes (e.g., Human GRCh38 at 1.5 GB) may take longer to download but still search in under 1 ms once loaded.
10. Panel Designer
The Panel Designer tab helps you design a minimal multiplex diagnostic panel — a small set of CRISPR guides that can distinguish between multiple pathogens in a single assay.
Select which pathogens to include (or use a preset: Respiratory, Hemorrhagic, STI, All).
Set minimum conservation threshold (default: 70%) and minimum quality score (default: 50).
Click Design Panel.
The algorithm uses a greedy set-cover approach: for each pathogen, it picks the highest-scoring guide that is unique to that pathogen (not found as a top candidate in others). The result is a table showing one distinguishing guide per pathogen.
Click Export Panel CSV to download the panel as a spreadsheet for ordering oligos or sharing with your lab.
If a pathogen shows "no qualifying guides": Relax the conservation or score thresholds. Pathogens with very large genome diversity (e.g., Dengue with 55,000 genomes) may not have guides reaching 70% conservation.
11. Disease Context Tab
After selecting a pathogen, the Disease Context tab shows biomedical ontology data: