?

Pathogens

Select a pathogen to browse pre-computed CRISPR targets or load its FM-index for live search

Select a pathogen to view CRISPR targets
# Sequence (23-mer) Gene Position PAM Coverage Score Actions

Select a pathogen above to view its CRISPR targets

No index loaded

Select a pathogen, load its FM-index, then search any DNA sequence for exact matches across all genomes

PubMed Literature Scanner

Search NCBI PubMed for published research on any CRISPR target sequence. Identify which targets are novel (no prior publications) and which have existing literature — helping prioritize targets for new papers.

Enter a sequence or gene name to search published CRISPR literature, or use "Quick scan" to check all targets for a selected pathogen

Disease & Epidemiological Context

Biomedical ontology data linking each pathogen to standardized disease classifications, WHO surveillance context, gene annotations, and diagnostic landscape. All data is bundled — works fully offline.

Select a pathogen to view disease context and ontology data

NCBI

NCBI RefSeq Viral

Complete reference sequences for all known viral genomes. The primary source for our viral pathogen indexes.

Dataset
RefSeq Viral Complete Genomes
Accessed
March 2026
Sequences
703,000+
License
Public domain (US Government work)
URL
ncbi.nlm.nih.gov/datasets
Citation: Sayers EW, et al. "Database resources of the National Center for Biotechnology Information." Nucleic Acids Research, 2024, 52(D1):D33-D43.
NCBI

NCBI RefSeq Bacterial

Reference genomes for bacterial pathogens including M. tuberculosis and V. cholerae.

Dataset
RefSeq Bacterial Genomes (selected pathogens)
Accessed
March 2026
Pathogens
Cholera, Tuberculosis
License
Public domain (US Government work)
URL
ncbi.nlm.nih.gov/datasets
Citation: O'Leary NA, et al. "Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation." Nucleic Acids Research, 2016, 44(D1):D733-D745.
GRCh38

Human Reference Genome

GRCh38.p14 (Genome Reference Consortium Human Build 38) used for off-target analysis — ensuring CRISPR guides don't match human sequences.

Assembly
GRCh38.p14 (GCF_000001405.40)
Accessed
March 2026
Size
~3.1 Gbp
License
Public domain
URL
GRCh38.p14 at NCBI
Citation: Schneider VA, et al. "Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly." Genome Research, 2017, 27(5):849-864.
LOOM

LOOM FM-Index Engine

The search engine powering this tool. A Burrows-Wheeler Transform (BWT) based FM-index compiled to 195 KB of WebAssembly, enabling sub-millisecond exact-match search in the browser.

Library
brenda (Rust crate)
Binary
195 KB WASM
Method
FM-index with suffix array sampling
License
Open source
Method: Ferragina P, Manzini G. "Opportunistic data structures with applications." FOCS 2000. IEEE, 2000, pp. 390-398.
PM

PubMed / NCBI E-utilities

Literature searches use NCBI's public E-utilities API to query PubMed for published CRISPR research related to target sequences.

API
NCBI E-utilities (esearch + esummary)
Rate limit
3 requests/sec (unauthenticated)
Data
PubMed article metadata
License
Public access (NLM Terms of Service)
URL
E-utilities documentation
Citation: Sayers E. "E-utilities Quick Start." Entrez Programming Utilities Help, NCBI, 2024.
TAX

NCBI Taxonomy

Standardized taxonomic classification for every pathogen — species name, lineage, genome type, and transmission mode. Powers the Disease Context taxonomy chips.

Dataset
NCBI Taxonomy Database
Accessed
March 2026
Pathogens
12 (all indexed species)
License
Public domain (US Government work)
URL
ncbi.nlm.nih.gov/taxonomy
Citation: Schoch CL, et al. "NCBI Taxonomy: a comprehensive update on curation, resources and tools." Database, 2020, baaa062.
DO

Disease Ontology

Standardized disease definitions, synonyms, and cross-references for each pathogen's primary disease. Provides the "What is it?" descriptions and alternative names in the Disease Context tab.

Dataset
Disease Ontology (DO)
Accessed
March 2026
Terms
12 disease terms (DOID mapped)
License
CC0 1.0 (Public Domain)
URL
disease-ontology.org
Citation: Schriml LM, et al. "The Human Disease Ontology 2022 update." Nucleic Acids Research, 2022, 50(D1):D1255-D1261.
MO

MONDO Disease Ontology

Cross-ontology disease mappings linking Disease Ontology, OMIM, Orphanet, and other vocabularies. Provides additional cross-references for each pathogen's disease.

Dataset
Monarch Disease Ontology (MONDO)
Accessed
March 2026
Terms
12 disease terms (MONDO mapped)
License
CC BY 4.0
URL
mondo.monarchinitiative.org
Citation: Vasilevsky NA, et al. "Mondo: Unifying diseases for the world, by the world." medRxiv, 2022.
WHO

WHO Disease Surveillance

Epidemiological context from the World Health Organization — case fatality rates, annual case/death estimates, geographic spread, diagnostic landscape, and CRISPR diagnostic status.

Dataset
WHO Disease Outbreak News & fact sheets
Accessed
March 2026
Data
Epi stats for 12 pathogens
License
CC BY-NC-SA 3.0 IGO
URL
who.int/disease-outbreak-news
Citation: World Health Organization. "Disease Outbreak News." WHO, 2024-2026.
GENE

NCBI Gene / Datasets V2

Complete gene annotations for each pathogen's reference genome — gene symbols, names, positions, and types. Powers the Gene Map section with 8,283 annotated genes across all 12 pathogens.

API
NCBI Datasets V2 (annotation_report)
Accessed
March 2026
Genes
8,283 across 12 pathogens
License
Public domain (US Government work)
URL
NCBI Datasets V2 API
Citation: NCBI Resource Coordinators. "Database resources of the National Center for Biotechnology Information." Nucleic Acids Research, 2024, 52(D1):D33-D43.
M

Methods (Copy for Your Paper)

Copy this methods paragraph into your manuscript's Materials & Methods section:

CRISPR diagnostic target candidates were identified using LOOM CRISPR Search (https://calm-mushroom-0185d800f.4.azurestaticapps.net), a BWT/FM-index based pangenomic scanning tool. For each pathogen, all available genome assemblies were downloaded from NCBI RefSeq (accessed March 2026) and concatenated into a single corpus. A 23-mer sliding window (20 bp guide + 3 bp PAM context) was applied to extract all candidate target sequences. PAM classification identified NGG (SpCas9) and TTTN (Cas12a/Cpf1) compatible sites. Targets were ranked by genome conservation (occurrence count across all assemblies). Guide quality scores were computed based on GC content (optimal 40-70%), seed region GC (last 12 nt), poly-T terminator absence, homopolymer run length, and self-complementarity. Off-target specificity was assessed by searching each candidate against 7 host reference genomes (human GRCh38, pig, bat, chicken, cow, camel, mouse) using exact-match FM-index queries. Literature coverage was assessed by automated PubMed scanning with ontology-enhanced synonym expansion (NCBI Taxonomy, Disease Ontology, MONDO). Drug-resistance region overlap was annotated using coordinates from WHO mutation catalogs and Stanford HIVDB.
?

How To Cite This Tool

If you use LOOM CRISPR Search or data from this tool in your research, please cite:

LOOM CRISPR Search: Open-science CRISPR target discovery tool. https://calm-mushroom-0185d800f.4.azurestaticapps.net (2026). Genome data: NCBI RefSeq Viral & Bacterial, GRCh38.p14. Disease context: Disease Ontology (CC0), MONDO (CC BY 4.0), WHO Disease Surveillance, NCBI Taxonomy, NCBI Gene. Search engine: brenda FM-index (195 KB WASM).

CRISPR Glossary

Key terms used throughout this tool

PAM (Protospacer Adjacent Motif)
A short DNA sequence (2–6 bp) immediately adjacent to the target site that the CRISPR-Cas protein must recognize before it can bind and cut. Without the correct PAM, the enzyme ignores the target — even if the guide RNA matches perfectly. Different Cas enzymes require different PAMs.
NGG (SpCas9 PAM)
The PAM sequence required by SpCas9 (from S. pyogenes), the most widely used CRISPR nuclease. "N" = any nucleotide, "GG" = two guanines. The NGG must appear on the 3′ side of the 20-bp target. Example: ...ATCGATCGATCGATCGATCGAGG
TTTN (Cas12a / Cpf1 PAM)
The PAM sequence required by Cas12a (also called Cpf1), often used in SHERLOCK/DETECTR diagnostic assays. "TTT" = three thymines, "N" = any nucleotide. Unlike SpCas9, this PAM sits on the 5′ side of the target. Example: TTTGATCGATCGATCGATCGATCG...
Guide RNA (gRNA / sgRNA)
A short RNA molecule (~20 nt) that directs the Cas enzyme to a specific DNA target via Watson-Crick base pairing. In this tool, each 23-mer target represents a potential guide: 20 bp of targeting sequence + 3 bp PAM.
23-mer
A 23-nucleotide DNA sequence. In CRISPR target design, a 23-mer typically means the 20 bp guide sequence plus a 3 bp PAM (e.g., 20 bp + NGG). This is the standard targeting unit for SpCas9.
Off-target
An unintended genomic site where a CRISPR guide could bind and cut due to partial sequence similarity. Good diagnostic guides should have zero off-targets in the human genome — which is why this tool can cross-check against GRCh38.
FM-index
A compressed full-text index based on the Burrows-Wheeler Transform (BWT). It enables exact substring matching in sub-millisecond time across gigabytes of genome data. LOOM uses a 195 KB WASM FM-index to search pathogen genomes entirely in your browser.
SpCas9
Streptococcus pyogenes Cas9 — the original and most commonly used CRISPR nuclease. Recognizes NGG PAM. Widely validated in diagnostics (e.g., SHERLOCK) and therapeutics.
Cas12a (Cpf1)
An alternative CRISPR nuclease that recognizes TTTN PAMs and creates staggered (sticky-end) cuts. Used in the DETECTR diagnostic platform. Offers different targeting range compared to SpCas9.
SHERLOCK / DETECTR
CRISPR-based diagnostic platforms. SHERLOCK (Cas13) and DETECTR (Cas12a) detect specific nucleic acid sequences with high sensitivity, used for rapid pathogen detection (e.g., SARS-CoV-2, Zika, Dengue).

Multiplexed Diagnostic Panel Designer

Select pathogens to build a syndromic diagnostic panel. The algorithm finds the minimum set of non-cross-reactive NGG guides that uniquely identifies each pathogen.

Select Pathogens for Panel

Methodology & Verification

How targets are scored, how literature gaps are verified, and what was retracted