Query CADD Scores
CADD (Combined Annotation Dependent Depletion) provides genome-wide predictions of variant deleteriousness. Use CADD scores to prioritize potentially harmful variants in your analysis.
About CADD Scores
CADD integrates multiple genomic annotations to predict the deleteriousness of genetic variants. Higher scores indicate variants more likely to have deleterious effects.
Learn more in the original CADD publication.
CADD Query Syntax
Type the field name
Start with cadd to specify the CADD score field you want to search.
Add a mathematical operator
Use operators like =,>,<,>=,<=, or a colon:
- ▶
cadd:(with colon) is equivalent tocadd = - ▶This applies to any numerical field like
pos,phyloP, andphastCons
Specify the score threshold
Add the numerical value you want to filter by. CADD scores typically range from 0 to 40+, with higher scores indicating more deleterious variants.
Common CADD Score Thresholds
Moderate Impact
CADD ≥ 10
Top 10% most deleterious variants. Good starting point for variant prioritization.
High Impact
CADD ≥ 20
Top 1% most deleterious variants. Commonly used threshold for pathogenic variant screening.
Very High Impact
CADD ≥ 30
Top 0.1% most deleterious variants. Used for identifying likely pathogenic mutations.
CADD Query Examples
Example 1: High-Impact Variants
Search for variants with CADD scores greater than 20 usingcadd > 20:
Finding high-impact variants with CADD scores above 20 (top 1% most deleterious)
Example 2: Inclusive Threshold
Include variants with exactly 20 usingcadd >= 20:
Using >= operator to include variants with CADD scores of exactly 20
Example 3: Score Range Query
Search for variants within a specific CADD range usingcadd:[15 TO 20]:
Finding variants with moderate CADD scores between 15 and 20
CADD Scores and Indels
Important Considerations
CADD scores are originally defined for SNPs only. For indels, Bystro provides CADD scores for all affected reference positions, giving you comprehensive deleteriousness predictions.
Single Base Indels
Bystro provides all 3 possible CADD scores for the affected position, assuming the indel could be as significant as the most deleterious SNP at that site.
Longer Indels
- • Deletions: First 32 covered bases annotated (separated by "|")
- • Insertions: Both flanking reference positions annotated
Indel Query Behavior
When querying cadd > 20, an indel with CADD scores of 0, 10, and 25 will match because one of its scores (25) exceeds the threshold.
This behavior applies to any field containing multiple values separated by "|".
Practical Applications
Clinical Variant Prioritization
- • Use
cadd >= 20for initial pathogenic variant screening - • Combine with allele frequency filters for rare, high-impact variants
- • Focus on coding regions with high CADD scores
Research Applications
- • Compare CADD distributions between case and control groups
- • Identify variants likely to disrupt protein function
- • Prioritize variants for functional validation studies
CADD Score Best Practices
• Combine with other filters: Use CADD scores alongside allele frequency and functional annotations
• Consider context: High CADD scores in non-coding regions may indicate regulatory disruption
• Validate findings: CADD is predictive - validate high-scoring variants experimentally when possible
• Population differences: Consider population-specific deleteriousness patterns in diverse cohorts
Performance Note
Dataset used in examples: 1000 Genomes Project (73,452,337 variants in 27,192 genes, queries typically complete in ~0.5 seconds)