Result Files Description
When your annotation completes, Bystro generates a comprehensive set of output files containing annotated variants, quality control metrics, ancestry analysis, polygenic risk scores, andmachine learning-ready datasets. Here's what each file contains and how to use them.
Download Package
Your results are delivered as a compressed tarball containing multiple files. Extract the archive to access individual components for downstream analysis.
Core Annotation Files
sample_vcf.annotation.tsv.gz
Main annotation output - Tab-separated file with comprehensive variant annotations, one row per variant with extensive genomic information.
Use for: Detailed variant analysis, filtering by annotation criteria, identifying functional variants, generating custom reports
Format: Block gzipped TSV (decompress with bgzip, gzip, or pigz)
Large File Warning
This file can be enormous (billions of variants for thousands of samples). For large cohorts, opening directly in Excel is not recommended. Use the Bystro web interface to filter and subset data, then download smaller filtered results for any spreadsheet analysis.
sample_vcf.dosage.feather
Genotype dosage matrix - Machine learning-ready format with variant dosages (0, 1, 2) for each sample, optimized for polygenic risk score calculations.
Use for: Polygenic risk scores, GWAS, machine learning, statistical genetics analyses
Format: Arrow Feather V2 (supported by Python pandas, polars, R, Julia)
Structure: First column = chr:pos:ref:alt, remaining columns = sample dosages
Quality Control & Metadata
Sample Information & Statistics
Configuration & Documentation
Performance & Cache Files
Advanced Analysis Results
Polygenic Risk Scores (PRS)
Risk predictions based on GWAS summary statistics. Each GWAS produces a separate PRS file.
Example files:
• sample_vcf.AD.30617256.prs.tsv (Alzheimer's Disease GWAS)
• sample_vcf.AD.35379992.prs.tsv (Different AD GWAS)
• sample_vcf.IBD.PMIDLIU.prs.tsv (Inflammatory Bowel Disease)
Individual Scores
Per-sample risk predictions for each trait
UI Download
Download CSV files directly from web interface
Ancestry Analysis
Genetic ancestry inference using principal component analysis and reference populations.
File: ancestry_results.json
CLI Access
Use Bystro CLI to convert JSON results to CSV format
UI Download
Download CSV directly from the web interface
Variant Reports
Curated reports highlighting variants of clinical or research significance.
Access: Available for download through Dashboard UI interface
Working with Your Results
Step 1: Extract the Archive
Your results come as a compressed file (ending in .tar.gz). You'll need to extract it to access the individual files.
💻 Windows
Built-in (Windows 10/11): Right-click the file → "Extract All"
Alternative: Download 7-Zip (free) if built-in extraction doesn't work
🍎 Mac
Built-in: Double-click the file - it will extract automatically
🐧 Linux / Command Line
tar -xzf sample_vcf_results.tar.gzStep 2: Choose Your Analysis Path
Detailed Variant Analysis
Use .tsv.gz file for comprehensive variant annotation analysis
Machine Learning
Use .feather file for ML workflows and statistical genetics
Step 3: Quality Control Review
Check statistics files to identify samples that may need exclusion from downstream analysis.
Step 4: Access Advanced Results
Download ancestry and PRS results directly from the web interface or convert using CLI tools.
File Format Details
Compression Formats
- ▶Block gzip (.tsv.gz): Decompress with bgzip, gzip, or pigz
- ▶Arrow Feather (.feather): Native binary format, no decompression needed
Data Loading Examples
Python (Pandas)
import pandas as pd
df = pd.read_feather('sample_vcf.dosage.feather')R
library(arrow)
df <- read_feather('sample_vcf.dosage.feather')Pro Tips
- ▶Filter in the UI first before downloading for Excel exploration - annotation files can be too large for spreadsheets
- ▶Check QC files to identify potential sample quality issues
- ▶Keep the config file for reproducibility and method documentation
- ▶Use CLI tools or Dashboard download for converting ancestry and PRS results to CSV format
Next Steps
After understanding your result files:
- ▶Explore field descriptionsto understand annotation columns
- ▶Learn filtering techniquesfor variant analysis