# Usage

Tracy uses subcommands for basecalling, alignment, deconvolution, variant calling and trace assembly. These subcommands are explained below.

# Basecalling a Chromatogram Trace File

Tracy can basecall a trace file and output the primary sequence (highest peak) in FASTA or FASTQ format.

tracy basecall -f fasta -o out.fasta input.ab1
tracy basecall -f fastq -o out.fastq input.ab1

You can also get the full trace information, including primary and secondary basecalls for heterozygous variants in JSON format or as a tab-delimited text file.

tracy basecall -f tsv -o out.tsv input.ab1
tracy basecall -f json -o out.json input.ab1

# Trace alignment

Tracy supports a profile-to-sequence dynamic programming alignment of a trace file to a small reference nucleotide sequence in FASTA format.

tracy align -o outprefix -r reference.fa input.ab1

Similarly, tracy can align a trace file to a wildtype chromatogram using profile-to-profile dynamic programming.

tracy align -o outprefix -r wildtype.ab1 input.ab1

Alignment of a trace file to a large reference genome requires a pre-built index on a bgzip compressed genome. For instance, to align a trace file to the human reference genome version hg38 you first need to index the reference FASTA file.

tracy index -o hg38.fa.fm9 hg38.fa.gz
samtools faidx hg38.fa.gz

Once that index has been created you can then align a Chromatogram trace file to the indexed genome.

tracy align -r hg38.fa.gz input.ab1

Obviously, the index needs to be built only once.

# Genome index

Pre-built genome indices for commonly used reference genomes are available for download here.

# Deconvolution of heterozygous mutations

Double-peaks in the chromatogram trace can cause alignment issues. Tracy supports deconvolution of heterozygous variants into two separate alleles. Each allele is then aligned separately to the reference sequence.

tracy decompose -r hg38.fa.gz -o outprefix input.ab1
cat outprefix.align1 outprefix.align2

Tracy also supports the use of a wildtype chromatogram for decomposition or a small FASTA sequence.

tracy decompose -r wildtype.ab1 -o outprefix mutated.ab1
tracy decompose -r sequence.fa -o outprefix mutated.ab1

# Variant Calling

Tracy can call and annotate single-nucleotide variants (SNV) and insertions & deletions (InDels) with respect to a reference genome. For instance, to call variants on a Chromatogram trace file using the hg38 human reference genome.

tracy decompose -v -a homo_sapiens -r hg38.fa.gz -o outprefix input.ab1

This command also annotates rs identifiers of known polymorphisms and produces a variant call file in binary BCF format. This output file in BCF format can be converted to VCF using bcftools.

bcftools view outprefix.bcf

# Using forward and reverse ab1 files to improve variant calling

If you do have forward and reverse trace files for the same expected genomic variant you can merge variant files and check consistency of calls and genotypes. Forward trace decomposition:

tracy decompose -o forward -a homo_sapiens -r hg38.fa.gz forward.ab1

Reverse trace decomposition:

tracy decompose -o reverse -a homo_sapiens -r hg38.fa.gz reverse.ab1

Left-alignment of InDels:

bcftools norm -O b -o forward.norm.bcf -f hg38.fa.gz forward.bcf
bcftools norm -O b -o reverse.norm.bcf -f hg38.fa.gz reverse.bcf

Merging of normalized variant files:

bcftools merge --force-samples forward.norm.bcf reverse.norm.bcf

# Trace assembly

For a short genomic region that you tiled with multiple, overlapping Sanger Chromatogram trace files you can use tracy to assemble these.

tracy assemble -r reference.fa file1.ab1 file2.ab1 fileN.ab1

Instead of a reference-guided assembly using the '-r' option, tracy also supports de novo assembly of chromatogram trace files if these sufficiently overlap each other.

tracy assemble file1.ab1 file2.ab1 fileN.ab1