.. autoDCR documentation master file, created by sphinx-quickstart on Mon Apr 21 10:42:35 2025. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. autoDCR ======= A modified version of Decombinator for flexible TCR annotation -------------------------------------------------------------- .. toctree:: :maxdepth: 2 :caption: Contents: installation referencedata annotate tcrtranscripts cli tcrproteins importing ``autoDCR`` (short for **auto**\ matic **D**\ e\ **c**\ ombinato\ **r**\ ) is a python script to perform T cell receptor (TCR) gene annotation of DNA sequencing reads. This is inspired by and in part built upon the core functionality of `Decombinator `_, the TCR analysis software developed by the Chain lab at UCL (in part by me). It uses a similar conceptual framework of using fast Aho-Corasick tries to search for the presence of 'tag' sequences in DNA reads, and use these to identify V and J TCR genes. The first big difference between the scripts is what tags are looked for. Standard ``Decombinator`` (V1 to V4) aims to find single CDR3-proximal tags (i.e. in the 3' of V genes or 5' of J genes) which identify individual TCR genes - ideally covering all of the alleles of that gene - producing an extremely concise trie which can very rapidly search large datasets. Instead, ``autoDCR`` generates overlapping tags across the *entirety* of **all alleles** for all genes for both TRA and TRB, producing a much larger trie. Tags will then potentially cover multiple genes, but genes are called based on the intersection between genes/alleles covered in all tags which they matched. This sacrifices the speed of ``Decombinator``, but allows for greater resolution of TCR genes and alleles, while using and retaining sequence information from a greater portion of the input TCR read. This tag approach also makes the automation of tag generation much simpler, making it much easier to either update ``autoDCR`` when new TCR alleles are discovered, or even to apply it to situations like non-natural repertoires, or for novel species for which TCR information has recently become available. * `Original Decombinator paper DOI: 10.1093/bioinformatics/btt004 `_ * `V4 update paper DOI: 10.1093/bioinformatics/btaa758 `_ **Note** that ``autoDCR`` is in active development and should be considered highly experimental, and does not offer as full a suite of functions as does ``Decombinator`` (nor seek to). Most especially it currently lacks the UMI based error-correction functionality of ``Decombinator``. Instead it is intended to supply a flexible framework of adaptable TCR annotation-related functions that can be tweaked for dataset-specific applications, including a few which I don't feel are necessarily well catered to with existing tools.