Parsers

Tentacle contains several parsers that parse input files to produce the data structures required to hold the results (coverage, counts) but also parsers for the output files from mappers (e.g. gem, razers3, or blast tabular formats).

If additional mappers are to be added to the program, suitable parsers might also be required. Have a look at how the supplied parsers are implemented and write something similar for the specific format you require. Make sure to add the parser to the tentacle.parsers.__init__py.

Parsers module

Mapper output parsers

blast8 tabular format

Tentacle parsers blast tabular output

author:: Fredrik Boulund date:: 2014-04-30

exception tentacle.parsers.blast8.Error(msg)[source]

Bases: exceptions.Exception

Base class for exceptions in this module.

Attributes:
msg error message
exception tentacle.parsers.blast8.FileFormatError(msg)[source]

Bases: tentacle.parsers.blast8.Error

Raised when file is not in expected format.

exception tentacle.parsers.blast8.ParseError(msg)[source]

Bases: tentacle.parsers.blast8.Error

Raised for parsing errors.

tentacle.parsers.blast8.parse_blast8(mappings, contig_data, options, logger)[source]

Parses mapped data in blast8 format (e.g. for usearch, pblat, blast).

GEM format

Tentacle parsers

author:: Fredrik Boulund date:: 2014-04-30

exception tentacle.parsers.gem.Error(msg)[source]

Bases: exceptions.Exception

Base class for exceptions in this module.

Attributes:
msg error message
exception tentacle.parsers.gem.FileFormatError(msg)[source]

Bases: tentacle.parsers.gem.Error

Raised when file is not in expected format.

exception tentacle.parsers.gem.ParseError(msg)[source]

Bases: tentacle.parsers.gem.Error

Raised for parsing errors.

tentacle.parsers.gem.parse_gem(mappings, contig_data, options, logger)[source]

Parses GEM alignment format.

RazerS3 format

Tentacle parsers

author:: Fredrik Boulund date:: 2014-04-30

exception tentacle.parsers.razers3.Error(msg)[source]

Bases: exceptions.Exception

Base class for exceptions in this module.

Attributes:
msg error message
exception tentacle.parsers.razers3.FileFormatError(msg)[source]

Bases: tentacle.parsers.razers3.Error

Raised when file is not in expected format.

exception tentacle.parsers.razers3.ParseError(msg)[source]

Bases: tentacle.parsers.razers3.Error

Raised for parsing errors.

tentacle.parsers.razers3.parse_razers3(mappings, contig_data, options, logger)[source]

Parses razers3 output.

SAM format

Tentacle parsers

author:: Fredrik Boulund date:: 2014-04-30

exception tentacle.parsers.sam.Error(msg)[source]

Bases: exceptions.Exception

Base class for exceptions in this module.

Attributes:
msg error message
exception tentacle.parsers.sam.FileFormatError(msg)[source]

Bases: tentacle.parsers.sam.Error

Raised when file is not in expected format.

exception tentacle.parsers.sam.ParseError(msg)[source]

Bases: tentacle.parsers.sam.Error

Raised for parsing errors.

tentacle.parsers.sam.parse_sam(mappings, contig_data, options, logger)[source]

Parses standard SAM alignments. Useful for bowtie2 and other aligners that output SAM format

Parser to create coverage data structure

Tentacle initialize contig data structure

author:: Fredrik Boulund date:: 2014-05-06 purpose:: Creates an empty data structure for contig coverage and annotated region count information.

exception tentacle.parsers.initialize_contig_data.Error(msg)[source]

Bases: exceptions.Exception

Base class for exceptions in this module.

Attributes:
msg error message
exception tentacle.parsers.initialize_contig_data.FileFormatError(msg)[source]

Bases: tentacle.parsers.initialize_contig_data.Error

Raised when file is not in expected format.

exception tentacle.parsers.initialize_contig_data.ParseError(msg)[source]

Bases: tentacle.parsers.initialize_contig_data.Error

Raised for parsing errors.

tentacle.parsers.initialize_contig_data.initialize_annotation_counts(contig_data, annotations_filename, options, logger)[source]

Initializes the second level of keys in the contig_data structure (annotations).

tentacle.parsers.initialize_contig_data.initialize_contig_data(files, options, logger)[source]

Reads annotation and reference (FASTA) files to create an empty data structure.

Data structure is a nested dictionary:
contig_data MAIN DICTIONARY
[“CONTIG0101”] CONTIG KEY
[“TIGRFAM0101”] ANNOTATION KEY
[int, int, int, “+”] LIST WITH: [ANNOTATION COUNT, ANNOTATION START, ANNOTATION STOP, STRAND]
[“__coverage__”] COVERAGE SPECIAL KEY (if reference sequence has this name error ensues!)
np.array COVERAGE DATA STRUCTURE (NumPy array)
tentacle.parsers.initialize_contig_data.initialize_contig_keys(contig_data, contigs_file, options, logger)[source]

Creates the first level of keys in the contig_data dictionary from a FASTA file.

Keys in the dictionary are based on the first-space separated header string in the FASTA headers.

Each sequence is represented by a continous array of length equal to contig length+1 for later use in coverage computations.

Table Of Contents

Previous topic

Tentacle output

Next topic

Coverage

This Page