[ Team LiB ] Previous Section Next Section

est2genome

est2genome will align a set of spliced nucleotide sequences (ESTs, cDNAs, or mRNAs) to an unspliced genomic DNA sequence and insert introns of arbitrary length when needed.

Here is a sample session with est2genome:

% est2genome
Align EST and genomic DNA sequences
EST sequence(s): embl:hs989235
Genomic sequence: embl:hsnfg9
Output file [hs989235.est2genome]:

Mandatory qualifiers:

[-est] (seqall)

EST sequence(s).

[-genome] (sequence)

Genomic sequence.

[-outfile] (outfile)

Output filename.

Optional qualifiers:

-match (integer)

Score for matching two bases.

-mismatch (integer)

Cost for mismatching two bases.

-gappenalty (integer)

Cost for deleting a single base in either sequence, excluding introns.

-intronpenalty (integer)

Cost for an intron, independent of length.

-splicepenalty (integer)

Cost for an intron, independent of length and starting/ending on donor-acceptor sites.

-minscore (integer)

Exclude alignments with scores below this threshold score.

Advanced qualifiers:

-reverse (boolean)

Reverse the orientation of the EST sequence.

-[no]splice (boolean)

Use donor and acceptor splice sites. If you want to ignore donor-acceptor sites, set this to false.

-mode (string)

This determines the comparison mode. The default value is both. In this case, both strands of the EST are compared assuming a forward gene direction (ie GT/AG splice sites), and the best comparison redone assuming a reversed (CT/AC) gene splicing direction. The other allowed modes are forward (when just the forward strand is searched), and reverse (when the reverse strand is searched).

-[no]best (boolean)

You can print out all comparisons (not just the best one) by setting this to false.

-space (float)

For linear-space recursion. If product of sequence lengths divided by 4 exceeds this value, a divide-and-conquer strategy is used to control the memory requirements. Very long sequences can be aligned in this manner. If you have a machine with plenty of memory, you may raise this parameter (but do not exceed the machine's physical RAM).

-shuffle (integer)

Shuffle.

-seed (integer)

Random number seed.

-align (boolean)

Show the alignment. The alignment includes the first and last 5 bases of each intron, together with the intron width. The direction of splicing is indicated by angle brackets (forward or reverse) or ???? (unknown).

-width (integer)

Alignment width.

    [ Team LiB ] Previous Section Next Section