est2genome
will align a set of
spliced nucleotide sequences (ESTs, cDNAs, or mRNAs) to an unspliced
genomic DNA sequence and insert introns of arbitrary length when
needed.
Here is a sample session with est2genome:
% est2genome
Align EST and genomic DNA sequences
EST sequence(s): embl:hs989235
Genomic sequence: embl:hsnfg9
Output file [hs989235.est2genome]:
Mandatory qualifiers:
- [-est] (seqall)
-
EST sequence(s).
- [-genome] (sequence)
-
Genomic sequence.
- [-outfile] (outfile)
-
Output filename.
Optional qualifiers:
- -match (integer)
-
Score for matching two bases.
- -mismatch (integer)
-
Cost for mismatching two bases.
- -gappenalty (integer)
-
Cost for deleting a single base in either sequence, excluding introns.
- -intronpenalty (integer)
-
Cost for an intron, independent of length.
- -splicepenalty (integer)
-
Cost for an intron, independent of length and starting/ending on
donor-acceptor sites.
- -minscore (integer)
-
Exclude alignments with scores below this threshold score.
Advanced qualifiers:
- -reverse (boolean)
-
Reverse the orientation of the EST sequence.
- -[no]splice (boolean)
-
Use donor and acceptor splice sites. If you want to ignore
donor-acceptor sites, set this to false.
- -mode (string)
-
This determines the comparison mode. The default value is
both. In this case, both strands of the EST are
compared assuming a forward gene direction (ie GT/AG splice sites),
and the best comparison redone assuming a reversed (CT/AC) gene
splicing direction. The other allowed modes are
forward (when just the forward strand is
searched), and reverse (when the reverse strand is
searched).
- -[no]best (boolean)
-
You can print out all comparisons (not just the best one) by setting
this to false.
- -space (float)
-
For linear-space recursion. If product of sequence lengths divided by
4 exceeds this value, a divide-and-conquer strategy is used to
control the memory requirements. Very long sequences can be aligned
in this manner. If you have a machine with plenty of memory, you may
raise this parameter (but do not exceed the
machine's physical RAM).
- -shuffle (integer)
-
Shuffle.
- -seed (integer)
-
Random number seed.
- -align (boolean)
-
Show the alignment. The alignment includes the first and last 5 bases
of each intron, together with the intron width. The direction of
splicing is indicated by angle brackets (forward or reverse) or ????
(unknown).
- -width (integer)
-
Alignment width.
|