trimseq
is used to tidy up the ends of sequences,
removing all the bits that you would really rather were not
published.
Tidy up the sequence ends, stopping at the first wanted code:
% trimseq xyz.seq xyz_clean.seq -window 1 -percent 100 Tidy up the sequence ends, removing poor bits at the ends:
% trimseq xyz.seq xyz_clean.seq -window 5 -percent 40 Tidy up the sequence ends, removing very poor bits at the ends:
% trimseq xyz.seq xyz_clean.seq -window 20 -percent 80 Tidy up the sequence ends, removing even maginally poor bits at the
ends:
% trimseq xyz.seq xyz_clean.seq -window 20 -percent 10 Tidy up the sequence ends, removing poor bits including ambiguity
codes:
% trimseq xyz.seq xyz_clean.seq -window 20 -percent 50 -strict Tidy up the sequence ends, removing asterisks from a protein end:
% trimseq xyz.seq xyz_clean.seq -window 1 -percent 100 -star Tidy up the sequence ends, removing poor bits at only the left end:
% trimseq xyz.seq xyz_clean.seq -window 20 -percent 50 -noright
Mandatory qualifiers:
- [-sequence] (seqall)
-
Sequence database USA.
- [-outseq] (seqoutall)
-
Output sequence(s) USA.
Optional qualifiers:
- -window (integer)
-
This determines the size of the region that is considered when
deciding whether the percentage of ambiguity is greater than the
threshold. A value of 5 means that a region of 5
letters in the sequence is shifted along the sequence from the ends
and trimming is done only if there is a greater or equal percentage
of ambiguity than the threshold percentage.
- -percent (float)
-
This is the threshold of the percentage ambiguity in the window
required in order to trim a sequence.
- -strict (boolean)
-
In nucleic sequences, trim off not only Ns and Xs, but also the
nucleotide IUPAC ambiguity codes M, R, W, S, Y, K, V, H, D and B. In
protein sequences, trim off not only Xs but also B and Z.
- -star (boolean)
-
In protein sequences, trim off not only Xs, but the asterisks as well.
Advanced qualifiers:
- -[no]left (boolean)
-
Trim at the start.
- -[no]right (boolean)
-
Trim at the end.
|