oddcomp searches a series of protein files,
reporting the identifier for those that exceed a certain amino acid
composition threshold in a portion of the sequence.
An example of the use of oddcomp to search for
entries in SWISS-PROT containing at least 1 SR and at least 2 RS:
% oddcomp
Finds protein sequence regions with a biased composition
Input sequence(s): sw:*
Output file [5h1d_fugru.oddcomp]: out.odd
Input file: test.comp
Window size to consider (e.g. 30 aa) [30]:
Mandatory qualifiers:
- [-sequence] (seqall)
-
Sequence database USA.
- [-compdata] (infile)
-
This is a file in the format of the output produced by
compseq used to set the minimum frequencies of
words in this analysis.
- [-window] (integer)
-
This is the size of window in which to count. If you want to count
frequencies in a 40 aa stretch, enter 40 here.
- [-outfile] (outfile)
-
This is the results file.
Advanced qualifiers:
- -[no]ignorebz (boolean)
-
The amino acid code B represents Asparagine or Aspartic acid, and the
code Z represents Glutamine or Glutamic acid. These codes are not
commonly used and you may not want to count words containing them.
This command will note codes B and Z in the count of
"Other" words.
|