[ Team LiB ] Previous Section Next Section

oddcomp

oddcomp searches a series of protein files, reporting the identifier for those that exceed a certain amino acid composition threshold in a portion of the sequence.

An example of the use of oddcomp to search for entries in SWISS-PROT containing at least 1 SR and at least 2 RS:

% oddcomp
Finds protein sequence regions with a biased composition
Input sequence(s): sw:*
Output file [5h1d_fugru.oddcomp]: out.odd
Input file: test.comp
Window size to consider (e.g. 30 aa) [30]:

Mandatory qualifiers:

[-sequence] (seqall)

Sequence database USA.

[-compdata] (infile)

This is a file in the format of the output produced by compseq used to set the minimum frequencies of words in this analysis.

[-window] (integer)

This is the size of window in which to count. If you want to count frequencies in a 40 aa stretch, enter 40 here.

[-outfile] (outfile)

This is the results file.

Advanced qualifiers:

-[no]ignorebz (boolean)

The amino acid code B represents Asparagine or Aspartic acid, and the code Z represents Glutamine or Glutamic acid. These codes are not commonly used and you may not want to count words containing them. This command will note codes B and Z in the count of "Other" words.

    [ Team LiB ] Previous Section Next Section