[ Team LiB ] Previous Section Next Section

remap

remap uses REBASE data to find the recognition sites and/or cut sites of restriction enzymes in a nucleic acid sequence. It also displays the cut sites on both strands by default. It will optionally also display the translation of the sequence.

Here is a sample session with remap. We only look at a small section of the sequence to save space:

% remap -notran -sbeg 1 -send 60
Display a sequence with restriction cut sites, translation etc..
Input sequence(s): embl:eclac
Output file [eclac.remap]:
Comma separated enzyme list [all]: taqi,bsu6i,acii,bsski
Minimum recognition site length [4]:

Here is an example where all enzymes in the REBASE database are used:

% remap -notran -sbeg 1 -send 60
Display a sequence with restriction cut sites, translation etc..
Input sequence(s): embl:eclac
Output file [eclac.remap]:
Comma separated enzyme list [all]:
Minimum recognition site length [4]:

Mandatory qualifiers:

[-sequence] (seqall)

Sequence database USA.

-enzymes (string)

The argument all reads all enzyme names from the REBASE database. You can specify enzymes by giving their names with commas between them, such as: HincII,hinfI,ppiI,hindiii. This command is not case-sensitive. You may also use the data from file containing enzyme names by prepending the name of the file you want to use with an @ character; for example, @enz.list. Blank lines and lines starting with a comment tag (# or !) within the file are ignored; all other lines are concatenated together with a comma and treated as the list of enzymes to search for. A file containing enzyme names might look like this:

! my enzymes
HincII, ppiII
! other enzymes
hindiii
HinfI
PpiI
-sitelen (integer)

Minimum recognition site length.

[-outfile] (outfile)

If you enter the name of a file here, this program will write the sequence details into that file.

Optional qualifiers:

-mincuts (integer)

Minimum cuts per restriction enzyme.

-maxcuts (integer

Maximum cuts per restriction enzyme.

-single (boolean)

Force single-site-only cuts.

-[no]blunt (boolean)

Allow blunt end cutters.

-[no]sticky (boolean)

Allow sticky end cutters.

-[no]ambiguity (boolean)

Allow ambiguous matches.

-plasmid (boolean)

Allow circular DNA.

-[no]commercial (boolean)

Only enzymes with suppliers.

-table (menu)

Code to use. See the fuzztran description for codes.

-[no]cutlist (boolean)

List the enzymes to cut.

-flatreformat (boolean)

Display restriction enzyme sites in flat format.

-[no]limit (boolean)

Limits reports to one isoschizomer.

-preferred (boolean)

Report preferred isoschizomers.

Advanced qualifiers:

-[no]translation (boolean)

Display translation.

-[no]reverse (boolean)

Display cut sites and translation of reverse sense.

-orfminsize (integer)

Minimum size of Open Reading Frames (ORFs) to display in the translations.

-uppercase (range)

Regions to put in uppercase. If no regions are specified, the sequence case is left alone. A set of regions is specified by a set of pairs of positions. The positions are integers. They are separated by any non-digit, non-alpha character. Examples of region specification: 24-45, 56-78, 1:45, 67=99;765..888, 1,5,8,10,23,45,57,99.

-highlight (range)

Regions to color if formatting in HTML. If no regions are specified, the sequence is left alone. A set of regions is specified by a set of pairs of positions. The positions are integers. They are followed by any valid HTML font color. Examples of region specifications:

24-45 blue 56-78 orange
1-100 green 120-156 red

A file of ranges to color (one range per line) can be specifed as @filename.

-threeletter (boolean)

Display protein sequences in three-letter code.

-number (boolean)

Number the sequences.

-width (integer)

Width of sequence to display.

-length (integer)

Line length of page (0 for indefinite length).

-margin (integer)

Margin around sequence for numbering.

-[no]name (boolean)

Set this to false if you do not want to display the ID name of the sequence.

-[no]description (boolean)

Set this to false if you do not want to display the description of the sequence.

-offset (integer)

Offset to start numbering the sequence from.

-html (boolean)

Use HTML formatting.

    [ Team LiB ] Previous Section Next Section