JSTRING

Introduction

JSTRING is a Java program for searching Tandem Repeats (TR) in a DNA sequence. It uses the same algorithm of STRING, but has the great advantage to show the results also in a very clear and usable graphical format, which allows to have an immediate survey of the results, besides a great flexibility in their interactive management.
For a complete documentation of the algorithm STRING see STRING  where one can find both the source code in C-language and the paper.

Choice of the functioning mode

JSTRING can be executed as an applet on any browser or as a Java-interpreted application (without browser).
To use JSTRING as an applet the user just needs to click on link JSTRING.
To use JSTRING in local as an application the user needs to download the byte-code (JSTRING.class) and execute it with a Java interpreter.
Note that in applet mode the sandbox limits prevent the I/O on disk, the output on printer and the input from Web, so these operations cannot be accepted when the program is running as an applet.

User interface frames

The user interface consists of a frame.
Data indicating the current mode (applet or application) and the current state are reported in the title border of the frame.
In each frame a Help button is present, that provides the list of active commands.

The possible frames are the following

Basic Frame

The Basic Frame can appear in three different graphical format, depending on the state of the program, as shown in figure below
 
 


UP



 

Input Frame


In the Input Frame there are four radio buttons (only the first one is enabled in applet mode) that allow the choice of the input sequence (in FASTA or GenBank text format), possibly by writing in some input windows:
 

1.  Input from window: When selected, the input sequence should be put in the nearby window (either by direct writing or by copy-and-paste) and confirm by depressing the pushbutton "Ok W".
 
 



UP


2.  Input by dialog: When selected the user should depress the pushbutton "Ok D" to find (and open) the file containing the desired input sequence
 
 


UP


3.  Input by filename: When selected the user should put in the nearby window (either by direct writing or by cut-and-paste) the name of the file containing the desired input sequence, and confirm by depressing the pushbutton "Ok N".
 
 



UP


4.  Input from web: When selected the user should put in the nearby window (either by direct writing or by cut-and-paste) the URL of the file containing the desired input sequence, and confirm by depressing the pushbutton "Ok U".
 
 


UP


Run Frame

 



UP


Graphic Frame

In the Graphic Frame the results are displayed in a sensitive graphic form that enables an interactive use of the results.

The Graphic Frame allows to select a tract of the sequence being examined, and is made of three horizontal bands, i.e., from below, the Sequence Band, the TR Band, and the Consensus Band.
Whenever possible, the Sequence Band displays the sequence tract being observed, while each result, i.e, the TR with the  associated consensus, is displayed in the two upper bands, as explained later.
The TR Band is a cartesian plane, in which each result is represented by a horizontal red segment (with a central black dot), having as ordinate the length of the consensus,  and as abscissas the positions along the sequence. The scale of the ordinates is an approximately logarithmic piecewise-linear scale, while the scale of the abscissas is a linear scale simply indicated by its extreme values along the sequence and by a representation of the scale unit length.
The Consensus Band is a rectangular area where only the ordinate matters, being the score of the result; in this area a consensus relative to a given result is represented as a vertical rectangle whose height is the score of the result and whose horizontal position is arbitrary (i.e. suggested only by graphic convenience) but with a straight line connecting the consensus with the corresponding red segment in the central band.
Within the rectangle the nucleotide bases of the consensus are represented by coloured slices (according to the selected colour option, see below), and are ordered from above (a similar slicing is used to represent, the nucleotide bases of the tract in the Sequence Band).
Whenever possible, the TR initial position is displayed above the rectangle, while the consensus length and the score of the result are displayed below the rectangle (we note that these three numerical values are also shown in analog form in the graph).
When many consensus are shown in the Consensus Band the width of a rectangle is smaller if the score is smaller.
We note that while all results for the considered tract are shown (possibly indistinguishably superposed) in the TR Band, only a limited number of consensus can be displayed in the Consensus Band.
More generally elements are displayed only if they are not too small and if the space is sufficient.

The sensitive feature of the Graphic Frame

A very useful feature of the Graphic Frame is the fact that it is sensitive, in the sense that many choices for displaying the results can be performed by simply clicking in various areas of the frame, as explained in detail both below and in the help frame obtained by clicking the "Clic" pushbutton.
The clicking areas and the corresponding actions (centering, zooming by a factor 2, describing results) are shown in the following figure

Zoom or translation actions can also be performed by suitably depressing keyboard or virtual pushbuttons. As explained in detailed by depressing the Help pushbutton, eight actions can be performed, namely left or right translation, go to sequence start or end, zoom in or zoom out by a factor 2 or 10.
The detailed description of each result can be obtained by clicking within the corresponding rectangle in the Consensus band, while the detailed description of the results showed in the frame can be obtained by clicking the Show pushbutton.
The details of all the results can be obtained by clicking the Full pushbutton in the Basic Frame.

UP

Options Frame

The Options Frame allows the user to change three kinds of default assignments: search dimensions and parameters, and colours.
Search dimensions and parameters have the same name and function of those in the source of the C language version string.c, and are explained in the comment lines or in the paper. In the completely new graphical presentation of the results there is also the possibility of assigning colours to the DNA bases.

1. parameters: the search parameters that influence the amount of output results are:


When the penalty becomes stronger, the cumulated length of the DNA tracts that are considered TR decreases, while it can be easily shown that their number could increase or decrease, depending on the sequence.
Similar considerations apply to a growing score and to a decreasing max_gamma.
A larger max_gamma, or a smaller score, or a milder penalty, markedly increases the execution time, can require larger dimensions, and might increase the amount of (possibly non-interesting) output.
On the other hand the main drawback of a smaller max_gamma,  or of a larger score or of a stronger penalty is the risk of not detecting some possibly interesting TR.

UP

2. dimensions: the working array dimension parameters that can be interactively changed are six:


We note that the need of long work arrays depends on the degree of repetitiveness (and not on the length) of the sequence.
In general terms a lower value for a dimension parameter reduces the memory requirement and often increases the execution speed, but could be insufficient; in this case a message is issued suggesting an increase of the relative value to repeat the search.
We note that too large a value may cause the Java interpreter to run out of memory, an error that JSTRING tries to catch, and in this case a warning message is issued, suggesting to re-run JSTRING after quitting (since after an OutOfMemoryError the Java Virtual Machine risks to become unable to manage the garbage collection). This must be done with smaller dimensions parameters, or, in the application case, by allocating more memory to the process, i.e. specifying a larger maximum heap size for the Java interpreter.
Default values used by JSTRING are generally sufficient.

UP

3) colours: in order to personalise the graphic presentation of results, the user can change the default identification colours for T, C, A, G, N, either selecting a shades-of-gray mode, or choosing among the 13 standard Java colours.



UP


Message Frame

In the Message Frame a message, context dependent, is displayed in a window, as in the example below.
 
 


UP


Help Frame

The Help Frame displays and explains the active commands in the frame in which the Help command has been called.
 
 


UP


Error Frame

The Error Frame is displayed when the user tries to perform a forbidden operation, i.e. I/O on disk, output on printer or input from Web in applet mode
 
 


UP



 

The pushbuttons

There are 31 pushbuttons, whose number and type in any frame could be context-dependent. A complete list is given in the following table.
 
 
label of 
pushbutton
keystroke
shortcuts
explanation
Quit [ESC], q quit
Save s save on disc
Cont [ESC], c exit from Message
Exit [ESC], e exit from Help
Ok W accept Input
Ok D input sequence by dialog window
Ok F input sequence by file name
Ok U input sequence by URL name
Esc [ESC] reject input
Help ?, h help about buttons
Stop [ESC], s stop current operation
Opt o change options
Grph g results graphical
Shrt s tabular results
Full f full results
Run r start TR search
Back [ESC], b exit from graph
Clic c help on sensitive graph
Prnt p graphical print
< [LEFT], < left movement
Home [HOME], a sequence begin
> [RIGHT], > right movement
End [END], z sequence end
- [DOWN], - zoom out * 2
--- [PGDN], 1 zoom out * 10
+ [UP], + zoom in * 2
+++ [PGUP], 9 zoom in * 10
Show s show page results
Set s set option
Load l load sequence
Beep b check during waiting

UP