Computational Radiation Cytogenetics (CRC) Contact Us 
 CRC Home > CAA Search CRC 

Chromosome Aberration Analyzer (CAA)

Chromosome Aberration Analyzer (CAA), currently under development, will provide tools for analyzing observed or simulated chromosome aberration patterns. To demonstrate the functionality currently available and to obtain feedback, a test applet is available. Instructions on accessing the applet are near the
middle of this web page.

Test Applet User Manual

The applet finds statistics on the possible cycle structures [Sachs et al. 1999; Cornforth 2001; Levy et al. 2003] of a complete mFISH or whole-chromosome painting FISH aberration pattern. If the input pattern is apparently incomplete [Cornforth 2001], the applet can find all minimal completions.
User Input for Applet. The input string for representing a final, directly observable aberration pattern such as that in the figure describes chromosome segments, telomeres, centromeres, and misrejoinings in a generalization of PAINT, mPAINT, and detailed ISCN as follows.
1. Each rearranged chromosome is specified by parentheses; there are no spaces between adjacent parentheses.
2. Misrejoinings are represented by double colons ::
3. Centromeres are denoted by primes '
4. Rings are denoted by a colon inside each parenthesis, e.g. (:4:). This corresponds to a misrejoining within the ring.
5. All misrejoinings required by the final pattern are included in the input. That is, no segment has more than one color or more than one centromere and every rearranged chromosome contains at least one misrejoining.
For instance, the pattern shown here has either of the following as input string representations: (r::r')(g::b'::g')(b::b)(:r::g:); or (red::red')(green::blue'::green')(blue::blue)(:red::green:). Here a misrejoining in a red chromosome is required by the fact that a red segment appears as part of the ring. In an mFISH experiment, where the colors correspond uniquely to chromosome numbers, the input string would be more explicit, e.g. (1::1')(2::3'::2)(3::3)(:1::2:)

The Pattern Analysis Window. Upon startup, the test applet launches a Pattern Analysis Window, which demonstrates the functions and the internal representations of the Pattern class. The window is initialized with the aberration pattern (5'::5) (1'::3::2') (:1:) (:2::5:) (4::1) (2::3') (3::1'::5::4') (1::1), which we will use to explain the program. The features most often used are the Display Pattern button, the Cycle Record button, and the Completion Window.

The Display Pattern Button. This button provides a graphical depiction of the final pattern. For instance, for the default pattern, pressing Display Pattern opens the window shown. Colors are not assigned consistently to the labels, but rather are chosen to be equally spaced about the color wheel.

Cycle Record Button.. A complete aberration formation process has a cycle structure that quantifies the complexity of the process [Cornforth 2001; Sachs et al. 2002; Levy et al. 2003]. Cycle structures specify the number of DSBs (DNA double strand breaks) involved in irreducible exchanges, where each irreducible exchange presumably involves DSBs close in space and time, but different irreducible exchanges can occur at different locations in the cell nucleus and/or different times. For example, an ordinary simple exchange involving 2 DSBs and thus involving pairwise misrejoining of 4 DSB free ends, of the kind that makes a simple dicentric or translocation, is a 2-cycle c2. Examples of 3-cycles c3, involving 3 DSBs in an irreducible exchange, include insertions of a part of one chromosome into another and 3-chromosome "musical chairs" exchanges. A sequential exchange complex (SEC) may involve two different 2-cycles, i.e. have cycle structure c2+c2. And so on. Given a final mFISH pattern for a complete aberration, various initial configurations and cycle structures are possible. The applet gives statistics on all the cycle structures compatible with a given pattern, as illustrated in the next figure. The figure results from pressing the Cycle Record button for the default pattern and uses the Java class called the Cycle Record.

Here the cycles are ordered such that the most complicated is at the top and the obligate occurs at the bottom. (12) denotes a 12-cycle c12, which is seen to be the most likely cycle structure for this final pattern. (9-3) denotes a 9-cycle and a 3-cycle, c9+c3. And so on.
In cases where the number of configurations is too computationally expensive (test at your leisure, but my machine shows a marked slowdown when over ten million initial configurations are involved), there is an option to randomly sample the space of cycle structures using the Monte Carlo Cycle Record button and specifying the number of iterations to perform. In the case shown in the figure, it is seen that the Monte Carlo probabilistic percentages are close to the numbers obtained by a complete list.

Apparently Incomplete Patterns. It can happen that even if none of the individual rearranged chromosomes in a pattern appear to have unrejoined DSB free ends and no pan-telomkeric probe is used, the overall pattern indicates some failure to rejoin or misrejoin and/or some cryptic chromatin segments too small to be seen. For example if for a particular color there is one centromere and three apparent telomeres, something is wrong. Patterns are called apparently incomplete unless each color involved has at least one centromere and exactly two apparent telomeres for each centromere. A systematic theory of apparent incompleteness has recently been developed and is partially incorporated into the test applet. If the input pattern is apparently incomplete, pressing any of the buttons will launch a dialog box which offers to open a Completion Window. Choosing "Yes" will open a new Completion Window with the mFISH pattern as the input string. To find out more about the Completion Window, click here.

Test Applet
The above discussion of the test applet for computing cycle structures should suffice for most users. To start the applet, SELECT HERE.

The applet may take some time to load in the browser. After finishing, you can close the Pattern Analysis window and use the Back button on your browser to return to this page. The applet requires a fairly recent version of the JRE (Java Runtime Environment 2). To get JRE in the Java J2SE platform from SUN select here (SDK, a large download, is not needed).


The pattern analysis window of the test applet has some additional buttons. These are mainly concerned with technical details of representing aberrations via multigraphs [Sachs et al. 2002.]. The image (click for larger versions) shows the results of having pressed the Pattern button. This shows the internal representation of the Pattern object created from the input mFISH pattern. An aberration is stored by labeling each chromosome segment sequentially. To each chromosome segment is associated a label and a Boolean flag indicating whether it has a centromere. This is stored in the label map. The pattern string stores the arrangement of the chromosome segment in the final pattern.

We note that for a given final pattern, we can find all the multigraph vertices and all chromosome and final edges determinately. Each vertex is the right or left endpoint of a chromosome segment. The label map provides a convenient method for labeling vertices as -i and +i for the left and right endpoints, respectively, of the chromosome edge numbered i in the label map. Then each chromosome edge is (-i, +i) and final edges are determined from the pattern string. To find the final edges, click the Final Edge button (figure). Note: the Pattern class does not generate final or initial edges between telomere vertices.

The Number of Initial Configurations button displays the number of distinct, valid initial edge configurations consistent with the final pattern. The initial configurations are generated by permuting over all possible initial configurations among the vertices of each Label. For each Label, we generate an Edge Tree which contains an edge at every node (save the root), and every path from root to leaf determines a unique valid initial edge configuration. This edge tree is generated from a Segment Decomposition which is a classification of chromosome segments by whether or not they have a telomere and whether or not they have a centromere. We have implemented an algorithm that produces an Edge Tree by taking a segment decomposition and finding the set of all initial edges incident on a single vertex. Each such edge is a child node on the Edge Tree. Its children are found recursively from a reduced segment decomposition. This is a segment decomposition where two chromosome edges are joined along the initial edge associated to that node.

By taking all possible unions of valid initial edge configurations over all labels, we obtain all valid initial edge configurations for the whole pattern. Each of these, taken in union with the chromosome and final edges we obtained previously, provide all possible multigraphs consistent with the final mFISH pattern. The All Initial Edges button displays all valid initial edge configurations. The All Cycles button gives all the exchange graph cycle structures for each initial edge configuration. In cases where there are more than 100 initial configurations, these buttons will display a message telling you it would take too long to display them all. (This is why we have modified the sample pattern to demonstrate these buttons.)

The cycle structures are obtained by computing, for each initial configuration, the number and size of the connected components for the exchange process submultigraph, consisting of the initial edges in union with the final edges and corresponding vertices. For instance, for an exchange structure (5-3-2), i.e. c5+c3+c2, the exchange process submultigraph has three connected components with ten, six, and four vertices respectively.

This calculation is also used to display the number and percent of initial edge configurations that exhibit a particular exchange cycle structure, as discussed above in connection with the Cycle Record button.


CAA is funded in part by Grant NNJ04HF42G, 03-OBPR-07, NNJ06HA27G, a subcontract for Computational Modeling of Chromosome Aberrations Produced by HZE Particles. Lynn Hlatky PI. We thank Drs. Cornforth, Loucas, and Savage for discussions that led to improving these algorithms.

CRC Home > CAA Top | Test Applet

"We should take comfort in two conjoined features of nature: first, that our world is incredibly strange and therefore supremely fascinating; second, that however bizarre and arcane our world might be, nature remains comprehensible to the human mind." - Stephen Jay Gould