Projects & Data Formats¶
Everything you do with Flapjack is stored within a project file; imported data, sort orders, trait information, colour schemes, etc. A Flapjack project is active at all times when using the application - even at startup, when a default new project is already created and waiting for data to be imported into it.
A Flapjack project can store zero or more data sets
.
Data sets, maps, and genotypes¶
A data set usually contains information from an imported map file
and genotype file
.
Note
Both the map file and the genotype file must be in plain-text, tab-delimited format.
The map file is used to provide details on the chromosomes (name and length; see warning below) and the markers (name, chromosome, and position). Order does not matter as Flapjack will group and sort them by chromosome and distance once they are loaded. A short example is shown below.
# fjFile = MAP
1H 125.0 # Only valid for version 1.16.10.x or above
Marker1 1H 32.5
Marker2 1H 45.4
Marker3 2H 23.8
The genotype file contains a list of variety lines, with allele data per marker for that line. It also requires a header line specifying the marker information for each column.
# fjFile = GENOTYPE
Marker1 Marker2 Marker3
Line1 A G G
Line2 A - G/T
Line3 T A C
Note
You can include additional headers which let Flapjack know the URLs for trying to access additional information about lines and markers held in external databases. You can also include headers for Pedigree Information and Favourable Alleles/Alt Marker Names.
Flapjack views¶
Flapjack stores the lines and markers internally in a structure and form that can never be modified. A default view
upon this data is created whenever an import is successful, and any subsequent operations upon the lines or markers will happen to the view, not to the data set.
Each view (and you can create as many as you like) will hold the set of chromosomes
for that data set. Each chromosome is displayed independently, but the lines are obviously common to all chromosomes and any modification to the order or display of lines on one chromosome will be reflected across all the others too.
Colour scheme information is generally specific to a view although some settings will be chromosome-specific, such as colouring by marker.
Phenotypes/Traits¶
A data set can optionally also store information on one or more traits
that are associated with the lines. Trait information is imported from a file with the following tab-delimited format:
# fjFile = PHENOTYPE
Trait1 Trait2 Trait3 Trait4 _#CAT
Experiment1 Experiment2 Experiment1 Experiment2
Line1 50 High Short 23
Line2 2.3 High Medium Average
Line3 99.3 Low Long Average
Trait data for a single trait can be either numerical or categorical. The line containing experiment information for each trait is optional.
Flapjack will determine whether a column contains numerical or categorical data by looking at the first line’s data. However, if for example, you have primarily categorical data but the first line contains a number, then you can override this by including _#CAT
in the column’s title (see example Trait4 above). Similarly, you can enforce numerical data by using _#NUM
.
QTLs¶
A data set can also optionally store information on one or more QTLs
that are associated with the map. QTL information is imported from a file with the following tab-delimited format:
# fjFile = QTL
Name Chromosome Position Pos-Min Pos-Max Trait Experiment [optional_1] .. [optional_n]
QTL1 1H 10 8 12 Height Exp1 25.5 high
QTL2 1H 20 19 26 Height Exp1 34.8 low
QTL3 2H 10 8 13.5 Temp Exp1 99.2 low
The Name to Experiment columns are required and must be included and listed in the order shown. After that, each QTL may have zero or more optional columns of numerical or textual data that can be included too.
Also see the GOBii QTL Format.
Graphs¶
A data set can also optionally store information on one or more graphs
that are associated with the map. Graph information is imported from a file with the following tab-delimited format:
# fjFile = GRAPH
SIGNIFICANCE_THRESHOLD Graph1 5.1
SIGNIFICANCE_THRESHOLD Graph2 7.5
Marker1 Graph1 1.3
Marker1 Graph2 4.3
...
Marker2 Graph1 1.8
Marker2 Graph2 3.9
Any number of graphs can be stored in a single file with data points per marker. The SIGNIFICANCE_THRESHOLD entry is optional (per graph) but defines the significance threshold for that graph if included which will be drawn on Flapjack’s display.