
Allele frequency tables contain population dependent information about how often each allele occurs at a certain DNA marker. This information can be read from text files (ASCII), where the allele frequencies are distributed over two rows for each DNA marker. In the upper row there are the allele names, in the lower one there are the frequency values. These files may contain invalid allele names as result of copy and paste into and from Excel. Allele names are often changed into dates by Excel, e.g. 07.07.2002 (primary 7/7.2, i.e. allele 7 and 7.2) or 15.Feb (primary 15.2). In this form the information cannot be saved to the database, it has to be preprocessed. SampleCheck replaces months (Jan, Feb, Mrz) and dates by numbers. It is found that several alleles are combined and have only one common frequency value. Names like >20 (alleles greater than 20) are split into two alleles by SampleCheck. Combined alleles like 7/7.2 are also split into several alleles. If combined alleles are split their frequency values are also split. Alleles with a value = 0.0 are not saved.
There is also a method to normalize the allele frequency values. The sum of the values for one marker may not be greater than 1. The floating point values are limited to four decimal places. Then the valid and normalized data is saved to the database.

Mutation rates describe how often a mutation occurs at a certain DNA marker. Files containing such information should have three columns, the first one for the marker name, the second one for the paternal mutation rate and the third one for the maternal mutation rate.