#.......# 
 S.......S 
  T.....E 
    R.Q 
     U 
    E.C 
  N.....T 
 C.......U 
 #.......R
  E.....E 
    S.S 

Structures and Sequences

Under construction

Context dependent mutation trends in the human genome

Supplementary materials

All data concern SNP regions of the human genome


SNP regions in the dataset are 21bp fragments of the genome containing SNP in the middle position
Details are described on "Legend" sheets of each file
  • Word frequencies in SNP regions of the human genome (zip file, 66 MB)
    The file contains all filtered SNPs from human genome with +/-10 flanks, 3.405.096 SNPs totally.
    Filters are described in the next files, see list "Legend".
    Columns are as follows:
    Chromosome|21bp_region_start|21bp_region_end|SNP_name|human_variant_1|human_variant_2|21bp_region_sequence|ancestral_state
  • Word frequencies in SNP regions of the human genome (xls file, 1.2 MB)
    The file contains the frequency of each word of length 1, 2, 3, 4 or 5 in flanks of SNP regions for several criteria of SNP selection. These frequencies are compared to the frequencies of the same words in the complete human genome
  • Context dependent mutation rate (xls file, 1.3 MB)
    The file contains probabilities of mutations in contexts of words of length 1, 2, 3, 4 or 5
  • Equilibrium words frequencies (xls file, 145 KB)
    Computed equilibrium frequencies of 1-4 letter words are compared to actual word frequencies in the human genome. Thus, evolutionary trend in word frequencies can be hypotized.
  •