RSAT - oligo-analysis

Detect over- or under-represented oligomers (k-mers) in nucleotidic of peptidic sequences.

Reference: van Helden, J., André, B. and Collado-Vides, J. (1998). . J Mol Biol 281, 827-42.

Warning !! For vertebrate genomes, analyses of complete promoters from co-expressed gene groups return many false positive (i.e. if you submit a random set of genes, you always get plenty of highly 'significant' motifs). This is likely to come from the heterogeneity of human sequences (mixtures of GC-rich and GC-poor promoters).
However, analyses of ChIP-seq peaks return very good results. See the program peak-motifs.

Sequence       Format Paste your sequence in the box below

Or select a file to upload (.gz compressed files supported)

Sequence type   
 purge sequences (highly recommended)

Oligomer counting mode

Oligomer lengths                  
Note: motifs can be larger than oligo sizes (oligos are used as seed for building matrices)
 prevent overlapping matches
Count on   return reverse complements together in the output

Background model 
Genome subset  Sequence type     
    [List of organisms] not seeing your favorite organism in the list ? Contact us to have it installed

    Taxon  [List of taxonomy]

Estimate from input sequence
Markov model (higher order dependencies)  order  
Equiprobable residues (usually NOT recommended)

Custom background model

  Create your own background model
Input your own background file  
File Upload        
URL of file available on a Web server.



One row per pattern
Return fields Lower
 Convert assembled patterns to matrices.
            Max pattern assemblies    Min site weight    Flanking residues    Matrix clustering 
Occurrence table: one row per sequence, one column per oligo (occurrence counts only, email output recommended)

Pattern count distribubtions, one row per pattern (occurrence counts only, email output recommended)