Download PDFOpen PDF in browser

Detection of exceptional genomic words: a comparison between species

EasyChair Preprint 63

12 pagesDate: April 15, 2018

Abstract

In this study we explore the potentialities of the inter-word distances to detect exceptional genomic words (oligonucleotides) in several species, using whole-genome analysis. We confront the empirical results obtained from the complete genomes with the corresponding results obtained from the random background. We develop a procedure, based on some statistical properties of the global distance distributions in DNA sequences, to discriminate words with exceptional inter-word distance distribution and to identify distances with exceptional frequency of occurrence. We identify the statistically exceptional words in whole-genomes, i.e., words with unexpected inter-word distance distributions, and we suggest species signatures based on exceptional word profiles.

Keyphrases: DNA sequence, Inter-oligonucleotide distances, exceptional genomic word, goodness-of-fit, stochastic model

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:63,
  author    = {Ana Tavares and João Rodrigues and Carlos Bastos and Armando Pinho and Paulo Ferreira and Paula Brito and Vera Afreixo},
  title     = {Detection of exceptional genomic words: a comparison between species},
  doi       = {10.29007/jvg4},
  howpublished = {EasyChair Preprint 63},
  year      = {EasyChair, 2018}}
Download PDFOpen PDF in browser