A new method for identification of pre-microRNAs based on hybrid features

EasyChair Preprint no. 201

14 pagesDate: May 31, 2018



The identification of pre-microRNAs (precursor microRNAs) helps us
to understand the regulatory mechanism of biological processes.
Currently, machine learning is the most popular method for
pre-microRNA identification. However, most methods mainly focus on
secondary structure information of pre-microRNA, while ignoring
sequence-order information and sequence evolution information.

In this work, we use three different methods to
extract features of the pre-microRNAs at different levels. We first
extract features from PSI-BLAST profiles and Hilbert-Huang
transform, which contain rich sequence evolution information and
sequence-order information respectively. We then get properties of
small molecular networks of pre-microRNAs, which contain refined
secondary structure information. We extract 591 features in total.
After extraction, we use support vector machine (SVM) as our
classifier, and use the maximum relevance and minimum redundancy
(mRMR) method for feature selection. Finally, we construct a new
predictor MicroRNA-NHPred by using the optimal feature set.
The performance of MicroRNA-NHPred is quite promising
compared to other popular miRNA predictors. It achieves an accuracy
of up to 94.83\%.

The higher prediction accuracy achieved by our proposed method is
attributed to the design of a comprehensive feature set on the
sequence and secondary structure, which are capable of
characterizing the sequence evolution information and sequence-order
information, and global and local information of pre-microRNAs
secondary structure. Therefore, it is a valuable method to
pre-microRNAs identification.

Keyphrases: Hibert-Huang Transform, mRMR, network, pre-microRNA, PSI-BLAST profiles, SVM

