Download PDFOpen PDF in browser

Near Real-Time Automatic Speaker Recognition for Voice-Based Interfaces

EasyChair Preprint no. 14093

14 pagesDate: July 23, 2024

Abstract

In recent years, the demand for efficient and secure voice-based interfaces has surged, driven by the proliferation of smart devices and the need for hands-free interaction. This paper presents a novel approach to near real-time automatic speaker recognition aimed at enhancing the security and usability of voice-based interfaces. Our system employs advanced machine learning algorithms and robust feature extraction techniques to achieve high accuracy in speaker identification and verification. We integrate a lightweight, yet powerful, deep neural network (DNN) architecture that processes voice input with minimal latency, making it suitable for real-time applications. The proposed method leverages a combination of mel-frequency cepstral coefficients (MFCCs), voice activity detection (VAD), and speaker embeddings to create a distinctive speaker profile. Experimental results demonstrate the system's efficacy in diverse acoustic environments and its resilience to common challenges such as background noise and voice mimicry. The implementation is evaluated on a publicly available dataset, showing promising results with an average identification accuracy of 98.2% and a verification equal error rate (EER) of 1.5%. This study underscores the potential of near real-time speaker recognition systems in enhancing user authentication and personalization in voice-activated applications, paving the way for more secure and intuitive human-computer interactions.

Keyphrases: automatic speaker recognition, feature extraction, Mel Frequency Cepstral Coefficients, Near Real-Time Processing, speaker identification, speaker verification

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@Booklet{EasyChair:14093,
  author = {Kayode Sheriffdeen},
  title = {Near Real-Time Automatic Speaker Recognition for Voice-Based Interfaces},
  howpublished = {EasyChair Preprint no. 14093},

  year = {EasyChair, 2024}}
Download PDFOpen PDF in browser