Back to articles
Identification
Volume: 28 | Article ID: art00009
Image
Attacks on Speaker Identification Systems Constrained to Speech-to-Text Decoding
  DOI :  10.2352/ISSN.2470-1173.2016.8.MWSF-073  Published OnlineFebruary 2016
Abstract

Speech processing is used to translate human speech to text and to identify speakers for applications in biometric systems. Speaker verification requires robust algorithms to prohibit an adversary from impersonating another speaker. Previous research has demonstrated that specially crafted additive noise can cause a misclassification of a speaker as a specific target. In this paper, we study whether targeted additive noise can thwart speaker verification without affecting speech-to-text decoding. Mel-frequency cepstral coefficients (MFCCs) and Gaussian mixture models (GMMs) are commonly used in both applications for encoding schemes. We attempt to induce a desired change in the probability of one speaker model used for speaker classification, while preserving likelihood under another speech model used for speech decoding.

Subject Areas :
Views 72
Downloads 3
 articleview.views 72
 articleview.downloads 3
  Cite this article 

Alireza Farrokh Baroughi, Scott Craver, Daniel Douglas, "Attacks on Speaker Identification Systems Constrained to Speech-to-Text Decodingin Proc. IS&T Int’l. Symp. on Electronic Imaging: Media Watermarking, Security, and Forensics,  2016,  https://doi.org/10.2352/ISSN.2470-1173.2016.8.MWSF-073

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2016
72010604
Electronic Imaging
2470-1173
Society for Imaging Science and Technology