Методы обнаружения ложных голосовых сигналов

Дашинимаева В. Ц.; Боршевников А. Е.

Methods for detecting fake voice signals

Dashinimaeva V. T., Borshevnikov A. E.

Incoming article date: 04.05.2025

The article analyzes various approaches to the generation and detection of audio deepfakes. Particular attention is paid to the preprocessing of acoustic signals, extraction of voice signal parameters, and data classification. The study examines three groups of classifiers: Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and neural networks. For each group, effective methods were identified, and the most successful approaches were determined based on a comprehensive analysis. The study revealed two approaches demonstrating high accuracy and reliability: a detector based on temporal convolutional networks analyzing MFCC-cepstrogram achieved an EER metric of 0.07%, while the Support Vector Machine with a radial basis function kernel reached an EER of 0.5%. Additionally, the latter method demonstrated the following metrics on the ASVspoof 2021 dataset: Accuracy = 99.6%, F1-score = 0.997, Precision = 0.998, and Recall = 0.994.

Keywords: audio deepfakes, preprocessing of acoustic signals, support vector machine, k-nearest neighbors, neural networks, temporal convolutional networks, deepfake detection