التعرف على العواطف باستخدام MFCC والمخططات الطيفية والشبكات العصبونية العميقة

المعتز بالله فاضل; ديما الشوافعة

Authors

Almotaz Bellah Fadel Master's student, Computer Engineering, Faculty of Electrical and Electronic Engineering, Aleppo University, Aleppo, Syria
Dima Al-Chawafea Assistant Professor, Computer Engineering, Faculty of Electrical and Electronic Engineering, Aleppo University, Aleppo, Syria

Keywords:

Deep Learning, Emotion Recognition, Feature Extraction, Spectrograms, Neural Networks.

Abstract

Speech recognition technologies are essential modern tools, with various systems differing in feature extraction and classification methods. This research examines multiple feature extraction approaches, highlighting the potential benefits of combining them to improve accuracy.

The proposed method was developed in three stages, each using a distinct method. The first stage employed MFCC (Mel-Frequency Cepstral Coefficients), achieving 78.62% accuracy. However, MFCC alone may lose important temporal and visual cues by segmenting signals into short frames, limiting emotion detection.

In the second stage, spectrograms were used, enhancing emotion recognition and achieving 93.20% accuracy by preserving energy distribution across frequencies. The third stage applied Feature-Level Fusion, combining MFCC and spectrogram outputs. This hybrid model, evaluated using a Random Forest classifier, reached a 97% accuracy rate, with F1-score, Precision, and Recall also at 97%.

The results show that fusing acoustic and visual representations significantly improves performance compared to individual models. Our proposed approach demonstrates superior effectiveness in emotion recognition from speech.

Emotion Recognition Using MFCC, Spectrograms and Deep Neural Networks

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

How to Cite

Language