A Deep Learning Framework for Speech Emotion Recognition: A Gender-Aware Hierarchical Pipeline with Optimized 18-Layer Convolutional Neural Network

Savita Jain

doi:10.36948/ijfmr.2026.v08i03.80011

A Deep Learning Framework for Speech Emotion Recognition: A Gender-Aware Hierarchical Pipeline with Optimized 18-Layer Convolutional Neural Network

Author(s)	Dr. Savita Jain
Country	India
Abstract	The field of Affective Computing has emerged as a crucial domain in human-computer interaction, with Speech Emotion Recognition (SER) serving as a cornerstone for developing intuitive, context-aware systems. While traditional Automated Speech Recognition (ASR) frameworks have achieved considerable maturity in decoding semantic content, recognizing the underlying emotional state from spoken language remains a computationally complex challenge. Real-world acoustic signals are heavily influenced by environmental noise, speaker idiosyncrasies, and physical variability across genders. This paper introduces a high-performance, structurally optimized hierarchical framework that addresses these limitations through three primary contributions: (1) a dense 182-feature extraction pipeline unifying spectral, linear predictive, dynamic energy, prosodic, and statistical shape profiles; (2) an early-stage, gender-aware hierarchical pipeline driven by a Gender Recognition (GR) circuit that splits the processing stream based on fundamental frequency distribution to eliminate cross-gender acoustic overlaps; and (3) a customized 18-layer Deep Convolutional Neural Network (CNN) integrated with meta-heuristic hyper-parameter optimization. The system is evaluated on the RAVDESS and SAVEE benchmark corpora, demonstrating superior multi-class emotion classification accuracy and operational efficiency compared to baseline Multi-Layer Perceptron (MLP) and Long Short-Term Memory (LSTM) architectures.
Keywords	Speech Emotion Recognition, Convolutional Neural Network, Gender Recognition Circuit, Affective Computing, RAVDESS, SAVEE, Feature Extraction, Deep Learning.
Field	Computer
Published In	Volume 8, Issue 3, May-June 2026
Published On	2026-05-30
DOI	https://doi.org/10.36948/ijfmr.2026.v08i03.80011

View / Download PDF File

E-ISSN 2582-2160

doi

CrossRef DOI is assigned to each research paper published in our journal.

IJFMR DOI prefix is
10.36948/ijfmr

Downloads

Research Paper Format Copyright Permission Form and Undertaking Form Cover Page Vol 8 Isu 3 Cover Page Vol 8 Isu 2 Cover Page Vol 8 Isu 2

All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.

CC-BY-SA

About IJFMR Fees & Payment Current Issue Publication Archive	Submit Research Paper Track Submission Status Publication Guidelines Publication Ethics Peer Review & Plagiarism	Join as a Reviewer Editors & Reviewers Reviewer Referral Program Get Reviewer Membership Certi.	Website/Journal Policies Usage Policy Content Policies Privacy Policy

Contact Us		+91-9687-828-838	editor@ijfmr.com

International Journal For Multidisciplinary Research

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

A Deep Learning Framework for Speech Emotion Recognition: A Gender-Aware Hierarchical Pipeline with Optimized 18-Layer Convolutional Neural Network

Share this