Age Estimation in Short Speech Utterances Based on Bidirectional ‎Gated-Recurrent Neural Networks

Authors

  • Ameer A. Badr College of Managerial and Financial Sciences
  • Alia K. Abdul Hassan Department of Computer Science

DOI:

https://doi.org/10.30684/etj.v39i1B.1905

Abstract

Recently, age estimates from speech have received growing interest as they are important for ‎many applications like custom call routing, targeted marketing, or user-profiling. In this work, an ‎automatic system to estimate age in short speech utterances without ‎depending on the text is proposed. From each utterance frame, four ‎groups of features are extracted and then 10 statistical functionals are measured for each ‎extracted dimension of the features, to be followed by dimensionality reduction using Linear ‎Discriminant Analysis (LDA). Finally, bidirectional Gated-Recurrent Neural Networks (G-‎RNNs) are used to predict speaker age. Experiments are conducted on the VoxCeleb1 ‎dataset to show the performance of the proposed system, which is the first attempt to do so for ‎such a system. In gender-dependent system, the Mean Absolute Error (MAE) of the proposed system ‎is 9.25 years, and 10.33 ‎years, the Root Mean ‎Square Error (RMSE)‎ is 13.17 and 13.26, respectively, ‎for ‎female and male speakers. In gender_ independent system, the MAE of the proposed system is 10.96 years, and the RMSE is 15.47. The results show that the proposed system has a good performance on short-duration utterances, taking into consideration the high noise ratio in the VoxCeleb1 dataset. ‎

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Published

2021-03-25

How to Cite

Badr, A. A., & Abdul Hassan, A. K. . (2021). Age Estimation in Short Speech Utterances Based on Bidirectional ‎Gated-Recurrent Neural Networks. Engineering and Technology Journal, 39(1B), 129-140. https://doi.org/10.30684/etj.v39i1B.1905