A Dataset for Categorizing Phrases and Sentences as Statements, Questions, or Exclamations Depending on Sound Pitch

Published: 25 February 2025| Version 1 | DOI: 10.17632/267dh85vpw.1
Contributors:
Ayub Abdulrahman,
,
,

Description

Speech is one of humanity's most natural and efficient forms of communication, working as an essential channel for the exchange of ideas and knowledge. In the current technological environment, researchers attempt to improve the collaboration between humans and computers. Fundamental in this approach is Natural Language Processing (NLP), a vibrant area within artificial intelligence that enables computers to understand and react to human language. NLP enhances our comprehension of speech, allowing perfect human-machine interaction. Recent developments in machine learning, particularly deep learning methodologies, have transformed sound recognition systems. These methods are efficient in extracting complex, advanced features from raw audio data, enabling algorithms to detect small differences that require precise understanding. Pitch is one of the most crucial acoustic characteristics. Pitch variations inform categorisation algorithms to differentiate various types of spoken content, establishing a foundation for enhanced speech analysis. The importance of pitch is particularly noticeable when classifying speech into specific categories. Age, gender, and fundamental vocal variations can influence the accuracy of speech recognition, making adaptation to these differences essential. Speeches are typically classified into various types: statements, questions, exclamations, and occasionally demands. In specific languages, like Kurdish, identical words or phrases may have varying meanings depending solely on the tone and pitch employed. Consequently, the pitch of a spoken phrase is crucial in communicating its intended message. Considering these challenges and opportunities, the provided dataset aims to classify phrases and sentences as statements, questions, or exclamations according to their pitch. Audio samples were obtained from various sources, guaranteeing an extensive range of pitch patterns and speech styles. Every sample is carefully labelled to indicate its communicative purpose, providing a valuable resource for training and assessing advanced speech recognition models. This dataset facilitates the creation of customised NLP applications and enhances our comprehension of the impact of acoustic features on language understanding.

Files

Institutions

University of Halabja

Categories

Linguistics, Computer Science, Natural Language Processing, Machine Learning

Licence