Dataset on Enhancing Sociopragmatic Competence through AI-Driven Tools: Insights from Real-World Language Interactions
Description
This dataset originates from a study titled “Enhancing Sociopragmatic Competence through AI-Driven YouTube Discourse Analysis: A Data-Driven Tool for Real-World Language Interaction Skills.” The study investigates how AI-based tools can enhance learners' sociopragmatic competence by analyzing authentic discourse from YouTube. Using a mixed-methods approach, the research focuses on key sociopragmatic skills, including politeness strategies, turn-taking, face-saving tactics, and conversational implicature. The dataset includes quantitative assessments from pre- and post-intervention tests, task-based evaluations, and survey responses, as well as qualitative insights gathered from semi-structured interviews. It reflects learners' experiences using AI tools to analyze and practice sociopragmatic strategies in real-world language interactions. Findings from the study demonstrate significant improvements in learners’ sociopragmatic awareness and competence, offering evidence of AI's potential to bridge theoretical knowledge and practical application in language learning. This dataset provides a comprehensive resource for educators, researchers, and developers seeking to leverage AI for enhancing language education.
Files
Steps to reproduce
To reproduce the results of this study, researchers should begin by recruiting 100 participants with advanced language proficiency, ensuring demographic balance and familiarity with digital tools. Participant consent must be obtained prior to their involvement in the study. The next step involves preparing the data by collecting discourse samples from YouTube videos relevant to sociopragmatic themes, such as politeness strategies, turn-taking mechanisms, and conversational implicature. These videos should be transcribed using an AI transcription tool to facilitate textual analysis. The AI-driven analysis platform should be configured to detect sociopragmatic elements within the transcriptions. Participants are then divided into two groups: an experimental group, which uses the AI tool for sociopragmatic analysis, and a control group, which engages in traditional text-based exercises covering similar sociopragmatic content. Both groups complete tasks designed to analyze politeness strategies, face-saving tactics, conversational implicature, and turn-taking behaviors. These tasks include individual and group discussions to ensure a comprehensive evaluation of sociopragmatic competence. Quantitative data is collected through pre- and post-tests administered to both groups, assessing their ability to recognize and apply sociopragmatic elements. Qualitative feedback is gathered through surveys and semi-structured interviews to understand participants’ experiences with the AI tool. Data analysis involves paired and independent t-tests to measure improvements in sociopragmatic competence and thematic coding of qualitative feedback to identify patterns in participant perceptions. The study adheres to strict ethical standards, ensuring participant confidentiality and data security throughout the process. Ethical approval is obtained from an institutional review board before commencing the research. Finally, researchers should document findings comprehensively, including raw and processed data, to enable reproducibility. Detailed instructions and templates for analysis should be provided to facilitate replication in different contexts or with similar tools.