Green Tourism Corpus Dataset

Published: 6 June 2025| Version 1 | DOI: 10.17632/jfngz4t9kr.1
Contributors:
Ikmi Nur Oktavianti,
,
,
,

Description

English for Tourism (EFT) is a specific course designed to teach English to tourism professionals or practitioners. When green or sustainable tourism has recently become prominent, the English for Tourism course should prepare the students to be aware and familiar with the campaign. One possible way is to create teaching materials relevant to the issue. In designing teaching materials, big data or corpus is essential to support the authenticity of the materials. Thus, this study aims at developing lists to create a corpus-informed EFT module to raise awareness of green tourism. The present study utilized a corpus approach to generate a word list and lists of formulaic languages, which will later be used to inform the development of an EFT module. To create the word list, the present study employed some filters, e.g., frequency, range, and lexical profiling, with the assistance of some corpus tools, such as AntConc, LancsBox, and AntWordProfiler. The identification of formulaic languages focused on collocations and 3- and 4-gram lexical bundles; the collocations were measured using the MI Score, and the lexical bundles were identified using frequency and dispersion. The results showed that as many as 299 words were listed in the Green Tourism Word List to teach EFT in response to the environmental issue challenge. Moreover, there are lists of collocations and lexical bundles that are useful for the learners to boost the mastery of vocabulary, especially regarding sustainable tourism.

Files

Institutions

Universitas Ahmad Dahlan

Categories

Corpus Linguistics

Funding

Ministry of Education, Culture, Research and Technology, Indonesia

107/E5/PG.02.00.PL/2024, 0609.12/LL5-INT/AL.04/2024, 067/PFR/LPPM UAD/VI/2024

Licence