Facebook bAbI Tasks for Malayalam Language

Published: 1 November 2024| Version 1 | DOI: 10.17632/h26g4n9w5j.1
Contributors:
Bibin P A,

Description

A Malayalam Question Answering dataset of 5,000 training samples and 5,000 testing samples was generated by translating Facebook bAbI tasks. Facebook's bAbI tasks was originally created in English, some of the languages it has been translated are French, German, Hindi, Chinese, Russian. Twenty fictitious tasks that test a system's capacity for responding to a range of themes, including text comprehension and reasoning, are included in the dataset. Five task-oriented usability questions with comparable sentence patterns are also included in the collection. The questions here range in difficulty. Every job has 1000 test samples and 1000 training samples in the dataset. we created the dataset for the proposed work by utilizing the bAbI dataset to translate the English dataset into Malayalam for five tasks (tasks 1, 4, 11, 12, and 13), represented as tasks 1 through 5. Titles such as "Single Supporting Facts," "Two Argument Relations," "Basic Coreference," "Conjunction," and "Compound Coreference" relate to the tasks. Every sample in the dataset comprises a series of statements (sometimes called stories) about people's movements around things, a question, a suitable answer. Tasks: Task 1: Single supporting fact: This task tests whether a model can identify a single important fact from a story to answer a question. The story usually contains several sentences, but only one sentence is directly useful in answering the question. Task 2: Relationships with two arguments: This task involves understanding the relationship between two entities. The model must infer relationships between pairs of objects, people or places. Task 3: Core co-reference: Co-reference resolution is the task of linking pronouns or phrases to the correct entities. In this task, the model must resolve simple pronominal references. Task 4: Conjunctions: This task tests the model's ability to understand sentences in which several actions or facts are joined by conjunctions such as "and" or "or". The model must process these linked statements to answer the questions correctly. Task 5: Compound Reference: This task is more complex because it requires the model to solve the conjunctions in the sentence with composite entities or more complex structures.

Files

Steps to reproduce

Researchers found that the Malayalam-based Question Answering System is difficult to use because of fusional character. There is a need for a method for providing precise responses to queries depending on domain-specific documents. However, the existing techniques are not scalable and cannot handle the tremendous complexity of diverse users’ texting or commenting styles. The existing systems do not support the real-time environment of question answering by using additional lexical resources and disambiguation methods. The main goal of this study is to create an advanced Malayalam Question Answering System using a Deep Learning (DL) hybrid model that combines Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (Bi-LSTM) techniques. By addressing linguistic issues unique to the Malayalam language, such as agglutinative morphology, non-specific word order, and rich inflexions, it is hoped to develop an effective and accurate system capable of comprehending, processing, and producing contextually relevant responses to questions posed in the Malayalam language. This study provides a Malayalam QA System that depends on a DL hybrid design of CNN & Bi-LSTM Approaches. The proposed system, gives an input query such as കിടപ്പുമുറിയുടെകിഴക്ക്എന്താണ്? (kidappumuriyude kizhakku enthaanu? (What is east of the bedroom?)) and input passage to Word2Vec for converting input data into vectors, which employs a method of Skip-gram with varied vector lengths using the dataset. The word vector is then given to the CNN, in which every input consists of a series of vectors, and this layer (convolutional layer, max-pooling layer, dropout layer, and the fully connected layer) filters them using a filter with a defined length. Each filter uses the Rectifier Linear Unit function (ReLU) to extract different features from a query and represent them in the feature map. The mapping feature for the word is created in the convolution layer, and the most important mapping features produced by a filter are selected using the maximum pooling layer. The result of the dropout layer, which has a dropout value of 0.5 to lessen over fitting, is then passed on to the max-pooling level for full filtering. Utilizing the Bi-LSTM, which comprises two LSTMs with weighted outputs, connecting the dropout layer and the fully linked layer. The softmax output is located in the final part of the structure, for which a completely connected layer is used. Thus, finally, the correct answer for the given question is അടുക്കള (adukkala, (Kitchen)), which is predicted by using the fully connected layer of CNN.

Institutions

Kannur University

Categories

Hybrid Modeling, Convolutional Neural Network, Deep Learning, Indian Language, Bidirectional Encoder Representations From Transformers, Bidirectional Long Short-Term Memory Network

Licence