A Multifaceted Approach to Gender Bias Detection in Bengali

Published: 11 July 2025| Version 1 | DOI: 10.17632/dj3745p2cy.1
Contributors:
,
,

Description

This project explores the critical issue of gender bias in the Bengali language by taking a multidimensional approach to detection. In a world where language reflects and shapes our social realities, it's essential to identify and address biases that can influence perceptions and reinforce stereotypes. By combining techniques from natural language processing (NLP), machine learning, and linguistic analysis, the project aims to uncover both overt and subtle forms of gender bias in Bengali texts—ranging from news articles and literature to social media content. It investigates how language use may differ based on gender representation and aims to build tools or models that can flag biased or discriminatory expressions. The ultimate goal is not only to detect bias but also to raise awareness and contribute to more inclusive and fair language practices in Bengali-speaking communities. Description of each column: ID: Serial number Text: Sentence or phrase in Bengali Label: "Biased" or "Unbiased" Gendered_Word: Word or phrase causing bias (if any) Bias_Type: Stereotype / Occupational Bias / Honorific Bias / Pronoun Bias / Neutral Source: News, Social Media, Literature, etc. Correction_Suggestion: Suggestion to neutralize the bias Structure of the Dataset: Format: CSV Rows: 2,451 (individual Detecting gender bias in Bengali language texts.) Columns: 7 (whether biased or not, the biased word or phrase, type of bias, and the source of the sentence)

Files

Institutions

  • Daffodil International University

Categories

Reference Source, Source Effect, Source (History)

Licence