libras_bilingual_dataset

Published: 25 April 2025| Version 1 | DOI: 10.17632/v7gyc4fvrc.1
Contributors:
,
, Samuel Moreira,
, Renan Costa,
,
,
,
,

Description

The libras_bilingual_dataset contains 68,029 aligned sentence pairs in Brazilian Portuguese and Libras gloss, a linear, written representation of Brazilian Sign Language. Provided in UTF-8 encoded Comma-Separated Values (CSV) format, it includes two columns: portuguese: sentences in Brazilian Portuguese libras: corresponding translations in Libras gloss format This dataset supports research in machine translation, sign language technologies, and accessible communication systems.

Files

Institutions

  • Universidade Federal da Paraiba

Categories

Natural Language Processing, Machine Translation, Sign Language, Portuguese Language

Funders

  • Secretaria Nacional dos Direitos das Pessoas com Deficiência (SNDPD), Ministério dos Direitos Humanos e da Cidadania, Brasil
  • Secretaria de Governo Digital (SGD), Ministério da Gestão e da Inovação em Serviços Públicos, Brasil

Licence