Clinical and sociodemographic data on congenital syphilis cases, Brazil, 2013-2021

Published: 30 September 2022| Version 1 | DOI: 10.17632/3zkcvybvkz.1
Contributors:
,
,
,
,
,
,
,
,

Description

This data set presents clinical and sociodemographic data regarding antenatal care, pregnant women's outcomes, and their children's Venereal Disease Research Laboratory (VDRL) test results, which is a screening test for congenital syphilis at childbirth, from the cities served by the Mãe Coruja Pernambucana Program (PMCP) in the State of Pernambuco, Brazil, between the years of 2013 and 2021. There are 41,762 records and 26 attributes in the data set, including 826 positive cases and 40,936 cases of congenital syphilis. The PMCP is a social program created by the Brazilian government of the state of Pernambuco that aims to offer support to pregnant women before and after the birth of their children through Sistema Único de Saúde (SUS), the Brazilian public healthcare system, monitoring the evolution of children up to the age of 5 years and ensuring their healthy and harmonious development during the first years of life. This data set contains two CSV files: the "data_set.csv" file contains the pre-processed data set, and the "attributes.csv" file contains information about each attribute.

Files

Steps to reproduce

The data was extracted from the information system of the PMCP, named Sistema de Informação do Mãe Coruja (SIS-MC), and contains clinical and sociodemographic data regarding antenatal care, pregnant women's outcomes, and their children in the cities served by the PMCP in the State of Pernambuco, Brazil, between 2013 and 2021. Initially, all the data from the SIS-MC was unified, resulting in 204,543 records and 218 attributes as part of the data pre-processing. To clean the data set, we eliminated attributes with more than 70% missing data that were unrelated to antenatal care, pregnant women's outcomes, and their children's Venereal Disease Research Laboratory (VDRL) test results, therefore lowering the dimensionality to 27 attributes. The target attribute (VDRL test result) has 827 positive cases of congenital syphilis, 40,992 negative cases, and 162,724 empty records (that were removed from the data set). 11 records were removed with an empty value for the family income indicator, as informed in the pregnant woman's record in SIS-MC, which resulted in a data set with 41,808 records. There are 827 positive cases of congenital syphilis, 40,992 negative cases, and 162,724 empty records (that were removed from the data set). According to the pregnant woman's record in SIS-MC, 11 records with an empty value for the family income indicator were eliminated, resulting in a dataset with 41,808 records. After analyzing the baseline characteristics of the data set with the assistance of health experts (stakeholders) from the PMCP, some records considered to be outliers were also removed, referring to (i) pregnant women born before 1960 and after 2020; (ii) family income disclosed at antenatal care greater than 20,000; and (iii) number of household residents greater than 20. To create the age attribute of pregnant women when they were attended by the PMCP, the date of registration of the pregnancy in the SIS-MC was subtracted from the date of birth of the pregnant woman. These two parameters (date of registration and date of birth) were subsequently removed from the data set. We also changed attributes such as the number of abortions, number of living children, number of pregnancies, and number of household members from numerical to categorical. At the end, the data set contained 41,762 records and 26 attributes. At last, we created a new category to fill in the empty data. We use this strategy because all our attributes were binary or categorical (except for the age attribute).

Institutions

Universidade Federal de Pernambuco, Universidade Federal da Paraiba, Secretaria de Saude do Estado de Pernambuco, Universidade de Pernambuco, Universitat Autonoma de Barcelona

Categories

Congenital Anomaly, Sexually Transmitted Infection, Brazil, Syphilis, Database

Licence