Datasets and R Markdown files for the article "Survey on critical results management in Brazilian clinical laboratories: Profiling practices through multivariate analysis, prioritization, and a 'New Statistics' approach" submitted to Clinica Chimica Acta

Published: 11 February 2025| Version 3 | DOI: 10.17632/frjs435wc8.3
Contributors:
,
,
,
,
,
,
,
,
,
,
,

Description

This repository contains supplementary materials related to the study "๐’๐ฎ๐ซ๐ฏ๐ž๐ฒ ๐จ๐ง ๐œ๐ซ๐ข๐ญ๐ข๐œ๐š๐ฅ ๐ซ๐ž๐ฌ๐ฎ๐ฅ๐ญ๐ฌ ๐ฆ๐š๐ง๐š๐ ๐ž๐ฆ๐ž๐ง๐ญ ๐ข๐ง ๐๐ซ๐š๐ณ๐ข๐ฅ๐ข๐š๐ง ๐œ๐ฅ๐ข๐ง๐ข๐œ๐š๐ฅ ๐ฅ๐š๐›๐จ๐ซ๐š๐ญ๐จ๐ซ๐ข๐ž๐ฌ: ๐๐ซ๐จ๐Ÿ๐ข๐ฅ๐ข๐ง๐  ๐ฉ๐ซ๐š๐œ๐ญ๐ข๐œ๐ž๐ฌ ๐ญ๐ก๐ซ๐จ๐ฎ๐ ๐ก ๐ฆ๐ฎ๐ฅ๐ญ๐ข๐ฏ๐š๐ซ๐ข๐š๐ญ๐ž ๐š๐ง๐š๐ฅ๐ฒ๐ฌ๐ข๐ฌ, ๐ฉ๐ซ๐ข๐จ๐ซ๐ข๐ญ๐ข๐ณ๐š๐ญ๐ข๐จ๐ง, ๐š๐ง๐ ๐š '๐๐ž๐ฐ ๐’๐ญ๐š๐ญ๐ข๐ฌ๐ญ๐ข๐œ๐ฌ' ๐š๐ฉ๐ฉ๐ซ๐จ๐š๐œ๐ก". The dataset, figures, exported results, and analysis scripts are included to ensure full transparency and reproducibility of the research findings. ๐…๐จ๐ฅ๐๐ž๐ซ ๐’๐ญ๐ซ๐ฎ๐œ๐ญ๐ฎ๐ซ๐ž 1_๐ƒ๐š๐ญ๐š๐ฌ๐ž๐ญ/ This folder contains the dataset used in the study, formatted for direct use in the Feature Priorizer R Markdown script. 2_๐…๐ข๐ ๐ฎ๐ซ๐ž๐ฌ/ All figures generated by the Feature Priorizer are stored here in 600 DPI resolution, ensuring high-quality graphics for publication and analysis. 3_๐„๐ฑ๐ฉ๐จ๐ซ๐ญ๐ž๐/ This folder contains the exported results, including statistical outputs, tables, and processed datasets derived from the analyses. 4_๐’๐ฎ๐ฉ๐ฉ๐ฅ๐ž๐ฆ๐ž๐ง๐ญ๐š๐ซ๐ฒ_๐…๐ข๐ฅ๐ž๐ฌ/ This folder contains auxiliary files used in generating the Feature Priorizer HTML report, ensuring an enhanced visual presentation and incorporating dynamic statistical quotes. โ€“ ๐ฌ๐ญ๐ฒ๐ฅ๐ž๐ฌ.๐œ๐ฌ๐ฌ: Defines the formatting of the HTML report, ensuring a consistent visual presentation. logo.html, logo.png, logo.txt โ€“ Files related to the project's visual identity. โ€“ ๐’๐œ๐ข๐ž๐ง๐œ๐ž_๐’๐ญ๐š๐ญ๐ฌ_๐‘๐ž๐Ÿ๐ฅ๐ž๐œ๐ญ๐ข๐จ๐ง๐ฌ.๐ฃ๐ฉ๐ž๐ : An image displayed in the HTML report, complementing the section on statistical and scientific reflections. โ€“ ๐ฌ๐ญ๐š๐ญ๐ช๐ฎ๐จ๐ญ๐ž_๐œ๐ฒ๐œ๐ฅ๐ž_๐ฌ๐ญ๐š๐ญ๐ž.๐ซ๐๐ฌ: An RDS file that stores the state of the statistical quotes cycle. This file is dynamically updated to prevent repetitions, ensuring that the quotes presented in the report change with each execution. 5_๐…๐ž๐š๐ญ๐ฎ๐ซ๐ž ๐๐ซ๐ข๐จ๐ซ๐ข๐ณ๐ž๐ซ โ€“ ๐‘ ๐Œ๐š๐ซ๐ค๐๐จ๐ฐ๐ง ๐’๐œ๐ซ๐ข๐ฉ๐ญ The "Feature Priorizer" is an R Markdown-based analytical pipeline (Script_Feature_Prioritizer.Rmd) developed to perform the full multivariate analysis workflow presented in the study. The script integrates: A) Dimensionality reduction (Logistic PCA) B) Unsupervised clustering (K-Means) C) Feature prioritization using the Nihans Index and Pareto Analysis D) Statistical and practical significance assessment (Chi-square test, Cohen's h) E) Automated report generation in HTML format, including figures and tables 6_๐…๐ข๐ฅ๐ž๐ฌ ๐‘๐ž๐ฅ๐š๐ญ๐ž๐ ๐ญ๐จ ๐ญ๐ก๐ž ๐…๐ž๐š๐ญ๐ฎ๐ซ๐ž ๐๐ซ๐ข๐จ๐ซ๐ข๐ณ๐ž๐ซ โ€“ ๐’๐œ๐ซ๐ข๐ฉ๐ญ_๐…๐ž๐š๐ญ๐ฎ๐ซ๐ž_๐๐ซ๐ข๐จ๐ซ๐ข๐ญ๐ข๐ณ๐ž๐ซ.๐‘๐ฆ๐: The R Markdown script that executes the entire analytical pipeline โ€“ ๐’๐œ๐ซ๐ข๐ฉ๐ญ_๐…๐ž๐š๐ญ๐ฎ๐ซ๐ž_๐๐ซ๐ข๐จ๐ซ๐ข๐ญ๐ข๐ณ๐ž๐ซ.๐ก๐ญ๐ฆ๐ฅ: The automatically generated HTML report containing all results, figures, and statistical summaries โ€“ ๐ˆ๐ง๐ฌ๐ญ๐š๐ฅ๐ฅ_๐ฉ๐š๐œ๐ค๐š๐ ๐ž๐ฌ.๐‘๐ฆ๐: A helper script that installs all necessary R packages for running the Feature Priorizer

Files

Steps to reproduce

To reproduce this study using the "Feature Priorizer" R Markdown tool, researchers should follow these steps: ๐€) ๐ƒ๐š๐ญ๐š ๐‚๐จ๐ฅ๐ฅ๐ž๐œ๐ญ๐ข๐จ๐ง: Use the questionnaire provided in the study to collect responses from laboratories. Ensure that responses are recorded consistently to facilitate processing. ๐) ๐ƒ๐š๐ญ๐š ๐“๐ซ๐š๐ง๐ฌ๐Ÿ๐จ๐ซ๐ฆ๐š๐ญ๐ข๐จ๐ง & ๐…๐ž๐š๐ญ๐ฎ๐ซ๐ž ๐„๐ง๐ ๐ข๐ง๐ž๐ž๐ซ๐ข๐ง๐  (๐ƒ๐š๐ญ๐š ๐ฉ๐ซ๐ž๐ฉ๐š๐ซ๐š๐ญ๐ข๐จ๐ง): Convert the collected responses into structured features. In this study, 60 features were created, but additional features may be defined depending on the analytical context. Each feature should be encoded as a binary variable (Yes/No format) following the methodology applied in this study. To enhance interpretability and standardization, we recommend naming each feature using Lexical Blendsโ€”formed by merging parts of two or more wordsโ€”following the approach used in this study. This helps create intuitive and meaningful labels for each feature. ๐‚) ๐ƒ๐š๐ญ๐š๐ฌ๐ž๐ญ ๐…๐จ๐ซ๐ฆ๐š๐ญ๐ญ๐ข๐ง๐  & ๐Ž๐ซ๐ ๐š๐ง๐ข๐ณ๐š๐ญ๐ข๐จ๐ง: Save the dataset as an Excel file (.xlsx format) and place it inside the "1_Dataset" folder. Then, define: ๐‚.๐Ÿ) The file name of the XLSX dataset; ๐‚.๐Ÿ) The worksheet name (spreadsheet tab) within the file. ๐ƒ) ๐๐ข๐ฉ๐ž๐ฅ๐ข๐ง๐ž ๐Ÿ๐จ๐ซ ๐‘๐ฎ๐ง๐ง๐ข๐ง๐  ๐ญ๐ก๐ž "๐…๐ž๐š๐ญ๐ฎ๐ซ๐ž ๐๐ซ๐ข๐จ๐ซ๐ข๐ณ๐ž๐ซ" ๐‘ ๐Œ๐š๐ซ๐ค๐๐จ๐ฐ๐ง ๐“๐จ๐จ๐ฅ ๐ƒ.๐Ÿ) ๐Ž๐ฉ๐ž๐ง ๐ญ๐ก๐ž ๐‘ ๐๐ซ๐จ๐ฃ๐ž๐œ๐ญ: Locate and open the Project_Critical_Results.Rproj file. This will launch RStudio with the correct working directory. ๐ƒ.๐Ÿ) ๐ˆ๐ง๐ฌ๐ญ๐š๐ฅ๐ฅ ๐‘๐ž๐ช๐ฎ๐ข๐ซ๐ž๐ ๐๐š๐œ๐ค๐š๐ ๐ž๐ฌ: Open Install_packages.Rmd in RStudio; Click on "Knit" to install all required R packages. ๐ƒ.๐Ÿ‘) ๐„๐ฑ๐ž๐œ๐ฎ๐ญ๐ž ๐ญ๐ก๐ž "๐…๐ž๐š๐ญ๐ฎ๐ซ๐ž ๐๐ซ๐ข๐จ๐ซ๐ข๐ณ๐ž๐ซ" ๐’๐œ๐ซ๐ข๐ฉ๐ญ: Open Script_Feature_Prioritizer.Rmd in RStudio; Click on "Knit" to execute the script and generate the output report.

Categories

Algorithms, Machine Learning, Principal Component Analysis, Biostatistics, Laboratory Assessment

Licence