HugSelect Datasets
Description
HugSelect Dataset is a structured dataset supporting the development and evaluation of HugSelect, a multi-criteria decision-making (MCDM) framework for foundation model selection. It enables reproducible research on transparent, task-specific recommendation of AI foundation models. The dataset integrates heterogeneous information from Hugging Face model repositories, including: (1) raw metadata, model descriptions, and community reviews. (2) processed representations generated through automated pipelines, where unstructured information is transformed into standardised features such as functional features and quality mappings. The dataset also includes information regarding the 3 step evaluations of the HugSelect: (3) pipeline validation data, which is used to assess the accuracy and reliability of feature extraction methods (4) a case study dataset containing task-based scenarios with reference model selections and evaluation of recommendations with the curated dataset. (5) evaluation results from TAM based user studies involving AI practitioners. Data is provided in structured formats (CSV/JSON) and organized into raw data, processed data, validation results, case studies, and user study. Limitations include potential noise in extracted features due to variability of LLM outputs and temporal variability due to evolving model repositories.
Files
Steps to reproduce
Raw Information: HF-Models-Y1.zip = initial collected metadata and descriptions for 2 million models united_reviews.json.zip = initial collected reviews for 70k models Processing Pipelines: HF-Models-T7-U.zip = filtered 70k models with functional features, quality attributes and model clusters + Functional Features model_ffs_new.csv.zip = final extracted functional features from descriptions NP_X1.json.zip = initial set of noun phrases NP_X5.json.zip = set of functional features before LLM filtering NP_GG_fin.json.zip = final set of functional features after LLM filtering + Clustering family_assignments_organized_again.csv = model clusters: modality, task, family + Quality Attributes united_f2.zip = reviews after mentioned filtering quality_mapping_output.csv = quality mappings and sentiments per review quality_mapping_output_AB50_all_fuzzy_full = fuzzy aggragate of quality mappings based on sentiment per model Pipeline Validation: + Functional Features functional_validation_overall = overall stats for ff functional_validation_samples = per sample comparison + Quality_Attributes sentiment_validation_overall = overall stats for ff sentiment_validation_samples = per sample comparison quality_validation_overall = overall stats for ff quality_validation_samples = per sample comparison Case Study: + Dataset Curation Initial_Curated_Cases = individually assessed papers after automatic filtering processes Case_Papers = snippeds from the papers where the utilised models are mentioned Final_Cases = retained 44 cases of model selections from scientific papers + Recommendation Evaluation Results_CS_detailed_examples = recommendations based on CS from Hugselect and baseline LLMS, each recoomendation named Results_CS_detailed_rankings = recommendations based on CS from Hugselect and baseline LLMS, each recoomendation ranked Results_CS_summary_family = recommendations based on CS from Hugselect and baseline LLMS, overall evaluation stats Results_CS_summary_modelid = recommendations based on CS from Hugselect and baseline LLMS, overall evaluation stats model family focused + Ablation Study Each of these files are also present for ablation study: - HugSelect without functional features - HugSelect without quality attributes - HugSelect without neither