Dataset of Software Self-evolution with SelfEvolve: an Agentic Architecture for Runtime Code Generation

Published: 13 October 2025| Version 1 | DOI: 10.17632/pysnkp9n23.1
Contributor:
Md Asif Fahim

Description

A collection of 11 curated datasets for evaluating AI agents capable of runtime self-evolution through autonomous code generation, codebase integration, and cross-session composition. Used in ICSE 2026 paper "Self-evolving Systems: a Runtime Architecture for Autonomous Code Generation". Dataset Categories: Integration Tasks (4 datasets): Multi-file codebases (577-783 LOC, 7-20 classes) • Patient Risk Analyzer (Healthcare) - 752 LOC, 12 files, 20 classes • Student GPA Calculator (Education) - 783 LOC, 10 files, 10 classes • Salary Analyzer (HR Analytics) - 616 LOC, 13 files, 12 classes • Inventory Low Stock Alert (E-commerce) - 577 LOC, 8 files, 7 classes Compositional Tasks (3 datasets): Cross-session capability building • Matrix Eigenvalue, Portfolio Risk, IoT Sensor Pipeline Data Processing Tasks (4 datasets): External data manipulation • Book Recommender (100 books CSV), Friend Suggester (100 users JSON) • Movie API (100 movies CSV), Performance Tracker (50 reviews CSV) Statistics:65 files, ~80 KB compressed, 49 Python classes, 2,685 LOC, 9 domains (Healthcare, Education, HR Analytics, E-commerce, Finance, Linear Algebra, IoT, Media, Social Networks) Contents: Each dataset includes problem.json (task specification), Python source files (PEP 8 compliant with type hints), __init__.py (package exports), and data files (CSV/JSON where applicable). Evaluation Results: Pass@1: 92.7% (WITH TDD) vs 72.7% (WITHOUT TDD); Avg iterations: 2.2 vs 4.7; Wilcoxon test: W=55, p<0.001, r=0.98 Dependencies: Python 3.8+, minimal (NumPy for 1 dataset) License:CC BY 4.0

Files

Institutions

  • University College Dublin

Categories

Software Engineering

Licence