Architecture for a Trustworthy Quantum Chatbot [dataset]
Description
This dataset contains all the materials and results used in the development and empirical validation of C4Q 2.0. The files are organized into two main directories: bakend_testing and empirical_validation. Below is a brief overview of their contents: - app/ - Contains the full source code of C4Q at the state of the software at the time of submitting this paper, allowing for reproducibility and further development. The frontend’s node_modules directory is not included, but these dependencies can be generated by following the installation instructions provided in the README.md file. - README.md – A guide detailing how to locally set up and run C4Q. - bakend_testing/ - Contains data from the evaluation of C4Q’s backend components: *reportBackendC4Q2.0.html: An HTML report generated by running 189 tests on the backend of C4Q. *classLLM_20241110101327.pth_training_metrics.csv: A CSV file documenting the Classification LLM’s training and validation metrics, including training loss, validation loss, training accuracy, and validation accuracy for each epoch. *qaLLM_evaluation_metrics.txt: A text file listing exact match and F1 metrics per epoch for the QA LLM. - create_data/ – Includes the scripts used to generate and curate training data for the classification LLM and the QA LLM. - empirical_validation/ - Includes data from the empirical evaluation of C4Q against other chatbots: *Directories named by model (e.g., openai-o1/, deepseek-coder_33b/, deepseek-r1/, etc.): Each directory contains the raw answers produced by the respective model in response to our set of quantum computing and software engineering questions. *prompts.txt: A text file with the full list of prompts used during the empirical evaluation. *requirements_qiskit0.46.3: A requirements file for Python dependencies used to create an environment with a Qiskit versions <1.0.0, enabling code snippet testing under older Qiskit releases. *requirements_qiskit1.3.1: A requirements file for Python dependencies used to create an environment with a Qiskit versions >= 1.0.0, ensuring reproducible tests under newer releases. *results.xlsx: An Excel spreadsheet containing the empirical evaluation outcomes, including correct, incomplete, and incorrect answer rates for each model, under both Qiskit environments. *script_gates.sh: A shell script that automates prompting of OLAMA’s deepseek-coder:33b and starcoder2:15b models with gate-related quantum questions. *script_SE.sh: A shell script that automates prompting of OLAMA’s deepseek-coder:33b and starcoder2:15b models with software engineering problem questions.