Evaluating the Accuracy of Large Language Models in Predicting ICD-10 and CPT Billing Codes for Outpatient Dermatology Notes
Description
1. Supplemental Figure I: A PDF compilation of test cases 1-8, including their ground truth ICD-10 and CPT codes. Each test case represents a dermatology patient encounter, detailing history, examination findings, assessment, and management plan. A corresponding table lists the associated ICD-10, CPT, and J codes for billing purposes. 2. Supplemental Figure II: A heatmap visualizing model accuracy across various test cases. The color scale ranges from blue (highest accuracy) to red (lowest accuracy). Each row represents a model-case combination, while the columns correspond to different grading schemas. "+P" indicates scoring with punishment, while "-P" represents scoring without punishment. 3. Supplemental Figure III: A graphical representation of under- and overprediction frequencies for CPT codes generated by each model. Overpredictions refer to CPT codes exceeding the board-certified dermatologist's ground truth, while underpredictions indicate missing or incomplete CPT code assignments. The figure displays total predictions across eight cases over five trials, with percentages calculated accordingly.