A Golden Set of Problem, Solution, Advantages Senteces of the Patents

Published: 31 August 2022| Version 2 | DOI: 10.17632/kpxdzkgs3j.2
Vito Giordano,


This data contains two different dataset: (1) Golden Set is dataset of sentences tagged as (A) technical problem; (B) solution to the problem; and (C) advantageous effect of the invention. The dataset is based on a selectively extracted collection from the United States Patent and Trademark Office (USPTO) curated by Chikkamath, R., Parmar, V. R., Hewel, C., & Endres, M. (2021). The dataset is available upon request. (2) Test data is a database 400 random patent grants and patent applications downoladed from USPTO. We use this data for evaluating a transformer-based language models developed for extracting problems, solutions and advantages on a real case use in an open ended domain.



Universita degli Studi di Pisa


Patent, Natural Language Processing, Problem Solving, Text Mining