Windshield recycling PATSTAT data

Published: 21-10-2019| Version 1 | DOI: 10.17632/4r3frnc9yw.1
Riccardo Priore


The dataset includes several Excel sheets, each listing the records obtained from Patstat online database ( by means of SQL queries run in Patstat online (Autumn 2017 ed.). Such investigation is aimed at retrieving patent documents data including in particular: • the individual patent application identification code (appln_id attribute in the table), • the application authorities (i.e. the patent offices to whom the patent applications are filed), • the earliest filing year, • the docdb_family_id code univocally identifying a single patent family (a.k.a. DOCDB family), • the patent applications titles and abstracts, • the classification codes (IPC and CPC, respectively) assigned to each patent application, • the patent publication authority, • the patent publication number, • the patent publication kind, • the denomination of each assignee/applicant (psn_name) and • the type of applicant (psn_sector allowing for distinguishing companies, universities, individuals etc. among applicants/assignees). PDF files (named as the corresponding Excel) may be present. Each provides lists of the documents downloadable from PATSTAT online. The number of documents corresponds to the amount of non-duplicated appln_id values (eg. 1125 for PHASE A in the pdf, as arguable from the PHASE A Excel file). URLs are provided for each record to link to Espacenet ( • PHASE A TABLE concern a broad search regarding glass recycling. • PHASE B concerns a more specific search, based on keywords AND IPC/CPC codes, regarding recycling of laminated glass by means of mechanical comminution. Data rather based on keywords are argued from PHASE B_query A. Records selected by means of cpc code Y02W30/521 are included in PHASE B Query B TABLE. • PHASE C Query C1 TABLE concerns recycling of glass with poly-vinyl-butyral interlayer (PVB) • PHASE C Query C2’ TABLE concerns recycling of glass with ethyl-vinyl-acetate interlayer (EVA) • PHASE C Query C2’’ TABLE concerns recycling of glass with (thermoplastic) poly-urethane interlayer (TPU). One file with SQL scripts not detailed in other publications is included (PATSTAT SCRIPTS). The file contains two scripts to be run with Patstat: the former is aimed at ranking the patent families depending on the co-occurrence of IPC classification codes, while the latter is aimed at ranking the granted patents depending on the number of annual fee payments and the date of patent validity expiration. SQL scripts are included in a manuscript submitted to Data in Brief journal, accompanying the manuscript submitted to World Patent Information for review (WPI_2018_95 _R1).


Steps to reproduce

The list of SQL queries necessary to get the records included in Excel is provided in the DATA IN BRIEF MANUSCRIPT and can be run in Patstat online as such, or with slight modification, depending on the attributes of particular interest. Basic information regarding the syntax of the SQL language and the features of patent data downloadable by means of Patstat are provided in the following URL: The raw data of each Excel file can be used to argue the following information (not exhaustive list): 1) The count of non-duplicated patent applications (see also corresponding PDF files) or patent families (DOCDB) 2) the trend over time (i.e. earliest filing years) of the number of filing events (by counting unique appln_id codes). 3) the distribution of the applications according to filings to each national/regional authority. In addition, the data can be partitioned according to a time interval based on the earliest_filing_year values. 4) the fragmentation of the applicants/assignees: to this aim, the psn_kind attributes can be used in order to partition the docdb_family_id values according to the type of applicant (by means of a 'manual' elaboration of such data). As alternative, an SQL script to be used with Patstat online is included in the manuscript submitted to World Patent Information and entitled "A patent intelligence analysis aimed at identifying eco-friendly methodologies for recycling PVB to be used as windscreens interlayer". Such script allows to partition the psn_sector attributes based on the number of docdb_families corresponding to each applicant type (i.e. the number of patent families where a COMPANY is included among the applicants, the number of patent families where a UNIVERSITY is included among the applicants, the number of patent families where one or more INDIVIDUALS – typically the (co)-inventors - are included among the applicants etc.). 5) the IPC and the CPC classification codes listed allow the user to calculate the most frequently assigned classification codes (either IPC or CPC) to the patent applications, as clearly explained in the manuscript submitted to WPI journal.