Data relating to public procurement criteria in public tendering proceedings in Poland

Published: 4 March 2025| Version 2 | DOI: 10.17632/xk8db3p74p.2
Contributors:
,
,

Description

The data is available in the following XLSX files: 1) Set of unique criteria in Unique criteria.xlsx file, 2) Test dataset matrix in test dataset matrix.xlsx file, 3) Results of Lingo clustering algorithm in Lingo Max top lvl passes 0.xlsx and Lingo Max top lvl passes 4.xlsx files, 4) Results of cosine grouping in cosine 0.6 grouping.xlsx and cosine 0.8 grouping.xlsx files

Files

Steps to reproduce

The dataset was obtained from the Supplement to Tenders Electronic Daily (TED). The classification tasks used lemmatization, embeddings, and cosine similarity between embeddings where distances of 0.6 and 0.8 were checked. For comparison, the Lingo clustering algorithm was used. The work analyzed 113 373 proceedings in which 25 535 unique criterion names were used. The guide for non-price criteria mentioned in the introduction was created with the participation of experts and analysts as a result of a hand-made review of the database of proceedings in the Public Procurement Bulletin. There are 49 505 procurement notices in the Bulletin from 1 January 2023 to 28 May 2023 only. From 2021, when the new version of the Bulletin is available - there are 352 804 announcements. The data were collected from the supplement to Tenders Electronic Daily on European public procurement. This database is hosted on the official website of the European Union, which provides data on European countries in many different domains. The data is provided in the XML document format, so it was transferred to an SQL database for the analysis. Announcements in which the contracting authority is an entity from Poland were downloaded and converted: 415 173. The contents of public proceedings announcements are structured in the form of an XML document. Each XML includes one announcement, which may consist of one or more parts of the announcement. Each part contains one or more criteria (the minimum is the price criterion). Preparing the data for analysis thus begins with extracting the criteria from the individual parts in each XML file.

Institutions

Uniwersytet Szczecinski

Categories

Multiple-Criteria Decision Analysis, Tendering

Funding

National Science Center

2022/45/N/HS4/03050

Licence