Dataset of the double moon classification problem and the structure of the neural networks that perform the classification

Published: 5 January 2022| Version 1 | DOI: 10.17632/pkxg7y9msz.1
Contributors:
Arpad Török,

Description

Description: The attached files contain the data points of the double moon classification problem and the neural network that performs the classification. The double moon is a commonly used binary classification problem for neural networks. The size of the two semicircular arcs can be set by the parameters r and w, which give the radius and the width of the arcs, in our case r = 10 and w = 3. The parameter d defines the position of the classes relative to each other. If d is positive, then the two classes are linearly separable. If it is negative, then the two moons "intersect" and are nonlinearly separable. In this example, d = -5. Neural networks can learn nonlinear decision boundaries, so they can be used to solve this problem. We attach some neural networks that classify these data points with 100% accuracy. 2_moons.txt: The first column contains the "x" coordinate, the second column the "y" coordinate, and the third column indicates the class of the data point. x.csv: The x in the file name indicates the number of hidden layers in the network and the number of hidden neurons within a hidden layer. A special format was used to save the network. The first column contains the number of neurons per layer, including input and output layers. Since we used fully connected feedforward neural networks, it is not necessary to save the endpoints of the edges. The second column contains the bias in the neurons, the bias in the input neurons is not interpreted. Finally, the third column contains the weights of the network.

Files

Steps to reproduce

data generation (the uploaded dataset can be used) training and testing the neural networks

Institutions

Budapesti Muszaki es Gazdasagtudomanyi Egyetem

Categories

Machine Learning, Classification System, Testing and Validation

Licence