Data for: Local fuzzy geographically weighted clustering: A new method for geodemographic segmentation.

Published: 29 July 2020| Version 1 | DOI: 10.17632/kd5xprhv65.1
Contributor:

Description

This dataset (compressed rar file) includes the Matlab code files for "Local Fuzzy Geographically Weighted Clustering" algorithm and a shapefile containing socio-demographic data and cancer incident data across 973 block groups in Manhattan, New York. The files are: 1. LFGWC.m = The Matlab code of LFGWC (Local Fuzzy Geographically Weighted Clustering) 2. LFGWC_Call.m = The file to run the above code 3. validity.m = The Matlab code for validating the clustering output 4. licence.txt = The file describing the license terms 5. Demo = Dataset folder Demo folder contains the following: 1. Data.txt = The non-normalized dataset 2. Population.txt = Population for each polygon 3. Distance.txt = Distance among all objects 4. Centroid.txt = Initial cluster centres 5. Shapefile: Manhattan_Data.shp The shapefile has been originally downloaded from a benchmark dataset of small-area cancer incidence (Boscoe et al. 2016). The benchmark dataset includes 524,503 tumors across 13,823 block groups for the entire New York State diagnosed between 2005 and 2009 (download link: https://www.satscan.org/datasets/nyscancer/index.html). Manhattan_Data.shp shapefile includes only the county of Manhattan and not the entire NY. Data have undergone slight modifications that are explained in detail in the paper. Attributes of Manhattan_Data.shp: DOHREGION Geographic identifier CODE Unique ID code for joining data POPULATION Total population (2010 Census) White_Pop % white alone population (2010 Census) Black_Pop % black alone population (2010 Census) Asian_Pop % Asian alone population (2010 Census) Other_Pop % other race population (2010 Census) Hispanic % Hispanic population (2010 Census HH_Size Persons per household (2010 Census) LT_HS % population less than high school education (25 & over) Under_Pov % population under poverty (2006-2010 ACS Data) BC_Rate Incidents of breast cancer per 1000 people PC_Rate Incidents of prostate cancer per 1000 people TC_Rate Total cancer incidents per 1000 people For more information on the original benchmark dataset visit: https://www.satscan.org/datasets/nyscancer/index.html

Files

Categories

Geographic Information Systems, Fuzzy Set, Geodemography

Licence