Data Normalization Method for Geo-Spatial Analysis on Ports
Based on open access data, 79 Mediterranean passenger ports are analyzed to compare their infrastructure, hinterland accessibility and offered multi-modality categories. Comparative Geo-spatial analysis is also carried out by using the data normalization method in order to visualize the ports' performance on maps. These data driven comprehensive analytical results can bring added value to sustainable development policy and planning initiatives in the Mediterranean Region. The analyzed elements can be also contributed to the development of passenger port performance indicators. The empirical research methods used for the Mediterranean passenger ports can be replicated for transport nodes of any region around the world to determine their relative performance on selected criteria for improvement and planning. The Mediterranean passenger ports were initially categorized into cruise and ferry ports. The cruise ports were identified from the member list of the Association for the Mediterranean Cruise Ports (MedCruise), representing more than 80% of the cruise tourism activities per country. The identified cruise ports were mapped by selecting the corresponding geo-referenced ports from the map layer developed by the European Marine Observation and Data Network (EMODnet). The United Nations (UN) Code for Trade and Transport Locations (LOCODE) was identified for each of the cruise ports as the common criteria to carry out the selection. The identified cruise ports not listed by the EMODnet were added to the geo-database by using under license the editing function of the ArcMap (version 10.1) geographic information system software. The ferry ports were identified from the open access industry initiative data provided by the Ferrylines, and were mapped in a similar way as the cruise ports (Figure 1). Based on the available data from the identified cruise ports, a database (see Table A1–A3) was created for a Mediterranean scale analysis. The ferry ports were excluded due to the unavailability of relevant information on selected criteria (Table 2). However, the cruise ports serving as ferry passenger ports were identified in order to maximize the scope of the analysis. Port infrastructure and hinterland accessibility data were collected from the statistical reports published by the MedCruise, which are a compilation of data provided by its individual member port authorities and the cruise terminal operators. Other supplementary sources were the European Sea Ports Organization (ESPO) and the Global Ports Holding, a cruise terminal operator with an established presence in the Mediterranean. Additionally, open access data sources (e.g. the Google Maps and Trip Advisor) were consulted in order to identify the multi-modal transports and bridge the data gaps on hinterland accessibility by measuring the approximate distances.
Steps to reproduce
Data Normalization for Geo-Spatial Analysis: The geo-spatial analyses were carried out by restructuring the database into common numeric proxy values by using the data normalization method. Therefore, data collected for each criteria (Table 2) corresponding to each categories for the individual ports were transformed into a normalized proxy value. These proxy values for each port were added together in order to analyze the level of their individual data categories. Furthermore, these total individual enabling factor proxy values for each port were summed up together in order to analyze the cumulative level for each category. Datasets on port infrastructure category (Table A1), hinterland accessibility category (Table A2), and passenger multi-modality category (Table A3) were represented sequentially as “α”, “β”, and “γ”. Corresponding to each port, the numeric value in a cell of “α” and “β” was represented by “n”. For the multi-modality database “γ”, only the affirmative values “Y” (Table A3) in a cell were replaced with “1”. By representing the minimum and maximum values in the same data column of “n” as “Vmin” and “Vmax”, the data for each cell (n) of “α” and “β” was normalized by using the following formula: NC = (n – Vmin) / (Vmax – Vmin), (1) and for “γ”, NC = 1 or 0 Where Nc is the normalized individual data cell for a port. Representing the data columns of each category datasets as “C1, C2, …., CL”; the normalized individual data cells in the same row corresponding to a port (NC) were added together by using the following formula: Np = ∑NC = NCC1 + NCC2 + ….+ NCCL (2) Where Np is the total normalized value for a port which adds together all the normalized individual cell values (Nc) under each criteria (Table 2) data column corresponding to an individual category. Equation 2 was replicated for each category to calculate the total normalized value of each port resulting in “Npα” for “α”, “Npβ” for “β”, and “Npγ” for “γ”. As accessibility is inversely proportional to the distance , the “Npβ” was considered as a negative value. Therefore, the consolidated normalized value involving all the analyzed criteria for each port was calculated by using the following formula: Ncon = Npα + (– Npβ) + Npγ (3) Where Ncon is the consolidated normalized value for each port. The normalization process is illustrated for the passenger port of Venice (Table 3), which serves both the cruise and ferry traffic. The normalized values for each cell were calculated using the equation 1, except for the “γ” as explained before. Using the equation 2, the total normalized values for Venice associated to datasets “α”, “β”, and “γ” were calculated as 0.933 (i.e. 0.435 + 0.313 + 0.185), 0.147 (i.e. 0.002 + 0.091 + 0.024 + 0.030), and 8.0 (i.e. 1+1+0+1+1+1+1+1+1). The cumulative value on all the criteria for Venice was calculated as 8.786 (i.e. 0.933 +(- 0.147) + 8), by using the equation 3.