Deprivation in Argentina

Published: 08-01-2021| Version 1 | DOI: 10.17632/m4h5kbv4h2.1
Juan Primosich


The objective of this work is to elaborate a General Deprivation Index (GDI) by using the information obtained by the Household Permanent Survey (HPS) of 32 agglomerates on the third quarter of 2017 applying the PCA in Two Stages developed by Alberto Marradi (1981).


Steps to reproduce

The procedure stages will be: a) to select a group of indices related to the conceptual dimensions studied; b) to calculate a correlation matrix among all the variables included in the basket which will be submitted to PCA; c) to examine the matrix to identify too high correlations (more than .9 absolute value) and to eliminate one of the two variables involved from the basket in order to avoid consequences derived from collinearity; d) to carry out a PCA; to produce diagrams showing the first component vs the second one, then vs the third one, to discover the location of each variable in the space created by the axes; e) starting from the examination of these diagrams, to identify clusters of variables which may have adjacent positions in the space in v-1 dimensions, and give it a semantic tag in the light of the variables which saturate them most. f) Over the basis of its position in the diagrams, to assign each variable to one of these components, or to more than one, or to none of them; g) to carry out a separate PCA for each dimension identified in step e); each analysis operates over one of the clusters identified in step f); two components are necessarily extracted in each analysis in order to examine the location of the variables in the plane thus created; h) by examining these diagrams, the variables, which may be considered marginal compared to those composing the dimension nucleus and whose semantic assignments are not deemed important, are eliminated; i) in each of the separate PCA the operations (g) and (h) are repeated until a satisfactory result is achieved; j) when a satisfactory result is achieved, its PCA is repeated, this time the computer is ordered to extract only one component to concentrate in only one vector the maximum of the extractable variance and calculate the loading factors vectors and the componential coefficients; k) for each component (representing one dimension), if one or more variables have very low componential coefficients or have opposite symbols to the relative loading factors, they are eliminated from the basket by repeating the PCA; l) for each component, steps (j) and (k) are repeated until the dimension is represented by a small number of variables with componential coefficients possibly balanced and having a symbol equal to the relative loading factors; m) an index is produced for each dimension by using the coefficients produced by the last PCA of the series in order to weigh the scores standardized of all cases (individuals, territorial additions, etc.) over variables maintained in the basket of the last PCA series; These indices may be related between them (to determine the degree of association among components representing the different dimensions) and with other variables of the same matrix, the same as in any data analysis.