Comparative Archaeology Database  |  Center for Comparative Archaeology  |  University of Pittsburgh



Rosario Valley Architectural Gini Coefficient and Neighborhood Dataset

Kyle Shaw-Müller and John P. Walden



The data files below provide information on the mound volume measurements for Rosario. They are listed from smallest to largest, accompanied by their Lorenz curve calculations. In the data, "mounds" are synonymous with "structures." Structures are the basic unit of analysis and consist of heavily eroded platforms (i.e., mounds) upon which domestic superstructures would have been built.

DOWNLOAD:

[Comma delimited UTF-8 format]
[Excel format]

FILE METADATA:

Each line in the .CSV file represents one mound. There are 1978 lines, each with 13 variables separated by commas. The variables are listed in the following order:

1 Site Name -- a code used to identify sites.
2 Volume of Mound in Cubic Meters
3 Wide Method: f' - An index of change, this is the difference between the prior and subsequent observations, divided by two and rounded down.
4 Wide Method: f'' - An index of acceleration, this is the difference between the f’ values of prior and subsequent observations, divided by two and rounded down.
5 Narrow Method: f' - An index of change, this is the difference between the observation and the one that follows, divided by two and rounded down.
6 Narrow Method: f'' - An index of acceleration, this is the difference between the f’ values of the observation and of its prior counterpart, divided by two and rounded down.
7 Individual # -- a number used to count sites used for mound calculations
8 Pop. frac: The share of the total population that the mound has; i.e. 1/ [sum of all observations]
9 Income frac: The proportion of the population's total volume that the mound has; i.e. [Mound Volume]/[Sum of Volume from all observations]
10 Line of Equality sum pop.: The cumulative value of each “population fraction” value, adding up to 1 on the final observation.
11 Lorenz Curve sum income: The cumulative value of each “income fraction” value, adding up to 1 on the final observation
12 G(i)*F(i+1): Calculates the differences between the line of equality and the Lorenz curve in order to estimate the area between them (via sums of their values) and consequently the Gini coefficient.
13 G(i+1)*F(i): Calculates the differences between the line of equality and the Lorenz curve in order to estimate the area between them (via sums of their values) and consequently the Gini coefficient.

The first line of the .CSV file, for example, looks like this:

RV120-2,4,,,0,,1,0.000505561,6.97672E-05,0.000505561,6.97672E-05,7.05432E-08,7.05432E-08

This means that RV120-2 has an volume of 4 meters squared. The four variables that follow ("Wide method" and "Narrow method" f' and t) show how quickly the area changes between consecutive values, therefore RV120-2 has no values or zero for these variables because it begins the sequence of observations. RV120-2 is the first observation ("Individual #'=1) of this analysis, it makes up about 0.05% of the total population ("Pop frac.=0.000505561), it has 6.97672E-05 of the population's total volume ("income frac."=6.97672E-05), the cumulative share of the population counted with this site is about 0.05% ("sum pop."=0.000505561), the cumulative share of the volume counted with this site is 6.97672E-05 ("sum pop."=6.97672E-05), and the difference between the line of equality and Lorenz curve at this point is 7.05432E-08 ("G(i)*F(i+1)"=7.05432E-08;"G(i+1)*F(i) =7.05432E-08).