|According to Joel Best, professor
of sociology and criminal justice at the University of Delaware, people approach statistics from one
of four perspectives: awestruck, naïve, cynical, or critical. Most people (myself included) occupy one
of the first two categories. However, if we are to believe Edward Teller, father of the hydrogen bomb,
critical statistical thinking is increasingly in demand. Along the arc of his career, Teller observed
his fellow scientists first believing that anything, if studied long enough, could be understood — but
gradually recognizing that we simply can’t know everything and so must make the best of uncertainty.
In characteristic wry fashion, Teller once quipped, "Physics seems to have explanations that are
somewhat complete for everything except life."
The geospatial industry is no stranger to uncertainty; researchers often must statistically analyze
incomplete or “fuzzy” spatial datasets to discover their hidden order, or confirm a lack thereof.
This column surveys emerging applications of ESDA and its associated spatial-statistical tools and
Awestruck, curious, or just desperate?
ESDA, or the use of statistics to better understand a spatial dataset, is popular in fields such as
public policy analysis, marketing, social science, epidemiology, and geology. Ultimately, the desire
to discover the secrets hidden by incomplete or uncertain data drives all ESDA. In some scenarios,
the researcher is certain about a few locations, but has to guess at others. For example, mining
geologists analyze a limited number of successful extraction points to deduce new areas potentially
rich in oil, gas, or minerals — a practice called geostatistics. In some projects, the researcher has
multiple datasets with overlapping spatial coverage but uncertain inter-relationships. Epidemiologists,
for example, may guess that people living near nuclear facilities have an above-average risk of getting
cancer, and statistically compare distribution of cancer cases with suspected radiation sources,
aggregating similar demographic groups and searching for clusters. In most ESDA efforts, researchers
compare individual spatial features to their nearest neighbors in a process called spatial
autocorrelation. For example, are violent crimes randomly distributed or, for whatever reason, more
concentrated in downtown clusters?
Why bother with ESDA? When the underlying data contain uncertainty or limited sample points,
statistical analysis may be the only way to tickle some truth from an otherwise puzzling dataset.
And even if no amount of analysis can squeeze true certainty from a fuzzy dataset, ESDA may at least
speed up the search by mapping areas most likely to yield results under closer scrutiny.
CSISS Center for Spatially Integrated Social Science
ESDA Exploratory Spatial Data Analysis
LISA Local Indicators of Spatial Association
NIH National Institute of Health
TM Thematic mapper