Summer school  teaching unit:
<Obtaining spatial information using spatial point processes>

Jorge Mateu– June 9, 2004

Email: mateu@mat.uji.es

Web: http://www3.uji.es/~mateu

Hours

Session 1: June 9, 9.30-11.00: 90 minutes Theoretical Session

Session 2: June 9, 11.30-13.00: 90 minutes Practical Session

Contents

 

A spatial point pattern is a set of data consisting of n locations in an essentially planar region. Examples include locations of earthquake epicentres, cell nuclei in a microscopic tissue section or trees in a forest.

 

This course presents an introduction to the theory of spatial point processes and their statistical analysis. In particular we present some basic theoretical set-up together with the most important theoretical models of spatial point processes. We present some Monte-Carlo tests and focus on simulation and estimation techniques. We finally develop particular analysis to detect anisotropy and orientation in spatial point processes.

 

Session 1:

 

The idea of this 1st session is to provide the basic theoretical concepts of spatial point processes introducing the student through motivating examples into this widely unknown area of spatial statistics while necessary for many practical situations where spatial dynamics is present. In a 90-minutes session only a general overview, and a brief outlook, of the methodology will be presented with the aim of allowing the students know about these techniques and trying to make them interested in this field.

 

In conclusion, session 1 will cover the following topics:

 

1. Introduction

 

1.1 Motivation

 

We present several spatial point patterns and pose possible questions such as: (a) Could all patterns have been generated by the same stochastic process? (b) If not, how would you describe the differences amongst them? (c) What kinds of stochastic models might be plausible for each pattern?

 

1.2 A general view

 

Point processes provide models for patterns of points. The mathematical theory was first developed in order to solve various problems where it is sensible to model the locations of events as random. Indeed, the study of spatial point patterns has a long history in ecology and forestry (Goodall, 1952, 1970; Pielou, 1977; Ripley, 1987b). Spatial point patterns have also found application in fields as diverse as archeology (Hodder and Orton, 1976), cosmology (Neyman and Scott, 1958), geography (Cliff and Ord, 1981), seismology (Ogata, 1989) and epidemiology (Diggle, 1993).

 

A benchmark hypothesis to be contrasted is the Complete Spatial Randomness (CSR) which asserts that the point are randomly and independently distributed in the region of interest. Rejection of CSR is a minimal prerequisite for any serious attempt to model an observed pattern as CSR operates as a dividing hypothesis between regular and aggregated patterns. Several tests based on distances together with Monte Carlo tests (Diggle, 1983; Mateu, 1995) may be used to contrast the CSR hypothesis: (a) Nearest neighbour distances, (b) Point to nearest event distances.

 

In the analysis of point patterns, statistical modelling of locations of interacting objects is studied by means of Gibbs point processes. They are built from local interactions between the individuals. Considering pairwise interaction processes, only interactions among pairs of objects are taken into account and a pair potential function or a pairwise interaction function is used to describe interactions between the individuals. The pair potential function is parametrized and the parameters have to be estimated.

 

Gibbs point processes first appeared in the theory of statistical physics, where Gibbs distributions are applied to describe the equilibrium states of closed physical systems of interacting objects. In mathematical statistics Gibbs point processes are used as models of spatial point patterns.

 

The parameters of the point process can be estimated by using some optimization method such as maximum likelihood estimation (MLE). The major problem in the application of the MLE method is that the likelihood function contains an unknown scaling factor which is intractable. Hence, this method cannot be applied straightforwardly. Several approximative maximum likelihood methods are known. Ogata and Tanemura (1981, 1984, 1985) and Penttinen (1984) suggested sparse data approximations. These are asymptotical methods based on strong assumptions. Ogata and Tanemura (1985) applied their approximations also for marked point processes. A more feasible possibility is to apply computer intensive Markov Chain Monte Carlo maximum likelihood method which simulates ergodic Markov chains having equilibrium distributions in the model. Penttinen (1984) suggested solving the likelihood equation by the stochastic Newton-Raphson algorithm whereas Moyeed and Baddeley (1991) suggested a stochastic approximation.

 

The problem of the scaling factor can be bypassed using the pseudo-likelihood estimation method instead of the likelihood one. The pseudo-likelihood function does not contain the scaling factor and it is easier to calculate. The method was first developed for Markov random fields (Besag, 1974) and later for Strauss point processes (Besag, 1978).

 

2. Theory setup

 

2.1 Second-order properties

 

2.2 Extension to multivariate processes

 

2.3 Estimation of Second-Order Properties for Univariate processes

 

2.4 Estimation of Second-Order Properties for Multivariate processes

 

2.5 Empty space and nearest neighbour distributions

 

2.6 Estimation of empty space and nearest neighbour properties

 

2.7 Fundamentals in the theory of Marked Point Processes

 

2.8 Estimation of Marked Point Processes characteristics

 

3. Models for spatial point processes

 

3.1 Complete spatial randomness: the homogeneous planar Poisson process

3.2 The inhomogeneous planar Poisson process

3.3 The Cox process

3.4 The Poisson cluster process

3.5 Gibbs and Pairwise interaction point processes

 

4. Monte Carlo Tests (MCT) and MCT-based measures of Complete Spatial Randomness

 

4.1 Test of CSR using K function-based envelopes

4.2 Test of CSR using F-G

4.3 Cluster investigation using K functions

4.4 Random labelling using K functions

4.5 Analyzing independence

4.6 Simulating a cluster pattern

 

5. Simulation techniques of Gibbs point processes

 

5.1 Birth-and-death algorithms

5.2 Metropolis-Hastings algorithms

5.3 Exact simulation

 

6. Estimation procedures for Gibbs point processes

 

6.1 Approximate Likelihood inference

6.2 Monte Carlo Likelihood inference

6.3 Pseudo-likelihood inference

6.4 Edge corrections

 

7. Anisotropy and Orientation analysis

 

7.1 Anisotropic characteristics of particle locations

7.2 Anisotropic characteristics of particle locations and sizes

7.3 Appendix: formulae for estimation

7.4 Appendix: Anisotropy test

7.5 Real-case studies

 

8. LISA functions for local product densities

 

 

 

Session 2:

 

The 90-minutes practical session will be based on the exploration of spatial dynamics through the practical analysis of spatial point processes. And this will be done with freely available software through the internet. In particular we shall discuss the following softwares:

 

(a)    R software (http://cran.r-project.org/) with the corresponding libraries

SPLANCS (http://cran.us.r-project.org/src/contrib/Descriptions/splancs.html) and

SPATSTAT (http://cran.r-project.org/src/contrib/Descriptions/spatstat.html).

They provide a GIS-like environment for the display and manipulation of spatial point patterns, together with an implementation of many of the analytic methods covered in this course.

 

(b) SPATSTATVIEWER (http://fortius.act.uji.es/SSV/red.asp?rd=def.htm)

This software is still under construction and provides windows-based tools for the analysis of point processes.

 

Students will be asked, but guided, to solve several practical exercises using one or several of the above mentioned softwares to get a feeling of the general meaning of modeling spatial point patterns and answer questions to some interesting queries. The basic steps for the practical session will be:

 

1. Reading and mapping spatial point processes

2. Simulating completely random spatial point patterns

3. Estimation of first and second-order descriptive characteristics of a point patterns

4. Testing CSR hypothesis

5. Monte-Carlo tests

6. Estimating random labelling and independence for bivariate point processes

7. Fitting models: Estimating parametes

 

 

Goals

 

The goal is that students completing this teaching unit, should be able to answer the following questions/meet the following goals:

 

Ø      Identification of the necessity of spatial point pattern analysis

Ø      (a) Could several patterns have been generated by the same stochastic process? (b) If not, how would we describe the differences amongst them? (c) What kinds of stochastic models might be plausible for each pattern?

Ø      Are the locations spatially clustered? Do they tend to be regularly distributed, or are they random (i.e a realization of a homogeneous Poisson process)?

Ø      Do two different species of tree tend to occur together? Are locations of cancer cases more clustered than a random subset of a control group?

Ø      What is the average density of trees in an area? What does a map of density look like?

Ø      Can we describe a point patterns through the use of first and second-order characteristics?

Ø      Can we distinguish amongst several possible models and consequently fit theoretical models of point processes?

Ø      Can we simulate a fitted model?

Ø      Can we analyze orientations in a point pattern?

Ø      Critical discussion of practical modeling

 

Students’ preparation in advance

 

Required readings:

 

The students are expected to read the following documents as basic material for this course preparation:

 

(a)    Specific notes with the contents of the course: course-mateu.pdf

(b)   Paper entitled Spatial Point Processes: An Overview: mateu-gregori.pdf

(c)    Paper entitled Statistical tools for Spatial Economics: mateu-albert.pdf

(d)   Paper entitled On kernel estimators of second-order measures for spatial point processes: mateu-montene.pdf

 

Additional books that  may be consulted for the course preparation:

 

(a)    Diggle’s book entitled Statistical Analysis of Spatial Point Patterns (second edition). London: Edward Arnold (2003).

(b)   Proceedings book entitled Spatio-Temporal Modelling of Environmental Processes, edited by J. Mateu & F. Montes. ISBN: 84-8021-368-X (2001).

(c)    Proceedings book entitled Spatial Point Process Modelling and its Applications, edited by A. Baddeley, P. Gregori, J. Mateu, R. Stoica & D. Stoyan. ISBN: 84-8021-475-9 (2004).

 

 

Software downloads or URLs

 

Software requirements:

In each of the URLs shown below the students can find the corresponding software to be used together with sufficient documentation in English, for proper installation and use.

 

(a)    R software (http://cran.r-project.org/) with the corresponding libraries:

SPLANCS: Spatial and Space-Time Point Pattern Analysis

(http://cran.us.r-project.org/src/contrib/Descriptions/splancs.html)

SPATSTAT: Analysis of spatial point patterns

(http://cran.r-project.org/src/contrib/Descriptions/spatstat.html).

They provide a GIS-like environment for the display and manipulation of spatial point patterns, together with an implementation of many of the analytic methods covered in this course.

 

(b) SPATSTATVIEWER (http://fortius.act.uji.es/SSV/red.asp?rd=def.htm)

This software is still under construction and provides windows-based tools for the analysis of point processes.

 

URL’s for data sets:

 

(a) http://www.maths.lancs.ac.uk/~diggle/

(b) http://www3.uji.es/~mateu