## Marine Biogeochemical Data Assimilation Symposium in Hobart, 27th-30th May

Hello,

At the end of May CSIRO (Marine and Atmospheric Research, Hobart) and in particular Emlyn Jones organise a conference on this topic, subtitled:

New Pathways to Understanding and Managing Marine Ecosystems: Quantifying Uncertainty and Risk Using Biophysical-Statistical Models of the Marine Environment

Here is an example of what marine biogeochemical data assimilation is about. Suppose you want to model the population sizes of phytoplankton and zooplankton like they do in A Bayesian approach to state and parameter estimation in a Phytoplankton-Zooplankton model. Marine biologists can formulate a model based on their knowledge of the phenomenon, and in this case they propose a model as a system of different equations (they call this one the PZ model):

.

This model describes the interaction between both species’ population sizes and, although it is very simple compared to other models in the area, it already represents a non-trivial phenomenon: when there are more phytoplanktons, then the zooplanktons have more food so they grow through the term , then both population decrease through the terms and respectively, then the phytoplankton population size increases again, etc, etc. It’s a Lotka-Volterra model describing a predator/prey interaction. The randomness comes from being a Normal random variable , drawn at every integer times , corresponding to each day; it reflects that the growth rate of phytoplankton can be different from one day to the other. The observations are daily, noisy measurements of the phytoplankton population sizes, the zooplanktons never being measured.

The questions are numerous: can we estimate the zooplankton population size from the phytoplankton population size (a problem called filtering) under parameter uncertainty? Can we predict both time-series under parameter uncertainty? Can we estimate the parameters, which all have biological interpretation (grazing rates, growth rates, mortality rates etc)? If we have competing models, can we use the data to decide which one is the most accurate under parameter uncertainty (a problem called model choice)? If we can for this simple model, can we also do all of that for more complex models, where both the processes and the parameters are high-dimensional?

The fact that we want to do these things “under parameter uncertainty” obviously makes everything way more challenging. To be clear it is meant that parameters are not fixed (or estimated in a first stage), they are simply given a prior distribution. At the moment scientists can perform most of the tasks mentioned above for reasonably simple models such as the PZ model, not for the craziest ones with million-dimensional hidden processes (e.g. spatial state space models). At least not without additional approximations. The conference will be an opportunity to discuss future improvements to make the methods scalable and computational issues currently experienced by practitioners.

Julyan Arbelsaid, on 10 April 2013 at 09:40Cool! is it an application for sequential monte carlo?

Pierre Jacobsaid, on 10 April 2013 at 09:57Yes definitely, SMC is all over the place now for those applications. In particular Particle MCMC really enabled to do everything under parameter uncertainty, and hopefully SMC^2 will be used to do sequential inference and perhaps model choice.

PMCMC: http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9868.2009.00736.x/full

SMC^2: http://arxiv.org/abs/1101.1528