A knowledgebased reactive transport approach for thesimulation of biogeochemical dynamics in Earth systems
D. R. Aguilera, P. Jourabchi, C. Spiteri, and P. Regnier
Department of Earth Sciences—Geochemistry, Faculty of Geosciences, University of Utrecht, P.O. Box 80.021, NL3508 TA Utrecht, Netherlands (aguilera@geo.uu.nl)
[
1
]
A KnowledgeBased Reactive Transport Model (KBRTM) for simulation of coupled transport and biogeochemical transformations in surface and subsurface flow environments is presented (http:// www.geo.uu.nl/
kbrtm). The scalable Webdistributed Knowledge Base (KB), which combinesInformation Technology (IT), an automatic code generator, and database management, facilitates theautomated construction of complex reaction networks from comprehensive information stored at the levelof biogeochemical processes. The reactioncentric approach of the KBRTM system offers full flexibilityin the choice of model components and biogeochemical reactions. The procedure coupling the reactionnetworks to a generalized transport module into RTMs is also presented. The workings of our KBRTMsimulation environment are illustrated by means of two examples of redox and acidbase chemistry in atypical shelf sediment and an aquifer contaminated by landfill plumes.
Components:
9832 words, 9 figures, 5 tables
.
Keywords:
biogeochemistry; information technology; Knowledge Base; numerical methods; reactive transport.
Index Terms:
1009 Geochemistry: Geochemical modeling (3610, 8410).
Received
16 December 2004;
Revised
10 March 2005;
Accepted
4 May 2005;
Published
27 July 2005.Aguilera, D. R., P. Jourabchi, C. Spiteri, and P. Regnier (2005), A knowledgebased reactive transport approach for thesimulation of biogeochemical dynamics in Earth systems
, 6
, Q07012, doi:10.1029/2004GC000899.
1. Introduction
[
2
] Reactive transport models (RTMs) are powerful tools for capturing the dynamic interplay between fluid flow, constituent transport, and biogeochemical transformations [
Steefel and VanCappellen
, 1998]. They have been used to simulate, among others, rock weathering and soil formation [e.g.,
Ayora et al.
, 1998;
Soler and Lasaga
,1998;
Steefel and Lichtner
, 1998a, 1998b;
Thyne et al.
, 2001;
Soler
, 2003;
De Windt et al.
, 2004],nutrient dynamics in river drainage basins andestuaries [e.g.,
Soetaert and Herman
, 1995;
Billenet al.
, 1994;
Regnier et al.
, 1997;
Regnier and Steefel
, 1999;
Vanderborght et al.
, 2002], reactivetransport in groundwater, like contamination of aquifers [e.g.,
Engesgaard and Traberg
, 1996;
Brown et al.
, 1998;
Hunter et al.
, 1998;
Xu et al.
,1999;
Murphy and Ginn
, 2000;
Barry et al.
, 2002;
Brun and Engesgaard
, 2002;
Thullner et al.
, 2004;
van Breukelen et al.
, 2004], early diagenetic transformations in sediments [e.g.,
Soetaert et al.
, 1996;
Boudreau
, 1996;
Van Cappellen and Wang
,1996;
Dhakar and Burdige
, 1996;
Berg et al.
,2003;
Jourabchi et al.
, 2005], benthicpelagic cou pling in ocean systems [e.g.,
Soetaert et al.
, 2000;
Archer et al.
, 2002;
Lee et al.
, 2002] and hydrocar bon migration and maturation in sedimentary basins[e.g.,
Person and Garven
, 1994]. By integratingexperimental, observational and theoretical knowledge about geochemical, biological and transport processes into mathematical formulations, RTMs provide the grounds for prognosis, while diagnosticcomparison between model simulations and measurements can help identify gaps in the conceptualunderstanding of a specific system or uncertainties
G
3
G
3
GeochemistryGeophysicsGeosystems
Published by AGU and the Geochemical SocietyAN ELECTRONIC JOURNAL OF THE EARTH SCIENCES
GeochemistryGeophysicsGeosystems
Article
Volume 6
, Number 7
27 July 2005Q07012, doi:10.1029/2004GC000899ISSN: 15252027Copyright 2005 by the American Geophysical Union 1 of 18
in proper parameterization of biogeochemical processes [
Berg et al.
, 2003;
Jourabchi et al.
, 2005].[
3
] RTMs have traditionally been developed andused to investigate the fate and transport of aselected set of chemical constituents within agiven compartment of the Earth system (e.g., theearly diagenetic models by
Soetaert et al.
[1996],
Boudreau
[1996],
Van Cappellen and Wang
[1996],
Dhakar and Burdige
[1996], and references citedabove). As a result, they have tended to be environment and application specific with regards tothe flow regime and the biogeochemical reactionnetwork.[
4
] Although the first attempts to develop interactive software systems for automatic solution of models based on ordinary and partial differentialequations date back to the creation of digital com puters [e.g.,
Young and Juncosa
, 1959;
Lawrenceand Groner
, 1973;
Mikhailov and Aladjem
, 1981,and references therein], such developments have sofar received little attention in the field of reactivetransport modeling. Literature review shows that RTM codes allowing for more flexible definition of state variables and processes without requiring indepth knowledge of programming or numericalsolution techniques have been developed over thelast decade [e.g.,
Reichert
, 1994;
Chilakapati
,1995;
Regnier et al.
, 1997;
Chilakapati et al.
,2000;
MacQuarrie et al.
, 2001;
Meysman et al.
, 2003a, 2003b;
Van der Lee et al.
, 2003].Database tools, such as that developed by
Katsev et al.
[2004], have also been presented recently tothe reactive transport community. Model flexibilityis a critical feature since a major challenge in thefield of reactive transport modeling is the realisticrepresentation of the highly complex reaction networks (RN) that characterize the biogeochemicaldynamics of natural environments [e.g.,
Mayer et al.
, 2002;
Berg et al.
, 2003;
Quezada et al.
, 2004].At the same time, many field and laboratorybasedexperiments are also being conducted to identifynovel reaction pathways, quantify reaction ratesand microbial activity levels, describe ecologicalcommunity structures, and elucidate interactions between biotic and abiotic processes. This rapidlygrowing knowledge about biogeochemical transformation processes creates a need for efficient means of transferring new experimental findingsinto RTMs.[
5
] Here, a unified modeling approach for implementing complex reaction networks in RTMs is presented. Our simulation environment is based ona modular approach to facilitate incorporation of new theoretical and experimental information onthe rates and pathways of biogeochemical reactions. The key novel feature of the modelingenvironment is a Webdistributed Knowledge Base(KB) of biogeochemical processes, which acts asthe evolving repository of uptodate informationgained in the field of geochemistry. The implementation of such a library within a simulation environment is a major step toward the model’sflexibility, because it is at the level of an easilyaccessible open resource, the KB, that process based theoretical and experimental advances areincorporated in the modeling process. Model generation is conducted via a graphical user interface(GUI) on a Webbased ‘‘runtime’’ server, whichallows for the development of biogeochemicalreaction network modules. That is, informationstored at the level of individual biogeochemical processes can be combined into mathematicalexpressions defining completely the (bio)chemicaldynamics of the system. Since the reaction network is assembled from information stored at the process level, almost any conceivable combination of mixed kinetic and equilibrium reactionscan be implemented in our model architecture.The selected RN can easily be merged withexisting transport models, hence creating a flexible framework in which to assemble RTMs. The proposed approach allows the RTM communityto test and compare, in close collaboration withexperimentalists, alternative mathematical descriptions of coupled biogeochemical reaction networks.For example, increasingly detailed representationsof biogeochemical processes can be incorporated inthe Knowledge Base. Reaction network modules of increasing complexity may then be assembled andcoupled to surface or subsurface flow models, inorder to determine which level of biogeochemicalcomplexity is adequate to simulate chemical systemdynamics at variable spatial and temporal resolutions. By taking a ‘‘reactioncentric’’ approachwhich utilizes the unifying conceptual and mathematical principles underlying all RTMs, onedimensional (1D) transport descriptions relevant to many compartments of the Earth system (rivers,estuaries, groundwater or sediments) can be incorporated in our simulation environment. The proposed approach should thus help overcometraditional disciplinary barriers between the different subfields of RTMs.[
6
] The paper is structured as follows: First, themass conservation equation describing 1D coupledtransport and reaction is briefly presented. A generalized continuum representation is proposed,
GeochemistryGeophysicsGeosystems
G
3
G
3
aguilera et al.: earth system dynamics
10.1029/2004GC000899
2 of 18
which allows for the simulation of reactivetrans port problems characterized by different flowregimes and dispersion intensities. A brief description on how an existing Automatic Code Generator (ACG) based on symbolic programming [
Regnier et al.
, 2002] can be used to create the modelspecific source code necessary to the numericalsolution of the governing equations is then given.We demonstrate how our Webdistributed Knowledge Base concept, which combines InformationTechnology with symbolic computing techniques,directs the mathematical formulation of the biogeochemical reaction network and leads to a modeling environment offering full flexibility. Finally,the workings of our KBRTM are illustrated withtwo contrasting examples of complex redox andacidbase geochemistry in an aquatic sediment andan aquifer, respectively.
2. Mathematical Representation of ReactiveTransport Equations
[
7
] A onedimensional continuum representationof coupled mass transport and chemical reactionsin Earth systems can be described mathematically by a set of partial differential equations (PDEs) intime and space of the form
x
@
C
j
@
t
¼
@ @
x D
x
@
C
j
@
x
@ @
x v
x
C
j
j
ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ}
T
þ
X
N
r
k
¼
1
l
k
;
j
s
k
ﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄ}
R
;
j
¼
1
;
. . .
;
m
;
ð
1
Þ
where
t
is time and
x
denotes the position alongthe 1D spatial domain. Particular solutions of equation (1) require specification of initial and boundary conditions. Further discussion about thegeneralized continuum representation of theadvectiondispersionreaction (ADR) equation isgiven by
Regnier et al.
[2002] and
Meile
[2003].The first two terms on the right hand side, insquare brackets, compose the transport operator (
T
); the last one (
R
) represents the sum of transformation processes (e.g., reactions) affectinga species
j
of concentration
C
j
. Table 1 shows that
x
,
D
and
v
are generic variables which takedifferent meanings depending on the environment considered.
s
k
represents, for kinetic reactions, therate of the
k

th
reaction and
l
k
,
j
is the stoichiometric coefficient of species
j
in that reaction.Currently, it is assumed that the reaction processes(
R
) have no effect on the physical or transport (
T
) properties of the system (i.e.,
x
,
D
and
v
areunaffected by reactions). The rate
s
k
is of arbitraryform, even nonlinear, and can be a function of several concentrations of the system. Through thiscoupling by the reaction terms, most multicomponent problems result in a set of coupled nonlinear partial differential equations (PDEs) of size
m
,number of species of the reaction network. In theevent that some of the reactions considered areassumed to be at equilibrium, algebraic expressions based on mass action laws are introduced into thesystem of equations to be solved [
Regnier et al.
,2002]. By replacing one or more of the
m
differential equations associated with reactions withalgebraic relations based on a mass action expression in the local equilibrium case, the set of ODEs istransformed into a set of differentialalgebraicequations (DAEs) [
Chilakapati
, 1995;
Hindmarshand Petzold
, 1995a, 1995b;
Brenan et al.
, 1989].Thistransformationleadstoasystemof
m
equationsto solve, including
m
k
equations associated withkinetic reactions, and
m
e
algebraic equations basedon mass action expressions.[
8
] The numerical solution of the set of discretizedPDEs and DAEs (
@
t
!
D
t
,
@
x
!
D
x
) commonlyrequires the use of implicit methods in order to becomputationally efficient [
Steefel and MacQuarrie
,1996]. We currently make use of the time splittingtechnique, which consists of solving first the trans port and then the reaction terms in sequence for asingle time step. This method is referred to as the
Table 1.
Meaning of the Generalized Variables
x
,
D
,and
v
for Different Environments
a
Surface FlowAquaticSediment Groundwater Flow PathSolutes or Suspended Solids Solids Solutes Solids Solutes
x
A
1
f f
1
f f
v V
flow
w w
+
v
flow
0
V
flow
D K
turb
D
b
D
b
+
D
sed
0
D
disp
a
From
Meile
[2003]. In porous media flow, distinction between anaqueous and a solid phase must be considered, and
C
is either aconcentration of solute or solid.
A
[L
2
], crosssection area of the surfaceflow channel;
f
[
], porosity;
V
flow
[LT
1
], externally imposed flowvelocity;
w
[LT
1
], burial velocity defined with respect to the sedimentwater interface (SWI);
v
flow
[LT
1
], flow velocity acting only on solutesexternally imposed or from porosity change; usually defined withrespect to the SWI [e.g.,
Boudreau
, 1997];
K
turb
[L
2
T
1
], longitudinalturbulent dispersion coefficient;
D
b
[L
2
T
1
], bioturbation coefficient;
D
sed
[L
2
T
1
] =
D
mol
/(1
ln(
f
2
)), tortuosity corrected molecular diffusion coefficient for solutes at in situ temperature and salinity[
Boudreau
, 1997];
D
disp
[L
2
T
1
] =
a
L
j
V
flow
j
, longitudinal dispersion,where
a
L
[L] is the longitudinal dispersivity [
Freeze and Cherry
,1979].
GeochemistryGeophysicsGeosystems
G
3
G
3
aguilera et al.: earth system dynamics
10.1029/2004GC000899
aguilera et al.: earth system dynamics
10.1029/2004GC000899
3 of 18
sequential noniterative approach (SNIA). Whensolving the reaction part an iterative method isrequired to numerically find the roots of thefunction residuals,
f
j
, which correspond to mass balance equations and, if equilibrium reactions areincluded, mass action equations. This is becausethe reaction terms can be nonlinear functions of the species concentrations. By far the most common approach for finding the root of nonlinear sets of equations is the NewtonRaphson method[
Press et al.
, 1992]. This method involves the useof a first degree Taylor series expansion tolinearize the problem for every single iterationstep. The function residuals,
f
j
, representing thereaction network (RN), and the Jacobian matrix,which contains the partial derivatives of thefunction residuals with respect to the unknownconcentrations, are the most important pieces of information required to implement the NewtonRaphson algorithm [e.g.,
Dennis and Schnabel
,1996]. Once linearized, the resulting problem issolved using linear algebra methods, such as theLU decomposition [e.g.,
Strang
, 1988]. A precisedefinition of the function residuals and the Jaco bian matrix is given by
Regnier et al.
[2002].
3. Reactive Transport Modeling Withthe Knowledge Base
[
9
] Inspection of the transport and reaction operators
T
and
R
in equation (1) shows that thefollowing information is required to define a specific reaction transport application:[
10
] 1. Domain definition: spatiotemporal size andresolution (
x
tot
,
D
x
,
t
tot
,
D
t
), where
x
tot
and
t
tot
standfor the total domain length and simulation time,respectively, and
D
x
and
D
t
are the space and timestep used for numerical integration, respectively.[
11
] 2. Transport coefficients
x
,
v
and
D
(Table 1).[
12
] 3. Reaction Network (RN): processes; rate parameters and equilibrium constants; species involved and stoichiometry.[
13
] 4. Boundary (BC) and initial (IC) conditionsfor every species of the RN.[
14
] In a modeling environment offering full flexibility, this information is specific to each RTMapplication and needs to be automatically translatedinto source code. This task is relatively easy for the physical domain definition, transport parameters, BC and IC. However, if flexibility in choosing process formulations is also important, thenthe stiff system of differential equations using alinearization method (such as the abovecited NewtonRaphson algorithm) necessitate automaticdifferentiation schemes for the calculation of theterms in the Jacobian matrix. Automatic symbolicdifferentiation offers the advantage of producingderivatives of potentially complicated functionswhich are accurate up to the precision of the programming language used (e.g., FORTRAN), plus the convenience of updating the derivativeseasily if the srcinal functions are changed [
Steefel and MacQuarrie
, 1996]. Automated differentiationfor stiff sets of differential equations is one of thekey features of our Automatic Code Generation(ACG) procedure [
Regnier et al.
, 2002].[
15
] The simultaneous implementation of a libraryof biogeochemical processes into a KnowledgeBase (KB) is an additional crucial component of the proposed simulation environment. The KBmakes it possible to take full advantage of theACG. The integration of these two componentswithin a Web system shows how the combinationof Information Technology with advanced symbolic programming allows the use of the Internet as asoftware provider in the area of ReactiveTransport modeling (Figure 1).
3.1. The Internet as a Software Provider
[
16
] An Internet system developed in PHP language (http://www.php.net) provides the adaptiveJavaScript and HTML code for the Graphical User Interface (GUI), as well as the Webbased ‘‘runtime’’ server to our modeling environment. It isaccessible at http://www.geo.uu.nl/
kbrtm. TheGUI is of evolutionary nature, that is, it dynamically adapts to changes in structure and content of the KB system as well as to the selections made bythe user.[
17
] Initially, the user accesses an interface for theKB system in the style of a Web form for selection of desired biogeochemical processes(Figure 1, step 1; further detailed in Figure 2).Speciesindependent physical parameters (definition of spatiotemporal domain and most transport coefficients in Table 1) are also defined at this stage.[
18
] Figure 2 gives a detailed description of themodel design procedure. The KB system consistsof a set of biogeochemical processes, containingdefault formulations for reactions, which are available to all users of the system (common KB) andwhich cannot be modified. However, these common processes can be edited if desired, or new
GeochemistryGeophysicsGeosystems
G
3
G
3
aguilera et al.: earth system dynamics
10.1029/2004GC000899
4 of 18
processes created using a standard template, andstored in a ‘‘private’’ KB library which is inaccessible to other users. To facilitate model development, processes can be grouped in different subsets, for instance, in terms of reaction types(see below).[
19
] The selected processes specify the userdefinedreaction network. The latter is then analyzed todetermine the list of chemical species involved inthe RN. A second Web submission form is subsequently created to specify all speciesdependent parameters (such as molecular diffusion coeffi
Figure 1.
Structure of the KnowledgeBased (KB) Reactive Transport Model (RTM) environment which uses theWeb as a software provider: 1, form submission; 2, parsing of the ASCII files into MAPLE; 3, MAPLE/ MACROFORT symbolic computing; 4, translation of MAPLE results into FORTRAN; 5, linking and compilation of the FORTRAN code; 6, transfer of the executable file to the user via email.
Figure 2.
KBInternet model design process (detailed description of step 1 in Figure 1): (1) Process edition or creation. Templates can be used to speed up the implementation of new processes. Modified or new process files arestored in a ‘‘private’’ KB which is accessible only to a single user. (2) Process selection from the common and privateKB process pools, and specification of the physical support. (3) Analysis of the selection and creation of adynamically adaptive Web form to input reaction network dependent parameters. (4) Submission of complete modelinformation (processes selected, physical parameters, and speciesdependent parameters) to the ACG via our Webserver.
GeochemistryGeophysicsGeosystems
G
3
G
3
aguilera et al.: earth system dynamics
10.1029/2004GC000899
5 of 18