a r X i v : q  b i o / 0 5 0 8 0 1 3 v 1 [ q  b i o . N C ] 1 3 A u g 2 0 0 5
Eﬀects of Fast Presynaptic Noise in Attractor Neural Networks
J. M. Cortes
†‡
, J. J. Torres
†
, J. Marro
†
, P. L. Garrido
†
and H. J. Kappen
‡†
Institute
Carlos I
for Theoretical and Computational Physics, andDepartamento de Electromagnetismo y F´ısica de la Materia,University of Granada, E18071 Granada, Spain.
‡
Department of Biophysics, Radboud University of Nijmegen,6525 EZ Nijmegen, The NetherlandsApril 5, 2007
To appear in Neural Computation, 2005Corresponding author: Jesus M. Cortesmailto:jcortes@ugr.es
Abstract
We study both analytically and numerically the effect of presynaptic noise on the transmission of information in attractor neural networks. The noiseoccurs on a very short–time scale compared to thatfor the neuron dynamics and it produces short–time synaptic depression. This is inspired in recentneurobiological ﬁndings that show that synapticstrength may either increase or decrease on a short–time scale depending on presynaptic activity. Wethus describe a mechanism by which fast presynaptic noise enhances the neural network sensitivity toan external stimulus. The reason for this is that,in general, the presynaptic noise induces nonequilibrium behavior and, consequently, the space of ﬁxed points is qualitatively modiﬁed in such a waythat the system can easily scape from the attractor. As a result, the model shows, in addition topattern recognition, class identiﬁcation and categorization, which may be relevant to the understanding of some of the brain complex tasks.
1 Introduction
There is multiple converging evidence[Abbott and Regehr, 2004] that synapses determine the complex processing of information inthe brain. An aspect of this statement is illustratedby attractor neural networks. These show thatsynapses can eﬃciently store patterns that are afterwards retrieved with only partial information onthem. In addition to this long–time eﬀect, artiﬁcialneural networks should contain some “synapticnoise”, however. That is, actual synapses exhibitshort–time ﬂuctuations, which seem to competewith other mechanisms during the transmissionof information, not to cause unreliability but toultimately determine a variety of computations[Allen and Stevens, 1994, Zador, 1998]. In spiteof some recent eﬀorts, a full understanding of how the brain complex processes depend on suchfast synaptic variations is lacking —see belowand [Abbott and Regehr, 2004], for instance—.A speciﬁc matter under discussion concerns theinﬂuence of short–time noise on the ﬁxed pointsand other details of the retrieval processes inattractor neural networks [Bibitchkov et al., 2002].The observation that actual synapses endureshort–time
depression
and/or
facilitation
is likelyto be relevant in this context. That is, onemay understand some observations by assumingthat periods of elevated presynaptic activity maycause either decrease or increase of the neurotransmitter release and, consequently, that thepostsynaptic response will be either
depressed
or
facilitated
depending on presynaptic neural activity [Tsodyks et al., 1998, Thomson et al., 2002,Abbott and Regehr, 2004]. Motivated by the neurobiological ﬁndings, we report in this paper on effects of presynaptic depressing noise on the func1
tionality of a neural circuit. We study in detail anetwork in which the neural activity evolves at random in time regulated by a “temperature” parameter. In addition, the values assigned to the synaptic intensities by a
learning
(e.g., Hebb’s) rule areconstantly perturbed with
microscopic
fast noise.A new parameter is involved by this perturbationthat allows for a continuum transition from depression to normal operation.As a main result, this paper illustrates that,in general, the addition of fast synaptic noise induces a nonequilibrium condition. That is, oursystems cannot asymptotically reach equilibriumbut tend to nonequilibrium steady states whosefeatures depend, even qualitatively, on dynamics[Marro and Dickman, 1999]. This is interesting because, in practice, thermodynamic equilibrium israre in nature. Instead, the simplest conditionsone observes are characterized by a steady ﬂux of energy or information, for instance. This makesthe model mathematically involved, e.g., there isno general framework such as the powerful (equilibrium) Gibbs theory, which only applies to systems with a single Kelvin temperature and a uniqueHamiltonian. However, our system still admits analytical treatment for some choices of its parametersand, in other cases, we discovered the more intricate model behavior by a series of computer simulations. We thus show that fast presynaptic depressing noise during external stimulation may inducethe system to scape from the attractor, namely,the stability of ﬁxed point solutions is dramaticallymodiﬁed. More speciﬁcally, we show that, for certain versions of the system, the solution destabilizes in such a way that computational tasks such asclass identiﬁcation and categorization are favored.It is likely this is the ﬁrst time such a behavior isreported in an artiﬁcial neural network as a consequence of biologically–motivated stochastic behavior of synapses. Similar instabilities have beenreported to occur in monkeys [Abeles et al., 1995]and other animals [Miller and Schreiner, 2000], andthey are believed to be a main feature in odor encoding [Laurent et al., 2001], for instance.
2 Deﬁnition of model
Our interest is in a neural network in whicha local stochastic dynamics is constantly inﬂuenced by presynaptic
noise
. Consider a setof
N
binary neurons with conﬁgurations
S
≡{
s
i
=
±
1;
i
= 1
,...,N
}
.
1
Any two neurons areconnected by synapses of intensity:
2
w
ij
=
w
ij
x
j
∀
i,j.
(1)Here,
w
ij
is ﬁxed, namely, determined in a previous
learning
process, and
x
j
is a stochasticvariable. This generalizes the hypothesis in previous studies of attractor neural networks with noisysynapses; see, for instance, [Sompolinsky, 1986,Garrido and Marro, 1991, Marro et al., 1999].
Once
W
≡{
w
ij
}
is given, the state of the systemat time
t
is deﬁned by setting
S
and
X
≡ {
x
i
}
.
These evolve with time —after the learning processwhich ﬁxes
W
— via the familiar Master Equation,namely,
∂P
t
(
S
,
X
)
∂t
=
−
P
t
(
S
,
X
)
X
′
S
′
c
[(
S
,
X
)
→
(
S
′
,
X
′
)]+
X
′
S
′
c
[(
S
′
,
X
′
)
→
(
S
,
X
)]
P
t
(
S
′
,
X
′
)
.
(2)We further assume that the
transition rate
or probability per unit time of evolving from (
S
,
X
) to(
S
′
,
X
′
) is
c
[(
S
,
X
)
→
(
S
′
,
X
′
)] =
pc
X
[
S
→
S
′
]
δ
(
X
−
X
′
)+(1
−
p
)
c
S
[
X
→
X
′
]
δ
S
,
S
′
.
(3)This choice [Garrido and Marro, 1994,Torres et al., 1997] amounts to consider competing mechanisms. That is, neurons (
S
) evolvestochastically in time under a noisy dynamicsof synapses (
X
)
,
the latter evolving (1
−
p
)
/p
times faster than the former. Depending on thevalue of
p,
three main classes may be deﬁned[Marro and Dickman, 1999]:1. For
p
∈
(0
,
1) both the synaptic ﬂuctuation and the neuron activity occur on the
1
Note that such binary neurons, although a crude simpliﬁcation of nature, are known to capture the essentials of cooperative phenomena, which is the focus here. See, forinstance [Abbott and Kepler, 1990, Pantic et al., 2002].
2
For simplicity, we are neglecting here postsynaptic dependence of the stochastic perturbation. There is someclaim that plasticity might operate on rapid time–scaleson postsynaptic activity; see [Pitler and Alger, 1992]. However, including
x
ij
in (1) instead of
x
j
would impede someof the algebra in sections 3 and 4.
2
same temporal scale. This case has alreadybeen preliminary explored [Pantic et al., 2002,Cortes et al., 2004].2. The limiting case
p
→
1
.
This corresponds toneurons evolving in the presence of a quenchedsynaptic conﬁguration, i.e.,
x
i
is constantand independent of
i.
The
Hopﬁeld model
[Amari, 1972, Hopﬁeld, 1982] belongs to thisclass in the simple case that
x
j
= 1
,
∀
j.
3. The limiting case
p
→
0
.
The rest of this paperis devoted to this class of systems.Our interest for the latter case is a consequence of the following facts. Firstly, there is adiabatic elimination of fast variables for
p
→
0 which decouples the two dynamics [Garrido and Marro, 1994,Gardiner, 2004]. Therefore, some exact analyticaltreatment —though not the complete solution— isthen feasible. To be more speciﬁc, for
p
→
0
,
theneurons evolve as in the presence of a steady distribution for
X
.
If we write
P
(
S
,
X
) =
P
(
X

S
)
P
(
S
)
,
where
P
(
X

S
) stands for the conditional probability of
X
given
S
,
one obtains from (2) and (3),
after rescaling time
tp
→
t
(technical details areworked out in [Marro and Dickman, 1999], for instance) that
∂P
t
(
S
)
∂t
=
−
P
t
(
S
)
S
′
¯
c
[
S
→
S
′
]+
S
′
¯
c
[
S
′
→
S
]
P
t
(
S
′
)
.
(4)Here,¯
c
[
S
→
S
′
]
≡
d
X
P
st
(
X

S
)
c
X
[
S
→
S
′
]
,
(5)and
P
st
(
X

S
) is the stationary solution that satisﬁes
P
st
(
X

S
) =
d
X
′
c
S
[
X
′
→
X
]
P
st
(
X
′

S
)
d
X
′
c
S
[
X
→
X
′
]
.
(6)This formalism will allows us for modelling fastsynaptic noise which, within the appropiate context, will induce sort of synaptic depression, as explained in detail in section 4.The superposition (5) reﬂects the fact that activity is the result of competition between diﬀerent elementary mechanisms. That is, diﬀerent underlying dynamics, each associated to a diﬀerentrealization of the stochasticity
X
,
compete and,in the limit
p
→
0
,
an
eﬀective
rate results fromcombining
c
X
[
S
→
S
′
] with probability
P
st
(
X

S
)for varying
X
.
Each of the elementary dynamicstends to drive the system to a welldeﬁned equilibrium state. The competition will, however, impede equilibrium and, in general, the system willasymptotically go towards a
nonequilibrium
steadystate [Marro and Dickman, 1999]. The question isif such a competition between synaptic noise andneural activity, which induces nonequilibrium, is atthe srcin of some of the computational strategiesin neurobiological systems. Our study below seemsto indicate that this is a sensible issue. As a matterof fact, we shall argue below that
p
→
0 may be realistic
a priori
for appropriate choices of
P
st
(
X

S
)
.
For the sake of simplicity, we shall be concernedin this paper with sequential updating by meansof single neuron or “spin–ﬂip” dynamics. That is,the elementary dynamic step will simply consist of local inversions
s
i
→−
s
i
induced by a bath at temperature
T.
The elementary rate
c
X
[
S
→
S
′
] thenreduces to a single site rate that one may write asΨ[
u
X
(
S
,i
)]
.
Here,
u
X
(
S
,i
)
≡
2
T
−
1
s
i
h
X
i
(
S
)
,
where
h
X
i
(
S
) =
j
=
i
w
ij
x
j
s
j
is the net presynaptic current arriving to —or local ﬁeld acting on— the(postsynaptic) neuron
i.
The function Ψ(
u
) is arbitrary except that, for simplicity, we shall assumeΨ(
u
) = exp(
−
u
)Ψ(
−
u
)
,
Ψ(0) = 1 and Ψ(
∞
) = 0[Marro and Dickman, 1999]. We shall report onthe consequences of more complex dynamics in aforthcomming paper [Cortes et al., 2005].
3 Eﬀective local ﬁelds
Let us deﬁne a function
H
eﬀ
(
S
) through the condition of detailed balance, namely,¯
c
[
S
→
S
i
]¯
c
[
S
i
→
S
] = exp
−
H
eﬀ
(
S
i
)
−
H
eﬀ
(
S
)
T
−
1
.
(7)Here,
S
i
stands for
S
after ﬂipping at
i, s
i
→−
s
i
.
We further deﬁne the “eﬀective local ﬁelds”
h
eﬀ
i
(
S
)by means of
H
eﬀ
(
S
) =
−
12
i
h
eﬀ
i
(
S
)
s
i
.
(8)Nothing guaranties that
H
eﬀ
(
S
) and
h
eﬀ
i
(
S
) havea simple expression and are therefore analytically3
useful. This is because the superposition (5), unlike its elements Ψ(
u
X
)
,
does not satisfy detailedbalance, in general. In other words, our system hasan essential nonequilibrium character that preventsone from using Gibbs’s statistical mechanics, whichrequires a unique Hamiltonian. Instead, there ishere one energy associated with each realization of
X
=
{
x
i
}
.
This is in addition to the fact that thesynaptic weights
w
ij
in (1) may not be symmetric.For some choices of both the rate Ψ andthe noise distribution
P
st
(
X

S
)
,
the function
H
eﬀ
(
S
) may be considered as a true eﬀective Hamiltonian [Garrido and Marro, 1989,Marro and Dickman, 1999]. This means that
H
eﬀ
(
S
) then generates the same nonequilibriumsteady state than the stochastic time–evolutionequation which deﬁnes the system, i.e., equation(4), and that its coeﬃcients have the propersymmetry of interactions. To be more explicit,assume that
P
st
(
X

S
) factorizes according to
P
st
(
X

S
) =
j
P
(
x
j

s
j
)
,
(9)and that one also has the factorization¯
c
[
S
→
S
i
] =
j
=
i
d
x
j
P
(
x
j

s
j
)Ψ(2
T
−
1
s
i
w
ij
x
j
s
j
)
.
(10)The former amounts to neglect some global dependence of the factors on
S
=
{
s
i
}
(see below),and the latter restricts the possible choices forthe rate function. Some familiar choices for thisfunction that satisfy detailed balance are: theone corresponding to the Metropolis algorithm,i.e., Ψ(
u
) = min[1
,
exp(
−
u
)]; the Glauber caseΨ(
u
) = [1 + exp(
u
)]
−
1
; and Ψ(
u
) = exp(
−
u/
2)[Marro and Dickman, 1999]. The latter fulﬁllsΨ(
u
+
v
) = Ψ(
u
)Ψ(
v
) which is required by (10)
3
. It then ensues after some algebra that
h
eﬀ
i
=
−
T
j
=
i
α
+
ij
s
j
+
α
−
ij
,
(11)
3
In any case, the rate needs to be properly normalized.In computer simulations, it is customary to divide Ψ(
u
) byits maximum value. Therefore, the normalization happensto depend on temperature and on the number of stored patterns. It follows that this normalization is irrelevant for theproperties of the steady state, namely, it just rescales thetime scale.
with
α
±
ij
≡
14 ln ¯
c
(
β
ij
;+)¯
c
(
±
β
ij
;
−
)¯
c
(
−
β
ij
;
∓
)¯
c
(
∓
β
ij
;
±
)
,
(12)where
β
ij
≡
2
T
−
1
w
ij
,
and¯
c
(
β
ij
;
s
j
) =
dx
j
P
(
x
j

s
j
)Ψ(
β
ij
x
j
)
.
(13)This generalizes a case in the literature for random
S
–independent ﬂuctuations [Garrido and Munoz, 1993,Lacomba and Marro, 1994,Marro and Dickman, 1999]. In this case, onehas ¯
c
(
±
κ
;+) = ¯
c
(
±
κ
;
−
) and, consequently,
α
−
ij
= 0
∀
i,j.
However, we here are concerned withthe case of
S
–dependent disorder, which results ina non–zero threshold,
θ
i
≡
j
=
i
α
−
ij
= 0
.
In order to obtain a true eﬀective Hamiltonian,the coeﬃcients
α
±
ij
in (11) need to be symmetric.Once Ψ(
u
) is ﬁxed, this depends on the choice for
P
(
x
j

s
j
)
,
i.e., on the fast noise details. This is studied in the next section. Meanwhile, we remark thatthe eﬀective local ﬁelds
h
eﬀ
i
deﬁned above are veryuseful in practice. That is, they may be computed—at least numerically— for any rate and noise distribution. As far as Ψ(
u
+
v
) = Ψ(
u
)Ψ(
v
) and
P
st
(
X

S
) factorizes,
4
it follows an eﬀective transition rate as¯
c
[
S
→
S
i
] = exp
−
s
i
h
eﬀ
i
/T
.
(14)This eﬀective rate may then be used in computersimulation, and it may also serve to be substitutedin the relevant equations. Consider, for instance,the
overlaps
deﬁned as the product of the currentstate with one of the stored patterns:
m
ν
(
S
)
≡
1
N
i
s
i
ξ
ν i
.
(15)Here,
ξ
ν
=
{
ξ
ν i
=
±
1
,i
= 1
,...,N
}
are
M
random patterns previously stored in the system,
ν
= 1
,...,M.
After using standard techniques[Hertz et al., 1991, Marro and Dickman, 1999]; see
4
The factorization here does not need to be inproducts
P
(
x
j

s
j
) as in (9). The same result (14)
holds for the choice that we shall introduce in thenext section, for instance.4
also [Amit et al., 1987], it follows from (4) that
∂
t
m
ν
= 2
N
−
1
i
ξ
ν i
sinh
h
eﬀ
i
/T
−
s
i
cosh
h
eﬀ
i
/T
.
(16)which is to be averagedover both thermal noise andpattern realizations. Alternatively, one might perhaps obtain dynamic equations of type (16) by using FokkerPlanck like formalisms as, for instance,in [Brunel and Hakim, 1999].
4 Types of synaptic noise
The above discussion and, in particular, equations(11) and (12), suggest that the system emergent
properties will importantly depend on the details of the synaptic noise
X
.
We now work out the equations in section 3 for diﬀerent hypothesis concerningthe stationary distribution (6).Consider ﬁrst (9) with the following speciﬁcchoice:
P
(
x
j

s
j
) = 1 +
s
j
F
j
2
δ
(
x
j
+Φ)+ 1
−
s
j
F
j
2
δ
(
x
j
−
1)
.
(17)This corresponds to a simpliﬁcation of the stochastic variable
x
j
.
That is, for
F
j
= 1
∀
j,
the noisemodiﬁes
w
ij
by a factor
−
Φ when the presynapticneuron is ﬁring,
s
j
= 1
,
while the learned synaptic intensity remains unchanged when the neuronis silent. In general,
w
ij
=
−
w
ij
Φ with probability
12
(1 +
s
j
F
j
)
.
Here,
F
j
stands for some information concerning the presynaptic site
j
such as, forinstance, a local threshold or
F
j
=
M
−
1
ν
ξ
ν j
.
Our interest for case (17) is two fold, namely,it corresponds to an exceptionally simple situationand it reduces our model to two known cases. Thisbecomes evident by looking at the resulting localﬁelds:
h
eﬀ
i
= 12
j
=
i
[(1
−
Φ)
s
j
−
(1 + Φ)
F
j
]
w
ij
.
(18)That is, exceptionally, symmetries here are suchthat the system is described by a
true
eﬀectiveHamiltonian. Furthermore, this corresponds to theHopﬁeld model, except for a rescaling of temperature and for the emergence of a threshold
θ
i
≡
j
w
ij
F
j
[Hertz et al., 1991]. On the other hand,it also follows that, concerning stationary properties, the resulting eﬀective Hamiltonian (8) reproduces the model as in [Bibitchkov et al., 2002].In fact, this would correspond in our notation to
h
eﬀ
i
=
12
j
=
i
w
ij
s
j
x
∞
j
,
where
x
∞
j
stands for thestationary solution of certain dynamic equation for
x
j
.
The conclusion is that (except perhaps concerning dynamics, which is something worth to be investigated) the fast noise according to (9) with (17)
does not imply any surprising behavior. In anycase, this choice of noise illustrates the utility of the eﬀective–ﬁeld concept as deﬁned above.Our interest here is in modeling the noise consistent with the observation of shorttime synaptic depression [Tsodyks et al., 1998, Pantic et al., 2002].In fact, the case (17) in some way mimics that increasing the mean ﬁring rate results in decreasingthe synaptic weight. With the same motivation, amore intriguing behavior ensues by assuming, instead of (9), the factorization
P
st
(
X

S
) =
j
P
(
x
j

S
) (19)with
P
(
x
j

S
) =
ζ
(
m
)
δ
(
x
j
+Φ)+[1
−
ζ
(
m
)]
δ
(
x
j
−
1)
.
(20)Here,
m
=
m
(
S
)
≡
m
1
(
S
)
,...,m
M
(
S
)
is the
M
dimensional overlap vector, and
ζ
(
m
) standsfor a function of
m
to be determined. The depression eﬀect here depends on the overlap vector which measures the net current arriving topostsynaptic neurons. The non–local choice (19)–(20) thus introduces non–trivial correlations between synaptic noise and neural activity, whichis not considered in (17). Note that, therefore,we are not modelling here the synaptic depression dynamics in an explicity way as, for instance,in [Tsodyks et al., 1998]. Instead, equation (20)
amounts to consider fast synaptic noise which naturally depresses the strengh of the synapses afterrepeated activity, namely, for a high value of
ζ
(
m
)
.
Several further comments on the signiﬁcance of (19)(20), which is here a main hypothesis together
with
p
→
0
,
are in order. We ﬁrst mention thatthe system time relaxation is typically orders of magnitude larger than the time scale for the various synaptic ﬂuctuations reported to account forthe observed high variability in the postsynapticresponse of central neurons [Zador, 1998]. On theother hand, these ﬂuctuations seem to have diﬀerent sources such as, for instance, the stochasticity5