Leabra
Contents
Introduction
Leabra stands for Local, Errordriven and Associative, Biologically Realistic Algorithm, and it implements a balance between associative (Hebbian) and errordriven learning on top of a biologicallybased pointneuron activation function with inhibitory competition dynamics (either via inhibitory interneurons or an approximation thereof), which produce kWinnersTakeAll (kWTA) sparse distributed representations. Extensive documentation is available from the online textbook: http://ccnbook.colorado.edu which serves as a second edition to the original book: Computational Explorations in Cognitive Neuroscience: Understanding the Mind by Simulating the Brain, O'Reilly and Munakata, 2000, Cambridge, MA: MIT Press. (Computational Explorations..)
As of November 3, 2014: Please see Leabra 710 Update for important information about updating to the 7.1.0 version of emergent.
Overview
The basic activation dynamics of Leabra are based on standard electrophysiological principles of real neurons, and in discrete spiking mode we implement exactly the AdEx (adapting exponential) model of Gerstner and colleagues (Scholarpedia article on AdEx). The rate code mode (which runs faster and allows for smaller networks) provides a very close approximation to the AdEx model behavior, in terms of a graded activation signal matching the actual instantaneous rate of spiking across a population of AdEx neurons. We generally conceive of a single ratecode neuron as representing a microcolumn of roughly 100 spiking pyramidal neurons in the neocortex. There are also shortterm plasticity dynamics that produce synaptic depression and potentiation, as described in Hennig (2013).
The excitatory synaptic input conductances (i.e., net input) is computed as an average, not a sum, over connections, based on normalized, sigmoidaly transformed weight values, which are subject to scaling on a connectiongroup level to alter relative contributions. Automatic scaling is performed to compensate for differences in expected activity level in the different projections. See Leabra Netin Scaling for details.
Inhibition is computed using a feedforward (FF) and feedback (FB) inhibition function (FFFB) that closely approximates the behavior of inhibitory interneurons in the neocortex. FF is based on a multiplicative factor applied to the average net input coming into a layer, and FB is based on a multiplicative factor applied to the average activation within the layer. These simple linear functions do an excellent job of controlling the overall activation levels in bidirectionally connected networks, producing behavior very similar to the more abstract computational implementation of kWTA dynamics implemented in previous versions.
There is a single learning equation, derived from a very detailed model of spike timing dependent plasticity (STDP) by Urakubo et al, 2008, that produces a combination of Hebbian associative and errordriven learning. For historical reasons, we call this the XCAL equation (eXtended Contrastive Attractor Learning), and it is functionally very similar to the BCM learning rule developed by Bienenstock, Cooper & Munro (1982). The essential learning dynamic involves a Hebbian coproduct of sending neuron activation times receiving neuron activation, which biologically reflects the amount of calcium entering through NMDA channels, and this coproduct is then compared against a floating threshold value. To produce the Hebbian learning dynamic, this floating threshold is based on a longterm running average of the receiving neuron activation. This is the key idea for the BCM algorithm. To produce errordriven learning, the floating threshold is based on a much faster running average of activation coproducts, which reflects an expectation or prediction, against which the instantaneous, later outcome is compared.
Weights are subject to a contrast enhancement function, which compensates for the soft (exponential) weight bounding that keeps weights within the normalized 01 range. Contrast enhancement is important for enhancing the selectivity of selforganizing learning, and generally results in faster learning with better overall results. Learning operates on the underlying internal linear weight value. Biologically, we associate the underlying linear weight value with internal synaptic factors such as actin scaffolding, CaMKII phosphorlation level, etc, while the contrast enhancement operates at the level of AMPA receptor expression. We also support (optionally) a distinction between fast and slow weights  biologically fast weights reflect the early LTP dynamics, which rise and fall over roughly 15 minutes after induction, while slow weights reflect the stable longterm weight values that the fast weights decay back to.
There are various extensions to the algorithm that implement special neural mechanisms associated with the prefrontal cortex and basal ganglia (PBWM), dopamine systems PVLV, the Hippocampus, and temporal integration dynamics associated with the thalamocortical circuits (LeabraTI, and CIFER).
Leabra Algorithm Equations
The pseudocode for Leabra is given here, showing exactly how the pieces of the algorithm fit together, using the equations and variables from the actual code. The emergent implementation contains a number of optimizations (including vectorization and GPU code), but this provides the core math in simple form.
Variables
LeabraUnits are organized into LeabraLayers, which sometimes have unit groups (which are now typically purely virtual, not actual Unit_Group objects). The LeabraUnit has the following key parameters, along with a number of others that are used for other nondefault algorithms and various optimizations, etc.
 act = activation sent to other units
 act_nd = nondepressed activation  prior to application of any shortterm plasticity
 net_raw = raw netinput, prior to timeaveraging
 net = timeaveraged excitatory conductance (net input)
 gc_i = inhibitory conductance, computed from FFFB inhibition function typically
 I_net = net current, combining excitatory, inhibitory, and leak channels
 v_m = membrane potential
 v_m_eq = equilibrium membrane potential  not reset by spikes  just keeps integrating
 adapt = adaptation current
 avg_ss = supershort term running average activation
 avg_s = shortterm running average activation, integrates over avg_ss, represents plus phase learning signal
 avg_m = mediumterm running average activation, integrates over avg_s, represents minus phase learning signal
 avg_l = longterm running average activation, integrates over avg_m, drives longterm floating average for Hebbian learning
Units are connected via synapses parameterized with the following variables. These are actually stored in an optimized vector format, but the LeabraCon object contains the variables as a template.
 wt = net effective synaptic weight between objects  subject to contrast enhancement compared to fwt and swt
 dwt = deltawt  change in synaptic weights due to learning
 fwt = fast weight  used for advanced fast and slow weight learning dynamic  otherwise equal to swt  stored as noncontrast enhanced value
 swt = slow weight  standard learning rate weight  stored as noncontrast enhanced value
LeabraCycle: Net input, Inhibition, Activation
For every cycle of activation updating, compute the net input, inhibition, membrane potential, and activation:
 Net input (see LeabraUnitSpec.cpp for code):

net_raw += (sum over recv connections of:) scale_eff * act * wt
 scale_eff = Leabra Netin Scaling factor that includes 1/N to compute an average, plus wt_scale.rel and abs relative and absolute scaling terms.
 act = sending unit activation
 wt = receiving connection weight value between sender and receiver
 emergent does this very efficiently by using a senderbased computation, that only sends changes (deltas) in activation values  typically only a few percent of neurons send on any given cycle.

net += dt.integ * dt.net_dt * (net_raw  net)
 time integration of net input, using net_dt (1/1.4 default), and global integration time constant, dt.integ (1 = 1 msec default)

 Inhibition (see LeabraLayerSpec.cpp for code):

ffi = ff * MAX(netin.avg  ff0, 0)
 feedforward component of inhibition with ff multiplier  has ff0 offset and can't be negative (that's what the MAX(.. ,0) part does).
 netin.avg is average of net variable across unit group or layer, depending on what level this is being computed at (both are supported)

fbi += fb_dt * (fb * acts.avg  fbi)
 feedback component of inhibition with fb multiplier  requires time integration to dampen oscillations that otherwise occur  fb_dt = 1/1.4 default

gc_i = gi * (ffi + fbi)

 Membrane potential (see LeabraUnitSpec.cpp for code)

I_net = net * (e_rev.e  v_m) + gc_l * (e_rev.l  v_m) + gc_i * (e_rev.i  v_m) + noise
 net current = sum of individual ionic channels: e = excitatory, l = leak (gc_l is a constant, 0.1 default), and i = inhibitory
 e_rev are reversal potentials: in normalized values derived from biophysical values, e_rev.e = 1, .l = 0.3, i = 0.25
 noise is typically gaussian if added
 if ex:
I_net += g_bar.l * exp_slope * exp((v_m  thr) / exp_slope)
 this is the exponential component of AdEx, if in use (typically only for discrete spiking), exp_slope = .02 default

v_m += dt.integ * dt.vm_dt * (I_net  adapt)
 in emergent, we use a simple midpoint method that evaluates v_m with a halfstep time constant, and then uses this halfstep v_m to compute full step in above I_net equation. vm_dt = 1/3.3 default.
 v_m is always computed as in discrete spiking, even when using rate code, with v_m reset to vm_r etc  this provides a more natural way to integrate adaptation and shortterm plasticity mechanisms, which drive off of the discrete spiking.

v_m_eq += dt.integ * dt.vm_dt * (I_net  adapt)
 the equilibrium version of the membrane potential does not reset with spikes, and is important for rate code per below

 Activation (see LeabraUnitSpec.cpp for code)

g_e_thr = (gc_i * (e_rev_i  thr) + gc_l * (e_rev_l  thr)  adapt) / (thr  e_rev.e)
 the amount of excitatory conductance required to put the neuron exactly at the firing threshold, thr = .5 default.

if(v_m > spk_thr) { spike = 1; v_m = vm_r } else { spike = 0 }
 spk_thr is spiking threshold (1.2 default, different from rate code thr), vm_r = .3 is the reset value of the membrane potential after spiking  we also have an optional refractory period after spiking, default = 3 cycles, where the vm equations are simply not computed, and vm remains at vm_r.
 if using spiking mode, then act = spike, otherwise, rate code function is below

if(v_m_eq <= thr) { new_act = NXX1(v_m_eq  thr) } else { new_act = NXX1(net  g_e_thr) }
 it is important that the time to first "spike" be governed by v_m integration dynamics, but after that point, it is essential that activation drive directly from the excitatory conductance (g_e or net) relative to the g_e_thr threshold  activation rates are linear in this term, but not even a welldefined function of v_m_eq  earlier versions of Leabra only used the v_m_eqbased term, and this led to some very strange behavior.
 NXX1 = noisyxoverx+1 function, which is implemented using a lookup table due to the convolving of the XX1 function with a gaussian noise kernel

XX1(x) = gain * x / (gain * x + 1)
 gain = 100 default

act_nd += dt.integ * dt.vm_dt * (new_act  act_nd)
 nondepressed rate code activation is timeintegrated using same vm_dt time constant as used in v_m, from the new activation value

act = act_nd * syn_tr (or just act_nd)
 if shortterm plasticity is in effect, then syn_tr variable reflects the synaptic transmission efficacy, and this product provides the net signal sent to the receiving neurons. otherwise syn_tr = 1.

adapt += dt.integ * (adapt.dt * (vm_gain * (v_m  e_rev.l)  adapt) + spike * spike_gain)
 adaptation current  causes rate of activation / spiking to decrease over time, adapt.dt = 1/144, vm_gain = 0.04, spike_gain = .00805 defaults

Learning
After iterating over some number of cycles, with typically the first set of 50 to 75 cycles being in the "minus phase" (no target values clamped on output layers, or other sources of plus phase signals as in LeabraTI), and the final 25 cycles being the "plus phase", where targets or other training signals are present, then learning equations are computed. These are based on running averages of activations as shown first.
 Running averages computed continuously every cycle, and note the compounding form (see LeabraUnitSpec.cpp for code)

avg_ss += dt.integ * ss_dt * (act_nd  avg_ss)
 supershort time scale running average, ss_dt = 1/2 default  this was introduced to smooth out discrete spiking signal, but is also useful for rate code

avg_s += dt.integ * act_avg.s_dt * (avg_ss  avg_s)
 short time scale running average, s_dt = 1/2 default  this represents the "plus phase" or actual outcome signal in comparison to avg_m

avg_m += dt.integ * act_avg.m_dt * (avg_s  avg_m)
 medium timescale running average, m_dt = 1/10 average  this represents the "minus phase" or expectation signal in comparison to avg_s

avg_l += (avg_m > 0.1) ? (avg_m * l_up_inc) : (l_dn_dt * own_lay.acts_p_avg_eff * (avg_m  avg_l)
 longterm running average  this is computed just once per learning trial, not every cycle like the ones above  l_up_inc = .2, l_dn_tau = 2.5 defaults
 the logic is that the additive increases can increase without bound, to drive high longterm running average, while it decays down to the natural 0 bound exponentially  x ? y : z terminology is C syntax for if x is true, then y, else z

 Learning equation (see LeabraConSpec.h for code)  most of these are intermediate variables used in computing final dwt value

srs = ru>avg_s * su>avg_s
 shortterm senderreceiver coproduct  this is the intracellular calcium from NMDA and other channels

srm = ru>avg_m * su>avg_m
 mediumterm senderreceiver coproduct  this drives dynamic threshold for errordriven learning

sm_mix = s_mix * srs + (1s_mix) * srm
 mix in some of the mediumterm factor into the shortterm factor  this is important for ensuring that when neuron turns off in the plus phase (short term), that enough trace of earlier minusphase activation remains to drive it into the LTD weight decrease region  s_mix = .9 default.

lthr = thr_l_mix * cos_diff_avg * su>avg_m * ru>avg_l
 longterm floating threshold value  drives Hebbian BCM learning component  key driver is receiving unit avg_l, but we also need to multiply by avg_m to keep value in proper dynamic range as the srs term that drives learning  avg_s could alternatively be used (to produce a more purely selforganizing learning dynamic), but given that avg_m shows up in errordriven term as well, it is more parsimonious to use it here (and it works better overall)  thr_l_mix is a constant (.05 default) determining how much of the learning is based on this hebbian longterm value
 lthr can (optionally) be weighted by the cos_diff_avg factor which is the normalized dot product (cosine) of the minus versus plus phase activations on the receiving layer, measuring the overall size of the error signal, averaged over a long time scale (100 trial tau time constant)  this accounts for the fact that some layers tend to have much larger error signals than others, based on where they are in the structural hierarchy, and that then affects the overall balance of hebbian vs. errordriven learning factors  this factor allows us to use a single constant thr_l_mix term for the entire network, whereas otherwise achieving the same net balance of learning terms requires setting different parameters for each layer in the network. We assume that evolution could figure out these differential parameters, but we could sure use the shortcut.

mthr = (1  thr_l_mix * cos_diff_avg) * srm
 mediumterm floating threshold value  weighted by 1 minus the longterm factor, so that net threshold is weighted average of the Hebbian longtermaverage factor and the errordriven mediumterm average factor.

dwt += lrate * XCAL(sm_mix, lthr + mthr)
 the change in weights is a function of the shortterm average coproduct (with a little bit of m mixed in) and the combined longterm and mediumterm floating threshold terms.
 XCAL is the "check mark" linearized BCMstyle learning function (see figure) that was derived from the Urakubo Et Al (2008) STDP model, as described in more detail in the CCN textbook: http://ccnbook.colorado.edu
XCAL(x, th) = (x < d_thr) ? 0 : (x > th * d_rev) ? (x  th) : (x * ((1d_rev)/d_rev))
 d_thr = 0.0001, d_rev = 0.1 defaults
 x ? y : z terminology is C syntax for: if x is true, then y, else z

 Weight update equation (see LeabraConSpec.h for code) (see below for alternative version using fast weights)
 The fwt value here is the linear, noncontrast enhanced version of the weight value, while wt is the sigmoidal contrastenhanced version, which is used for sending netinput to other neurons. One can compute fwt from wt and viceversa, but numerical errors can accumulate in going backand forth more than necessary, and it is generally faster to just store these two weight values (and they are needed for the fast weights version show below).

dwt *= (dwt > 0) ? (1fwt) : fwt
 soft weight bounding  weight increases exponentially decelerate toward upper bound of 1, and decreases toward lower bound of 0. based on linear, noncontrast enhanced fwt weights.

fwt += dwt
 increment the linear weights with the bounded dwt term

wt = SIG(fwt)
 new weight value is sigmoidal contrast enhanced version of fast weight

SIG(x) = 1 / (1 + (off * (1w)/w)^gain)

dwt = 0
 reset weight changes now that they have been applied.
 Fast Weights version of the weight update equation  this is optional and not widely used (as of yet)

dwt *= (dwt > 0) ? (1fwt) : fwt
 soft weight bounding  weight increases exponentially decelerate toward upper bound of 1, and decreases toward lower bound of 0. based on fast weights (in linear, noncontrast enhanced form), so that fast weights can drive learning saturation if in effect.

fwt += dwt + decay_dt * (swt  fwt)
 fast weights, if in effect, decay back to slow weights  otherwise fwt = swt and so it just takes the increment from dwt

swt += dwt * slow_lrate
 slow weights, if using fast weights, learn at a slower learning rate based on same dwt term  otherwise swt = fwt

nwt = SIG(fwt)
 new weight value is sigmoidal contrast enhanced version of fast weight (nwt is not stored  just a tmp variable)

SIG(x) = 1 / (1 + (off * (1w)/w)^gain)

wt += wt_dt * (nwt  wt)
 weight moves toward new weight value with a time constant  reflects slowed rise time seen in early LTP dynamics  if not using fast weights, then wt = nwt directly (as if wt_dt = 1)

dwt = 0
 reset weight changes now that they have been applied.

Specific Leabra Object Information
The following provide more details for the main class objects used in Leabra.
Network structure
Specs
As discussed in Specs, the specs contain all the parameters and algorithmspecific code, while the above objects contain the dynamic state information.
Special Algorithms
See LeabraWizard for special functions for configuring several of these special architectural features.
Simple Recurrent Network Context Layer
 LeabraContextLayerSpec  copies activation state of hidden layer
Scalar, Graded Value Representation
 ScalarValLayerSpec  encodes and decodes scalar, realnumbered values based on a coarse coded distributed representation (e.g., a Gaussian bump) across multiple units. This provides a very efficient and effective way of representing scalar values  individual Leabra units do not do a very good job of that, as they have a strong binary bias.
 TwoDValLayerSpec  twodimensional version of scalar val
 MotorForceLayerSpec  represents motor output and input forces in a distributed manner across unit groups representing position and velocity.
Temporal Differences and General Da (dopamine) Modulation
Temporal differences (TD) is widely used as a model of midbrain dopaminergic firing. Also included are Leabra Units that respond to simulated dopaminergic modulation (DaMod)
See Leabra TD for details.
PVLV  Pavlovian Conditioning
Simulates behavioral and neural data on Pavlovian conditioning and the midbrain dopaminergic neurons that fire in proportion to unexpected rewards (an alternative to TD). It is described in these papers: O'Reilly, Frank, Hazy & Watz, 2007 Hazy, Frank & O'Reilly, 2010. The current version (described here) is as described in the 2010 paper. A PVLV model can be made through the LeabraWizard  under Networks menu.
See Leabra PVLV for full details.
PBWM  Prefrontal Cortex Basal Ganglia Working Memory
Uses PVLV to train PFC working memory updating system, based on the biology of the prefrontal cortex and basal ganglia. For complete details, see O'Reilly and Frank, 2006.
Described in Leabra PBWM.
Other Misc Classes
 MarkerConSpec  a "null" connection that doesn't do anything, but serves as a marker for special computations (e.g., the temporal derivative computation instead of standard net input).
 LeabraLinUnitSpec  linear activation function
 LeabraNegBiasSpec  negative bias learning (for refractory neurons)
 TrialSynDepConSpec, TrialSynDepCon  synaptic depression on a trialwise time scale
 FastWtConSpec, FastWtCon  transient fast weights in addition to standard slowly adapting weights
 ActAvgHebbConSpec  hebbian learning includes proportion of timeaveraged activation (as in the "trace" rule)
Parameter Setting Tips and Tricks
See Leabra Params for detailed discussion about adjusting parameters.
Software Version Update Issues
 Leabra 4.0.15 Notes  Version 4.0.15 release notes for Leabra