# Leabra

## Introduction

Leabra stands for Local, Error-driven and Associative, Biologically Realistic Algorithm, and it implements a balance between Hebbian and error-driven learning on top of a biologically-based point-neuron activation function with inhibitory competition dynamics (either via inhibitory interneurons or a fast k-Winners-Take-All approximation thereof). Extensive documentation is available from the new online textbook: http://ccnbook.colorado.edu which serves as a second edition to the original book: Computational Explorations in Cognitive Neuroscience: Understanding the Mind by Simulating the Brain, O'Reilly and Munakata, 2000, Cambridge, MA: MIT Press. (Computational Explorations..)

NOTE: the following is out of date and will be updated at some point soon -- in the mean time, please refer to http://ccnbook.colorado.edu which has all the current information on what is actually implemented in the most recent version of emergent, along with a large number of simulations to demonstrate how it works.

Hebbian learning is performed using conditional principal components analysis (CPCA) algorithm with correction factor for sparse expected activity levels.

Error driven learning is performed using GeneRec, which is a generalization of the Recirculation algorithm, and approximates Almeida-Pineda recurrent backprop. The symmetric, midpoint version of GeneRec is used, which is equivalent to the contrastive Hebbian learning algorithm (CHL). See O'Reilly (1996; Neural Computation) for more details.

The activation function is a point-neuron approximation with both discrete spiking and continuous rate-code output.

Layer or unit-group level inhibition can be computed directly using a k-winners-take-all (KWTA) function, producing sparse distributed representations, or via inihibitory interneurons.

The net input is computed as an average, not a sum, over connections, based on normalized, sigmoidaly transformed weight values, which are subject to scaling on a connection-group level to alter relative contributions. Automatic scaling is performed to compensate for differences in expected activity level in the different projections. See Leabra Netin Scaling for details.

Weights are subject to a contrast enhancement function, which compensates for the soft (exponential) weight bounding that keeps weights within the normalized 0-1 range. Contrast enhancement is important for enhancing the selectivity of self-organizing learning, and generally results in faster learning with better overall results. Learning operates on the underlying internal linear weight value, which is computed from the nonlinear (sigmoidal) weight value prior to making weight changes, and is then converted back. The linear weight is always stored as a negative value, so that shared weights or multiple weight updates do not try to linearize the already-linear value. The learning rules have been updated to assume that wt is negative (and linear).

There are various extensions to the algorithm that implement things like reinforcement learning (temporal differences), simple recurrent network (SRN) context layers, and combinations thereof (including an experimental versions of a complex temporal learning mechanism based on the prefrontal cortex and basal ganglia). Other extensions include a variety of options for the activation and inhibition functions, self regulation (accommodation and hysteresis, and activity regulation for preventing overactive and underactive units), synaptic depression, and various optional learning mechanisms and means of adapting parameters. These features are off by default but do appear in some of the edit dialogs --- any change from default parameters should be evident in the edit dialogs.

## Overview of the Leabra Algorithm

The pseudocode for Leabra is given here, showing exactly how the pieces of the algorithm described in more detail in the subsequent sections fit together.

Iterate over minus and plus phases of settling for each event.
o At start of settling, for all units:
- Initialize all state variables (activation, v_m, etc).
- Apply external patterns (clamp input in minus, input & output in
plus).
- Compute net input scaling terms (constants, computed
here so network can be dynamically altered).
- Optimization: compute net input once from all static activations
(e.g., hard-clamped external inputs).
o During each cycle of settling, for all non-clamped units:
- Compute excitatory netinput (g_e(t), aka eta_j or net)
-- sender-based optimization by ignoring inactives.
- Compute kWTA inhibition for each layer, based on g_i^Q:
* Sort units into two groups based on g_i^Q: top k and
remaining k+1 -> n.
* If basic, find k and k+1th highest
If avg-based, compute avg of 1 -> k & k+1 -> n.
* Set inhibitory conductance g_i from g^Q_k and g^Q_k+1
- Compute point-neuron activation combining excitatory input and
inhibition
o After settling, for all units, record final settling activations
as either minus or plus phase (y^-_j or y^+_j).
After both phases update the weights (based on linear current
weight values), for all connections:
o Compute error-driven weight changes with soft weight bounding
o Compute Hebbian weight changes from plus-phase activations
o Compute net weight change as weighted sum of error-driven and Hebbian
o Increment the weights according to net weight change.


## Point Neuron Activation Function

Default parameter values:

Parameter | Value | Parameter | Value
--------------------------------------------
E_l       | 0.15  | gbar_l     | 0.10
E_i       | 0.15  | gbar_i     | 1.0
E_e       | 1.00  | gbar_e     | 1.0
V_rest    | 0.15  | Theta      | 0.25
tau       | .02   | gamma      | 600
k_hebb    | .02   | epsilon    | .01


Leabra uses a point neuron activation function that models the electrophysiological properties of real neurons, while simplifying their geometry to a single point. This function is nearly as simple computationally as the standard sigmoidal activation function, but the more biologically-based implementation makes it considerably easier to model inhibitory competition, as described below. Further, using this function enables cognitive models to be more easily related to more physiologically detailed simulations, thereby facilitating bridge-building between biology and cognition.

The membrane potential V_m is updated as a function of ionic conductances g with reversal (driving) potentials E as follows:

$\Delta V_m(t) = \tau \sum_c g_c(t) \overline{g_c} (E_c - V_m(t))$

with 3 channels (c) corresponding to: e excitatory input; l leak current; and i inhibitory input. Following electrophysiological convention, the overall conductance is decomposed into a time-varying component g_c(t) computed as a function of the dynamic state of the network, and a constant gbar_c that controls the relative influence of the different conductances. The equilibrium potential can be written in a simplified form by setting the excitatory driving potential (E_e) to 1 and the leak and inhibitory driving potentials (E_l and E_i) of 0:

$V_m^{\infty} = {{g_e \overline{g_e}} \over {g_e \overline{g_e} + g_l \overline{g_l} + g_i \overline{g_i}}}$

which shows that the neuron is computing a balance between excitation and the opposing forces of leak and inhibition. This equilibrium form of the equation can be understood in terms of a Bayesian decision making framework @cite{(O'Reilly & Munakata, 2000)}.

The excitatory net input/conductance g_e(t) or eta_j is computed as the proportion of open excitatory channels as a function of sending activations times the weight values:

$\eta_j = g_e(t) = \langle x_i w_{ij} \rangle = {{1} \over {n}} \sum_i x_i w_{ij}$

See Leabra Netin Scaling for details on rescaling of these values across projections to compensate for activity levels etc.

The inhibitory conductance is computed via the kWTA function described in the next section, and leak is a constant.

Activation communicated to other cells (y_j) is a thresholded (Theta) sigmoidal function of the membrane potential with gain parameter gamma:

$y_j(t) = {{1} \over {\left(1 + {{1} \over {\gamma [V_m(t) - \Theta]_+}} \right)}}$

where [x]_+ is a threshold function that returns 0 if x<0 and x if X>0. Note that if it returns 0, we assume y_j(t) = 0, to avoid dividing by 0. As it is, this function has a very sharp threshold, which interferes with graded learning learning mechanisms (e.g., gradient descent). To produce a less discontinuous deterministic function with a softer threshold, the function is convolved with a Gaussian noise kernel (\mu=0, \sigma=.005), which reflects the intrinsic processing noise of biological neurons:

$y^*_j(x) = \int_{-\infty}^{\infty} {{1} \over {\sqrt{2 \pi} \sigma}} e^{-z^2/(2 \sigma^2)} y_j(z-x) dz$

where x represents the [V_m(t) - \Theta]_+ value, and y^*_j(x) is the noise-convolved activation for that value. In the simulation, this function is implemented using a numerical lookup table.

## k-Winners-Take-All Inhibition

Leabra uses a kWTA (k-Winners-Take-All) function to achieve inhibitory competition among units within a layer (area). The kWTA function computes a uniform level of inhibitory current for all units in the layer, such that the k+1th most excited unit within a layer is generally below its firing threshold, while the kth is typically above threshold. Activation dynamics similar to those produced by the kWTA function have been shown to result from simulated inhibitory interneurons that project both feedforward and feedback inhibition (OReilly & Munakata, 2000). Thus, although the kWTA function is somewhat biologically implausible in its implementation (e.g., requiring global information about activation states and using sorting mechanisms), it provides a computationally effective approximation to biologically plausible inhibitory dynamics.

kWTA is computed via a uniform level of inhibitory current for all units in the layer as follows:

$g_i = g^{\Theta}_{k+1} + q (g^{\Theta}_k - g^{\Theta}_{k+1})$

where 0<q<1 (.25 default used here) is a parameter for setting the inhibition between the upper bound of g^Theta_k and the lower bound of g^Theta_k+1. These boundary inhibition values are computed as a function of the level of inhibition necessary to keep a unit right at threshold:

$g_i^{\Theta} = {{g^*_e \bar{g_e} (E_e - \Theta) + g_l \bar{g_l} (E_l - \Theta)} \over {\Theta - E_i}}$

where g^*_e is the excitatory net input without the bias weight contribution --- this allows the bias weights to override the kWTA constraint.

In the basic version of the kWTA function, which is relatively rigid about the kWTA constraint and is therefore used for output layers, g^Theta_k and g^Theta_k+1 are set to the threshold inhibition value for the kth and k+1th most excited units, respectively. Thus, the inhibition is placed exactly to allow k units to be above threshold, and the remainder below threshold. For this version, the q parameter is almost always .25, allowing the kth unit to be sufficiently above the inhibitory threshold.

In the average-based kWTA version, g^Theta_k is the average g_i^Theta value for the top k most excited units, and g^Theta_k+1 is the average of g_i^Theta for the remaining n-k units. This version allows for more flexibility in the actual number of units active depending on the nature of the activation distribution in the layer and the value of the q parameter (which is typically .6), and is therefore used for hidden layers.

## Hebbian and Error-Driven Learning

For learning, Leabra uses a combination of error-driven and Hebbian learning. The error-driven component is the symmetric midpoint version of the GeneRec algorithm @cite{(O'Reilly, 1996)}, which is functionally equivalent to the deterministic Boltzmann machine and contrastive Hebbian learning (CHL). The network settles in two phases, an expectation (minus) phase where the network's actual output is produced, and an outcome (plus) phase where the target output is experienced, and then computes a simple difference of a pre and postsynaptic activation product across these two phases. For Hebbian learning, Leabra uses essentially the same learning rule used in competitive learning or mixtures-of-Gaussians which can be seen as a variant of the Oja normalization @cite{(Oja, 1982)}. The error-driven and Hebbian learning components are combined additively at each connection to produce a net weight change.

The equation for the Hebbian weight change is:

$\Delta_{hebb} w_{ij} = x^+_i y^+_j - y^+_j w_{ij} = y^+_j (x^+_i - w_{ij})$

and for error-driven learning using CHL:

$\Delta_{err} w_{ij} = (x^+_i y^+_j) - (x^-_i y^-_j)$

which is subject to a soft-weight bounding to keep within the 0-1 range:

$\Delta_{sberr} w_{ij} = [\Delta_{err}]_+ (1-w_{ij}) + [\Delta_{err}]_- w_{ij}$

The two terms are then combined additively with a normalized mixing constant k_hebb:

$\Delta w_{ij} = \epsilon[k_{hebb} (\Delta_{hebb}) + (1-k_{hebb}) (\Delta_{sberr})]$

## Specific Leabra Object Information

The following provide more details for the main class objects used in Leabra.

### Specs

As discussed in Specs, the specs contain all the parameters and algorithm-specific code, while the above objects contain the dynamic state information.

### Special Algorithms

See LeabraWizard for special functions for configuring several of these special architectural features.

#### Simple Recurrent Network Context Layer

• ScalarValLayerSpec -- encodes and decodes scalar, real-numbered values based on a coarse coded distributed representation (e.g., a Gaussian bump) across multiple units. This provides a very efficient and effective way of representing scalar values -- individual Leabra units do not do a very good job of that, as they have a strong binary bias.
• TwoDValLayerSpec -- two-dimensional version of scalar val
• MotorForceLayerSpec -- represents motor output and input forces in a distributed manner across unit groups representing position and velocity.

#### Temporal Differences and General Da (dopamine) Modulation

Temporal differences (TD) is widely used as a model of midbrain dopaminergic firing. Also included are Leabra Units that respond to simulated dopaminergic modulation (DaMod)

See Leabra TD for details.

#### PVLV -- Pavlovian Conditioning

Simulates behavioral and neural data on Pavlovian conditioning and the midbrain dopaminergic neurons that fire in proportion to unexpected rewards (an alternative to TD). It is described in these papers: O'Reilly, Frank, Hazy & Watz, 2007 Hazy, Frank & O'Reilly, 2010. The current version (described here) is as described in the 2010 paper. A PVLV model can be made through the LeabraWizard -- under Networks menu.

See Leabra PVLV for full details.

#### PBWM -- Prefrontal Cortex Basal Ganglia Working Memory

Uses PVLV to train PFC working memory updating system, based on the biology of the prefrontal cortex and basal ganglia. For complete details, see O'Reilly and Frank, 2006.

Described in Leabra PBWM.

#### Other Misc Classes

• MarkerConSpec -- a "null" connection that doesn't do anything, but serves as a marker for special computations (e.g., the temporal derivative computation instead of standard net input).
• LeabraLinUnitSpec -- linear activation function
• LeabraNegBiasSpec -- negative bias learning (for refractory neurons)
• TrialSynDepConSpec, TrialSynDepCon -- synaptic depression on a trial-wise time scale
• FastWtConSpec, FastWtCon -- transient fast weights in addition to standard slowly adapting weights
• ActAvgHebbConSpec -- hebbian learning includes proportion of time-averaged activation (as in the "trace" rule)