AXTut PfcBg
From Emergent
(back to AX Tutorial)
Contents |
Adding a Prefrontal Cortex, Basal Ganglia Working Memory System
There are many different ways of giving a neural network some amount of working or active memory, to hold on to prior events. Perhaps the simplest is to add a "simple recurrent network" (SRN) context layer that holds on to the prior time step's hidden layer activations, and then feeds back into the hidden layer to provide context for the current inputs.
However, there are various limitations of this simple SRN memory, which can be overcome by having an active gating mechanism that determines when to hold onto information and when to forget it. One scientific theory is that the basal ganglia provide this function, by interacting with the prefrontal cortex, which is widely regarded as the brain area responsible for holding onto working memory. The specific implementation of this idea, called PBWM (prefrontal-cortex basal-ganglia working memory; O'Reilly & Frank, 2006, Neural Computation) is available through the Leabra wizard, and we'll use that.
First, to prepare the model for the PBWM components, we need to move the Input layer up to the same level as the output layer. For anatomically-inspired reasons, PBWM locates various brain-stem dopamine systems in the lower level of the model. To do this, click on the red arrow in the .T3Tab.Network_0 panel, and click on the virtical arrow poking through the green Input layer border, and drag it up to the level of the Output layer. The Output layer should move out of the way, and that is all you need to do, but if things don't look right, you can drag layers around with the horizontal arrows too.
(I think the path in the following paragraph is wrong. following wizards|LeabraWizard_0 then hitting the Network button at the bottom and selecting PBWM works for me. I can't see the fm_hid_cons selection which is mentioned below, but it doesn't seem to make a difference. No other differences I could see - TM)
Next, go to the .PanelTab.LeabraWizard_0, and
select Network/Bg PFC. A dialog with several options and lots of information comes up. Turn off fm_hid_cons, and turn on nolrn_pfc. This makes the PFC working memory layer activated directly from the input layer, and not the hidden layer, and it makes it a direct copy of the input layer, instead of having it learn new representations. These are "hacks" that simplfy the model and make it easier to understand -- performance is generally the same without them. When you hit OK, you'll get a series of dialogs with information -- just keep hitting OK until it is done. You should see a rather more elaborate network now, with many more layers.
For complete details about these layers, see the OReilly and Frank-2006 paper (O'Reilly, R.C. & Frank, M.J. (2006). Making Working Memory Work: A Computational Model of Learning in the Frontal Cortex and Basal Ganglia. Neural Computation, 18, 283-328.) Here is a very brief overview:
- First, note that there are four separate stripes (groups of units) in the PFC and Matrix layers -- this was determined by the
n_stripesparameter in the wizard. Each stripe can be independently updated, such that this system can remember up to 4 different things at the same time, each with a different "updating policy" of when memories are updated and maintained. The active maintenance of the memory is in PFC, and the updating signals (and updating policy more generally) come from the Matrix units (a subset of basal ganglia units). - PV* and LV* and friends at the very bottom layer of the network: these represent the dopaminergic system, which provides reinforcement learning signals to train up the dynamic gating system in the basal ganglia. The PV layers represent primary values of reward (i.e., actual externally-delivered reward values), while the LV layers represent learned ("anticipated") values -- together, they account for Pavlovian conditioning phenomena and associated dopaminergic firing data.
- Matrix: this is the dynamic gating system representing the matrix units of the basal ganglia. Every even-index unit within a stripe represents "Go", while the odd-index units represent "NoGo." The Go units cause updating of the PFC, while the NoGo units cause the PFC to maintain its existing memory representation.
- SNrThal: represents the substantia nigra pars reticulata (SNr) and the associated area of the thalamus, which produce a competition among the Go/NoGo units within a given stripe. If there is more overall Go activity in a given stripe, then the associated SNrThal unit gets activated, and it drives updating in PFC.
- PFC: has 4 different stripes each of which has a localist one-to-one representation of the input units (due to the nolrn_pfc flag). Thus, you can look at these PFC representations and see directly what the network is maintaining.
Setting the RewTarg Input
Before we can run the model, we need to do one extra bit of configuration. The PBWM model learns from rewards and punishments generated based on how it is performing the task. Only the reward values generated on the probe trials are relevant, however, so we need to tell the model when the relevant trials are. This is done using the RewTarg layer (in the bottom layer), which is a new input layer that was added by the wizard. When we set this unit activation to 1, then that tells the network that this is a trial when reward should be computed based on the difference between the network's output and the correct answer. Note that this is not the direct value of the reward itself, just the indicator of when reward should be computed.
The procedural steps for making this RewTarg work are mostly the same for any kind of change in the input data table structure (e.g., adding more input units), so these steps are generally useful:
- First, go to the .PanelTab.LeabraWizard_0 and
select Data/UpdateInputDataFmNet -- for 'data_table', select 'StdInputData'. This will automatically reconfigure your StdInputData table to include the RewTarg input (and it will adjust it to any other changes you might make in your network -- a very useful function!). You can check the results in data|InputData subgroup|StdInputData; the matrix will have an extra column.
- Next, we need to update the program that applies the input data to the network, so that it will appropriately apply the new RewTarg input to the network. This is the .programs.gp.LeabraAll_Std.ApplyInputs program in LeabraAll_Std subgroup.
In its objs section, there is an object called .programs.gp.LeabraAll_Std.ApplyInputs.LayerWriter_0, which provides the info for mapping input data to the network layers. Hit AutoConfig on this object, and it will automatically update based on the new input data and network configuration.
- Now we need to modify our .programs.CPTAXGen program to set this RewTarg input value correctly. This requires several steps:
- Update the unit names so we can refer to the rew targ input using an enum: click on the InitNamedUnits object in the init_code section of the program (under Edit Program tab) and hit the [[.programs.CPTAXGen.init_code[0].InitNamesTable()|InitNamesTable]] button -- this will update the UnitNames data table to match the updates in the input data table.
- Go to .data.gp.InputData.UnitNames and enter the name "rew_targ" for the single RewTarg unit.
- Go back to InitNamedUnits and do [[.programs.CPTAXGen.init_code[0].InitDynEnums()|InitDynEnums]] -- this will add a RewTarg DynEnum in the types section (it would also update the enums based on any other changes you might have made in the UnitNames table -- again a very useful function to remember)
- Now we are finally ready to add the code to set the rew_targ input for the probe trial. Just drag a
set units litfrom the Network toolbox to end of the prog_code before the DoneWritingRowData guy, and set the enum_type to RewTarg, and the value should be R_rew_targ.
Finally, you can hit Init and Run on the LeabraTrain program to run your new network (select Yes to Initialize the weights).
Increasing the Hidden Layer Size
It may or may not learn the task very quickly (depends on your random initial weights, etc). It turns out from playing with this model a bit that the initial 16 unit hidden layer is just a bit too small to handle all the new information being represented in the PFC layer. So, click on the Hidden layer's green border in the Network_0 3d view, and you should see the edit panel for the network in the middle panel. Locate the un_geom line, and change it to 5 x 5 (25 units) instead of 4 x 4. When you hit apply, the layer will change size in the display, but the extra units will not be filled in. You need to click back on the Network_0 tab in the middle panel, and hit the Build button at the bottom to rebuild the network based on the changes you made.
Now Init and Run on the LeabraTrain program again, select Yes to initialize the weights, and you should see the network learning within 10-20 epochs or so (again, toggle off the display button on the Network view control panel to speed things up -- same with the TrialOutputData Grid display).
Displaying Unit Names
Once the network has learned, we can use the Step button on the Train program to see how it operates step by step (turn the network and trial grid log display's back on). By default, the step goes one settle at a time -- you can change this to LeabraTrial to get one trial at a time.
To better visualize what is happening, you can change the input, output, and PFC layers to display the name of the most active unit, rather than just the raw unit activations. This makes it just that much easier to figure out what the network is doing.
There are two steps to this. First, we need to get the unit name labels from the UnitNames data table into our network. Then, we need to configure the network display to show the labels instead of the units.
- Go to our good friend the InitNamedUnits object in the
init_code of the .programs.CPTAXGen program. There is a LabelNetwork button there, but if you press it, you'll get an error about not finding a network variable to apply the names to. To create this variable, you can just copy the network variable from the args section of any of the programs in the LeabraAll_Std subgroup, and put it in the args of the CPTAXGen program (do context menu copy on the network variable and then context menu paste on args, or you can actually open up the args section of a program in the left browser and drag the network directly into your CPTAXGen args in the middle browser -- that is a convenient way to do various copies). Then do [[.programs.CPTAXGen.init_code[0].LabelNetwork()|LabelNetwork]] again, and this time it should work.
- In the .T3Tab.Network_0 view, select the red arrow and then click on the Input layer (green border) to select it, and then use the context menu (right mouse button or mac-command-mouse) to select
Disp Output Name. Repeat this for the Output and PFC layers.
Now Step your LeabraTrain program, and you should see the names of the active units.
In the network that we trained (which you can load into Network_0 by clicking on that guy and doing Object/Load and selecting ax_tutorial_cptax.net), it is very clear that the middle two stripes update for an A or a C, while the first and third stripe update for B. No stripes update for any of the probe stimuli. This encoding of the cue but not the probe is just what you'd expect the network to learn.
It might be easier to see what the network is doing if you change the pct_target to .25 or something instead of .7 -- you don't have to spend so much time clicking through AX trials.
That is all we have for now. You might notice a project called 12ax4s.proj in this same directory -- that is an even more complex version of the CPT-AX task involving an outer-loop of 1 or 2 stimuli that determines what the inner-loop target sequence is (AX or BY). These same mechanisms can learn that more difficult task, though it takes longer. It is described in detail in the O'Reilly & Frank 2006 paper referenced above.
