CECN1 Active Maintenance

From Computational Cognitive Neuroscience Wiki

Jump to: navigation, search

Contents

Active Maintenance

  • The project file: act_maint.proj (click and Save As to download, then open in Emergent -- NOTE: requires version 4.13 or higher)

Back to CECN1 Projects

Project Documentation

(note: this is a literal copy from the simulation documentation -- it contains links that will not work within the wiki)

  • GENERAL USAGE NOTE: To start, it is usually a good idea to do Object/Edit Dialog in the menu just above this text, which will open this documentation in a separate window that you can more easily come back to. Alternatively, you can just always return to this document by clicking on the ProjectDocs tab at the top of the middle panel.

You will see that the network has three hidden units representing the three features (Monitor, Speakers, Keyboard; Figure 9.18 in the text). The input units provide individual input to the corresponding hidden unit. In an active maintenance context, one needs to have the individual features of a distributed representation mutually support each other via excitatory connections.

  • Select r.wt in the .T3Tab.Distributed tab of the middle frame (scroll down screen) and then click on each of the different hidden units to view their weights.

You should especially note the bidirectional excitatory connections among the three hidden units, which in theory might enable them to actively maintain the representations even after the input pattern is turned off.

There are two "events" in the environment, one where an input pattern is presented to the network, and another where the input is zeroed out (not presented). We are interested in how well the information in the hidden units is maintained during this second event.

You will see the network presented with inputs and the units respond, but it will probably be too quick to get a clear idea of what happened.

The grid display shows the activity of the input and hidden units during the first event (input is present) and the second event (input is removed). You should see that when the two features are active in the input, this activates the appropriate hidden units corresponding to the distributed representation of television as described in the text. However, when the input is subsequently removed, the activation does not remain concentrated in the two features, but spreads to include the other feature (Figure 9.19). Thus, it is impossible to determine which item was originally present. This spread occurs because all the units are interconnected.

Perhaps the problem is that the weights are all exactly the same for all the connections, which is not likely to be true in the brain.

  • Set the RecurrentCons wt_init.mean parameter for the Distributed Network in the master .PanelTab.ControlPanel to .5 (to make room for more variance) and then try a range of wt_init.var values (e.g., .1, .25, .4). Be sure to do multiple runs with each variance level -- you might get lucky on some trials, but it only counts if you can achieve reliable maintenance.

Question 9.9 Describe what happened as you increased the amount of variance. Were you able to achieve reliable maintenance of the input pattern?


Higher Order Distributed Representations

The activation spread in this network occurs because the units do not mutually reinforce a particular activation state (i.e., there is no attractor) -- each unit participates in multiple distributed patterns, and thus supports each of these different patterns equally. Although distributed representations are defined by this property of units participating in multiple representations, this network represents an extreme case. To make attractors in this network, we can introduce higher-order representations within the distributed patterns of connectivity.

A higher-order representation in the environment we have been exploring would be something like a television unit that is interconnected with the monitor and speakers features. It is higher-order because it joins together these two lower-level features and indicates that they go together. Thus, when monitor and speakers are active, they will preferentially activate television, which will in turn preferentially activate these two feature units. This will form a mutually reinforcing attractor that should be capable of active maintenance.

  • To test out this idea, first hit Defaults to restore the original weight parameters, and then set network to HigherOrderDistNet instead of DistributedNet. Then click on the .T3Tab.HigherOrderDistNet tab in the far right frame to view the network.

You can see that this network now has an additional hidden layer with the three higher-order units corresponding to the different pairings of features (also Figure 9.20 in text).

  • Now do Init, Run with this network.

You should observe that indeed it is capable of maintaining the information without spread. Thus, to the extent that the network can develop distributed representations that have these kinds of higher-order constraints in them, one might be able to achieve active maintenance without spread. Indeed, given the multilayered nature of the cortex (see Chapter 3), it is likely that distributed representations will have these kinds of higher-order constraints.

To this point, we have neglected a very important property of the brain -- noise. All of the ongoing activity in the brain, together with the somewhat random timing of individual spikes of activation, produces a background of noise that we have not included in this simulation. Although we generally assume this noise to be present and have specifically introduced it when necessary, we have not included it in most simulations because it slows everything down and typically does not significantly change the basic behavior of the models. However, it is essential to take noise into account in the context of active maintenance because noise tends to accumulate over time and degrade the quality of maintained information -- the active maintenance system must be capable of overcoming this degradation.

  • To add noise (we just add it to the membrane potential on each time step), set the noise.var parameter in the HigherOrderDist Network section of the master .PanelTab.ControlPanel to .01. Do several Runs.

You should have observed that in the presence of noise, even the higher-order distributed representations cannot prevent the spread of activation. The explanation is relatively straightforward -- the noise was sufficiently large to move the network outside of the original attractor basin and into that of another representation. This indicates that the higher-order distributed representations may not have sufficiently wide attractor basins for robust active maintenance.

A parameter that should play an important role in this network is the strength of the recurrent weights. For example, if these weights were made sufficiently weak, one would expect that the network would be incapable of active maintenance. At the other extreme, it might be the case that very strong recurrent weights would produce a more robust form of active maintenance that better resists noise. The strength of the recurrent weights is determined by the RecurrentCons.wt_scale.rel parameter in the .PanelTab.ControlPanel, which has been set to 1.

  • Change this now to .05 (and keep the noise set to .01), and do a couple of Runs.

You should observe that the network is now no longer capable of even maintaining any information at all once the input goes away (whereas before it still maintained activation over time, even though it was not accurate). Thus, clearly the recurrent weight strength is important for supporting basic active maintenance.

  • Now let's see if making the recurrent weights stronger improves the ability to overcome noise. Try wt_scale.rel values of 2 and 5 with multiple Runs of each.

Question 9.10 (a) Does this seem to improve the network's ability to hold onto information over time? (b) Explain your results, keeping in mind that the recurrent weights interconnect all of the hidden units.


Isolated Representations

Although some kinds of distributed representations could potentially exhibit sufficiently robust active maintenance abilities, there is another type of representation that is guaranteed to produce very robust active maintenance. This type of representation uses isolated units that do not have distributed patterns of interconnectivity, and thus that have very wide basins of attraction. Because there is no interconnectivity between units, it is impossible for activation to spread to other representations, resulting in perfect maintenance of information even in the presence of large amounts of noise. These isolated units can be self-maintaining by having an excitatory self-connection for each unit.

  • To explore this kind of representation, set network to IsolatdNet. You can verify the connectivity by using r.wt in the network display (.T3Tab.IsolatedNet). Hit the Defaults button, and then set the noise_var under the Isolated Network section to .01, and then Run several times.

You should observe that the network is now able to maintain the information without any difficulty, even with the same amount of noise that proved so damaging to the previous network. However, this isolated network no longer has the ability to perform any of the useful computations that require knowledge of which features go together, because each unit is isolated from the others. Nevertheless, the posterior cortex can represent all of this relationship information via overlapping distributed representations, so it should be okay for a specialized active maintenance system to use more isolated representations, given their clear advantages in terms of robustness. We will explore this idea further in Section 9.5.

Robust yet Rapidly Updatable Active Maintenance

(Section 9.4.3 in the textbook)

In addition to the basic need for maintaining information over time (without the kind of activation spreading that we saw above), activation-based working memory representations also need to meet two potentially conflicting needs: they sometimes need to be maintained in the face of ongoing processing, while at other times they need to be updated as a function of current information. For example, when doing mental arithmetic, one needs to maintain some partial products while at the same time computing and updating others.

The following simple task, which is similar in many respects to the continuous performance tasks (CPT) often used to test working memory (Servan-Schreiber et al, 97; Rosvold et al, 1956), provides a clear demonstration of working memory demands. Stimuli (e.g., letters) are presented sequentially over time on a computer display. If a particular cue stimulus is shown (e.g., an A), then the subject has to remember the next stimulus, and determine if it matches the one that comes two stimuli after that. After every stimulus presentation, a button must be pressed -- one button if a match event has just occurred, and another button otherwise. Thus, whether one wants to encode a stimulus into active memory or not depends dynamically on preceding stimuli, and cannot be determined as a function of the specific stimulus itself. Further, once encoded, the stimulus must be maintained in the face of the two intervening stimuli.

Because of this need both to maintain robustly and update rapidly, the working memory system cannot adopt a consistent strategy for active maintenance -- it cannot always make the active memories robust by making them insensitive to their inputs, because this would preclude updating. Similarly, if the active memories are easily updatable as a function of their inputs, they will not be robustly maintained in the face of irrelevant information on these inputs.

In this section we will see that the kind of simple active memory system that we have been exploring is missing the kind of dynamic switching between maintenance and updating that seems to be necessary. Thus, the need for this kind of dynamic regulation system provides one more reason to believe that there is a specialized neural system for supporting activation-based working memory. We will explore some ideas regarding the nature of this specialized system and its dynamic regulation in Section 9.5.

We are now going to explore an environment that starts out by presenting an input pattern and then removing that input (as before), and then a new input pattern will be presented and then removed. Under some circumstances, we can imagine that the network would want to update the active memory representations to reflect the second input, but in other circumstances, this input may be irrelevant and should be ignored. It should be clear at the outset that the same network with the same parameters cannot achieve both of these objectives. Thus, we will explore how the parameters can be manipulated to alter the network's tendency to maintain or update.

  • To select the new environment, set input_data MaintUpdateEnv instead of MaintEnv.

We also want to use the IsolatedNet network as explored previously, because it provides the best active maintenance performance.

  • Set network to IsolatedNet.

To add realism and ensure that the basic maintenance task is not completely trivial, let's also add noise.

The first part is the same as before, but then the Input2 input is presented, followed by the Maint2 maintenance period. Note that the grid display will scroll, such that at the end of the run, only this last set of events is shown. You can use the purple bar along the right hand side (click red arrow tool first) scroll down and view the entire sequence.

You should observe in this case that the network updates its internal representation upon the Input2 input pattern presentation. If the task context at this point called for the active maintenance of this new input (e.g., in the CPT-like task described previously, Input1 would be the cue stimulus in this case), then this would be desirable behavior. However, it is also possible that Input2 could be a transient bit of information that should not be maintained (e.g, one of the two intervening stimuli in the CPT-like task). In this latter case, the network's behavior would be inappropriate.

The obvious parameter to manipulate to determine whether the network robustly maintains or rapidly updates is the relative strength of the recurrent self-maintenance connections compared to the input connections. The RecurrentCons.wt_scale.rel parameter in the control panel lets us adjust this, by determining the relative strength of the recurrent self-maintenance connections.

  • Try setting the wt_scale.rel parameter to 2 instead of the default of 1, and Run a couple of times.

Question 9.11 (a) Describe what happens when the Input2 pattern is presented. (b) Now try a wt_scale.rel of 3 instead of 2. What happens with Input2? (c) Explain why changing wt_scale.rel has the observed effects.


You should have observed that by changing the relative strength of the recurrent weights compared to the input weights, you can alter the network's behavior from rapid updating to robust maintenance. This suggests that if the relative strength of these connections could be dynamically controlled (e.g., by a specialized controller network as a function of prior input stimuli), then an activation-based memory system could satisfy the unique demands of working memory (i.e., robust maintenance and rapid updating).

  • When you are done with this simulation, you can either close this project in preparation for loading the next project, or you can quit completely from the simulator.
Personal tools