CECN1 Transform
From Computational Cognitive Neuroscience Wiki
Contents |
Exploration of Feedforward Transformations
- The project file: transform.proj (click and Save As to download, then open in Emergent)
Back to CECN1 Projects
Project Documentation
(note: this is a literal copy from the simulation documentation -- it contains links that will not work within the wiki)
This project explores how the feedforward flow of information through a network can transform information into different forms. Generally, such transformations involve collapsing across irrelevant distinctions, and emphasizing relevant ones (i.e., forming categories).
- To start, it is usually a good idea to do
Object/Edit Dialogin the menu just above this text, which will open this documentation in a separate window that you can more easily come back to. Alternatively, you can always return by clicking on theProjectDocstab at the top of this middle panel.
We use a network that can recognize images of the 10 arabic numerals (0-9), which you can see in the network view panel on the right (labeled with the .T3Tab.Digit_Network tab).
The Digits Network
Let's first examine the network. It has a 5x7 Input layer for the digit images, and a 2x5 Hidden layer, with each of the 10 hidden units representing a digit.
- In the network view control panel (to access, select the .PanelTab.Digit_Network tab in the middle panel), select the
r.wtvariable to view in the network view (it is at the bottom of the list of variables shown -- you may need to scroll down to find it). Then, select the red arrow at the far right of the window, by the network display, and then click on each of the different hidden units in the network. This will display their weight values going into each unit.
You will see that the weights exactly match the images of the digits the units represent, just as our single 8 detector from the previous chapter did.
Although it should be pretty obvious from these weights how each unit will respond to the set of digit input patterns that exactly match the weight patterns, let's explore this nonetheless.
- Select the
actvalue to view the unit activities in the network, and then select the .PanelTab.ControlPanel tab and do Init and then Step to single-step through each of the input patterns.
As before, this presents the first input pattern (0) to the network, and updates the activations in the network over a series of cycles until equilibrium activations have effectively been reached for all units. Note that we are using the noisy XX1 rate coded activation function, as with most of our simulations.
- Proceed to Step through the entire sequence of digits.
You should have observed that each unit was activated when its matching digit was presented, and not when any of the other digits were presented.
You have probably noticed that the pattern of hidden unit activations over the different inputs is also displayed in the Grid View display next to the network in the right-most display panel. This window displays the activity states of the network, plus the raw excitatory netinput to the hidden units, over time as rows of colored squares. Thus, you can see the whole picture in one glance.
Bias Weights
Before we continue, it is important to understand the role of the bias weights in this simulation. Recall that bias weights provide an additional constant source of net input to the units, independent of any other activity in the network, and reflect differences in intrinsic neural excitability, which have been shown to be adaptively updatable. The digit images used as input patterns have somewhat different numbers of active units (as you might have observed in the detector exercise in the previous chapter). Thus, the bias weights help compensate for these differences.
- Press the .T3Tab.Digits view tab (above right panel) to see images of all the digit input patterns. These are displayed using the same Grid View display as the network patterns. Note that there is a (thin!) purple scroll bar on the right of each of these grid view displays, which you can use (with the red arrow tool clicked) to scroll through the full set of items for the Noisy Digits and Letters, which exceed the 10 rows visible at a time.
You might have expected these differences in the number of active input units in the Digits images to result in different net input and activation levels for the corresponding hidden units. Instead, all the activations shown in the Grid View appear roughly similar. The discrepancy here is attributable to the use of bias weights, which compensate for these differences in overall activity level coming from the inputs.
- Go back to the .T3Tab.Digit_Network view, and select
bias.wtin the NetView window.
You should see a pattern of different valued bias weights on the hidden units. Let's evaluate the contributions of these bias weights by running the network without them.
- In the .PanelTab.ControlPanel, click the
biases_oncheck box to off' (not checked), and then do Run (or single-step) to see how the network runs without these bias weights on.
This will run through all of the digits, and you can view the Grid View to see the resulting activities and net inputs.
'Question 3.1: How did the lack of bias weights affect the hidden unit activities, and their relation to the number of active units in the input patterns?
- Turn the bias weights back on and select
bias.wtto view in the Network window if it isn't already selected.
Question 3.2: Explain in specific terms how these bias weights contribute to producing the originally observed hidden unit activities. Make reference to specific input digit images, unit weights, and the network net input values with the biases on and off.
Cluster Plots
Next we will produce some cluster plots like those shown in the text.
- Run the network with the biases on again. Then select Digits for the
cluster data srcfield in the .PanelTab.ControlPanel and then do Cluster Init and Cluster Run to generate a cluster plot.
You should get a window containing the cluster plot for the similarity relationships among the digit images. Compare the amount of overlap between activated pixels in the digit images with the cluster plot results.
Next let's look at the similarity relationships among the hidden unit representations of the digits.
- Select
TrialOutputDatafor the cluster data src this time (hit Apply), and then do Cluster Run again.
You should get a cluster plot that looks much like that shown in Figure 3.8b in the textbook, except there is just one label for each digit category. This shows that the network has transformed the complex patterns of input similarity into equally distinct hidden representations of the digit categories.
Note that we have binarized the hidden unit activation values (i.e., changed values greater than .5 to 1, and those less than .5 to 0) for the purposes of clustering. Otherwise, small differences in the activation values of the units would distract from the main structure of the cluster plot (which is actually the same regardless of whether the activations are binarized or not). For the present purposes, we are more interested in whether a detector has fired or not, and not in the specific activation value of that detector, though in general the graded values can play a useful role in representations, as we will discuss later.
The next step is to run the case where there are multiple instances of each digit (the NOISY_DIGITS case).
- In the .PanelTab.ControlPanel, set the
input datatoNOISY_DIGITS, Apply, and Step through these new patterns. You can also click on the .T3Tab.Digits tab and scroll through the noisy digits inputs there to see what they look like.
You should see that the appropriate hidden unit is active for each version of the digits (although small levels of activity in other units should also be observed in some cases).
- Now do Cluster Run again (make sure that cluster data src is still set to
TrialOutputData, so it will process the hidden layer activations).
You should see the same plot as shown in Figure 3.8b in the textbook. You can compare this with the cluster plot you get when you change cluster data src to NoisyDigits and run the cluster plot (same as Figure 3.8a). This clearly shows that the network has collapsed across distinctions between different noisy versions of the same digits, while emphasizing the distinctions between different digit categories.
Selectivity and Leak
In the detector exploration from the previous chapter, we saw that manipulating the amount of leak conductance altered the selectivity of the unit's response. By lowering the leak, the unit responded in a graded fashion to the similarity of the different digit images to the detector's weight pattern. Let's see what kinds of effects this parameter has on the behavior of the present network. The control panel shows that the leak conductance (g_bar.l in the .PanelTab.ControlPanel) for the hidden units has been set to a value of 6.
- Reduce the
g_bar.lfor the hidden units from 6 to 5 and Run (still using input data ofNoisy Digits).
Question 3.3 (a) What happens generally to the hidden activations with this reduction in leak value? (b) How does this affect the cluster plot of hidden unit activities? (c) How about for a g_bar.l of 4? (d) If the goal of this network was to have the same hidden representation for each version of the same digit, and different representations for different digits, how does changing the units' excitability (via the leak current) affect the success of the network, and why?
Letter Inputs
Now, we will see how the network responds to letter inputs instead of digits.
- First, set the
g_bar.lleak conductance back to 6 and make sureactis selected in the Net View window. Then setinput datatoLetters(Apply)
and press Run.
Notice the letters being presented over the input layer on the network.
- Use the scrollbar on the right of the Grid Log view next to the network to scroll the display back to the start of the letter presentations, because these have scrolled off the ``top of the display.
The only significant response came to the "S" letter input from the "8" hidden unit -- note that "S" is very similar to the "8".
- Make a cluster plot of the letter input patterns by selecting cluster data src as Letters and doing Cluster Run. This should look like Figure 3.10a in the textbook. Then, make a cluster plot of the hidden unit activations (cluster data src to TrialOutputData and Cluster Run again). This should look like Figure 3.10b.
You should be able to see in these cluster plots that these digit units do not respond very informatively to the letter stimuli.
Question 3.4 (a) Based on your experiences in the previous question, what would you expect to happen to the cluster plot of hidden responses to letter inputs as you lowered the g_bar.l leak current to a value of 4? Do this -- were you right? (b) Would you say that this hidden representation is a good one for conveying letter identity information? Why or why not? (Hint: Pay particular attention to whether any letters are collapsed in the cluster plot -- i.e., having no distance at all between them.) (c) Can you find any setting of g_bar.l that gives you a satisfactory hidden representation of letter information? Explain.
