CCNBook/Sims/Networks/Categ

From Computational Cognitive Neuroscience Wiki
Jump to: navigation, search
CCNBook/Sims/Networks/Categ
Project Name face_categ
Filename File:face categ.proj Open Project in emergent
Author Randall C. O'Reilly
Email emergent-users@grey.colorado.edu
Publication OReillyMunakataFrankEtAl12
First Published Jul 27 2016
Tags Categorization, Network, Faces, Emotion
Description This project explores how sensory inputs (in this case simple cartoon faces) can be categorized in multiple different ways, to extract the relevant information and collapse across the irrelevant
Updated 28 July 2016, 6 September 2016, 12 January 2017, 13 January 2017, 11 January 2018, 22 January 2018
Versions 8.0.0, 8.0.2, 8.0.3, 8.0.4, 8.0.5, 8.0.7
Emergent Versions 8.0.0, 8.0.4, 8.5.0, 8.5.1
Other Files File:FaceNetwork.wts


Back to CCNBook/Sims/All or Networks Chapter.

Introduction

This project explores how sensory inputs (in this case simple cartoon faces) can be categorized in multiple different ways, to extract the relevant information and collapse across the irrelevant. It allows you to explore both bottom-up processing from face image to categories, and top-down processing from category values to face images (imagery), including the ability to dynamically iterate both bottom-up and top-down to cleanup partial inputs (partially occluded face images).

If you are following along with the text, then first do Part I in the section on Categorization, and then come back for Part II after reading about Bidirectional Excitatory Dynamics and Attractors.

It is recommended that you click here to undock this document from the main project window. Use the Window menu to find this window if you lose it, and you can always return to this document by browsing to this document from the docs section in the left browser panel of the project's main window.

Part I: Feedforward (bottom-up) Flow of Information from Inputs to Categories

The Network and Face Inputs

Let's first examine the network, shown in the Network tab in the right 3D panel. It has a 16x16 Input layer for the face images, and three different categorization output layers:

  • Emotion with "Happy" and "Sad" units, categorizes the emotion represented in the face into these two different categories.
  • Gender with "Female" and "Male" units, categorizes the face according to these two gender categories.
  • Identity with 6 labeled units with the names given to the different faces in the input (Alberto, Betty, Lisa, Mark, Wendy, Zane) -- the network can categorize the individual despite differences in emotional expression. Four additional units are available if you want to explore further by adding new faces.
Select the r.wt variable to view in the network view, and click on each of the different output category neurons in the network. This will display the weight values going into each neuron.

These weights were learned in a way that makes their representations particularly obvious by looking at these weights, so you can hopefully see sensible-looking patterns for each unit. To further understand how this network works, we can look at the input face patterns and corresponding categorization values that it was trained on (we'll explain this learning process in the Learning Chapter).

Select the Faces tab in the right panel -- look at the patterns shown in the left portion of the display -- these are the names, faces, and emotional expresions that the network was trained on.

Testing the Network

The next step in understanding the basics of the network is to see it respond to the inputs.

Select the act value to view the unit activities in the Network, and then select the ControlPanel tab and do Init and then Step Trial to single-step through the input patterns.

You will see the network process the face input and activate the appropriate output categories for it (e.g., for the first pattern, it will activate Happy, Male, and Alberto). Note that we are using the NOISY_XX1 rate coded activation function, as with most of our simulations.

Proceed to {Step Trial through the entire set of faces.

You have probably noticed that the pattern of network activations was recorded in the TrialOutputData Grid View display next to the network. This allows you to see the whole picture of network behavior in one glance.

Using Cluster Plots to Understand the Categorization Process

A Cluster Plot provides a convenient way of visualizing the similarity relationships among a set of items, where multiple different forms of similarity may be in effect at the same time (i.e., multidimensional similarity structure). First, we'll look at the cluster plot of the input faces, and then of the different categorizations performed on them, to see how the network transforms the similarity structure to extract the relevant information and collapse across the irrelevant.

In the ControlPanel, check that the cluster_data_src (source data for cluster plotting) is set to Faces, and the cluster_layer is set to Input, and then hit the ClusterPlot Run button.

You should see the resulting cluster plot in the ClusterPlotData tab.


Question 3.1: Given what you know about how a Cluster Plot works (see above link), explain the similarity structure among the different face inputs. Describe which items are most similar to each other, and next-most similar, etc. Specifically, list the ordering of the Emotion, Gender, and Identity factors in terms of how similar items are -- i.e., are different versions of the same Identity more similar to each other than faces with the same Emotion?

Now, let's see how this input similarity structure is transformed by the different types of categorization.

Set the cluster_layer to Emotion and do another ClusterPlot Run.

Question 3.2: How does the Emotion categorization transform the input similarity -- ie., what items are now the most similar to each other?

Set the cluster_layer to the other options (Gender, Identity) and do cluster plots of those as well.

You should observe that the different ways of categorizing the input faces each emphasize some differences while collapsing across others. For example, if you go back and look at the r.wt values of the "Happy" and "Sad" Emotion units, you will clearly see that these units care most about (i.e., have the largest weights from) the mouth and eye features associated with each of the different emotions, while having weaker other weights from the inputs that are common across all faces.

This ability of synaptic weights to drive the detection of specific features in the input is what drives the categorization process in a network, and it is critical for extracting the behaviorally-relevant information from inputs, so it can be used at a higher level to drive appropriate behavior. For example, if Zane is looking sad, then perhaps it is not a good time to approach him for help on your homework..

In terms of localist vs. distributed representations, the category units are localist within each layer, having only one unit active, uniquely representing a specific category value (e.g., Happy vs. Sad). However, if you aggregate across the set of three category layers, it actually is a simple kind of distributed representation, where there is a distributed pattern of activity for each input, and the similarity structure of that pattern is meaningful. In more completely distributed representations, the individual units are no longer so clearly identifiable, but that just makes things more complicated for the purposes of this simple simulation.

Having multiple different ways of categorizing the same input in effect at the same time (in parallel) is a critical feature of neural processing -- all too often researchers assume that one has to choose a particular level at which the brain is categorizing a given input, when in fact all evidence suggests that it does massively parallel categorization along many different dimensions at the same time.

Part II: Bidirectional (Top-Down and Bottom-Up) Processing

In this section, we use the same face categorization network to explore bidirectional top-down and bottom-up processing through the bidirectional connections present in the network. First, let's see these bidirectional connections.

In the Network tab, select the s.wt (sending weights) variable and click on the various category output units -- you can click back and forth between r.wt and s.wt to compare the receiving and sending weights for a given unit -- in general they should have a similar pattern with somewhat different overall magnitudes.

Thus, as we discussed in the Networks Chapter, the network has roughly symmetric bidirectional connectivity, so that information can flow in both directions and works to develop a consistent overall interpretation of the inputs that satisfies all the relevant constraints at each level (multiple constraint satisfaction).

Top-Down Imagery

A simple first step for observing the effects of bidirectional connectivity is to activate a set of high-level category values and have that information flow top-down to the input layer to generate an image that corresponds to the combination of such categories. For example, if we activate Happy, Female, and Lisa, then we might expect that the network will be able to "imagine" what Lisa looks like when she's happy.

In the ControlPanel, change the input_type to TOP_DOWN instead of the current BOTTOM_UP, and then do Init and hit Step Cycle multiple times to see the activation dynamics unfold one cycle at a time.

You should see that the high-level category values for the first face in the list (Happy, Male, Alberto) were activated at the start, and then the face image filed in over time based on this top-down information.

Interactive Top-Down and Bottom-Up and Partial Faces

Next, let's try a more challenging test of bidirectional connectivity, where we have partially occluded face input images (20 pixels at random have been turned off from the full face images), and we can test whether the network will first correctly recognize the face (via bottom-up processing from input to categories), and then use top-down activation to fill in or pattern complete the missing elements of the input image, based on the learned knowledge of what each of the individuals (and their emotions) look like.

Set the input_type back to BOTTOM_UP in the ControlPanel, and change the input_data to PartialFaces instead of Faces, and then do Init and hit Step Cycle multiple times while watching the Network display.

You should observe the initial partial activation pattern, followed by activation of the category-level units, and then the missing elements of the face image gradually get filled in.


Question 3.3: Across multiple different such partial faces, what is the order in which the category units get active? How does this relate to the timing of when the missing features in the input face start to get filled in?

Another way of thinking about the behavior of this network is in terms of attractor dynamics, where each specific face and associated category values represents a coordinated attractor, and the process of activation updating over cycles results in the network settling into a specific attractor from a partial input pattern that neverthelss lies within its overall attractor basin.

At a technical level, the ability of the network to fill in the missing parts of the input requires soft clamping of the input patterns -- the face pattern comes into each input as an extra contribution to the excitatory net input, which is then integrated with the other synaptic inputs coming top-down from the category level.

You may now close the project (use the window manager close button on the project window or File/Close Project menu item) and then open a new one, or just quit emergent entirely by doing Quit emergent menu option or clicking the close button on the root window.