|Author||Randall C. O'Reilly, Sergio Verduzco-Flores|
|First Published||Oct 12 2016|
|Tags||Cerebellum, Error-correction learning, Sparse representation, Motor control, reaching, Muscles, Arm|
|Description||Cerebellum model showing how muscle-specific error signals can drive anticipatory correction and prevention of errors via Purkinje cell inhibition driven by Inferior Olive error signals.|
|Updated||12 October 2016, 14 January 2017, 16 January 2018|
|Versions||0.1.15, 0.1.18, 0.1.19, 0.1.20|
|Emergent Versions||8.0.1, 8.0.4, 8.5.1|
This simulation illustrates how the Cerebellum learns from errors to improve motor control. It includes a detailed model of a human arm, with 12 different muscles operating in a biophysiologically-realistic manner.
This model of learning in the cerebellum (Verduzco-Flores & O'Reilly, 2015) synthesizes many ideas from the long literature on this brain area, going back to the seminal ideas in the field (Marr, 1969; Albus, 1971; Ito84). The key idea in all of these models is that the Inferior Olivary (IO) climbing fiber inputs onto the Purkinje cells provides an error signal that drives learning in the Purkinje cells (via massive complex spikes), such that, somehow, the resulting modified output of the Purkinje's reduces or eliminates the error in the future. Because the Purkinje cells are tonically active, and drive inhibition into the deep cerebellar nuclei (DCN) that are the output of the cerebellum, it seems that this IO learning signal serves to inhibit Purkinje firing, and thus disinhibit the DCN outputs. Indeed, there is some evidence that IO climbing fiber activity drives LTD (long-term depression) on the excitator granule-cell inputs into the Purkinje's. However, it also seems that potentiation of other inhibitory connections from inhibitory stellate and basket cells is likely to be important too.
Within this broad, widely-accepted framework, the two central questions for any complete model of cerebellum function are:
- How does the IO compute its error signals? What triggers them to fire these complex spikes that drive Purkinje learning?
- How does Purkinje disinhibition resulting from learning then correct the errors? What exactly is the output of the cerebellum doing?
Our version of this model is based on the following hypothesized answers to these questions:
- IO Errors are based on a combination of two factors, interacting in the following way:
- visual signals based on a desired motor target in visual coordinates, and the actual outcome of the motor action, also in visual coordinates (e.g., the current position of the hand).
- somatosensory (muscle spindle fiber) signals indicating the extent of muscle length, relative to motor control signals specifying the target muscle lengths associated with the posture for the given visual target.
- The visual error signals gate the muscle signals, such that muscle error signals only arise when there is an error in the visual signals.
- Furthermore, an error is only signaled when the gradient of the difference between target and current position is increasing -- i.e., as long as the system is overall reducing the error, then no IO error signal is triggered -- it is only when error starts to increase that the IO gets excited.
- The net effect of this error mechanism is that only when a motor plan starts to "go awry" (deviate from plan) in overall visual coordinates, a muscle-specific detailed error signal is provided via the IO, targeting those muscles that seem to be most out of alignment from their targeted values.
- Individual Purkinje cells are associated with particular muscles, and receive these muscle-specific error signals from the IO, and their effect is to modulate the gain of the current motor control signal being provided. Thus, if there is a given level of contraction currently being driven on a given muscle, Purkinje disinhibition amplifies this contraction in proportion to the level of disinhibition.
- Finally, there is a temporal offset in the learning dynamics of the Purkinje cells, so that an IO signal arriving at time t causes the Purkinje cells to drive LTD on synapses that were active at t-d msec earlier in time (where d is roughly 80 msec based on empirical and biophysical estimates). This means that the Purkinje cells effectively apply a corrective action (driven initially at the time when the error is detected) to a point earlier in time, to anticipate and prevent the error in the first place.
Thus, our model basically says that the cerebellum memorizes a corrective action taken in response to an error, and applies this corrective action earlier in time to prevent or minimize the error next time around.
To get this model to actually work, we clearly need some kind of representation of time within the context of an unfolding motor action. This is provided by the massive number of granule cells, which encode low-order conjunctions over the various input signals coming into the cerebellum, and have an inhibitory dynamic that causes them to turn off after a brief period of activation. The net result is a "flickering" array of granule cell activation that provides a unique time-stamp for each moment in the overall trajectory, as a distributed, sparse activity pattern over the entire population of granule cells in a given small module of the cerebellum. The Purkinje cells can then learn to modulate their activation as a function of this temporally evolving input pattern.
Reaching Behavior and Motor Plant
To provide a realistic test of our model, we constructed a detailed model of the human arm and muscle system, including the 12 major muscle groups that control arm movements, attaching at different points on the shoulder, humerus (upper arm bone), ulna (lower arm bone), and hand. This arm moves according to the laws of physics as simulated by a physics equation solver library (ODE) that we have linked into emergent.
You can rotate the space around using the wheels on the sides of the window, and by clicking on the different tabs at the bottom of the window, to see the full 3D layout of things.
| ⇒ In the , click and to start the model running. |
You should see the arm reaching over toward the green target. On the first few reaches, notice how the hand overshoots the target a bit -- going too high -- and then it comes back down closer to the target. This is the error that the cerebellum will learn to correct. The maximum height of the initial error is indicated by the black bar, so you can keep track of how the learning is doing. The graph on the right also shows the vertical coordinate of the hand position over time (red line, hand_y), as well as the target hand position (targ_y) along with the error delta, del hand y -- it is this last measure that (when combined with the delta or error along the other two axes) represents the visual component to the IO error signal. When this delta value starts to increase over time, instead of decreasing as it does at the start of the motor trajectory, then the IO error signal is enabled.
The middle graph shows several variables plotted over time as the reach unfolds. The key one to pay attention to first is the musc io err mag -- this is the net muscle error signal computed by the IO -- you should see that just as the hand overshoots the target, this error signal pops up, enabled by the blue hand io err signal, which in turn is driven by the hand pos err mag, which is the total hand position error magnitude value. You may need to Stop, Init, and Run again several times to follow all of the action as learning unfolds.
After a few more reaches with this green IO error signal firing at the end of the reach, you should see the purple gains_mag line start to pop up, earlier in the reaching trajectory. This is the overall gain signal driven by the Purkinje cells, in response to the IO error signals! As emphasized above, the temporal offset in learning allows the Purkinje signal to come earlier in the reaching trajectory, to anticipate and correct the subsequent errors.
|Question 7.10: What effect does this Purkinje signal have on the reaching behavior -- does the hand overshoot the target as much once this signal starts to take effect? Again, re-Init and Run a few times through to see the whole process unfold.|
|Question 7.11: Explain how the corrective muscle control signals applied when the hand overshot the target could, when applied earlier, prevent the overshoot in the first place? i.e., in what direction would these corrective control signals move the hand, and why would applying these same muscle signals earlier in the reach prevent the error?|
Now let's see how the cerebellum pulls off this trick!
| ⇒ Click on the tab to see the network. Do Init and Run again in the ControlPanel. |
We can work backwards to see how the network functions. First, pay attention to the IO layer on the upper right. Toward the end of each reach, you will see a subset of these units get activated -- these are the muscle-specific IO error signals (note that there are 12 such IO units, one per each muscle). After a few more repetitions of learning, you should start to see the Purkinje cells become inactive briefly during the reach -- the specific Purkinje cells that correspond to the IO signals that were active before. This is the net result of the IO-driven learning, reducing synaptic strength from active granule cells, and driving a net disinhibitory signal from the cerebellum.
Next, pay attention to the flickering Granule cell layer activity, and notice how it unfolds as a function of the slowly-changing input activations. The input to this model, simulating the mossy fiber pathway from the pontine nuclei, comes from the target muscle lengths for this motor action (i.e., the motor command generated by primary motor cortex), the current muscle lengths, the velocity (rate of change) in muscle length, and then the three smaller layers representing the 3D coordinates of the target and hand, and hand velocity.
| ⇒ Click on r.wt in the network view, and click around on some Granule cells. |
You should see that they are very sparsely and randomly connected to the input layers. Thus, each granule cell samples a random subset of these inputs, ensuring that it will only be active for a small portion of the overall trajectory. We also add some activity-driven inhibition (refractoriness) to the granule units, so once they are active for a brief period, they are actively inhibited. This produces the overall flickering dynamic, and enables the granule cell layer to provide a useful representation of time within a given motor trajectory. Because the motor control and current state signals drive the granule cell layer, the pattern that evolves over it is highly specific to a particular reach trajectory, allowing it to memorize the corrections to that particular reach in a way that doesn't interfere with other reaches.
| ⇒ You may now close the project (use the window manager close button on the project window or menu item) and then open a new one, or just quit emergent entirely by doing menu option or clicking the close button on the root window. |
- Albus, J. S. (1971). A theory of cerebellar function. Mathematical Biosciences, 10(1-2), 25–61. Retrieved from http://www.sciencedirect.com/science/article/B6VHX-45F52M2-J8/2/bba55f65c1bf9b826444584ec64ee6c3
- Ito84 could not be found
- Marr, D. (1969). A theory of cerebellar cortex. Journal Of Physiology (London), 202, 437-470.
- Verduzco-Flores, S. O., & O'Reilly, R. C. (2015). How the credit assignment problems in motor control could be solved after the cerebellum predicts increases in error. Frontiers In Computational Neuroscience, 9. http://doi.org/10.3389/fncom.2015.00039