47: Striatum as a mosaic of broken mirrors

The mosaic of broken mirrors is an analogy for the striatum in the basal ganglia [Da Cunha et al 2009]. The striatum represents a vertebrate’s actions and environment in a broken and overlapping fashion. While actions have focal projections to the striatum, the contextual input is broad and diffuse [Fee MS 2012]. While the hippocampus represents the environment globally, the striatum depends on piecewise representation. This means striatal learning is unable to generalize, because mosaic fragments lack a global perspective [Da Cunha et al 2009].

In the essays I’ve used the striatum as a timeout for food seek from an odor plume. When a food odor doesn’t have any food, or the odor is behind a barrier, the animal needs some sort of timeout to give up seeking the odor, turn away from the false odor, and search else where. However, the current simulation only uses the seek action for the timeout; it lacks environment context.

A simple illustration depicting two circular shapes; one has a blue and red figure inside, while the other contains a green star-like shape.
Simulation screenshot showing the model animal trapped into perseverating seek in the center of a false odor plume. Circles represent odor plumes and the star represents food.

The above simulation screenshot shows the general problem. The circles represent food odor plumes and the star represents food. The animal has followed a false odor plume and will continue circling the center until the striatum timeout. However, there is a nearby valid odor plume with food in it. The animal should avoid the false odor plume and search the correct odor plume, but currently it can’t distinguish the two, because it’s only using its own seek action as a key. If the animal could detect environmental context differences between the two plumes, it could search more effectively.

As a context, odor neighborhoods [Jacobs 2022], [Marin et al 2021] can represent a primitive representation of place. If each place has a different set of odor molecules, the animal can use that odor scene to distinguish false odor plumes from true food odor sources.

Striatum as timeout

In the essays I’ve used the striatum as a timeout mechanism to prevent perseveration, specifically the S.d2 (striatum projection neurons with D2.i Gi inhibiting dopamine receptor), which use A2a.s (adenosine Gs stimulating receptor) to measure the buildup of Ado (adenosine). The striatum’s projection neurons are roughly evenly divided between S.d1 (D1.s Gs stimulating dopamine receptor) and S.d2. For long stimulations, S.d1 generally motivate the current action and S.d2 opposes the action and produces avoidance [Soares-Cunha et al 2020], but for short stimulations both are active for action initiation [Cui G et al 2013].

Ado signaling limits swimming in frog tadpoles [Dale 1998]. In the striatum A2a.s receptors in S.d2 projection neurons also detect Ado buildup from neuron activity and from astrocytes that monitor neuron activity [Kang S et al 2020]. The Ado buildup activates S.d2 neurons and increases its internal PKA (protein kinase A) [Ma L et al 2022]. PKG slowly builds up during activity with a buildup time constant on the order of 10-20s and a decay constant on the order of 70s [Ma L et al 2022], but the increase appears to be log or sigmoid-like, not linear, suggesting that longer timeouts would be possible with high thresholds or opposition from S.d1. In S.d2, the PKA buildup enables the release of the opioid enkephalin [Konradi et al 2003], [Hook et al 2008], which activates the DOR.i (δ-opioid inhibitory receptor), which is necessary for the inhibitory/avoidance behavior for S.d2 [Soares-Cunha et al 2020].

Because the essay simulation needs a timeout to limit food seek perseveration, and the Ado and S.d2 avoidance chain could plausibly implement that timeout, I’ve been using it as the basis for the simulation’s timeout. This function seems evolutionarily plausible, because avoiding perseveration is important to keep the animal from unproductive seeking, and the implementation is fairly straightforward, only requiring already existing Ado timeout sensing and S.d2, without needing the entire basal ganglia. However, up to this point, that timeout has only used the action as a key, and has not included any context. Essay 15 and 16 did cover odor seek timeout in the context of associative habituation, but on the context of the fruit fly mushroom body.

Action and context as striatum inputs

S.pn (striatum projection neurons: both S.d1 and S.d2) are often called medium spiny neurons because their extensive dendrites are covered with spines. Spines are small dendrite compartments that receive axon inputs. Spines can compartmentalize Ca2+ (calcium) transients [Fang LZ and Creed 2024], meaning S.pn activation is not necessarily global across the entire dendrite tree, but compartmentalized. In S.d (dorsal striatum), cortical axons attach to S.pn spines, while T.pf (parafascicular thalamus) connects to the dendrite shaft [Fee MS 2012]. T.pf signals include ongoing action feedback and efference copies from the hindbrain and midbrain, including OT (optic tectum), while the cortex provides environmental context.

Diagram illustrating the connection between cortical inputs (C), thalamic inputs (T.pf), and striatum projection neurons (S.pn) via dendritic spines.
Rough diagram of inputs to S.pn dendrites. Cortical input is to distal dendrites and spines, while T.pf is more proximal and to dendrite shaft. C (cortex), S.pn (striatum projection neuron), T.pf (parafascicular thalamus).

Songbirds have a portion of the basal ganglia devoted to singing called Area X [Kornfeld et al 2020]. Area X receives song action variability information from C.lman (lateral nucleus of anterior nidopallum) and timing context from C.hvc, which are areas of the songbird cortex. C.lman provides an action efference copy of the variation actions [Fee MS 2014]. The majority (85%) of C.hvc input is on S.pn spines, while 55% of the C.lman action information is on the dendrite shafts [Kornfeld et al 2020]. The action efferent copy input to S.pn are not plastic, while the contextual input on the shafts is plastic [Fee MS 2012]. The context drives S.pn activation to an Up state, gating the core action driver [Fee MS 2012]. Action input to the striatum is focuses, while contextual information is diffuse [Fee MS 2012].

S.pn inputs can differ in their attachment to spines or dendrite shafts, and they can also differ in triggering behavior and for Up states. S.pn are normally hyper polarized, meaning they are normally especially difficult to trigger. Some inputs can shift S.pn into an Up state, where they are more easily triggered. In S.v (ventral striatum), inputs from E.sub.v (ventral subiculum in the hippocampal complex) can shift S.pn into an Up state for hundreds of millisecond [O’Donnell and Grace 1995], [Sesack and Grace 2010]. When E.sub.v is disable, S.v spontaneous or bistable activity halts, and other inputs such as F.pfc (prefrontal cortex) can’t trigger action potentials [O’Donnell and Grace 1995]. Up state transitions can be facilitated by dopamine [Fang LZ and Creed 2024], [Lahiri and Bevan 2020] and astrocyte sensing of glutamate activity [D’Ascenzo et al 2007], [Yu X et al 2018].

For the purposes of the essay simulation, these differences in striatum input suggest it’s plausible to treat action input and contextual input as distinct types of input, following [Fee MS 2012]. Specifically, that the combination of an action input and a context is required to drive the striatum timeout. This means that a seek timeout can be specific to a context and not overflow to other contexts.

Innate and contextual odors

In vertebrates, O.sn (odor sensory neurons) axons project to O.gl (glomerules) in Ob (olfactory bulb), where they connect with O.pn (odor projection neuron: mitral and tufted cells in mammals) dendrites. Each O.gl is a large neuropil (axon and dendrite connection area) where multiple O.sn and O.pn combine. In mammals, each O.gn responds to a single O.sn odor feature. Each O.gl typically responds to several odor molecules, and each odor molecule drives multiple O.gl [Wilson and Mainen 2006], [Weiss 2020]. In other vertebrates, each O.gl can combine inputs from multiple O.sn. Insect odor processing also uses glomerules, but this shared structure is independent evolution not homology because even the underlying odor detection receptors are unrelated between insects and vertebrates [Weiss 2020]. The glomeruli structure is likely simply an effective way of connecting multiple O.sn to O.pn.

As an analogy that the simulation uses, consider phonemes in a syllable, where each syllable is like an odor molecule and each phoneme is like a glomeruli. The syllable “cat” consists of “c-“, “-a-“, and “-t”, corresponding to three glomerules, and “c-” is driven by many different syllables. So, O.gl doesn’t identify the whole odor, but only a feature of the odor, like “c-“, but the features can be recombined to identify the odor.

The odor glomerules in vertebrates are divided into a smaller innate group and a larger contextual group. Most mammals have distinct Ob and O.a (accessory olfactory bulb). The lamprey Ob.m (medial Ob) projects directly to the midbrain, including Hb.m (medial habenula) and V.pt (posterior tuberculum) [Derjean et al 2010], [Beauséjour et al 2022], while the Ob.l (lateral Ob) projects to Pa (pallium/cortex) and basal ganglia [Beauséjour et al 2022], [Beauséjour et al 2024], [Suryanarayana et al 2021].

Previous essays have only used the innate Ob.m projection and ignored the Ob.l projection. This essays adds the Ob.l context projection to S.o t(olfactory tubercle), which is a part of S.v (ventral striatum) with large, direct olfactory input, and output to H.l (lateral hypothalamus) and Pv (ventral pallidum). The Ob.l context may represent odor neighborhoods, introducing a notion of place.

Odor neighborhoods

Odors rarely occur in isolation, are dynamic in space and time [Marin et al 2021], and form spatial neighborhoods [Jacobs 2022]. Olfactory curs influence E.hc (hippocampus) place fields, and place cells in blind rats are similar to sighted rats [Marin et al 2021]. In O.pir.p (posterior piriform cortex/olfactory cortex), place can be decoded to 90% accuracy with 240 neurons [Poo C et al 2022]. The olfactory spatial hypothesis considers odor are more for navigation than for identification [Jacobs 2012], where an odor neighborhood is a local area of odor mixtures.

It seems plausible that an early proto-vertebrate could use a combination of odor features from O.gl in an early S.v to restrict a seek timeout to a local place. The circuit is a straightforward extension of existing Ado timeout circuitry.

Seek striatum

For the essay’s simulation, I’m using odor context from Ob.l as a neighborhood detector. The striatum has two inputs: an action that enables the striatum during a seek and a place context identified by odor to restrict the search.

A simulation screenshot depicting various spatial representations of sensory inputs, including graphs and spatial maps related to odor detection and processing mechanisms.
Screenshot of the seek task blocked by a U-shaped barrier with odor neighborhoods represented by color and pattern. The local odor “rat” is represented by odor glomerules for “r-“, “-a-“, and “-t”.

The above screenshot shows the animal after its timeout from a failed odor seek when blocked by a barrier. The star represents food and the circle is an odor plume. Each pattern in the arena represents an odor neighborhood. The right side of the screenshot shows the active glomerules for the neighborhood, represented by “r-“, “-a-” and “-t”. I’m using syllables to represent odor molecules and phonemes to represent odor features detected by Ob glomerules.

Fruit fly mushroom body and Kenyon cells

Because the hypothetical proto-vertebrate would have a much simpler striatum than the mammal striatum, consider the comparison with the fruit fly MB (mushroom body) and its KC (Kenyon cells), which has a similar structure to the Sv projections to Pv, but has a much smaller scale. The mushroom body is highly conserved among insects and possibly predates all arthropods [Fiala and Kaun 2024], and serves as an odor pattern detector. In fruit flies, 52 O.pn project to ~2000 KC [Chan ICW et al 2024], which project to 24 MBON (mushroom body output neurons) [Seki et al 2017]. Each KC has three to seven claws [Zheng et al 2022], which are essentially single-connection dendrites.

A diagram illustrating the connectivity of Kenyon cells (KCs) in the fruit fly mushroom body, showing olfactory sensory neuron (OSN) inputs, projection neurons (PNs), and connections to mushroom body output neurons (MBONs) with their neurotransmitter types.
Architecture of the Drosophila mushroom body, adapted from [Aso Y et al 2014]. For this essay, only the left side projections of PN to KC are important. KC (Kenyon cell), PN (olfactory projection neuron), OSN (olfactory sensory neuron)

The above diagram shows the fruit fly mushroom body, but only the KC on the left are relevant for this essay. The MBON on the right would correspond to Pv (ventral pallidum) in this analogy. Each of the 2000 KC receive essentially random olfactory input from the O.pn, where each KC receives 3-7 O.pn inputs.

This mushroom body structure roughly corresponds to vertebrate O.pn projections to S.ot (ventral striatum olfactory tubercle), which projects to Pv. Although the connectivity pattern is similar, the two structures are not homologous in any fashion. Insect O.sn and vertebrate O.sn use entirely separate olfactory receptor families, and KCs use ACh (acetylcholine) as a neurotransmitter, while S.pn use GABA and vertebrate-specific opioids. The point of the analogy here is only to compare the scale of the O.gl and S.ot for a possible proto-vertebrate because the mammal S.ot is vastly too large to be plausible for that ancestor.

The MB has been compared to vertebrate CB-like (cerebellum-like) structures in the hindbrain [Farris 2011], suggesting that both serve as adaptive sensory filters. CB-like structures have dual inputs: one is sensory-specific input and the other is multimodal contextual. The MB also serves as a brake on insect locomotion, because fruit flies with MB lesions are less likely to stop locomotion once begun moving. This locomotion stopping is similar to the seek perseveration timeout in this essay.

Diagram comparing the olfactory processing pathways in insects and vertebrates. The insect mushroom body pathway includes odor sensory neurons (O.sn), odor projection neurons (O.pn), Kenyon cells (KC), and mushroom body output neurons (MBON), leading to seek/avoid responses. The vertebrate pathway similarly includes O.sn, O.pn, striatal projection neurons (S.pn), and the ventral pallidum (Pv), also leading to seek/avoid responses.
Analogy between the insect mushroom body and the vertebrate ventral basal ganglia. KC (Kenyon cells), MBON (mushroom body output neuron), O.pn (olfactory projection neuron), O.sn (olfactory sensory neuron), Pv (ventral pallidum), S.pn (striatum projection neuron)

For scale, consider using the lamprey Ob during the syllable analogy. The lamprey has approximately 40 olfactory receptor genes [Beauséjour et al 2020]. If we exclude about 10 innate from Ob.m, the 30 are contextual odors in Ob.l Consider splitting each odor as a syllable into three odor features as pheromones. If the 30 lamprey O.gl were organized like phonemes, then it might have 10 initial consonants, 10 vowels, and 10 final consonants. Suppose each S.pn receives three inputs from O.pn: one of 10 initial consonants, one of 10 vowels, and one of 10 final consonant. The 1000 S.pn would cover the possible syllables, expanding the dimensionality from 30 odor phonemes to 1000 odor syllables. Analogously, the Drosophila 52 O.pn phonemes expand to ~2000 KC syllables, roughly the same order of magnitude. Of course, the olfactory neighborhoods aren’t actually nicely ordered into convenient human-readable syllables, but it’s a convenience analogy, particularly for the simulation.

Returning to the original metaphor of the mosaic of broken mirrors [Da Cunha et al 2009], the O.pn breaks odor molecules (syllables and unbroken mirror) into a broken set of odor features (phonemes and mosaic tessera), and randomly reassembles the features in S.pn like tessera in a mosaic, partially recovering the original syllable structure, albeit lossy. Because the features are broken pieces stripped from their original odor molecule identity, the system could add other modalities, such as lateral line or whisker sensing, or temperature, or color fragments, before combining them, Although the fruit fly KCs primarily combine odor inputs, they also include a smaller number of non-odor inputs such as visual, gustatory, mechanosensory, and proprioceptive inputs [Farris 2011].

S.core seek and S.msh roam

So far I’ve used the striatum as a timeout to avoid seek perseveration, giving up on a failed food odor. Adding an odor neighborhood context improves the accuracy of seek perseveration control. Now that odor neighborhoods are available, the animal could also avoid neighborhoods that it’s already searched, essentially creating a memory breadcrumb, as explored in essay 44.

A diagram showcasing a grid with labeled sections including 'dm,' 'dl,' 'msh.d,' 'msh.v,' 'core,' 'lsh,' 'mot,' and 'lot,' indicating different components or regions.
Rough topographic divisions of the striatum. The blue S.msh.d is used for roam, and the amber S.core and S.lsh are used for seek. S.core (Sv core), S.dl (dorsolateral striatum), S.dm (dorsomedial striatum), S.lot (lateral S.ot – olfactory tubercle of Sv), S.lsh (lateral shell of Sv), S.mot (medial S.ot), S.msh.d (medial shell of Sv, dorsal part), S.msh.v (medial shell of Sv, ventral part), Sv (ventral striatum aka nucleus accumbens)

Sv (ventral striatum) is divided into S.core (Sv core) and S.sh (S shell), where S.core surrounds the anterior commissure, which connects Ob and amygdala. The shell further divides into S.msh (medial shell) and S.lsh (lateral shell), with S.ot also dividing into S.mot (medial S.ot) and S.lot (lateral S.lot). S.msh itself divides into S.msh.d (dorsal S.msh) and S.msh.v (ventral S.msh). These regions have distinct genetic transcription factor types and connectivity, and S.msh may be even more complicated with further genetically defined subtypes [Chen R et al 2021].

Functionally, S.lsh and S.core are more similar, related to cues and seek [Floresco 2015], [Chen G et al 2023], [Ding YD et al 2022], [Dobrovitsky 2017], while S.msh is distinct and related to place [Al-Hasani et al 2015], [Humphries and Prescott 2010], but not cues [Domingues et al 2025]. S.sh is important for place habituation: avoiding places already visited [Floresco 2015]. S.core is more associated with seek actions, and S.msh associated with place preference and avoidance [Fisher et al 2025]. S.ot is less studied, but is strongly Ob and O.pir related. Assigning S.lot to the same group as S.lsh and S.mot with S.msh is not well supported functionally, but does have some transcriptional support [Chen R et al 2021]. S.msh.d generally supports RTPP (real-time place preference) and S.msh.v RTPA (real-time place avoidance) [Ding YD et al 2022], [D’Aquila 2024], [Faget et al 2024], but because other studies report S.msh.d as necessary for cued avoidance [Ramirez et al 2015], the S.msh function may not be as simple as a clear RTPA/RTPP difference. Sv has three-dimensional aspects as well, with S.msh.a (anterior S.msh) and S.msh.p (posterior S.msh) having opposing seek and avoid motivation [Castro et al 2015], [Berridge 2019], [Bond et al 2020], [Marinescu and Labouesse 2024], with differing projections to locomotion vs eating regions [Richard and Berridge 2011].

As a complication for reading the research, because the heterogeneity of Sv regions was relatively recently discovered, many papers report results for S.sh without distinguishing between S.lsh, S.msh.d, or S.msh.v, despite these regions having different or even opposing functions. Older papers often simply report results for Sv without even distinguishing S.core from S.sh. Another complication is that Sv is also important for eating, not simply dedicated to seek or avoidance. Stimulating S.sh.a immediately stops eating [Reed et al 2018]. However, eating and roaming/seeking are related because they’re mutually exclusive: the animal needs ot stop roaming or seeking to eat. In case cases eating circuits may actually be stop-moving circuits, since 70% of S.sh are inhibited while eating and 30% are short lived excite while eating [Marinescu and Labouesse 2024]. Note that although the divisions in S.v are broadly topographic, some sub-functions could be mixed salt-and-pepper style, particularly in the complicated S.msh region.

So, the essay can add S.msh place habituation to avoid places already visited [Floresco 2015]. This seems likely to be S.msh.d because S.msh.v is more associated with threat avoidance [Ding YD et al 2022].

Odor neighborhoods for roaming

The odor spatial hypothesis suggests that the vertebrate Ob (olfactory bulb) is used more for spatial navigation than odor identification [Jacobs 2012]. Odors form spatial neighborhoods [Jacobs 2022], and the mammalian E.hc (hippocampus) place fields are driven by odor [Jacobs 2022]. Odors are rarely in isolation, but are dynamic in space and time. Place cells in blind rats are similar to sighted rats [Marin et al 2021]. A very old model of E.hc called it the rhinencephalon (nose brain), which was displaced by the discovery of spatial place fields, but if place is grounded by an old odor neighborhood circuit, then rhinencephalon may be accurate [Jacobs 2022].

For the essay simulation I’ve created two parallel basal ganglia paths for seek and roam. The seek path is only activated when the animal is following a food odor plume. The roam path is more broadly activated when the animal is searching for food. For specific paths, S.core and S.lsh appear specific to seek [Dobrovitsky 2017], [Soares-Cunha et al 2020], [Walle et al 2024]. Sv research doesn’t investigate roam circuits per se, but RTPA (real-time place avoidance) and RTPP (real-time place preference) and conditioned place preference are centered on S.msh [Britt et al 2012], [Marinescu and Labouesse 2024].

Flowchart illustrating the pathways in the basal ganglia for seeking and roaming behaviors in response to olfactory signals. Left side shows 'Ob.l place' input leading to seek and roam pathways, including interactions between various neural regions.
Paths used by the simulation for seek timeout and roam timeout with odor neighborhood context. H.l (lateral hypothalamus), Hb.lm (lateral habenula, medial part), MLR (midbrain locomotor region), Ob.l (lateral olfactory bulb), Pv.dl (ventral pallidum, dorsolateral part), Pv.vm (ventral pallidum, ventromedial part), R1.a (anterior hindbrain locomotor region), R5.rs (mid-hindbrain turning region), S.core (ventral striatum core), S.msh.d (striatum medial shell, dorsal part), T.pf (parafascicular thalamus), V.rn (raphe nuclei)

The above diagram shows a hypothetical dual seek and roam circuit. S.core uses seek action feedback / efference copy to enable a seek timeout to avoid perseveration. S.msh.d uses H.l (lateral hypothalamus) roaming driver to enable place habituation to avoid searching places already visited.

A screenshot of a simulation showing an animal's movement in a patterned arena with various sections representing different odor neighborhoods. Circles denote active olfactory stimuli and a star marks the location of food. Syllables and phonemes are used to represent odor features.
Screenshot of the simulation with the animal leaving an area it’s already explored. The phonemes “d-“, “-o-“, and “-g” represent the current odor neighborhood.

The above screenshot shows the simulation for roaming odor neighborhood. Each pattern represents a different odor neighborhood. The phonemes on the right — “d-“, “-o-“, and “-g” — represent odor features of the neighborhood. The animal is avoiding the bottom neighborhood marked by blue horizontal lines because roaming has timed out for that neighborhood.

Box plot and violin plot comparing the distances traveled in an open field for two groups: with 'PvRoam' and without 'PvRoam'.
Monte Carlo simulation of the animal’s search. PvRoam represents roaming with timeout enabled. No PvRoam representing roaming without timeout.

As a test to verify that avoiding place repetition improves food search, I ran 300 Monte Carlo simulations for both the roam timeout enabled and disabled. In the simulation code, the timeout circuit is organized by its Pv projection, which owns the corresponding Sv. So enabling the roaming timeout means enabling PvRoam. The results suggest that avoiding place repetition improves performance by regarding the long-search tail of the distribution.

Discussion: multimodal feature inputs

The essay’s simulation only used odor features as striatum inputs for identifying neighborhoods, but the mosaic model can work with multimodal inputs. For example the lateral line sense can detect a barrier to the right of the animal, That signal can be added to the striatum mosaic to distinguish odor neighborhoods bordered by a reef from a neighborhood over sand. Temperature sensors can distinguish cold and warm neighborhoods. Similarly, even simple, non-imaging photoreceptors tuned to multiple colors could help distinguish sand from reef or deep blue ocean from shallow waters. Three or four bits of visual information could help distinguish neighborhoods without needing complicated visual processing.

Similarly, although the fruit fly mushroom body mainly has odor inputs, it also includes some visual, gustatory, and thermosensory input [Chan ICW et al 2024]. Like a striatum mosaic, the visual processing in the mushroom body isn’t complex, but it can distinguish environments.

Insect mushroom body as a cerebellum-like structure

An interesting comparison between the insect MB (mushroom body) and vertebrate CB-like (cerebellum-like) structures suggests that both act as adaptive sensory filters [Farris 2011]. Vertebrate CB-like structures, mainly in the hindbrain, use anti-Hebbian plasticity to predict and erase self-motion from sensor data [Bell et al 2008], [Montgomery et al 2012].

For example, the aquatic lateral-in sense uses water motion sensors to detect objects and prey. If the animal is swimming near an obstacle to the right, the relative water motion produces a curl around the animal, which a relatively simple circuit can decode to infer the barrier [Oteiza et al 2017]. Because the animal’s own swimming also produces water movement, a CB-like organ R.mon (medial octavo lateral nucleus) subtracts the self signal, enabling more accurate obstacle and prey detection.

Similar to the striatum context and action architecture explored in this essay, CB-like structures have a context formed by parallel fibers, which encodes multimodal combination of self-action and proprioceptive input, and a primary sensory input, which the context modulates. Also similar to this essay’s striatum model, repeated activation is anti-Hebbian: suppressing repeated activation.

There are major differences between the striatum and CB-like functionality, of course. CB-like structures form an adaptive filter to produce a more useful signal, while the striatum timeout in this essay avoid repeating search areas, and it’s hard to find any commonality in those two functions other than the very general avoidance of repetition.

Possible cortex enhancements

Because the simulation is a tow model, it hides the noise and signal problems. Odors in particular are difficult and messy sensor input because odors in water are clumpy, not the clean odor gradients and neighborhoods of the model. Real odor signals will appear and disappear, and neuron signals are generally short, between 3ms for fast AMPA receptors and 100ms NMDA receptors, but the odor needs much longer sustain, on the order of several seconds.

Cortical pyramidal neurons can activate a sustained ADP (afterdepolarization) model lasting on the order of 6-8 seconds when activated by ACh (acetylcholine) activating mACh.q (Gq coupled acetylcholine receptor). This sustained activity could stretch an odor neighborhood signal across the gaps in spotty odor receptor signal. A proto-vertebrate improvement could use a simple proton-cortical circuit as short term memory.

A more complicated improvement is associative pattern recognition. The mosaic striatum model can detect simple patterns, but it can’t generalize, and may be susceptible to noise and distractor signals. Typically, an odor scene will have multiple odor molecules, unlike the simulation’s simplified model. Cortex circuits could filter the noisy inputs and produce a more reliable input to the striatum, replacing direct Ob input with more conceptual O.pir (piriform cortex) engrams.

Odor and place

Although odors can form neighborhoods, they aren’t necessarily precise or reliable. One complicated improvement is dedicated cortical regions devoted to identifying place. O.pir ish the main olfactory cortex in mammals with homologous olfactory cortexes in other vertebrates. In mammals, O.pir.a (anterior O.pir) idenfieis odors, and O.pir.p (posterior O.pir) detects place [Poo C et al 2022]. The neuronal connectivity of O.pir.p resembles parts of E.hc, which is well-known to encode place.

Consider an evolutionary sequence of improvements that starts from a simple striatum mosaic that detects odor neighborhoods, then improves that primitive place detection with more sophisticated cortical place in O.pir.p. That single-mode odor place detection could then combine with other sensory modes using egocentric and allocentric inputs like head direction and landmarks into a more reliable place detection in E.hc.

References

Al-Hasani R, McCall JG, Shin G, Gomez AM, Schmitz GP, Bernardi JM, Pyo CO, Park SI, Marcinkiewcz CM, Crowley NA, Krashes MJ, Lowell BB, Kash TL, Rogers JA, Bruchas MR. Distinct Subpopulations of Nucleus Accumbens Dynorphin Neurons Drive Aversion and Reward. Neuron. 2015 Sep 2;87(5):1063-77. 

Beauséjour PA, Auclair F, Daghfous G, Ngovandan C, Veilleux D, Zielinski B, Dubuc R. Dopaminergic modulation of olfactory-evoked motor output in sea lampreys (Petromyzon marinus L.). J Comp Neurol. 2020 Jan 1;528(1):114-134.

Beauséjour, P.A., Zielinski, B. and Dubuc, R., 2022. Olfactory-induced locomotion in lampreys. Cell and tissue research, pp.1-15.

Beauséjour PA, Veilleux JC, Condamine S, Zielinski BS, Dubuc R. Olfactory Projections to Locomotor Control Centers in the Sea Lamprey. Int J Mol Sci. 2024 Aug 29;25(17):9370. 

Bell, Curtis C., Victor Han, and Nathaniel B. Sawtell. Cerebellum-like structures and their implications for cerebellar function. Annu. Rev. Neurosci. 31 (2008): 1-24.

Berridge KC. Affective valence in the brain: modules or modes? Nat Rev Neurosci. 2019 Apr;20(4):225-234. 

Bond CW, Trinko R, Foscue E, Furman K, Groman SM, Taylor JR, DiLeone RJ. Medial Nucleus Accumbens Projections to the Ventral Tegmental Area Control Food Consumption. J Neurosci. 2020 Jun 10;40(24):4727-4738. 

Britt JP, Benaliouad F, McDevitt RA, Stuber GD, Wise RA, Bonci A. Synaptic and behavioral profile of multiple glutamatergic inputs to the nucleus accumbens. Neuron. 2012 Nov 21;76(4):790-803. 

Castro DC, Cole SL, Berridge KC. Lateral hypothalamus, nucleus accumbens, and ventral pallidum roles in eating and hunger: interactions between homeostatic and reward circuitry. Front Syst Neurosci. 2015 Jun 15;9:90.

Chan, I.C.W., Chen, N., Hernandez, J., Meltzer, H., Park, A. and Stahl, A., 2024. Future avenues in Drosophila mushroom body research. Learning & Memory, 31(5), p.a053863.

Chen, G., Lai, S., Bao, G., Ke, J., Meng, X., Lu, S., Wu, X., Xu, H., Wu, F., Xu, Y. and Xu, F., 2023. Distinct reward processing by subregions of the nucleus accumbens. Cell reports, 42(2).

Chen R, Blosser TR, Djekidel MN, Hao J, Bhattacherjee A, Chen W, Tuesta LM, Zhuang X, Zhang Y. Decoding molecular and cellular heterogeneity of mouse nucleus accumbens. Nat Neurosci. 2021 Dec;24(12):1757-1771.

Da Cunha C, Wietzikoski EC, Dombrowski P, Bortolanza M, Santos LM, Boschen SL, Miyoshi E. Learning processing in the basal ganglia: a mosaic of broken mirrors. Behav Brain Res. 2009 Apr 12;199(1):157-70.

Dale N. Delayed production of adenosine underlies temporal modulation of swimming in frog embryo. J Physiol. 1998 Aug 15;511 ( Pt 1)(Pt 1):265-72. 

D’Ascenzo M, Fellin T, Terunuma M, Revilla-Sanchez R, Meaney DF, Auberson YP, Moss SJ, Haydon PG. mGluR5 stimulates gliotransmission in the nucleus accumbens. Proc Natl Acad Sci U S A. 2007 Feb 6;104(6):1995-2000. 

D’Aquila, PS, 2024. Licking microstructure in response to novel rewards, reward devaluation and dopamine antagonists: possible role of D1 and D2 medium spiny neurons in the nucleus accumbens. Neuroscience & Biobehavioral Reviews, p.105861.

Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21;8(12):e1000567. 

Ding YD, Chen X, Chen ZB, Li L, Li XY, Castellanos FX, Bai TJ, Bo QJ, Cao J, Chang ZK, Chen GM, Chen NX, Chen W, Cheng C, Cheng YQ, Cui XL, Duan J, Fang YR, Gong QY, Hou ZH, Hu L, Kuang L, Li F, Li HX, Li KM, Li T, Liu YS, Liu ZN, Long YC, Lu B, Luo QH, Meng HQ, Peng DH, Qiu HT, Qiu J, Shen YD, Shi YS, Si TM, Tang YQ, Wang CY, Wang F, Wang K, Wang L, Wang X, Wang Y, Wang YW, Wu XP, Wu XR, Xie CM, Xie GR, Xie HY, Xie P, Xu XF, Yang H, Yang J, Yao JS, Yao SQ, Yin YY, Yuan YG, Zang YF, Zhang AX, Zhang H, Zhang KR, Zhang L, Zhang ZJ, Zhao JP, Zhou RB, Zhou YT, Zhu JJ, Zhu ZC, Zou CJ, Zuo XN, Yan CG, Guo WB. Reduced nucleus accumbens functional connectivity in reward network and default mode network in patients with recurrent major depressive disorder. Transl Psychiatry. 2022 Jun 6;12(1):236. 

Dobrovitsky V, West MO, Horvitz JC. The role of the nucleus accumbens in learned approach behavior diminishes with training. Eur J Neurosci. 2019 Nov;50(9):3403-3415. 

Domingues, A.V., Carvalho, T.T., Martins, G.J., Correia, R., Coimbra, B., Bastos-Gonçalves, R., Wezik, M., Gaspar, R., Pinto, L., Sousa, N. and Costa, R.M., 2025. Dynamic representation of appetitive and aversive stimuli in nucleus accumbens shell D1-and D2-medium spiny neurons. Nature communications, 16(1), p.59.

Faget L, Oriol L, Lee WC, Zell V, Sargent C, Flores A, Hollon NG, Ramanathan D, Hnasko TS. Ventral pallidum GABA and glutamate neurons drive approach and avoidance through distinct modulation of VTA cell types. Nat Commun. 2024 May 18;15(1):4233. 

Fang, L.Z. and Creed, M.C., 2024. Updating the striatal–pallidal wiring diagram. Nature neuroscience, 27(1), pp.15-27.

Farris, S.M., 2011. Are mushroom bodies cerebellum-like structures?. Arthropod structure & development, 40(4), pp.368-379.

Fee MS. Oculomotor learning revisited: a model of reinforcement learning in the basal ganglia incorporating an efference copy of motor actions. Front Neural Circuits. 2012 Jun 27;6:38. 

Fee MS. The role of efference copy in striatal learning. Curr Opin Neurobiol. 2014 Apr;25:194-200. 

Fisher, A.A., Gonzalez, L.S., Cappel, Z.R., Grover, K.E., Waclaw, R.R. and Robinson, J.E., 2025. Dopaminergic encoding of future defensive actions in the mouse nucleus accumbens. PNAS nexus, 4(5), p.pgaf128.

Floresco SB. The nucleus accumbens: an interface between cognition, emotion, and action. Annu Rev Psychol. 2015 Jan 3;66:25-52.

Hook, V., Toneff, T., Baylon, S. and Sei, C., 2008. Differential activation of enkephalin, galanin, somatostatin, NPY, and VIP neuropeptide production by stimulators of protein kinases A and C in neuroendocrine chromaffin cells. Neuropeptides, 42(5-6), pp.503-511.

Humphries MD, Prescott TJ. The ventral basal ganglia, a selection mechanism at the crossroads of space, strategy, and reward. Prog Neurobiol. 2010 Apr;90(4):385-417. 

Jacobs L. F. (2012). From chemotaxis to the cognitive map: the function of olfaction. Proc. Natl. Acad. Sci. U.S.A. 109(Suppl. 1) 10693–10700

Jacobs LF. How the evolution of air breathing shaped hippocampal function. Philos Trans R Soc Lond B Biol Sci. 2022 Feb 14;377(1844):20200532. 

Kang S, Hong SI, Lee J, Peyton L, Baker M, Choi S, Kim H, Chang SY, Choi DS. Activation of Astrocytes in the Dorsomedial Striatum Facilitates Transition From Habitual to Goal-Directed Reward-Seeking Behavior. Biol Psychiatry. 2020 Nov 15;88(10):797-808.

Konradi, C., Macı́as, W., Dudman, J.T. and Carlson, R.R., 2003. Striatal proenkephalin gene induction: coordinated regulation by cyclic AMP and calcium pathways. Molecular brain research, 115(2), pp.157-161.

Kornfeld, J., Januszewski, M., Schubert, P., Jain, V., Denk, W. and Fee, M.S., 2020. An anatomical substrate of credit assignment in reinforcement learning. BioRxiv, pp.2020-02.

Lahiri AK, Bevan MD. Dopaminergic Transmission Rapidly and Persistently Enhances Excitability of D1 Receptor-Expressing Striatal Projection Neurons. Neuron. 2020 Apr 22;106(2):277-290.e6. 

Ma L, Day-Cooney J, Benavides OJ, Muniak MA, Qin M, Ding JB, Mao T, Zhong H. Locomotion activates PKA through dopamine and adenosine in striatal neurons. Nature. 2022 Nov;611(7937):762-768. 

Marin AC, Schaefer AT, Ackels T. Spatial information from the odour environment in mammalian olfaction. Cell Tissue Res. 2021 Jan;383(1):473-483. 

Marinescu AM, Labouesse MA. The nucleus accumbens shell: a neural hub at the interface of homeostatic and hedonic feeding. Front Neurosci. 2024 Jul 30;18:1437210. 

Montgomery, John C., David Bodznick, and Kara E. Yopak. The cerebellum and cerebellum-like structures of cartilaginous fishes. Brain Behavior and Evolution 80.2 (2012): 152-165.

O’Donnell P, Grace AA. Synaptic interactions among excitatory afferents to nucleus accumbens neurons: hippocampal gating of prefrontal cortical input. J Neurosci. 1995 May;15(5 Pt 1):3622-39.

Poo C, Agarwal G, Bonacchi N, Mainen ZF. Spatial maps in piriform cortex during olfactory navigation. Nature. 2022 Jan;601(7894):595-599.

Ramirez F, Moscarello JM, LeDoux JE, Sears RM. Active avoidance requires a serial basal amygdala to nucleus accumbens shell circuit. J Neurosci. 2015 Feb 25;35(8):3470-7. 

Reed SJ, Lafferty CK, Mendoza JA, Yang AK, Davidson TJ, Grosenick L, Deisseroth K, Britt JP. Coordinated Reductions in Excitatory Input to the Nucleus Accumbens Underlie Food Consumption. Neuron. 2018 Sep 19;99(6):1260-1273.e4. 

Sesack SR, Grace AA. Cortico-Basal Ganglia reward network: microcircuitry. Neuropsychopharmacology. 2010 Jan;35(1):27-47.

Soares-Cunha C, de Vasconcelos NAP, Coimbra B, Domingues AV, Silva JM, Loureiro-Campos E, Gaspar R, Sotiropoulos I, Sousa N, Rodrigues AJ. Nucleus accumbens medium spiny neurons subtypes signal both reward and aversion. Mol Psychiatry. 2020 Dec;25(12):3241-3255. 

Suryanarayana, S. M., Perez-Fernandez, J., Robertson, B., & Grillner, S. (2021). Olfaction in lamprey pallium revisited—dual projections of mitral and tufted cells. Cell Reports, 34(1).

Walle R, Petitbon A, Fois GR, Varin C, Montalban E, Hardt L, Contini A, Angelo MF, Potier M, Ortole R, Oummadi A, De Smedt-Peyrusse V, Adan RA, Giros B, Chaouloff F, Ferreira G, de Kerchove d’Exaerde A, Ducrocq F, Georges F, Trifilieff P. Nucleus accumbens D1- and D2-expressing neurons control the balance between feeding and activity-mediated energy expenditure. Nat Commun. 2024 Mar 21;15(1):2543. 

Weiss, L., 2020. Information processing in the olfactory system of different amphibian species (Doctoral dissertation, Dissertation, Göttingen, Georg-August Universität, 2020).

Wilson, R.I. and Mainen, Z.F., 2006. Early events in olfactory processing. Annu. Rev. Neurosci.29(1), pp.163-201.

Yu X, Taylor AMW, Nagai J, Golshani P, Evans CJ, Coppola G, Khakh BS. Reducing Astrocyte Calcium Signaling In Vivo Alters Striatal Microcircuits and Causes Repetitive Behavior. Neuron. 2018 Sep 19;99(6):1170-1187.e9.

Ventral Pallidum for Sustain and Timeout

Previous essays have used Pv (ventral pallidum) as part of the seek and avoidance circuit without exploring it in detail. For this essay, I’m revisiting Pv in more detail for two purposes: first, to check that the simulation’s seek and avoid model is compatible with scientific results about Pv, and second, to understand more on how those internal circuits work.

Timeouts are critical for the food-odor seek circuit to prevent the animal from getting stuck in a trap where it either can’t reach the food, or the food odor has no food. A timeout could simply disable seek and return to the default roaming random walk, or it could actively avoid the current area. When the seek times out, an active avoidance phase is more effective than returning to roaming, because the avoidance moves away from the current false cues and into a distant area more likely to have a new food source.

Diagram illustrating the seek and avoid circuit related to food detection, showing phases of roaming, detecting odor, seeking food, timing out, and avoiding false cues.
State machine for seeking food. When the animal detects an odor, it follows the odor gradient until the animal either finds food or an internal timeout shifts the seek to avoid.

The simulation uses the basal ganglia as a timeout system, specifically Sv (ventral striatum) with Pv that’s interconnected with food-seek motivation based in H.l (lateral hypothalamus). The model uses Ado (adenosine) as a timeout neurotransmitter and S.d2 (striatum projection neuron with D2.i receptor) to signal the timeout. Essay 31 covered the adenosine-S.d2 system in more detail. Essentially, neural activity produces Ado from neurons and neighboring astrocytes. The Ado then activates A2a.s (adenosine G-s coupled receptors) on S.d2, which potentiates S.d2 and increases internal activity in an PKA (protein kinase A) activation chain. As Ado builds up over time, S.d2 activity increases until it triggers a switch from seek to avoid in Pv.

A flowchart illustrating the seek and timeout process in a neural simulation, showing the interactions between 'Ob', 'H.I seek', and 'R1.a' with a 'S.ot/Pv timeout' indicator.
The current simulation model uses the Sv/Pv to timeout seek motivation. H.l (lateral hypothalamus), Ob (olfactory bulb), Pv (ventral pallium), R1.a (anterior hindbrain motor area), S.ot (olfactory tubercule portion of Sv)

The above diagram shows how the current simulation model uses Sv/Pv as a timeout. H.l (lateral hypothalamus) is responsible for seek motivation based on odor input from Ob (olfactory bulb) and it drives roaming search to R1.a (anterior hindbrain motor region). The basal ganglia, represented by S.ot (olfactory tubercle, an olfactory region of Sv) and Pv serve as the timeout function. This essay aims to expand that simple model into a more accurate representation of the Sv/Pv timeout.

Seek and avoid

In neuroscience, seek and avoid are measured with RTPP (real-time place preference) and RTPA (real-time place avoidance) experiments, although these measurements are often interpreted as “valence” instead of actions. Circuits that produce RTPP could contribute to the seek action, and circuits that produced RTPA could produce avoidance. For example, Hb.lm (lateral habenula, medial part) produces RTPA when stimulated and RTPP when inhibited [Stamatakis et al 2016], and Sv, Pv, and H.l produce either RTPP or RTPA, depending on which neurons are stimulated. In Sv, S.d1 (striatum projection neuron with D1.s dopamine receptor) produces RTPP [Soares-Cunha et al 2020], [Tan et al 2024], while S.d2 produces RTPA [Bonnavion et al 2024], but only when stimulated for longer times [Soares-Cunha et al 2020]. Different regions of Sv have flipped seek and avoidance, between S.msh.d (medial shell of Sv, dorsal) and S.msh.v (medial shell of Sv, ventral) [Yao Y et al 2021]. In Pv, glutamate neurons produce RTPA and GABA neurons produce RTPP [Stephenson-Jones et al 2020], which matches H.l, where glutamate produces RTPA [Stamatakis et al 2016] and GABA produces RTPP [Jennings et al 2015], [Siemian et al 2021].

Diagram illustrating the neural circuits involved in the seek and avoid behavior in the brain, showing connections between various components like S.ot, Pv, H.l, Ob, and R1.a.
Simplified seek and avoid timeout circuit. The seek circuit uses H.l as the subthalamic motor region to the R1.a anterior hindbrain motor region. The avoid circuit uses Hb.lm to V.rn raphe also to R1.a. The Sv and Pv basal ganglia switch between the circuits. H.l (lateral hypothalamus), Hb.lm (lateral habenula, medial part), Ob (olfactory bulb), Pv (ventral pallidum), R1.a (anterior hindbrain motor region), S.ot (olfactory tubercle), V.rn (raphe nuclei).

The above diagram shows a simplified timeout and avoid circuit. The blue arrows show the proposed timeout avoid path. The greyed arrows show related connectivity, which are either contextual or for other actions. For example, the H.l glutamate to Hb.lm avoidance is necessary for predator and toxin avoidance such as a looming response from OT (optic tectum) [Lecca et al 2017] or pain responses from R.pb.l (lateral parabrachium) [Phua et al 2021]. Although the H.l is RTPA and also uses Hb.lm as an avoidance action path, it seems less likely to be a seek-timeout path. Because the Sv, Pv, and H.l circuit is also an eating circuit, some of the locomotion is stopping to eat. Some of the Sv and Pv projections to H.l are eating circuits [Root et al 2015], and eating also inhibits Hb.lm avoidance [Hu H et al 2020] because the animal shouldn’t move away from its food.

Hb.lm is a key action node for avoidance, using V.rn (raphe nuclei) to drive avoidance. In zebrafish, this path is exclusively V.mr (median raphe) because the zebrafish Hb.lm only connects to V.mr [Agetsuma et al 2010]. In mammals, the target of Hb.lm is less clear cut because both V.mr and V.dr (dorsal raphe) receive Hb.lm output [Baker et al 2015] and could participate in avoidance.

Pv as a heterogenous area

In this model, Pv is a key decision node. It receives seek-driving input from H.l and A.bl (basolateral amygdala) [Giardino et al 2018], [Heinsbroek et al 2020] and decision and timeout information from S.ot. Pv is defined by the projection of Sv, specifically using tac1 (tachykinin 1 for substance-p neurotransmitter), which S.d1 neurons exhibit. However, the neuron types and origins are heterogenous [Ottenheimer et al 2024], and derive from neighboring regions. In part, Pv derives from Po.l (lateral preoptic area) and H.l neuron types, in part it derives from P.bst (bed nucleus of the stria terminalis, extended amygdala), in part it derives from Pd (global pallidus external) [Ottenheimer et al 2024], and it has some functionality more similar to P.bf (basal forebrain), including ACh (acetylcholine) attention projections.

A diagram illustrating the connections and circuits involved in attention, avoidance, and decision-making within the brain, specifically highlighting the ventral pallidum (Pv), lateral hypothalamus (H.l), and other neural components.
Multiple circuits in Pv, including attention, avoidance, wake, seek, eat, avoidance, selection, and feedback to Sv. A.bl (basolateral amygdala), H.l (lateral hypothalamus), H.stn (sub thalamic nucleus), Hb.lm (lateral habenula, medial), P.epn (entopeduncular nucleus), Pv (ventral pallidum), Pv.a (anterior Pv), Pv.p (posterior Pv), Pv.dl (dorsolateral Pv), Pv.vm (ventromedial Pv), S.d1 (striatum projection neuron with D1.s receptor), S.d2 (striatum projection neuron with D2.i receptor), S.pv (striatum parvalbumin inhibitory neuron), Snr (substantia nigra pars reticulata).

The above diagram shows some of the difficulty by categorizing Pv functions by its output projections. Pv ACh (acetylcholine) projections particularly to A.bl to sustain attention, such as enabling odor seek [Kim R et al 2024], which is a P.bf function. Separate Pv glutamate and GABA projections to Hb.lm produce RTPP and RTPA [Stephenson-Jones et al 2020], which matches theH.l and Po.l function. Projections to H.l are more complex, producing wake [Luo YJ et al 2023] and eating [Palmer et al 2024]. Pv has choice-related output to Vta (ventral tegmentum) [Faget et al 2018], [Palmer et al 2024], which drives seek but is not motivational. Pv also has similar connections to the basal ganglia, similar to the S.d (dorsal striatum) and Pd (dorsal pallidum aka globus pallidus external) connections to H.stn (subthalamic nucleus), Snr (substantia nigra pars reticulata) and P.epn (entopeduncular nucleus aka globus pallidus internal) [Root et al 2015]. However, those Pd-like circuits are restricted to a particular part of Pv.dl (dorsolateral Pv). Finally, like Pd, Pv has “arkypallidal” feedback connections to Sv [Vachez et al 2021].

Decision: selection and commitment

Decision can be decomposed into a selection function and a commitment function. Selection chooses between competing options, such as left or right. Commitment ensures that the selection follows through and is not immediately distracted. Commitment is more important because without commitment, a selection isn’t a decision, while a random selection or a first-arriving selection is a workable decision. In a WTA (winner-take-all) process, the key part is the “take-all” part. Random take-all would also work. The commitment function needs a lockout function (“take-all”) but also a timeout function,e ach of which may be separate circuits.

A flow diagram illustrating the relationship between Sv (ventral striatum) and Pv (ventral pallidum) in a neural circuit, highlighting components like Vta select, Sv lockout, and Hb.lm timeout.
Possible circuit decomposition of decision between selection, lockout, and timeout. Hb.lm (lateral habenula, medial), Pv (ventral pallidum), Sv (ventral striatum), Vta (ventral tegmentum).

The above diagram shows a possible functional decomposition for Pv and decision-making. The Pv to Vta projection is important for the selection process [Palmer et al 2024]. More speculatively, the Pv feedback connection to Pv could provide a lockout function by inhibiting new selections through Sv. A similar circuit may exist in H.sth, which also projects directly to S.d [Williams 2024]. The Pv to Hb.lm projection is more clearly established as an avoidance pathway [Faget et al 2018].

One neuron, two functions

Although selection isn’t the focus of the essay, some learning theory results and some neuroscience measurements show that single S.d2 neurons are possibly serving opposite roles: selecting an action, but then opposing that same action [Hodge and Yttri 2025], [Soares-Cunha et al 2020], or terminating the current activity [Tecuapetla et al 2016]. In the classical model of basal ganglia selection, S.d1 and S.d2 are oppositional: S.d1 promotes an action and S.d2 either opposes the action or promotes an opposite action [Bariselli et al 2019]. In the learning model where DA (dopamine) serves as a teaching signal, DA enhances selected actions when successful and suppresses unsuccessful actions. However, some scientists argue that this learning model doesn’t work for S.d2 if S.d1 and S.d2 are selection with no other function [Lindsey et al 2025]. Some proposals to rescue the learning models include sustaining S.d2 activity after selection [Lindsey et al 2025]

Some prominent results show both S.d1 and S.d2 selecting the winning option [Cui G et al 2013], not opposing each other. However, studies consistently show the stimulating S.d1 makes contralateral turns but stimulating S.d2 makes ipsilateral turns [Conde-Berriozabal et al 2025], which is clearly oppositional. Possibly resolving this conflict, stimulating S.d2 shows a short 1s period of inhibiting Pv and exciting Vta while longer 2s stimulation excites Pv and inhibits Vta [Soares-Cunha et al 2020]. Another study shows short 350ms S.d2 as not producing RTPA, but 2s long S.d2 stimulus does produce RTPA [Hodge and Yttri 2025].

S.d2 neurons produce both GABA and the opioid enkephalin as neurotransmitters [Dai KZ et al 2022]. GABA is a fast neurotransmitter on the order of 3-5ms and only requires electrical AP (action potentials). Enkephalin is a much slower neuropeptide and is released when internal Ca2+ (calcium) and PKA (protein kinase A) levels have risen [Konradi et al 2023], [Hook et al 2008]. PKA levels rise in response to G-s protein coupled receptors like A2a.s (adenosine G-s coupled receptor). Enkephalin requires both action potentials and PKA, likely triggered by A2a.s. This A2a.s PKA signaling needs to overcome D2.i, which inhibits the PKA pathway. Technically, D2.i inhibits AC (adenylyl cyclase), which prevents cAMP accumulation, which prevents PKA. One result of this longer chain is that enkephalin signaling is much slower than GABA and is modulated by other neurotransmitters like DA and Ado.

This dual transmitter system means that a short stimulus might release GABA, while a longer stimulus would release enkephalin. In addition, S.d2 axons contain DOR.i (δ-opioid inhibitory receptor), which can self-inhibit its own GABA release [Steiner and Gerfen 1998]. The longer enkephalin path may disable the faster GABA path. Prolonged S.d2 stimulation produces RTPA and requires active DOR.i in Pv [Soares-Cunha et al 2020]. A similar oppositional fast vs slow transmitter system exists in the H.l to Vta connection, where GABA provides fast inhibition but a slower neurotensin neurotransmitter excites [Patterson et al 2015].

Diagram illustrating the functional decomposition of the ventral pallidum (Pv) and its role in timeout and decision-making circuits, involving interactions with the lateral habenula (Hb.lm) and ventral tegmental area (Vta).
Hypothetical fast and slow multiplexing circuit. The fast path uses GABA through Pv.g to activate DA in Vta for a selection. The slow path uses enkephalin to disinhibit an avoidance action path using Pv glutamate and Hb.lm. DA (dopamine), glu (glutamate), Hb.lm (lateral habenula, medial), Pv (ventral pallidum), Pv.g (Pv GABA neuron), S.d2 (striatum projection neuron with D2.i receptor), Vta (ventral tegmentum).

The above diagram shows hypothetical fast and slow multiplexing circuit with GABA driving the fast selection path and enkephalin driving the slow avoidance path. The fast S.d2 GABA path disinhibits Vta by inhibiting a tonically active Pv GABA interneuron. The slow S.d2 enkephalin path inhibits a distinct tonically active Pv GABA interneuron, which disinhibits the Pv glutamate to Hb.lm avoidance path, and re-inhibits Vta DA. Re-inhibition of Vta DA serves as a lockout of subsequence decisions. Disinhibition of the Hb.lm avoidance enables timeout avoidance. With this temporal multiplexing system, a single S.d2 neuron can serve all three decision functions: selection, lockout, and timeout.

Pv glutamate inputs vs tonic activity

The most prominent Pv inputs from Sv are inhibitory, which raises the question: what are they inhibiting? Either it is inhibiting an excitatory input or it’s inhibiting tonically active neurons. So, the glutamate inputs have an outsized importance because without glutamate or tonic activity, the inhibition has nothing to work against.

In studying the Pv projection to Hb.lm, [Stephenson-Jones et al 2020] inhibited glutamate and GABA neurons to explore the tonic behavior. Inhibiting glutamate did not produce an effect, either RTPP or RTPA, and inhibiting GABA also did not produce an effect. This result suggests that the Pv output neurons are not tonically active, either from their own activity or other internal Pv activity. Without tonic activity, glutamate inputs are necessary to drive output.

The major glutamate inputs are from A.bl, H.l, and H.stn, but the H.stn input is specific to the Pd-like area in Pv.dl [Root et al 2015], so for the purpose of this essay I’m assuming H.stn is restricted to a specific Pv subarea with dorsal basal ganglia function and does not apply to the rest of Pv.

A diagram illustrating the neural connections involving the lateral hypothalamus (H.l), striatal projection neurons (S.d1 and S.d2), and the ventral pallidum (Pv), highlighting their roles in the seek and avoid circuits.
H.l glutamate as powering the Pv. Without H.l input the system is unpowered and has no output. Enk (enkephalin), Glu (glutamate), H.l.ox (lateral hypothalamus orexin), Hb.lm (lateral habenula, medial), Pv.g (ventral pallidum GABA output), Pv.glu (Pv glutamate), S.d1 (striatum projection neuron with D1.s dopamine receptor), S.d2 (striatum projection neuron with D1.i dopamine receptor).

The above diagram shows an hypothetical circuit using H.l.ox (orexin neurons of H.l) as a food search signal that drives both roaming random walk and directed, targeted seek. When the animal is not seeking food because it’s sated or eating, H.l.ox is silent, which unpowers the circuit. My choice of H.l.ox as a glutamate source is hypothetical. H.l has at least 17 glutamate populations [Wang Y et al 2021], including one that implements SLR (subthalamic locomotor region) [Ji C et al 2024], some that project to Hb.lm directly for aversion [Lecca et al 2017], as well as eating-related neurons, and H.l.ox.

I’ve used the enkephalin output from S.d2 because the Hb.lm is the avoidance circuit. The enkephalins receptor DOR.i (δ-opioid receptor) is coupled to inhibitory G-protein and acts primarily presynaptically but does act postsynaptically in Pv [Neuhofer and Kalivas 2023], [Rysztak and Jutkiewicz 2020]. In Pv, stimulating DOR.i inhibits 24% of Pv neurons and excites 13% [Root et al 215]. In an alternative circuit, the S.d2 enkephalin-triggered DOR.i receptor is presynaptic on the glutamate input to Pv.g. Without that glutamate input, the Pv.g neuron is inhibited, which disinhibits the Pv.glu path.

A diagram showing neural pathways related to ventral pallidum circuits. The diagram is divided into four sections with representations of different neuron types and their interactions, including S.d1 neurons interacting with GABA and enkephalin, as well as their connections to the lateral habenula.
Several possible hypothetical slow RTPP and RTPA circuits, focusing on S.d1 opposition to RTPA. S.d1 could directly oppose Pv.glu avoidance with GABA, it could enhance inhibitory interneurons with substance P, or it could inhibit Pv.glu RTPA with dynorphin. Dyn (dynorphin opioid), enk (enkephalin opioid), glu (glutamate), H.l.ox (lateral hypothalamus orexin), Hb.lm (lateral habenula, medial), Pv (ventral pallidum), Pv.g (Pv GABA), Pv.glu (Pv glutamate), S.d1 (striatum projection neuron with D1.s receptor), S.d2 (striatum projection neuron with D2.i receptor), SP (substance-P neurotransmitter), tac1 (tachykinin 1 transcription factor for SP),

Unfortunately, the exact details of the circuits aren’t known yet. It seems reasonable to assume that the S.d1 RTPP path opposes the S.d2 RTPA path using peptides or opioids instead of GABA, but S.d1 produces two additional outputs: the opioid dynorphin with its inhibitory KOR.i (κ-opioid receptor) and the peptide SP (substance P) with its excretory NK1.q (neurokinin 1 with PLC/PKC path). Like enkephalin’s DOR.i receptor, dynorphin’s KOR.i is primarily presynaptic. The above diagram shows three hypothetical circuits, but other more complicated possible circuits exist, including using more tonically active inhibitory GABA interneurons. In particular, S.d1 and S.d2 have auto-receptors for dynorphin and enkephalin respectively, which inhibits their own release of the opioids. Dynorphin is known to self-inhibit S.d1 neurons in Pv [Steiner and Gerfen 1998], which may be its main function. Although I’ve focused on S.d1 and S.d2 neurotransmitters for the slow circuit, another possibility is that a distinct internal Pv mechanism drives the slow avoidance circuit, independent of S.d2 enkephalin and S.d2 dynorphin or SP.

A.bl glutamate

I used H.l.ox as the source of glutamate above, but A.bl is also an important source of glutamate, and inhibiting A.bl can turn odor seek into avoid [Kim R et al 2024], which is exactly the situation here. A.bl is a cortical area, which means it’s more complicated, but has the advantage of supporting sustained, working-memory output. A.bl receives olfactory input from Ob and O.pir (piriform cortex) and outputs glutamate to Pv and to Sv. A.bl has both seek and avoid outputs with distinct projections [Sniffen et al 2024]. A.bl is necessary for conflicting seek and threat, but disabling A.bl does not prevent seek [Hernández-Jaramillo et al 2024]. In addition A.bl receives ACh input from Pv [Root et al 2015]. For this circuit, I’m using the A.bl seek output to serve the same function as H.l did in the previous description. Without A.bl seek input, the seek collapses and turns to avoidance [Kim R et al 2024].

Diagram illustrating the role of the A.bl region in glutamate signaling and its connections to various structures including the olfactory bulb (Ob), ventral pallidum (Pv), and habenula (Hb.lm).
Using A.bl as the primary glutamate source to power the Pv seek and avoidance circuit. A.bl itself is powered by ACh from Pv. A.bl (basolateral amygdala), ACh (acetylcholine), H.l (lateral hypothalamus), Hb.lm (lateral habenula, medial), Ob (olfactory bulb), P.bst (bed nucleus of the stria terminalis, extended amygdala), Pv (ventral pallidum), Pv.g (Pv GABA), Pv.glu (Pv glutamate), Sa (central amygdala), S.d1 (striatum projection neuron with D1.s receptor), S.d2 (striatum projection neuron with D1.i receptor), Sv (ventral striatum).

The ACh input from Pv to A.bl is important to sustaining attention. ACh acts on m1.q (ACh metabotropic G-q coupled receptor) in the A.bl PY (pyramidal) neurons [Unal et al 2015]. Activating m1.q turns the PY neurons into a sustained excitation with an ADP (after-depolarization potential) after receiving both ACh and an AP (action potential) [Unal et al 2015]. ADP turns the PY neuron into an Up state for 7-10 seconds, meaning it’s more easily activated by inputs than its base state. Essentially, ACh converts A.bl firing into working memory or sustained attention.

The Pv ACh neuron inputs include H.l, Sv, and Sa (central amygdala) and P.bst (bed nucleus of the stria terminalis, external amygdala) [Schlingloff et al 2025]. This ACh modulation gives another opportunity to control seek to an odor target. An initial odor detection on the order of 500ms might only trigger sustained seek if ACh is activated by a food-seek drive from H.l and not suppressed by Sv, Sa, or P.bst. Working memory or sustained attention for the odor would require food motivation and an absence of habituation.

Simulation

The main seek path is almost entirely disconnected from the Pv timeout circuitry discussed in the essay. The main seek path is a short, fast path from Ob to V.pt (posterior tuberculum) to MLR (midbrain locomotor region) to R5.rs (mid-hindbrain reticulospinal turning area), represented by Ob to MidSeek to HindMove.

A flowchart illustrating the pathway from the olfactory bulb (Ob) to a central decision-making node ('MidSeek') that connects to the midbrain locomotor region (V.pt-MLR) and subsequently to the hindbrain region (R5.rs) for movement control.
Simulation model for the direct seek path. Ob (olfactory bulb), MLR (midbrain locomotor region), R5.rs (mid-hindbrain reticulospinal motor), V.pt (posterior tuberculum).

An earlier simulation model used S.d as a timeout for an OT orientation circuit, but the S.d lacks the direct avoidance action that Sv has with Hb.lm. However, the Pv and Hb.lm circuit is almost entirely disconnected from the V.pt-MLR, which means that the Pv modulation is quite convoluted.

A diagram illustrating a neural circuit model showing connections between various brain regions, including the olfactory bulb (Ob), midbrain (MidSeek), ventral pallidum (Pv), and areas involved in seeking and avoiding behaviors.
Convoluted avoidance path from PvSeek through HbAvoid to suppress the MidSeek action. A.bl (basolateral amygdala), Ob (olfactory bulb), Pv (ventral pallidum), R1.a (anterior hindbrain motor region), R5.rs (mid-hindbrain motor region), S.ot (olfactory tubercle).

In the avoid circuit, HbAvoid is the key avoidance node, which PvSeek uses for avoidance. An avoidance action needs to stop ongoing action, and to enable a reversal of the current seek direction. In the simulation, MidSeek can reverse its direction if it received an avoid signal. However, I don’t know if any midbrain circuit can reverse direction with an external modulating signal. The most plausible path is from V.mr as the main target of Hb.lm.

If this seek to avoid reversal circuit does exist, it might exist in OT, which does handle both seek and avoid, is used for general left vs right decisions, and receives V.rn input. But for the sake of this essay, I’m avoiding the complexity of revisiting OT and instead assuming that MidSeek can reverse direction on its own.

An alternative is more of a switchboard configuration, where avoidance disables the seek path and enables an odor avoidance path. In animals like the lamprey and fish, Ob directly drives Hb.m for odor chemotaxis, although that path does not exist for mammals, because hippocampus output drives their Hb.m. Using that switchboard model, Pv would use V.rn as the controller to switch between the V.pt seek circuit and the Hb.m odor avoidance chemotaxis. V.rn is essentially part of the Hb.m and R1.a motor circuit, and can project to essentially the entire brain the serotonin and non-serotonin projections.

Hysteresis

The simulation raised the problem of hysteresis again. This time partially because of its simplified PKA and enkephalin model. In this case, the simulation uses a single threshold for deciding to avoid, using PKA and enkephalin rising above a threshold. Unfortunately, when avoidance occurs, the simulation immediately decays the PKA, which drops it below the threshold, curtailing the avoidance and allowing the animal to reenter the failed odor plume. Because the simulation is a program, this problem could be easily fixed by adding a second threshold to disable avoidance, but how could Pv accomplish this hysteresis?

One solution could have Pv blocking any new decision to seek an odor. The S.d2 fast selection phase could be inhibited by low levels of enkephalin. When a new odor triggers S.d2, it would release some level of enkephalin because of the remaining PKA, which might be enough to block a new decision. An alternative solution could use the ACh to A.bl attention circuit. If the lower enkephalin level was still high enough to block ACh attention, it would block a new seek action. This A.bl solution would work especially well if A.bl habituates to an odor if it has no ACh.

References

Agetsuma M., Aizawa H., Aoki T., Nakayama R., M. Takahoko, M. Goto, T. Sassa, R. Amo, T. Shiraki, K. Kawakami, et al. The habenula is crucial for experience-dependent modification of fear responses in zebrafish Nat. Neurosci., 13 (2010), pp. 1354-1356

Baker PM, Mathis V, Lecourtier L, Simmons SC, Nugent FS, Hill S, Mizumori SJY. Lateral Habenula Beyond Avoidance: Roles in Stress, Memory, and Decision-Making With Implications for Psychiatric Disorders. Front Syst Neurosci. 2022 Mar 3;16:826475. 

Bariselli S, Fobbs WC, Creed MC, Kravitz AV. A competitive model for striatal action selection. Brain Res. 2019 Jun 15;1713:70-79. 

Bonnavion, P., Varin, C., Fakhfouri, G., Martinez Olondo, P., De Groote, A., Cornil, A., Lorenzo Lopez, R., Pozuelo Fernandez, E., Isingrini, E., Rainer, Q. and Xu, K., 2024. Striatal projection neurons coexpressing dopamine D1 and D2 receptors modulate the motor function of D1-and D2-SPNs. Nature neuroscience, 27(9), pp.1783-1793.

Conde-Berriozabal, S., Sitja-Roqueta, L., García-García, E., García-Gilabert, L., Sancho-Balsells, A., Fernandez-García, S., Rodriguez-Urgellés, E., Giralt, A., Castañé, A., Rodríguez, M.J. and Alberch, J., 2025. Differential impact of optogenetic stimulation of direct and indirect pathways from dorsolateral and dorsomedial striatum on motor symptoms in Huntington’s disease mice. Experimental Neurology, 383, p.114991.

Cui G, Jun SB, Jin X, Pham MD, Vogel SS, Lovinger DM, Costa RM. Concurrent activation of striatal direct and indirect pathways during action initiation. Nature. 2013 Feb 14;494(7436):238-42. 

Dai, K.Z., Choi, I.B., Levitt, R., Blegen, M.B., Kaplan, A.R., Matsui, A., Shin, J.H., Bocarsly, M.E., Simpson, E.H., Kellendonk, C. and Alvarez, V.A., 2022. Dopamine D2 receptors bidirectionally regulate striatal enkephalin expression: Implications for cocaine reward. Cell reports, 40(13).

Faget, L., Oriol, L., Lee, W.C., Zell, V., Sargent, C., Flores, A., Hollon, N.G., Ramanathan, D. and Hnasko, T.S., 2024. Ventral pallidum GABA and glutamate neurons drive approach and avoidance through distinct modulation of VTA cell types. Nature Communications, 15(1), p.4233.

Giardino WJ, Eban-Rothschild A, Christoffel DJ, Li SB, Malenka RC, de Lecea L. Parallel circuits from the bed nuclei of stria terminalis to the lateral hypothalamus drive opposing emotional states. Nat Neurosci. 2018 Aug;21(8):1084-1095. 

Heinsbroek JA, Bobadilla AC, Dereschewitz E, Assali A, Chalhoub RM, Cowan CW, Kalivas PW. Opposing Regulation of Cocaine Seeking by Glutamate and GABA Neurons in the Ventral Pallidum. Cell Rep. 2020 Feb 11;30(6):2018-2027.e3.

Hernández-Jaramillo, A., Illescas-Huerta, E. and Sotres-Bayon, F., 2024. Ventral pallidum and amygdala cooperate to restrain reward approach under threat. Journal of Neuroscience, 44(23).

Hodge, A. and Yttri, E., 2025. Striatal modulation supports context-specific reinforcement and not action selection. Cell Reports, 44(8).

Hook, V., Toneff, T., Baylon, S. and Sei, C., 2008. Differential activation of enkephalin, galanin, somatostatin, NPY, and VIP neuropeptide production by stimulators of protein kinases A and C in neuroendocrine chromaffin cells. Neuropeptides, 42(5-6), pp.503-511.

Hu, H., Cui, Y. and Yang, Y., 2020. Circuits and functions of the lateral habenula in health and in disease. Nature Reviews Neuroscience, 21(5), pp.277-295.

Jennings JH, Ung RL, Resendez SL, Stamatakis AM, Taylor JG, Huang J, Veleta K, Kantak PA, Aita M, Shilling-Scrivo K, Ramakrishnan C, Deisseroth K, Otte S, Stuber GD. Visualizing hypothalamic network dynamics for appetitive and consummatory behaviors. Cell. 2015 Jan 29;160(3):516-27.

Ji, C., Zhang, Y., Lin, Z., Zhao, Z., Jiao, Z., Zheng, Z., Shi, X., Wang, X., Li, Z., Yu, S. and Qu, Y., 2024. Activation of hypothalamic-pontine-spinal pathway promotes locomotor initiation and functional recovery after spinal cord injury in mice. bioRxiv, pp.2024-11.

Kim, R., Ananth, M.R., Desai, N.S., Role, L.W. and Talmage, D.A., 2024. Distinct subpopulations of ventral pallidal cholinergic projection neurons encode valence of olfactory stimuli. Cell reports, 43(4).

Konradi, C., Macı́as, W., Dudman, J.T. and Carlson, R.R., 2003. Striatal proenkephalin gene induction: coordinated regulation by cyclic AMP and calcium pathways. Molecular brain research, 115(2), pp.157-161.

Lecca S, Meye FJ, Trusel M, Tchenio A, Harris J, Schwarz MK, Burdakov D, Georges F, Mameli M. Aversive stimuli drive hypothalamus-to-habenula excitation to promote escape behavior. Elife. 2017 Sep 5;6:e30697.

Lindsey JW, Markowitz J, Gillis WF, Datta SR, Litwin-Kumar A. Dynamics of striatal action selection and reinforcement learning. Elife. 2025 May 8;13:RP101747.

Luo, Y.J., Ge, J., Chen, Z.K., Liu, Z.L., Lazarus, M., Qu, W.M., Huang, Z.L. and Li, Y.D., 2023. Ventral pallidal glutamatergic neurons regulate wakefulness and emotion through separated projections. Iscience, 26(8).

Neuhofer, D. and Kalivas, P., 2023. Differential modulation of GABAergic and glutamatergic neurons in the ventral pallidum by GABA and neuropeptides. Eneuro, 10(7).

Ottenheimer, D.J., Simon, R.C., Burke, C.T., Bowen, A.J., Ferguson, S.M. and Stuber, G.D., 2024. Single-cell sequencing of rodent ventral pallidum reveals diverse neuronal subtypes with non-canonical interregional continuity. BioRxiv, pp.2024-03.

Palmer D, Cayton CA, Scott A, Lin I, Newell B, Paulson A, Weberg M, Richard JM. Ventral pallidum neurons projecting to the ventral tegmental area reinforce but do not invigorate reward-seeking behavior. Cell Rep. 2024 Jan 23;43(1):113669. 

Patterson CM, Wong JM, Leinninger GM, Allison MB, Mabrouk OS, Kasper CL, Gonzalez IE, Mackenzie A, Jones JC, Kennedy RT, Myers MG Jr. Ventral tegmental area neurotensin signaling links the lateral hypothalamus to locomotor activity and striatal dopamine efflux in male mice. Endocrinology. 2015 May;156(5):1692-700. 

Phua SC, Tan YL, Kok AMY, Senol E, Chiam CJH, Lee CY, Peng Y, Lim ATJ, Mohammad H, Lim JX, Fu Y. A distinct parabrachial-to-lateral hypothalamus circuit for motivational suppression of feeding by nociception. Sci Adv. 2021 May 7;7(19):eabe4323. 

Root DH, Melendez RI, Zaborszky L, Napier TC. The ventral pallidum: Subregion-specific functional anatomy and roles in motivated behaviors. Prog Neurobiol. 2015 Jul;130:29-70. 

Rysztak, L.G. and Jutkiewicz, E.M., 2022. The role of enkephalinergic systems in substance use disorders. Frontiers in Systems Neuroscience, 16, p.932546.

Schlingloff, D., Szabó, Í., Gulyás, É., Király, B., Kispál, R., Stephenson-Jones, M. and Hangya, B., 2025. Most ventral pallidal cholinergic neurons are cortically projecting bursting basal forebrain cholinergic neurons. bioRxiv, pp.2025-02.

Siemian JN, Arenivar MA, Sarsfield S, Aponte Y. Hypothalamic control of interoceptive hunger. Curr Biol. 2021 Sep 13;31(17):3797-3809.e5.

Soares-Cunha C, de Vasconcelos NAP, Coimbra B, Domingues AV, Silva JM, Loureiro-Campos E, Gaspar R, Sotiropoulos I, Sousa N, Rodrigues AJ. Nucleus accumbens medium spiny neurons subtypes signal both reward and aversion. Mol Psychiatry. 2020 Dec;25(12):3241-3255. 

Stamatakis AM, Van Swieten M, Basiri ML, Blair GA, Kantak P, Stuber GD. Lateral Hypothalamic Area Glutamatergic Neurons and Their Projections to the Lateral Habenula Regulate Feeding and Reward. J Neurosci. 2016 Jan 13;36(2):302-11. 

Steiner, H. and Gerfen, C.R., 1998. Role of dynorphin and enkephalin in the regulation of striatal output pathways and behavior. Experimental brain research, 123(1), pp.60-76.

Stephenson-Jones M, Bravo-Rivera C, Ahrens S, Furlan A, Xiao X, Fernandes-Henriques C, Li B. Opposing Contributions of GABAergic and Glutamatergic Ventral Pallidal Neurons to Motivational Behaviors. Neuron. 2020 Mar 4;105(5):921-933.e5. 

Tan, B., Browne, C.J., Nöbauer, T., Vaziri, A., Friedman, J.M. and Nestler, E.J., 2024. Drugs of abuse hijack a mesolimbic pathway that processes homeostatic need. Science, 384(6693), p.eadk6742.

Tecuapetla F, Jin X, Lima SQ, Costa RM. Complementary Contributions of Striatal Projection Pathways to Action Initiation and Execution. Cell. 2016 Jul 28;166(3):703-715. 

Unal, C.T., Pare, D. and Zaborszky, L., 2015. Impact of basal forebrain cholinergic inputs on basolateral amygdala neurons. Journal of Neuroscience, 35(2), pp.853-863.

Vachez YM, Tooley JR, Abiraman K, Matikainen-Ankney B, Casey E, Earnest T, Ramos LM, Silberberg H, Godynyuk E, Uddin O, Marconi L, Le Pichon CE, Creed MC. Ventral arkypallidal neurons inhibit accumbal firing to promote reward consumption. Nat Neurosci. 2021 Mar;24(3):379-390.

Wang Q, Sun RY, Hu JX, Sun YH, Li CY, Huang H, Wang H, Li XM. Hypothalamic-hindbrain circuit for consumption-induced fear regulation. Nat Commun. 2024 Sep 4;15(1):7728.

Williams, M., 2024. Study of the network involving the subthalamic nucleus in various measures of motivation in rats (Doctoral dissertation, Aix-marseille université).

Yao Y, Gao G, Liu K, Shi X, Cheng M, Xiong Y, Song S. Projections from D2 Neurons in Different Subregions of Nucleus Accumbens Shell to Ventral Pallidum Play Distinct Roles in Reward and Aversion. Neurosci Bull. 2021 May;37(5):623-640. 

42: Optic Tectum Decision

The key element of decision making is the commitment to an action: all-or-none, both sustaining a decision and locking out competing action. In contrast, the choice part of decision-making is less important. Even a simple random choice or first choice is an effective decision mechanism, but losing out the all-or-none means the decision isn’t a decision. The key is ensuring that once a choice is made, the animal sticks to that choice. The losing action should not interfere with the winner. Specifically a decision suppresses dithering: switching between competition actions [Redgrave et al 1999].

This essay uses wall-following from essay 41 as the decision. If the animal detects a wall to the right, it will follow that wall for a time, improving search over random walk by reducing the search to a single dimension. In this case, the choice is left or right, which particularly matters because the choice requires communication between the two sides, which requires specific circuits because commissures are relatively rare.

Decision: two-phase commit

Decision-making has two main components: the choice (preparation) and commitment. A decision that doesn’t sustain or that doesn’t lock out competing stimuli isn’t a decision. Decision-making can split into a preparation / selection phase, which compare options — taking time if necessary, followed by a commit winner-take-all phase where the winning action goes forward and any losing action is locked out.

Decision as a two-phase process

Most decision research is focused on the preparation phase because people are interested in choosing A vs B, and much less research on the commitment implementation, the timeout and lock-out. For the essay simulation, the commitment is more important and needs to be implemented first. Without the commitment, a losing option can continually interrupt the animal, distracting it from its goals. The requirements for commitment are something like:

  • Sustain
  • Timeout: prevent the sustain from becoming perseveration
  • Lockout: prevent competing actions

Orientation and wall-following

For a decision, this essay uses wall-following (thigmotaxis), continuing from essay 41. Wall-following needs to be treated as a decision beyond a single swimming cycle. Consider the alternative where a lateral-line sense is a simple sensory-action reflex for each swimming cycle. Without a longer conception of the decision, the animal can’t avoid perseveration: it would circle a pillar or a convex arena endlessly. Any timeout needs to curtail wall-following, not right turns. Similarly, without persistence the animal might alternate left and right wall-following in a crowded environment, where both the left and right lateral-line indicates obstacles. Of the two, the timeout issue is more critical, but the ability to continue an action, is necessary to enable a “win-stay” strategy.

Because the research hasn’t located the circuit for wall following, these essays need to choose a brain location for it. The previous essay proposed R1.a (anterior hindbrain) as the driver of thingmotaxis, but an alternative uses OT (optic tectum) as an orientation center for thigmotaxis. Because OT receives lateral-line input via M.ts (torus semicircularis / inferior colliculus) [Zeymer et al 2018], it has the sensory information needed to turn toward a wall. OT is known as an orientation center. When a surprising or salient sensation appears, the animal turns toward it. If we imagine the proto-vertebrate as non-cortical, then that orient is likely OT. Even in mammals with a strongly developed cortex that can provide orientation functionality, OT is perhaps the strongest [Schall 2019].

Two architectures for orientation decisions. On the left, the orientation sustains its own decisions and is directly modulated by timeout. On the right, a separate module is responsible for sustaining the decision, possibly incorporating motor efference copies as part of the sustain system.

One immediate question is if the orientation also implements sustain. Compare the left and right model. In the left model, the orientation system implements sustain itself, driving and sustaining a turn action. The right model has a distinct sustain system, which may be driven by motor efference copies. The known connectivity of OT could support either model.

OT has independent sustaining capabilities, in part due to sodium channel modulation [Ghitani et al 2016], [Thompson AC and Aizenman 2023]. OT also has a loop with T.pf (parafascicular thalamus) and S.d (dorsal striatum), which can implement the timeout using A2a.s (adenosine G-s coupled stimulatory receptor) and adenosine accumulation. Although the left model is possible, several studies report OT burst neurons activate at the decision point [Lintz et al 2019], [Stine et al 2023], with ramping neurons before the decision and action [Munoz and Wurtz 1995], [Lintz et al 2019], not after the decision, suggesting the model on the right. Ppt (pedunculopontine tegmental nucleus) is well-suited as a central node of the sustain role. Ppt maintains activity from one decision to another in tasks that repeat decisions [Thompson JA et al 2016]. It also receives widespread input from hindbrain motor areas, including R1.a and R5.my.gi (medulla giganocellular chx10 turning neurons) [Huerta-Ocampo et al 2021].

If wall-following uses the midbrain orientation circuitry, then a combination of OT and Ppt is plausible following the model on the right. Note, though, that the sustain may not be only Ppt, but could also include other anterior hindbrain systems like R1.a, V.rn (serotonin Raphé nuclei), and possibly R.ip (interpeduncular nucleus), because all of these are associated with brainstem sustained attention network [Alves et al 2022].

S.nr tri-value logic

S.nr (substantia nigra pars reticulata) is a key player in commitment. S.nr provides tonic suppression over essentially every voluntary action. For this essay, consider S.nr as a tri-value logic. The tonic, middle level of S.nr allows ongoing actions to continue, but inhibits starting a new action. A low S.nr value, the classic disinhibition model, allows new actions to start. A high S.nr value stops ongoing actions. The tonic level itself might be adjustable. For decision-making, this tri-value S.nr can support the commitment requirements of sustain, timeout (Stop), and lockout (passive inhibition with overriding Go).

S.nr as a tri-value system. The tonic activation allows sustain of an ongoing action. A Go signal disinhibits an action, allowing it to start. A Stop signal inhibits an ongoing action.

In the diagram above, the tonic S.nr inhibits new actions, but ongoing actions would continue, or a sufficiently strong sense input could start an action. An explicit Go signal would allow a weak sensory input to start an action. An explicit Stop signal would stop all action, regardless of the sense strength. The adjustable tonic level can be influenced by sleep and wake pressure, since S.nr.m is associated with sleep [Liu D et al 2020]. As the animal grows tired, a higher tonic S.nr would discourage new actions and encourage stopping of sustained actions, but the sleep pressure would not outright prevent action.

In the context of decision commitment, disabling S.nr eliminates orientation selectivity: mice are unable to resist orienting to any object in the whisker field [Redgrave et al 1999]. In PD (Parkinson’s disease) an overly-active S.nr produces bradykinesia (slow movements) and akinesia (lack of voluntary movement), and inhibiting S.nr can reduce akinesia and bradykinesia [Hu Y et al 2023], [Lin C et al 2024]. However, an underactive S.nr can produce dyskinesia (twisted postures) where opposing actions activate at the same time, and stimulating S.nr can reduce dyskinesia [Hu Y et al 2023].

Action sustain and timeout

An immediate consequence of commitment is timeout. Without a timeout, an unending commitment can lock an animal into a decision forever. A previous essay already covered a possible timeout circuit using adenosine as a timing neurotransmitter. The S.d2 (striatum with D2 dopamine-receptor) projection neuron have A2a.s (adenosine G-s coupled stimulatory) receptors, and some studies use A2a.s receptors to identify the S.d2 neurons as S.a2a as opposed to the S.d2 convention in the essays. Adenosine builds up around active neurons, partially produced by astrocytes that monitor glutamate activity [Ma et al 2022]. This adenosine progressively activates S.d2 neurons, which stops action using the indirect path.

Basal ganglia timeout circuit. Ongoing left wall-following provides an efference copy to S.d2. As time continues, adenosine enables S.d2, and eventually activates the indirect path to stop the left action. H.stn (subthalamic nucleus), P.ge (globus pallidus, external), S.d2 (striatum D2-receptor projection neuron), S.nr (substantia nigra pars reticulata), T.pf (parafascicular thalamus).

The above diagram shows a possible timeout circuit for thigmotaxis following a left wall. During the left wall-following, an efference copy via T.pf (parafascicular thalamus) drives glutamate to S.d2 in the striatum. Sustained glutamate in S.d2 produces adenosine, which progressively activates S.d2, which then inhibits the current action using the indirect path of P.ge (external globus pallidus) to H.stn (subthalamic nucleus) to S.nr to stop the left action.

Note that the P.ge / H.stn circuit is complex and oscillatory. This path isn’t necessarily a straight chain as the above diagram would suggest. For example, S.d2 to H.stn / P.ge can switch the mode from irregular, unsynchronized firing to a regular oscillation [Terman et al 2002], or switch from a gamma (~80Hz) to a beta (~20Hz) frequency [Wang Y et al 2024].

Interrupting sustained action

Sometimes sustained actions need to be interrupted, either for dramatic reasons like a predator attack or more mundane situations like stubbing a toe. These interrupts need to be fully general, halting the current action, no matter which action path happens to be active. Note the similarity to sleep, where sleep needs to halt any action.

Two ways of halting action are either a direct halt signal or a deadman’s switch. In a deadman’s switch, a tonic signal maintains normal behavior, and the absence of the signal stops the action. The brain uses this pattern with several instances of high-affinity Gi (G-protein coupled inhibitory) receptors that saturate in normal, tonic activity, but disengage when the neurotransmitter drops. In particular the D2.i (dopamine G-i inhibitory) receptor is high affinity, quickly saturating, that is fully active at normal tonic levels of dopamine and only shuts off when dopamine levels drop.

Adding a dopamine deadman’s switch to the timeout circuit. An interruption drops DA, which immediately activates the timeout circuit from S.d2. H.stn (subthalamic nucleus), P.ge (external globus pallidus), S.d2 (striatum D2 projection neuron), S.nr (substantia nigra pars reticulata), T.pf (parafasciculus thalamus), V.da (midbrain dopamine), V.rmtg (rostral medial tegmentum).

The diagram above adds a dopamine deadman’s switch to the timeout circuit. Tonic dopamine from V.da (midbrain dopamine) normally inhibits S.d2, allowing for a normal timeout. Because D2.i is a high affinity receptor, a low tonic level of dopamine activates it and quickly saturates the receptor. When dopamine drops below a threshold, the D2.i receptor will deactivate and disinhibit the S.d2 neuron, which rapidly fires the timeout using the indirect path, stopping the action. Dopamine will drop if V.rmtg (rostromedial tegmental) activates. V.rmtg is activated by pain or itch sensations and by many more general failure or disappointment systems.

Lockout and the Sprague effect

The commitment phase needs to lockout alternative distractors. I haven’t found any research on this specific scenario. Decision research generally studies artificial forced-choice scenarios, where each choice is separated by several seconds from another choice, and is forced by a single decision point, like a T-maze or Y-maze, or turning left or right from a central cue port. The design of the typical experiment removes the scenario of sequential, continuous choices. Because of the lack of direct studies, the following discussion is more speculative. Attention is a related, but distinct research area to decision-making. Sustained attention is similar to this commitment issue.

The Sprague effect is related to OT attention. In mammals OT receives excitatory input from C.vis (visual cortex). If the left C.vis is lesioned, the animal will ignore items in contralateral, right visual field. Paradoxically, a following lesion to contralateral, right OT will restore attention to the left visual field [Gambrill et al 2018], [Gebhardt et al 2019], [Jiang et al 2003], [Krauzlis et al 2013]. Further studies have shown this effect with the second lesion to the tectal commissure [Gambrill et al 2018] or to the specific area of contralateral S.nr [Krauzlis et al 2013] or to the entire contralateral Ppt [Valero-Cabré et al 2020].

In frogs there is a direct OT to contralateral OT connection. Unilateral OT legion impairs bilateral visual behavior regardless of looming direction [Gambrill et al 2018]. In contrast unilateral OT lesion deficit in behavior only in lesioned hemifield [Gambrill et al 2018].

Possible Sprague effect circuit, showing lockout of contralateral wall-following. OT.d (deep layers of optic tectum), Ppt.a (anterior pedunculopontine tegmental nucleus), Ppt.p (posterior Ppt), S.nr (substantia nigra pars reticulata).

This above diagram shows a potential circuit for the Sprague effect. (For simplicity, straightening out crossed output to the motor.) Once a decision to follow a right wall has been made, the ongoing motor action sends an efferent copy to Ppt [Caggiano et al 2018], which projects to S.nr [Durmer and Rosenquist 2001], which inhibits the contralateral OT.d [Durmer and Rosenquist 2001]. Similarly, a motor efferent copy to Ppt also projects to the ipsilateral OT.d [Valero-Cabré et al 2020], which enhances attention to continue following the right wall.

This Ppt sustained attention circuit is similar to the R.is (nucleus isthmus / parabigeminal) circuit in fish [Henriques et al 2019] and birds [Marín et al 2007], covered in essay 19. Like Ppt, R.is has both ACh and GABA components, although in Ppt the components are mixed salt-and-pepper, while R.is has distinct nuclei. Ppt and R.is are sibling areas, both generated from the same progenitors in R1 (hindbrain rhombomere 1), but one is generated before the other [Morello et al 2020]. R.is is better understood because it has simpler connectivity than Ppt. When zebrafish hunt paramecia, R.is sustains attention to a target prey and inhibits attention to other visual areas [Henriques et al 2019]. R.is provides a similar effect in birds [Knudsen 2011], [Marín et al 2007], [Mysore and Knudsen 2011], [Reynaert et al 2023].

Although Ppt has much more complicated connectivity and function, like R.is, it has reciprocal connectivity with OT. Ppt is also active during actions, and it highly heterogenous, and connected with much of the hindbrain motor, both R.pn (pons, anterior hindbrain) and R.my (medulla, central and posterior hindbrain). Like R.is, Ppt proves ACh attention to OT [Isa et al 2021], [Mena-Segovia et al 2008], [Mena-Segovia et al 2017], [Krauzlis et al 2013], [Wolf et al 2015] and as the Sprague studies show, it inhibits the contralateral OT via S.nr.

At the time of choice, many Ppt reflect previous action and outcome. Ppt lesions reduce influence of recent experience on action selection. The Ppt ACh input to OT is possible as a Bayesian prior [Thompson et al 2016].

Passive lockout

An alternative to an active of alternative actions is a passive lockout, which inhibits actions without needing input from a sustain system. Once an action commits, the passive lockout prevents new action. A possible passive lockout involves the H.stn / P.ge pair, which is hyperactive in PD. Akinesia like PD is exactly what’s needed for passive lockout.

H.stn / P.ge as a passive lockout system. H.stn (subthalamic nucleus), P.ge (external globus pallidus), S.nr (substantia nigra pars reticulata).

In the above circuit, H.stn and P.ge form a spontaneously oscillating circuit at beta frequencies. In PD, this circuit is hyperactive, oscillating at beta frequencies, providing broad movement inhibition [Fischer et al 2017]. This circuit drives S.nr, which inhibits the action.

This description vastly oversimplifies the P.ge / H.stn circuit. The P.ge / H.stn circuit can operate in at least two modes: inhibitory at beta frequencies (H.stn exciting S.nr), and excitatory at gamma (P.ge inhibiting S.nr) [Fisher et al 2017], [Terman et al 2002]. H.stn also has distinct subregions, with H.stn.vm (venture-medial) as almost an extension of H.l [Haynes and Haber 2013], while H.stn.l as distinct functionality [Baunez and Lardeux 2011], [Pasquereau and Turner 2017].

Studies seem to divide on whether H.stn is suitable for a commitment function. H.stn activity terminates at onset of movement [Espinosa-Parrilla et al 2013], which would argue against passive lockout. H.stn gamma increases during movement for humans [Fischer et al 2017], but others point out that H.stn beta are brief bursts, not sustained [Feingold et al 2015], and H.stn beta in humans is active for acute stopping [Wessel et al 2016].

A second passive lockout is in the striatum itself, discouraging new actions by default. S.pn (striatum projection neurons), both S.d1 (D1 receptor S.pn) and S.d2, are hyper polarized, making them harder to drive than most neurons. Secondarily, new actions are inhibited by the feedforward, fast-spiking S.pv (parvalbumin) neurons, which inhibit S.d1 and S.d2 before they can be activated. S.pv activates before S.d1 and S.d2 [Gage et al 2010], [Lee C et al 2019], [O’Hare et al 2017], [Yim et al 2011].

Striatum passive lockout using the inhibitory S.pv neurons. Once an action starts, an endocannibinoid sub circuit disinhibits the action, allowing sustained activity. eCB (endocannibinoid neurotransmitter), S.d1 (striatum D1-receptor neurons), S.d2 (striatum D2-receptor neurons), S.pv (striatum parvalbumin inhibitory neuron), T.pf (parafascicular thalamus).

This S.pv inhibition is suppressed by sustained action using a retrograde eCB (endocannabinoid) system that disinhibits both S.d1 and S.d2 by inhibiting GABA release from S.pv [Narushima et al 2006], [Adermark et al 2009], [Mathur and Lovinger 2012].

Active initialization

A passive lockout system needs to be paired with an active initialization. If new actions are passively inhibited by default, a new action needs extra effort to cross the barrier. Possible active initialization nodes include OT, Ppt, as well as the S.d1 direct path.

The following diagram shows a possible active initialization. The passive lockout subcircuit is the same as before. The active initialization would logically use the S.d1 path. Phasic dopamine activates the Go path, both by enabling the S.d1 input and their output, because D1.s receptors are on inputs to S.d1 and on the axons in S.nr, which enables the direct path to disinhibit the S.nr.

Explicit active Go circuit to override the passive lockout circuit. H.stn (subthalamic nucleus), LL (lateral-line), OT (optic tectum), P.ge (external globus pallidus), Ppt (pedunculopontine tegmentum), S.d1 (D1-receptor striatum), S.nr (substantia nigra pars reticulata), T.pf (parafascicular thalamus), V.da (midbrain dopamine).

This Go circuit is for wall-following, which uses the lateral-line as a wall distance sensor. The lateral line sense is input to the OT orientation circuit, which excites both the S.d1 path via T.pf and the V.da path, which will add a phasic DA burst to enhance the S.d1 circuit, giving it an extra boost to overcome the barriers.

Note that OT / Ppt also inhibits contralateral OT as described in the Sprague effect system, and the OT to V.da excitation is ipsilateral, but OT also inhibits the contralateral V.da via V.rmtg [Pradel et al 2021]. So this circuit is also part of the active lockout system.

Consider the striatum passive lockout circuit again, which discouraged new actions but enabled sustained actions. That passive lockout implies the necessity of an extra push for new actions. Phasic dopamine bursts could provide that extra push. A burst of dopamine activates the low-affinity D1.s receptors in S.d1, allowing S.d1 to override its intrinsic hyperpolarization and the feedforward S.pv inhibition and initiate a new action.

Striatum passive lockout circuit with sustain from eCB disinhibition and new actions enabled by DA burst. DA (dopamine), eCB (endocannabinoid), S.d1 (D1-receptor striatum projection neuron), S.d2 (D1-receptor striatum projection neuron), S.pv (striatum parvalbumin inhibiting interneuron), T.pf (parafascicular thalamus).

Pretectum suppression of OT

This essay uses the aquatic-only lateral-line for thigmotaxis. Thigmotaxis is an interesting system because the animal must be weakly attracted to the wall but simultaneously repelled by the wall to avoid collision. M.pt (pretectum) is an obstacle avoidance system in the midbrain and OT is an orienting system. In non-mammalian vertebrates (reptiles, birds, frogs), the striatum projects directly to M.pt, which inhibits OT [Krauzlis et al 2018]. If the animal gets too close to the wall, M.pt should inhibit the OT orientation and avoid the wall, but if the animal is far enough from the wall, it should approach the wall with thigmotaxis, suppressing the avoidance circuit.

Thigmotaxis balancing attraction from OT orientation and avoidance from M.pt obstacle avoidance. LL (lateral line), M.pt (pretectum), OT (optic tectum), S (striatum), T.pf (parafascicular thalamus).

The above circuit shows this potential thigmotaxis circuit. Normally, M.pt avoids the wall and suppresses OT to keep any OT orientation from running into the wall. But during thigmotaxis, the S circuit will suppress M.pt to allow the animal to get closer to the wall.

Simulation

The simulation divides thigmotaxis into several systems. An obstacle system roughly corresponds to M.pt and keeps the animal from running into a wall. An orientation system provides an attractive drive toward the wall. These two systems are now designed as independent and general, where sensory input is external. For example, the lateral line drives both the obstacle and orient systems, but the code for obstacle and orient systems are ignorant of the lateral line itself.

Outline of simulation modules for lateral-line thigmotaxis.

The decision commitment uses a loop with a sustain module and a striatum module. The sustain roughly corresponds to Ppt with possible associated areas like V.dr and R1.a, because the simulation is more abstract than directly implementing each neural ganglia. The striatum module provides with timeout function with an adenosine-lie timeout. The sustain also provides an active inhibitory lockout function, following the Sprague effect studies.

References

Adermark L, Talani G, Lovinger DM. Endocannabinoid-dependent plasticity at GABAergic and glutamatergic synapses in the striatum is regulated by synaptic activity. Eur J Neurosci. 2009 Jan;29(1):32-41.

Alves PN, Forkel SJ, Corbetta M, Thiebaut de Schotten M. The subcortical and neurochemical organization of the ventral and dorsal attention networks. Commun Biol. 2022 Dec 7;5(1):1343.

Baunez C, Lardeux S. Frontal cortex-like functions of the subthalamic nucleus. Front Syst Neurosci. 2011 Oct 11;5:83. 

Caggiano V, Leiras R, Goñi-Erro H, Masini D, Bellardita C, Bouvier J, Caldeira V, Fisone G, Kiehn O. Midbrain circuits that set locomotor speed and gait selection. Nature. 2018 Jan 25;553(7689):455-460. 

Durmer JS, Rosenquist AC. Ibotenic acid lesions in the pedunculopontine region result in recovery of visual orienting in the hemianopic cat. Neuroscience. 2001;106(4):765-81. 

Espinosa-Parrilla JF, Baunez C, Apicella P. Linking reward processing to behavioral output: motor and motivational integration in the primate subthalamic nucleus. Front Comput Neurosci. 2013 Dec 17;7:175. 

Feingold J, Gibson DJ, DePasquale B, Graybiel AM. Bursts of beta oscillation differentiate postperformance activity in the striatum and motor cortex of monkeys performing movement tasks. Proc Natl Acad Sci U S A. 2015 Nov 3;112(44):13687-92.

Fischer P, Pogosyan A, Herz DM, Cheeran B, Green AL, Fitzgerald J, Aziz TZ, Hyam J, Little S, Foltynie T, Limousin P, Zrinzo L, Brown P, Tan H. Subthalamic nucleus gamma activity increases not only during movement but also during movement inhibition. Elife. 2017 Jul 25;6:e23947.

Gage GJ, Stoetzner CR, Wiltschko AB, Berke JD. Selective activation of striatal fast-spiking interneurons during choice execution. Neuron. 2010 Aug 12;67(3):466-79. 

Gambrill AC, Faulkner RL, Cline HT. Direct intertectal inputs are an integral component of the bilateral sensorimotor circuit for behavior in Xenopus tadpoles. J Neurophysiol. 2018 May 1;119(5):1947-1961.

Gebhardt C, Auer TO, Henriques PM, Rajan G, Duroure K, Bianco IH, Del Bene F. An interhemispheric neural circuit allowing binocular integration in the optic tectum. Nat Commun. 2019 Nov 29;10(1):5471.

Ghitani N, Bayguinov PO, Basso MA, Jackson MB. A sodium afterdepolarization in rat superior colliculus neurons and its contribution to population activity. J Neurophysiol. 2016 Jul 1;116(1):191-200.

Haynes WI, Haber SN. The organization of prefrontal-subthalamic inputs in primates provides an anatomical substrate for both functional specificity and integration: implications for Basal Ganglia models and deep brain stimulation. J Neurosci. 2013 Mar 13;33(11):4804-14. 

Henriques, P.M., Rahman, N., Jackson, S.E. and Bianco, I.H., 2019. Nucleus isthmi is required to sustain target pursuit during visually guided prey-catching. Current Biology, 29(11), pp.1771-1786.

Hu, Y., Ma, T.C., Alberico, S.L., Ding, Y., Jin, L. and Kang, U.J., 2023. Substantia Nigra pars reticulata projections to the pedunculopontine nucleus modulate dyskinesia. Movement Disorders, 38(10), pp.1850-1860.

Huerta-Ocampo I, Dautan D, Gut NK, Khan B, Mena-Segovia J. Whole-brain mapping of monosynaptic inputs to midbrain cholinergic neurons. Sci Rep. 2021 Apr 27;11(1):9055.

Isa T, Marquez-Legorreta E, Grillner S, Scott EK. The tectum/superior colliculus as the vertebrate solution for spatial sensory integration and action. Curr Biol. 2021 Jun 7;31(11):R741-R762. 

Jiang H, Stein BE, McHaffie JG. Opposing basal ganglia processes shape midbrain visuomotor activity bilaterally. Nature. 2003 Jun 26;423(6943):982-6. 

Knudsen EI. Evolution of neural processing for visual perception in vertebrates. J Comp Neurol. 2020 Dec 1;528(17):2888-2901. 

Krauzlis RJ, Lovejoy LP, Zénon A. Superior colliculus and visual spatial attention. Annu Rev Neurosci. 2013 Jul 8;36:165-82. 

Krauzlis RJ, Bogadhi AR, Herman JP, Bollimunta A. Selective attention without a neocortex. Cortex. 2018 May;102:161-175.

Lee, C. R., Yonk, A. J., Wiskerke, J., Paradiso, K. G., Tepper, J. M., & Margolis, D. J. (2019). Opposing influence of sensory and motor cortical input on striatal circuitry and choice behavior. Current Biology, 29(8), 1313-1323.

Lin, C., Ridder, M., Zhong, J., Albornoz, E.A., Sedlak, P., Xu, L., Woodruff, T.M., Chen, F. and Sah, P., 2024. Modulation of pedunculopontine input to the basal ganglia relieves motor symptoms in Parkinsonian mice. bioRxiv, pp.2024-03.

Lintz MJ, Essig J, Zylberberg J, Felsen G. Spatial representations in the superior colliculus are modulated by competition among targets. Neuroscience. 2019 Jun 1;408:191-203. 

Liu, D., Li, W., Ma, C., Zheng, W., Yao, Y., Tso, C. F., … & Dan, Y. (2020). A common hub for sleep and motor control in the substantia nigra. Science, 367(6476), 440-445.

Ma L, Day-Cooney J, Benavides OJ, Muniak MA, Qin M, Ding JB, Mao T, Zhong H. Locomotion activates PKA through dopamine and adenosine in striatal neurons. Nature. 2022 Nov;611(7937):762-768.

Marín, G., Salas, C., Sentis, E., Rojas, X., Letelier, J.C. and Mpodozis, J., 2007. A cholinergic gating mechanism controlled by competitive interactions in the optic tectum of the pigeon. Journal of Neuroscience, 27(30), pp.8112-8121.

Mathur BN, Lovinger DM. Endocannabinoid-dopamine interactions in striatal synaptic plasticity. Front Pharmacol. 2012 Apr 19;3:66. 

Mena-Segovia J, Sims HM, Magill PJ, Bolam JP. Cholinergic brainstem neurons modulate cortical gamma activity during slow oscillations. J Physiol. 2008 Jun 15;586(12):2947-60. 

Mena-Segovia, Juan, and J. Paul Bolam. Rethinking the pedunculopontine nucleus: from cellular organization to function. Neuron 94.1 (2017): 7-18.

Morello F, Borshagovski D, Survila M, Tikker L, Sadik-Ogli S, Kirjavainen A, Estartús N, Knaapi L, Lahti L, Törönen P, Mazutis L, Delogu A, Salminen M, Achim K, Partanen J. Molecular Fingerprint and Developmental Regulation of the Tegmental GABAergic and Glutamatergic Neurons Derived from the Anterior Hindbrain. Cell Rep. 2020 Oct 13;33(2):108268. 

Munoz DP, Wurtz RH. Saccade-related activity in monkey superior colliculus. I. Characteristics of burst and buildup cells. J Neurophysiol. 1995 Jun;73(6):2313-33. 

Mysore SP, Knudsen EI. The role of a midbrain network in competitive stimulus selection. Curr Opin Neurobiol. 2011 Aug;21(4):653-60. 

Narushima M, Uchigashima M, Hashimoto K, Watanabe M, Kano M. Depolarization-induced suppression of inhibition mediated by endocannabinoids at synapses from fast-spiking interneurons to medium spiny neurons in the striatum. Eur J Neurosci. 2006 Oct;24(8):2246-52. 

O’Hare JK, Li H, Kim N, Gaidis E, Ade K, Beck J, Yin H, Calakos N. Striatal fast-spiking interneurons selectively modulate circuit output and are required for habitual behavior. Elife. 2017 Sep 5;6:e26231.

Pasquereau B, Turner RS. A selective role for ventromedial subthalamic nucleus in inhibitory control. Elife. 2017 Dec 4;6:e31627. 

Pradel K, Drwiȩga G, Błasiak T. Superior Colliculus Controls the Activity of the Rostromedial Tegmental Nuclei in an Asymmetrical Manner. J Neurosci. 2021 May 5;41(18):4006-4022. 

Redgrave, Peter, Tony J. Prescott, and Kevin Gurney. The basal ganglia: a vertebrate solution to the selection problem?. Neuroscience 89.4 (1999): 1009-1023.

Reynaert B, Morales C, Mpodozis J, Letelier JC, Marín GJ. A blinking focal pattern of re-entrant activity in the avian tectum. Curr Biol. 2023 Jan 9;33(1):1-14.e4. 

Schall JD. Accumulators, Neurons, and Response Time. Trends Neurosci. 2019 Dec;42(12):848-860. doi: 10.1016/j.tins.2019.10.001. 

Stine GM, Trautmann EM, Jeurissen D, Shadlen MN. A neural mechanism for terminating decisions. Neuron. 2023 Aug 16;111(16):2601-2613.e5. 

Terman D, Rubin JE, Yew AC, Wilson CJ. Activity patterns in a model for the subthalamopallidal network of the basal ganglia. J Neurosci. 2002 Apr 1;22(7):2963-76. 

Thompson JA, Costabile JD, Felsen G. Mesencephalic representations of recent experience influence decision making. Elife. 2016 Jul 25;5:e16572.

Thompson AC, Aizenman CD. Characterization of Na+ currents regulating intrinsic excitability of optic tectal neurons. Life Sci Alliance. 2023 Nov 2;7(1):e202302232. 

Valero-Cabré A, Toba MN, Hilgetag CC, Rushmore RJ. Perturbation-driven paradoxical facilitation of visuo-spatial function: Revisiting the ‘Sprague effect’. Cortex. 2020 Jan;122:10-39.

Wang Y, Wang L, Manssuer L, Zhao YJ, Ding Q, Pan Y, Huang P, Li D, Voon V. Subthalamic stimulation causally modulates human voluntary decision-making to stay or go. NPJ Parkinsons Dis. 2024 Nov 2;10(1):210.

Wessel JR, Jenkinson N, Brittain JS, Voets SH, Aziz TZ, Aron AR. Surprise disrupts cognition via a fronto-basal ganglia suppressive mechanism. Nat Commun. 2016 Apr 18;7:11195. 

Wolf AB, Lintz MJ, Costabile JD, Thompson JA, Stubblefield EA, Felsen G. An integrative role for the superior colliculus in selecting targets for movements. J Neurophysiol. 2015 Oct;114(4):2118-31. 

Yim, M. Y., Aertsen, A., & Kumar, A. (2011). Significance of input correlations in striatal function. PLoS Computational Biology, 7(11), e1002254.

Zeymer M, von der Emde G, Wullimann MF. The Mormyrid Optic Tectum Is a Topographic Interface for Active Electrolocation and Visual Sensing. Front Neuroanat. 2018 Oct 1;12:79. doi: 10.3389/fnana.2018.00079.

Essay 37: Odor neighborhood

Let’s revisit the striatum timeout from essay 31: striatum LTD, where food seeking used the striatum as a timeout to avoid perseveration. Without the timeout, the animal continues to seek toward the odor source even if the food was missing. This essay adds to the timeout by adding an odor context as a cached set of locations to avoid until the timeout, as opposed to avoiding all locations until the timeout. This odor neighborhood resembles the olfactory spatial hypothesis [Jacobs 2012], which considers olfaction as primarily a navigation sense. The added specificity to failed-seek avoidance improves search for other nearby food sources.

Recap of essay 31: striatum LTD

The food seek logic from essay 31 has two search states: a general roaming search and an odor seek. If the odor seek times out, the animal avoids the current area to prevent perseveration. Essay 35 on hippocampal sequences explored using a sequence to specify the avoidance timeout.

Foraging state diagram with roaming and odor seeking
Foraging state machine with two search modes: a general roaming search and a target-specific seek.

For the timeout, the essay uses the Sv (ventral striatum aka nucleus accumbens) to suppress failed food seeking [Lafferty et al 2020]. Without the S.v timeout, the animal perseverates at the seek task and gets stuck in the center of the odor plume.

Striatum as timing out failed odor search.
Circuit of S.v for timing out a failed food seek. Adenosine drives a ramping timeout signal that reduces motivation by switching from the seek path via V.pt to the avoidance path via Hb.l. Ad (adenosine), Hb (habenula), Ob (olfactory bulb), Pv (ventral pallidum), S.d1 (striatum projection neuron with D1 dopamine receptor), S.d2 (striatum projection neuron with D2 projection receptor), V.pt (posterior tuberculum – Vta/Snc)

The above diagram shows the essay 31 circuit, largely based on the lamprey. V.pt (posterior tuberculum) is a locomotion hub that receives a direct signal from Ob.m (medial olfactory bulb) and drives downstream motor areas [Derjean et al 2012]. Hb.l (lateral habenula) drives place avoidance. S.v (ventral striatum) drives the timeout selection in Pv (ventral pallidum). Ad (adenosine) is the timeout variable, which increases as neural activity in S.d1 (striatum D1 projection neuron) and S.d2 (striatum D2 projection neuron) continues. Adenosine is a byproduct of ATP (adenosine triphosphate) energy production, and is also a gliotransmitter from astrocytes that monitor synapse activity. Essentially, the Sv subcircuit in red acts as a timeout for the main seek circuit.

Importantly, because the essay 31 timeout only uses the seek odor itself as a key, it can’t distinguish spatially distinct odors, such as different flowers for a honeybee.

Neighborhood odor as context

Because essay 31 only used the Ob seek odor as a signal, a timeout of that odor locks out all food search for that odor. That lockout may be long because the S.d2 LTD (long term depression) recovery time is on the order of 20 to 60 minutes. Consider an analogy to a bee searching a field of flowers for nectar. If one flower is missing nectar, the bee should give up on that flower, but it shouldn’t abandon the entire task until a 60 minute timer expires.

Odor neighborhoods with food odor plumes. Each colored area is an odor neighborhood and each cloud is an odor plume. Only the starred areas contain food.

In the above diagram, the stars represent food locations and the clouds represent food odor plumes. Odor plumes without food are false odors. The colors of the regions represent odor neighborhoods, where non-food odors distinguish the areas. Suppose the animal first searches in the dark orange area and fails to find food. If it next reaches the green area with the star, the timeout from the failed orange search will block the search unless the timeout is specific to the orange neighborhood.

Olfactory spatial hypothesis

The olfactory spatial hypothesis argues that a primary function for olfaction is navigation, as opposed to simply proving identification [Jacobs 2012]. This navigation-centric idea is fleshed out in the parallel map theory, which argues that the hippocampus is primarily organized around two maps: a bearing map using gradients to distant odor landmarks, and a sketch map with local landmark cues [Jacobs and Schenk 2003]. The parallel map theory associates the distant bearing map with E.dg (dentate gyrus of the hippocampus) and the local sketch map with E.ca1 (CA1 region of the hippocampus).

The current essay uses the broad idea of the olfactory spatial hypothesis and the idea of a local olfactory neighborhood. The olfactory neighborhood provides a context to restrict the striatum timeout. Functionally it resembles the local sketch map, but it’s not strictly speaking a map, only a cache of failed locations.

Lamprey dual odor path

The lamprey is a useful animal model because it represents the older jawless vertebrates that preceded the development of the jaw and the majority of more complex vertebrates and because it has a simpler brain. In the lamprey, Ob.m directly drives locomotion via V.pt (posterior tuberculum), which is homologous to the mammalian midbrain dopamine areas Vta (ventral tegmental area) and Snc (substantia nigra pars compacta). Unlike the mammalian dopamine areas, the lamprey V.pt drives locomotion directly to MLR (midbrain locomotor region) and R.rs (reticulospinal motor neurons) [Beauséjour et al 2020].

The rest of the lamprey Ob drives the pallium (cortex) and subpallium (basal ganglia). Unlike the mammalian Ob which only drives specific olfactory cortical areas, the lamprey Ob broadly connects to the entire pallium [Derjean et al 2010], [Suryanarayana et al 2021]. Note that the lamprey pallium is smaller than the Ob [Pombal and Megías 2019].

Dual olfactory projections: direct to locomotor via V.pt and indirectly through the S.ot/P.v (basal ganglia). Hb.l (lateral habenula), MLR (midbrain locomotor region), Ob.l (lateral olfactory bulb), Ob.m (medial olfactory bulb), Pv (ventral pallidum), R.rs (reticulospinal motor command), S.ot (olfactory tubercle), V.pt (posterior tuberculum)

The above diagram illustrates the dual olfactory projection. The main action path is Ob.m to the V.pt to the MLR locomotion [Derjean et al 2010], [Beauséjour et al 2020], [Beauséjour et al 2024]. Not shown is the Ob.m projection to the Hb.m (medial habenula) – R.ip (interpeduncular area) for chemotaxis. The previous essay included the Ob.m to S.ot path for the timeout, which suppressed chemotaxis to avoid perseveration. Ob.l is the new addition, providing distinguishing context to the S.ot circuit.

Striatal discrimination

To represent distinct timeouts, different context or olfactory neighborhoods need distinct neurons or at least different dendrite spines. The striatum architecture is well-suited for this task because of the very large number of S.pn (striatal projection neurons aka medium spiny neurons). Each S.pn can represent a distinct combination of signal and context.

Striatum architecture to represent multiple timeouts, each with a unique context key built from unique distinguishing combination of inputs. cxt-1 (context input), Ob.m (medial olfactory bulb), Pv (ventral pallidum), S.pn (striatum projection neuron).

The above diagram shows the context-keyed timeout architecture. Each S.pn is associated with a distinguishing context, but all of these use the same primary signal. Because S.pn stores the timeout in the LTP (long term potentiatiation) / LTD in its dendrite spines, the multiple S.pn neurons allow for distinct persistent timeout variables. Furthermore, a single S.pn can support multiple contexts because each S.pn has several dendrites, on the order of 8-12, each of which can respond to a distinct input combination.

Note the similarity of this fan-out to granule cells in the hippocampus and cerebellum, and the Kenyon cells in Drosophila fruit fly. This expansion of the coding dimensionality allows for a large space to place odors while reducing overlaps [Laurent 2002].

Striatum UP states

In mammals S.pn are only active with sustained input from multiple distributed cortical sources [Shipp 2017]. This sustain input the S.pn into an UP state, which allows a primary signal to drive the neuron, but doesn’t drive an AP (action potential) directly. Typically the context UP state inputs drive distal dendrites and spines, and the primary signal drives the proximal dendrite. S.pn are hyper polarized at rest, making it difficult for a signal to drive an AP directly. The UP state depolarizes the S.pn, allowing the signal to drive an AP. Essentially this means the context neurons are required gates for the signal.

Combinations of context neurons drive a dendrite UP state, which allows the signal to drive the projection neuron. CN (context neuron), S.nr (substantia nigra pars reticulata), S.pn (striatum projection neuron).

The above diagram shows how each S.pn has an associated context made from a conjunction of several context neurons. Each S.pn has a different combination of context neurons, each differing greatly from its neighbor [Bolam and Bevan 2006]. Multiple simultaneous context neurons are necessary for an UP state.

Broad circuit

Taking an overview of this system, let’s see how addition of this context information affects the seek and timeout circuit affects the earlier circuit.

Olfactory timeout circuit with Ob.l added as a context input to S.v. Ad (adenosine), Hb (habenula), Ob (olfactory bulb), Pv (ventral pallium), S.d1 (striatum D1 projection neuron), S.d2 (striatum D2 projection neuron), V.pt (posterior tubuculum)

The above diagram shows the addition of Ob.l to S.ot was the only change necessary, along with the dimension expansion of the S.pn.

Cache-like model for simulation

The striatum architecture poses a scaling problem for the simulation. The striatum has a large number of neurons, each with a large number of essentially random inputs. This architecture works because the possible combinations are predefined. Each odor neighborhood is a conjunction of odor features, each corresponding to an Ob glomeruli and O.mc (olfactory mitral cells). The many predefined conjunctions are likely to match any new odor combination. However, a simulation model using this architecture would be overly large.

Because the essay model is a toy model, it can use a much simplified system. A cache-like architecture can work because only a few odor locations are active at any time. The cache only holds the recent odor locations, and the cache entry for an odor location is removed when the timeout expires. The simulation cache only needs to store the active locations, unlike the striatum, which holds the much larger number of possible distinct locations.

Simulation

The simulation adds a simplification of odor neighborhoods. Instead of simulating accurate odor plumes, each location has a place code, which then produces an odor code. In the screenshot below, the hexagonal colors represent these place codes that produce odor neighborhoods.

Simulation screenshot of the animal reaching food in a different neighborhood than the previously avoided neighborhood.

The above diagram shows two different odor neighborhoods (teal vs red). The animal avoids the red neighborhood after failing to find food, but seeks in the teal neighborhood to find the food. If the animal had first searched the teal neighborhood without food, it would have avoided have avoided the teal neighborhood with food.

Discussion

A major simplification in the simulation is consistency and precision in odor cues. In an actual environment, odors are not reliable. For now I’m not adding that complexity, but it might explain the need for cortical circuits in O.pir (piriform olfactory cortex) and E.hc (hippocampus). If an odor is irregular, some circuit needs to maintain a consistent odor neighborhood for the timeout circuit to work. In the simulation because the Ob perfectly represents the odor neighborhood and food plume, the downstream circuits can use the Ob signal directly. If the odor varies slightly within a neighborhood, or is lost intermittently, the S.ot timeout circuit could shift to a different S.pn timeout, breaking the logic of the circuit. A later essay might explore how cortical areas like O.pir might be necessary to create a stable neighborhood.

References

Beauséjour PA, Auclair F, Daghfous G, Ngovandan C, Veilleux D, Zielinski B, Dubuc R. Dopaminergic modulation of olfactory-evoked motor output in sea lampreys (Petromyzon marinus L.). J Comp Neurol. 2020 Jan 1;528(1):114-134. 

Beauséjour PA, Veilleux JC, Condamine S, Zielinski BS, Dubuc R. Olfactory Projections to Locomotor Control Centers in the Sea Lamprey. Int J Mol Sci. 2024 Aug 29;25(17):9370.

 Bolam, J. P., & Bevan, M. D. (2006). Microcircuits of the striatum. In Basal Ganglia and Thalamus in Health and Movement Disorders (pp. 29-39). Boston, MA: Springer US.

Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21;8(12):e1000567. 

Jacobs LF, Schenk F. Unpacking the cognitive map: the parallel map theory of hippocampal function. Psychol Rev. 2003 Apr;110(2):285-315. 

Jacobs L. F. (2012). From chemotaxis to the cognitive map: the function of olfaction. Proc. Natl. Acad. Sci. U.S.A. 109(Suppl. 1) 10693–10700 

Lafferty CK, Yang AK, Mendoza JA, Britt JP. Nucleus Accumbens Cell Type- and Input-Specific Suppression of Unproductive Reward Seeking. Cell Rep. 2020 Mar 17;30(11):3729-3742.e3.

Laurent G. Olfactory network dynamics and the coding of multidimensional signals. Nat Rev Neurosci. 2002 Nov;3(11):884-95. 

Pombal MA, Megías M. Development and Functional Organization of the Cranial Nerves in Lampreys. Anat Rec (Hoboken). 2019 Mar;302(3):512-539. 

Shipp S. The functional logic of corticostriatal connections. Brain Struct Funct. 2017 Mar;222(2):669-706. 

Suryanarayana SM, Pérez-Fernández J, Robertson B, Grillner S. Olfaction in Lamprey Pallium Revisited-Dual Projections of Mitral and Tufted Cells. Cell Rep. 2021 Jan 5;34(1):108596. 

Essay 31: Striatum as Timeout

Let’s return to the task of essay 16 on give-up time in foraging, which covered food search with a timeout. At first the animal uses a general roaming search and if it smells a food odor, it switches to a targeted seek following the odor with chemotaxis. If the animal finds food in the odor plume, it eats the food, but if it doesn’t find food, it will eventually give up and avoid the local area before returning to the roaming search.

Search state machine. Roam is the starting state, switching to seek when it detects odor, and switching to avoid after a timeout.

For another attempt at the problem, let’s take the striatum (basal ganglia) as implementing the timeout portion of this task using the neurotransmitter adenosine as a timeout signal and incorporating the multiple action path discussion from essay 30 on RTPA. Adenosine is a byproduct of ATP breakdown and is a measure of cellular activity. With sufficiently high adenosine, the striatum switches from the active seek path to an avoidance path. These circuits are where caffeine works to suppress the adenosine timeout, allowing for longer concentration.

Mollusk navigation

As mentioned in essay 30, the mollusk sea slug has a food search circuit with a similar logic to what we need here. The animal seeks food odors when it’s hungry, but it avoids food odors when it’s not hungry [Gillette and Brown 2015].

Mollusk food search circuit, modulated by hunger.
Mollusk food search circuit, illustrating a hunger-modulated switchboard. When the animal is not hungry, the switchboard reverses the odor to motor links turning it away from food.

This essay uses the same idea but replaces the hunger modulation with a timeout. When the timeout occurs, the circuit switches from a food seek action path to a food avoid action path.

Odor action paths

Two odor-following actions paths exist in the lamprey, one using Hb.m (medial habenula) and one using V.pt (posterior tuberculum). The Hb.m path is a chemotaxis path following a temporal gradient. The V.pt path projects to MLR (midbrain locomotor region), but The lamprey Ob.m (medial olfactory bulb) projects to both Hb.m (medial habenula) and to V.pt (posterior tuberculum), which each project to different locomotor paths [Derjean et all 2010], Hb.m to R.ip (interpeduncular nucleus) and V.pt to MLR (midbrain locomotor region). The zebrafish also has Ob projections to Hb and V.pt [Imamura et al 2020], [Kermen et al 2013].

Dual odor-seeking action paths in the lamprey and zebrafish. Hb (habenula), Ob.m (medial olfactory bulb), V.pt (posterior tectum).

Further complicating the paths, the Hb.m itself contains both an odor seeking path and an odor avoiding path [Beretta et al 2012], [Chen et al 2019]. Similarly Hb.m has dual action paths for social winning and losing [Okamoto et al 2021]. So, this essay could use the dual paths in Ob.m instead of contrasting Ob.m with V.pt, but the larger contract should make the simulation easier to follow.

This essay’s simulation makes some important simplifications. The Hb to R.ip path is a temporal gradient path used for chemotaxis, phototaxis and thermotaxis. In a real-world marine environment, odor diffusion and water turbulence is much more complicated, producing more clumps and making a simple gradient ascent more difficult [Hengenius et al 2012]. Because this essay is only focused on the switchboard effect, this simplification should be fine.

Striatum action paths with adenosine timeout

The timeout circuit uses the striatum, which has two paths: one selecting the main action, and the second either stopping the action, or selecting an opposing action [Zhai et al 2023]. The two paths are distinguished by their responsiveness to dopamine with S.d1 (striatal projection with D1 G-s stimulating) or S.d2 (striatal projection with D2 G-i inhibiting) marking the active and alternate paths respectively. This model is a simplification of the mammalian striatum where the two paths interact in a more complicated fashion [Cui et al 2013].

Essay odor seek with timeout circuit. The seek path flows from Ob, through S.d1 to P.v to V.pt. The avoid path flows from Obj, though S.d2 to Pv. to Hb. Ad (adenosine), Hb (habenula), Ob (olfactory bulb), Pv (ventral pallidum), S.d1 (striatum D1 projection neuron), S.d2 (striatum D2 projection neuron), V.pt (posterior tuberculum)

As mentioned, the two actions paths are the seek path from Ob to V.pt and the avoid path from Ob to Hb. For the timeout and switchboard, the Ob has a secondary projection to the striatum. Although this circuit is meant as a proto-vertebrate simplification, Ob does project to S.ot (olfactory tubercle) and to the equivalent in zebrafish [Kermen et al 2013].

The timeout is managed by adenosine, which is a neurotransmitter derived from ATP and a measure of neural activity. The striatum has three sub-circuits for this kind of functionality, which I’ll cover in order of complexity.

S.d1 and adenosine inhibition

The first circuit only uses the direct S.d1 path and adenosine as a timeout mechanism. When the animal follows an odor, the Ob to S.d1 signal enables the seek action. As a timeout, ATP from neural activity degrades to adenosine and the buildup of adenosine is a decent measure of activity over time. The longer the animal seeks, the more adenosine builds up. Of the Ob projection axis contains an A1i (adenosine G-i inhibitory) receptor, the adenosine will inhibit the release of glutamate from Ob, which will eventually self-disable the seek action.

S.d1 action path inhibited by adenosine buildup as a timeout. A1i (adenosine G-i inhibitory receptor), Ad (adenosine), mGlu5q (metabotropic glutamate G-q receptor), Ob (olfactory bulb), S.d1 (D1-type striatal projection neuron)

In practice, the striatum uses astrocytes to manage the glutamate release. An astrocyte that envelops the synapse measures glutamate release with an mGlu5q (metabotropic glutamate with G-q/11 binding) receptor and accumulates internal calcium [Cavaccini et al 2020]. The astrocyte’s calcium triggers an adenosine release as a gliotransmitter, making the adenosine level a timeout measure of glutamate activity. The presynaptic A1i receptor then inhibits the Ob signal. The timeframe is on the order of 5 to 20 minutes with a recovery of about 60 minutes, although the precise timing is probably variable. Interestingly, the time-out is a log function instead of linear measure of activity [Ma et al 2022].

This circuit doesn’t depend on the postsynaptic S.d1 firing [Cavaccini et al 2020], which contrasts with the next LTD (long term depression) circuit which only inhibits the axon if the S.d1 projection neuron fires.

S.d1 presynaptic LTD using eCB

S.d1 self-activating LTD uses retrotransmission to inhibit its own input using eCB (endocannabiniods) as a neurotransmitter. Like the astrocyte in the previous circuit, S.d1 uses a mGlu5q receptor to trigger eCB release, but also require that S.d1 fire, as triggered by NMDA glutamate receptor. The axon receives the eCB retrotransmission with a CB1i (cannabinoid G-i inhibitory) receptor and trigger presynaptic LTD [Shen et al 2008], [Wu et al 2015]. Like the previous circuit, the timeframe seems to be on the order of 10 minutes, lasting for 30 to 60 minutes.

S.d1 LTD circuit. A coincidence of glutamate detection with mGlu5q and S.d1 activation with NMDA triggers eCB release, which activates CB1i leading to presynaptic LTD. CB1i (cannabinoid G-i inhibitory receptor), mGlu5q (glutamate G-q receptor), Ob (olfactory bulb), S.d1 (striatum D1-type projection neuron).

This circuit inhibits itself over time without using adenosine or astrocytes. In the full striatum circuit, high dopamine levels suppress this LTD suppression, meaning that dopamine inhibits the timeout [Shen et al 2008].

The next circuit adds the S.d2 path, which uses adenosine and self-activity to trigger postsynaptic LTD.

S.d2 postsynaptic LTP via A2a.s

Consider a third circuit that has the benefits of both previous circuits because it uses adenosine as a timer managed by astrocytes and is also specific to postsynaptic activity. In addition, it allows for a second action path, changing the circuit from a Go/NoGo system to a Go/Avoid action pair. This circuit uses LTP (long term potentiation) on the S.d2 striatum neurons.

Timeout circuit using postsynaptic LTD at the S.d2 neuron and adenosine as a timeout signal. As adenosine accumulates, it stimulates S.d2, which both disables S.d1 and drives the avoid path. A2a.s (adenosine G-s stimulatory receptor), Ad (adenosine), mGlu5q (glutamate G-q metabotropic receptor), Ob (olfactory bulb), S.d1 (striatum D1-type projection neuron), S.d2 (striatum D2-type projection neuron)

When the odor first arrives, Ob activates the S.d1 path, seeking toward the odor. S.d1 is activated instead of S.d2 because of dopamine. In this simple model, the Ob itself could provide the initial dopamine like c. elegans odor-detecting neurons or the tunicate’s coronal cells or the dual glutamate and dopamine neurons in Vta (ventral tegmental area).

As time goes on, adenosine from the astrocyte builds up, which activates the S.d2 A2s.a (adenosine G-s stimulatory receptor) until it overcomes dopamine suppression and increases the S.d2 activity with LTP [Shen et al 2008]. Once S.d2 activates, it suppresses S.d1 [Chen et al 2023] and drives the avoid path.

The combination of these circuits looks like it’s precisely what the essay needs.

Simulation

In the simulation, when the animal is hunting food and finds a food odor plume, it directly seeks toward the center and eats if it find food. In the screenshot below, the animal is eating.

Simulation showing the animal eating food after seeking the odor plume.

Satiation disables the food seek. This might sound obvious, but hunger gating of food seeking requires specific satiety circuits to any seek path that’s food specific, which means the involvement of H.l (lateral hypothalamus) and related areas like H.arc (arcuate hypothalamus) and H.pv (periventricular hypothalamus). And, of course, the simulation requires simulation code to only enable food odor seek when the animal is searching for food.

The next screenshot shows the central problem of the essay, when the animal seeks a food odor but there’s no food at the center.

Screenshot showing the animal stuck in the middle of the food odor plume before the timeout.

Without a timeout, the animal circles the center of the food odor plume endlessly. After a timeout, the animal actively leaves the plume and avoid that specific odor until the timeout decays.

Screenshot showing the animal escaping from the odor plume after the timeout.

This system is somewhat complex because of the need for hysteresis. A too-simple solution with a single threshold can oscillate, because as soon as the animal starts leaving the timeout decays, which then re-enables the food-seek, which then quickly times out, repeating. Instead, the system needs to make re-enabling of the food seek more difficult after a timeout.

But that adds a secondary issue because if food seek is a lower threshold, then the sustain of seek needs to raise the threshold while the seek occurs. So, the sustain of seek needs a lower threshold than starting seek. This hysteresis and seek sustain presumably needs to be handled by the actual striatum circuit.

Discussion

I think this essay shows that using the stratum for an action timeout for food seek is a plausible application. The circuit is relatively simple and is effective, improving search by avoiding failed areas.

However, the simulation does raise some issues, particularly hysteresis problem. If the striatum does provide a timeout along these lines, it must somehow solve the hysteresis problem. While the animal is seeking, the ongoing LTP/LTD inhibition should use a high threshold to stop seeking, but once avoidance starts, there needs to be a high threshold to return to seeking to avoid oscillations between the two action paths.

Because LTD/LTP is a relatively long chemical process (minutes) internal to the neurons, as opposed to an instant switch in the simulation, the delay itself might be sufficient to solve the oscillation problem. It’s also possible that some of the more complicated parts of the circuit, such as P.ge (globus pallidus) and its feedback to the striatum or H.stn (subthalamic nucleus) might affect the sustain of seek or breaking it and so control the hysteresis problem.

The simulation also reinforced the absolute requirement that action paths need to be modulated by internal state like hunger. For the seek paths, both Hb.m and V.pt are heavily modulated by H.l and other hypothalamic hunger and satiety signals.

As expected, the simulation also illustrated the need for context information separate from the target odor. While the food odor is timed out, the animal can’t search the other odor plume because this essay’s animal can’t distinguish between the odor plumes, and therefore avoids both odors. With a long timeout and many odor plumes, this delays the food search. A future enhancement is to add context to the timeout. If the animal can timeout a specific odor plume, it can search alternatives even if the food odor itself is identical.

References

Beretta CA, Dross N, Guiterrez-Triana JA, Ryu S, Carl M. Habenula circuit development: past, present, and future. Front Neurosci. 2012 Apr 23;6:51. 

Cavaccini A, Durkee C, Kofuji P, Tonini R, Araque A. Astrocyte Signaling Gates Long-Term Depression at Corticostriatal Synapses of the Direct Pathway. J Neurosci. 2020 Jul 22;40(30):5757-5768. 

Chen JF, Choi DS, Cunha RA. Striatopallidal adenosine A2A receptor modulation of goal-directed behavior: Homeostatic control with cognitive flexibility. Neuropharmacology. 2023 Mar 15;226:109421. 

Chen WY, Peng XL, Deng QS, Chen MJ, Du JL, Zhang BB. Role of Olfactorily Responsive Neurons in the Right Dorsal Habenula-Ventral Interpeduncular Nucleus Pathway in Food-Seeking Behaviors of Larval Zebrafish. Neuroscience. 2019 Apr 15;404:259-267. 

Cui G, Jun SB, Jin X, Pham MD, Vogel SS, Lovinger DM, Costa RM. Concurrent activation of striatal direct and indirect pathways during action initiation. Nature. 2013 Feb 14;494(7436):238-42.

Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21;8(12):e1000567. 

Gillette R, Brown JW. The Sea Slug, Pleurobranchaea californica: A Signpost Species in the Evolution of Complex Nervous Systems and Behavior. Integr Comp Biol. 2015 Dec;55(6):1058-69. 

Hengenius JB, Connor EG, Crimaldi JP, Urban NN, Ermentrout GB. Olfactory navigation in the real world: Simple local search strategies for turbulent environments. J Theor Biol. 2021 May 7;516:110607.

Imamura F, Ito A, LaFever BJ. Subpopulations of Projection Neurons in the Olfactory Bulb. Front Neural Circuits. 2020 Aug 28;14:561822. 

Kermen F, Franco LM, Wyatt C, Yaksi E. Neural circuits mediating olfactory-driven behavior in fish. Front Neural Circuits. 2013 Apr 11;7:62.

Ma L, Day-Cooney J, Benavides OJ, Muniak MA, Qin M, Ding JB, Mao T, Zhong H. Locomotion activates PKA through dopamine and adenosine in striatal neurons. Nature. 2022 Nov;611(7937):762-768.

Okamoto H, Cherng BW, Nakajo H, Chou MY, Kinoshita M. Habenula as the experience-dependent controlling switchboard of behavior and attention in social conflict and learning. Curr Opin Neurobiol. 2021 Jun;68:36-43. 

Shen W, Flajolet M, Greengard P, Surmeier DJ. Dichotomous dopaminergic control of striatal synaptic plasticity. Science. 2008 Aug 8;321(5890):848-51. 

Wu YW, Kim JI, Tawfik VL, Lalchandani RR, Scherrer G, Ding JB. Input- and cell-type-specific endocannabinoid-dependent LTD in the striatum. Cell Rep. 2015 Jan 6;10(1):75-87. 

Zhai S, Cui Q, Simmons DV, Surmeier DJ. Distributed dopaminergic signaling in the basal ganglia and its relationship to motor disability in Parkinson’s disease. Curr Opin Neurobiol. 2023 Dec;83:102798.

Essay 22 issues: subthalamic nucleus simulation

The essay 22 simulation explored a striatum model where the two decision paths competed: odor seeking vs random exploration, using dopamine to bias between exploration and seeking. This model resembled striatum theories like [Bariselli et al. 2020] that consider the stratum’s direct and indirect paths as competing between approach and avoidant actions.

Issues in essay 22 include both neuroscience divergence and simulation problems. Although the simulation is a loose functional model, that laxity isn’t infinite and it may have gone too far from the neuroscience.

Adenosine and perseveration

Seeking and foraging have a perseveration problem: the animal must eventually give up on a failed cue, or it will remain stuck forever. The give-up circuit in essay 22 uses the lateral habenula (Hb.l) to integrate search time until it reaches a threshold to give up. An alternative circuit in the stratum itself involves the indirect path (S.d2), the D2 dopamine receptor and adenosine, with a behaviorally relevant time scale.

When fast neurotransmitters are on the order of 10 milliseconds, creating a timeout on the order of a few minutes is a challenge. Two possible solutions in that timescale are long term potentiation (LTP) where “long” means about 20 minutes, and astrocyte calcium accumulation, which is also about 10 to 20 minutes.

Adenosine receptors (A2r) in the striatum indirect path (S.d2) measure broad neural activity from ATP byproducts that accumulate in the intercellular space. Over 10 minutes those A2r can produce internal calcium ion (Ca) in the astrocytes or via LTP to enhance the indirect path. Enhancing the indirect path (exploration), eventually causes a switch from the direct path (seeking) to exploration, essentially giving-up on the seeking.

Ventral striatum

Although the essay models the dorsal striatum (S.d), the ventral striatum (S.v aka nucleus accumbens) is more associated with exploration and food seeking. In particularly, the olfactory path for food seeking goes through S.v, while midbrain motor actions use S.d. In salamanders, the striatum only processes midbrain (“collo-“) thalamic inputs, while olfactory and direct senses (“lemno-“) go to the cortex [Butler 2008]. Assuming the salamander path is more primitive, the essay’s use of S.d in the model is a likely mistake.

But S.v raises a new issue because S.v doesn’t use the subthalamus (H.stn) [Humphries and Prescott 2009]. Although, that model only applies to the S.v shell (S.sh) not the S.v core (S.core).

Ventral striatum pathway. MLR midbrain locomotive region, P.v ventral pallidum, S.sh ventral striatum shell, Vta ventral tegmental area.

In the above diagram of a striatum shell circuit, an odor-seek path is possible through the ventral tegmental area (Vta) but there is no space for an alternate explore path.

Low dopamine and perseveration

[Rutledge et al. 2009] investigates dopamine in the context of Parkinson’s disease (PD), which exhibits perseveration as a symptom. In contrast to the essay, PD is a low dopamine condition, and adding dopamine resolves the perseveration. But that resolve is the opposite of essay 22’s dopamine model, where low dopamine resolved perseveration.

Now, it’s possible that give-up perseveration and Parkinson’s perseveration are two different symptoms, or it’s possible that the complete absence of dopamine differs from low tonic dopamine, but in either case, the essay 22 model is too simple to explain the striatum’s dopamine use.

Dopamine burst vs tonic

Dopamine in the striatum has two modes: burst and tonic. Essay 22 uses a tonic dopamine, not phasic. The striatum uses phasic dopamine to switch attention to orient to a new salient stimulus. The phasic dopamine circuit is more complicated than the tonic system because it requires coordination with acetylcholine (ACh) from the midbrain laterodorsal tegmentum (V.ldt) and pedunculopontine (V.ppt) nuclei.

A question for the essays is whether that phasic burst is primitive to the striatum, or a later addition, possibly adding an interrupt for orientation to an earlier non-interruptible striatum.

Explore semantics

The word “explore” is used differently by behavioral ecology and in reinforcement learning, despite both using foraging-like tasks. These essays have been using explore in the behavioral ecology meaning, which may cause confusion on the reinforcement learning sense. The different centers on a fixed strategy (policy) compared with changing strategies.

In behavioral ecology, foraging is literal foraging, animals browsing or hunting in a place and moving on (giving up) if the place doesn’t have food [Owen-Smith et al. 2010]. “Exploring” is moving on from an unproductive place, but the policy (strategy) remains constant because moving on is part of the strategy. The policy for when to stay and when to go [Headon et al. 1982] often follows the marginal value theorem [Charnov 1976], which specifies when the animal should move on.

In contract, reinforcement learning (RL) uses “explore” to mean changing the policy (strategy). For example, in a two-armed bandit situation (two slot machines), the RL policy is either using machine A or using machine B, or a fixed probabilistic ratio, not a timeout and give-up policy. In that context, exploring means changing the policy not merely switching machines.

[Kacelnick et al. 2011] points out that the two-choice economic model doesn’t match vertebrate animal behavior, because vertebrates use an accept-reject decision [Cisek and Hayden 2022]. So, while the two-armed bandit may be useful in economics, it’s not a natural decision model for vertebrates.

Avoidance (nicotinic receptors in M.ip)

The simulation uncovered a foraging problem, where the animal remained around an odor patch it had given up on, because the give-up strategy reverts to random search. Instead, the animal should leave the current place and only resume search when its far away.

Path of simulated animal after giving up on a food odor.

In the diagram above, the animal remains near the abandoned food odor. The tight circles are the earlier seek before giving up, and the random path afterwards is the continued search. A better strategy would leave the green odor plume and explore other areas of the space.

As a possible circuit, the habenula (Hb.m) projects to the interpeduncular nucleus (M.ip) uses both glutamate and ACh as neurotransmitters, where ACh amplifies neural output. For low signals without ACh, the animal approaches the object, but high signals with ACh switch approach to avoidance. This avoidance switching is managed by the nicotine receptor (each) which is studied for nicotine addiction [Lee et al. 2019].

An interesting future essay might explore using nicotinic aversion to improve foraging by leaving an abandoned odor plume.

References

Bariselli S, Fobbs WC, Creed MC, Kravitz AV. A competitive model for striatal action selection. Brain Res. 2019 Jun 15;1713:70-79.

Butler, Ann. (2008). Evolution of the thalamus: A morphological and functional review. Thalamus & Related Systems. 4. 35 – 58.

Charnov, Eric L. Optimal foraging, the marginal value theorem. Theoretical population biology 9.2 (1976): 129-136.

Cisek P, Hayden BY. Neuroscience needs evolution. Philos Trans R Soc Lond B Biol Sci. 2022 Feb 14;377(1844):20200518.

Headon T, Jones M, Simonon P, Strummer J (1982) Should I Stay or Should I Go. On Combat Rock. CBS Epic.

Humphries MD, Prescott TJ. The ventral basal ganglia, a selection mechanism at the crossroads of space, strategy, and reward. Prog Neurobiol. 2010 Apr;90(4):385-417.

Kacelnik A, Vasconcelos M, Monteiro T, Aw J. 2011. Darwin’s ‘tug-of-war’ vs. starlings’ ‘horse-racing’: how adaptations for sequential encounters drive simultaneous choice. Behav. Ecol. Sociobiol. 65, 547-558.

Lee HW, Yang SH, Kim JY, Kim H. The Role of the Medial Habenula Cholinergic System in Addiction and Emotion-Associated Behaviors. Front Psychiatry. 2019 Feb 28

Owen-Smith N, Fryxell JM, Merrill EH. Foraging theory upscaled: the behavioural ecology of herbivore movement. Philos Trans R Soc Lond B Biol Sci. 2010 Jul 27;365(1550):2267-78. 

Rutledge RB, Lazzaro SC, Lau B, Myers CE, Gluck MA, Glimcher PW. Dopaminergic drugs modulate learning rates and perseveration in Parkinson’s patients in a dynamic foraging task. J Neurosci. 2009 Dec 2

Essay 22: Subthalamic Nucleus

After essay 21 changed the animal’s default movement to a Lévy exploration, it’s immediate to ask whether that random search is a full action, just like a seek turn or an avoid turn. An if exploration is a controlled action, then the model needs to treat exploration as a full action, like approach or avoid.

Exploration as a full locomotive system at the level of approach and avoid.

[Cisek 2020] identifies a vertebrate system for exploration, including the hippocampus (E.hc) and its associated nuclei such as the retromammilary hypothalamus (H.rm aka supramammilary). Essay 22 considers the idea of treating the subthalamic nucleus (H.stn) as part of the exploration circuit.

Subthalamic nucleus

H.stn is a hypothalamic nucleus from the same area as H.rm, which is part of the hippocampal theta circuit, which synchronizes exploration and spatial memory and learning. However, H.stn is part of the basal ganglia and not directly connected with the exploration system.

[Watson et al. 2021] finds a locomotive function of H.stn, where specific stimulation by the parafascicular thalamus (T.pf) to H.stn starts locomotion. If the stimulation is one-sided, the animal moves forward with a wide turn to the contralateral side. T.pf includes efference copies of motor actions from the MLR as well as from other midbrain actions.

Locomotion induced in the H.stn by T.pf stimulation. H.stn sub thalamic nucleus, T.pf parafascicular nucleus, MLR midbrain locomotor region.

For essay 22, let’s consider the H.stn locomotion as exploration. Since H.stn is part of the basal ganglia, the bulk of essay 22 is considering how exploration might fit into the proto-striatum model of essay 18.

Striatal attention and persistence

Since the current essay simulation animal is an early Cambrian proto-vertebrate, it doesn’t have a full basal ganglia. Evolutionarily, the full basal ganglia architecture could not have sprung into being fully formed; it must have developed in smaller step. Following a hypothetical evolutionary path, the essays are only implementing a simplified striatal model, adding features step-by-step. Unfortunately, because there’s no living species with a partial basal ganglia — all vertebrates have the full system — the essay’s steps are pure invention.

The initial striatum of essay 18 was a partial solution to a simulation problem: persistence. When the animal hit a wall head on, activating both touch sensors, it would choose randomly left or right, but because the simulation is real-time not turn-based, at the next tick both sensors remained active and the animal would choose randomly again, jittering at the wall until enough turns of the same direction escaped the barrier.

proto-striatum circuit for persistence by attention.
Proto-striatum for persistence by attention. Action feedback biases the choice to the last option: win-stay. B.rs reticulospinal motor command, Ob olfactory bulb, MLR midbrain locomotor region, Snc substantia nigra pars compacta (posterior tuberculum).

The main sense-to-action path is from the olfactory bulb (O.b) through the substantia nigra (Snc aka posterior tuberculum in zebrafish) to the midbrain locomotor region (MLR) and to the reticulospinal motor command neurons (B.rs), following the tracing and locomotive study of [Derjean et al. 2010] in zebrafish and Vta/Snc control of locomotion in [Ryczko et al. 2017]. The proto-striatum circuit is built around that olfactory-seeking circuit, acting persistent attention.

The proto-striatal model uses an efference copy of the last action from the MLR to bias the choice of the next action via a MLR to T.pf to striatum path. The model biases the choice through removing inhibition of the odor to action path. If the last action as left, the left odor is disinhibited, making it more likely to win.

The striatal system uses disinhibition for noise reasons. [Cohen et al. 2009] studied attention in the visual system and found that attention removed coherent noise by removing inhibition. By removing inhibition, the attended circuit is less affected by the controlling circuit’s noise.

Note: essay 19 considered an alternative solution to the attention issue by following the nucleus isthmi system in zebrafish as studied in [Grubert et al. 2006], where the attention to the win-stay odor used acetylcholine (ACh) amplification to bias the choice.

Striatal columns: approach and avoid

An immediate difficulty with the simple proto-striatal model is the lack of priority. Although left vs right have equal priority, avoiding a predator is more important than seeking a potential food source. Unfortunately, the proto-striatum treats all options equally. As a solution, essay 18 split the striatum into columns, where each column resolves an internal conflict without priority (“within-system”) and the columns are compared separately (“between-systems”), where “within-system” and “between-system” are from [Cisek 2019].

Proto-striatum columns for maintaining attention.
Dual striatum column for approach and avoid, where MLR resolves the final conflict. B.rs reticulospinal command neuron, B.ss somatosensory (touch), MLR midbrain locomotive region, M.pag periaqueductal gray, Ob olfactory bulb, S.ot olfactory tubercle, S.d dorsal striatum.

Subthalamic nucleus and exploration

If we now treat exploration as a distinct action system, then it needs its own control system and column in the proto-striatum. The within-system choice for exploration is the left and right turns for a random walk, and the between-system choices are between the exploration system and the odor-seeking system.

As a possible neural correlate of exploration, consider the sub thalamic nucleus (H.stn). The sub thalamic nucleus is derived from the hypothalamus, specifically from the same area as the retromammilary area (H.rm aka supramammilary), which is highly correlated with hippocamptal theta, locomotion and exploration.

[Watson et al. 2021] finds a locomotive function of H.stn, where specific stimulation by the parafascicular thalamus (T.pf) produces locomotion via the midbrain locomotive region (MLR). T.pf includes efference copies of motor actions from the MLR as well as other midbrain action efference copies. In the proto-striatum model, the feedback from MLR to striatum uses T.pf.

Exploration locomotive path through H.stn. H.stn sub thalamic nucleus, MLR midbrain locomotive region, T.pf parafascicular thalamus.

Seek and explore with dual striatal columns

Suppose the striatum manages both odor seeking (chemotaxis) and default exploration (Lévy walk). The two actions are conflicting with a complex priority system. When a food odor first appears, the animal should seek toward it (priority to seek), but if no food exists the animal should resume exploration (priority to explore). To resolve the between-system conflict, the two strategies need to columns with lateral inhibition to ensure that only one is selected.

Dual striatum columns for seek and explore strategies. B.rs reticulospinal motor command, H.stn sub thalamic nucleus, Ob olfactory bulb, P.ge globus pallidus external, S.d1 direct striatum projection, S.d2 indirect striatum projection, Snc substantia nigra pars compacta, Snr substantia nigra pars reticulata.

Selecting the seek column enables the odor sense to MLR path, seeking the potential food odor. Selecting the explore column enables the H.stn to MLR path, randomly searching for food.

Note: the double inversion in both paths is to reduce neuron noise [Cohen et al. 2009]. Removing inhibition reduces noise, where adding excitation would add noise. In the essay stimulation, this double negation isn’t necessary.

Striatum with dopamine/habenula control

The previous dual column circuit isn’t sufficient for the problem, because it lacks a control signal to switch between exploit (seek) and explore. The striatum dopamine circuit might help this problem by bringing in the foraging implementation from essay 17.

A major problem in essay 17 was the tradeoff between persistence and perseverance in seeking an odor. Persistence ensures that seeking an odor will continue even when the intermittent. Perseverance is a failure mode where the animal never gives up, like a moth to a flame. As a model, consider using dopamine in the striatum as persistence or effort [Salamone et al. 2007], and control of dopamine by the habenula as solving perseverance with a give-up circuit.

Explore and exploit (seek) columns controlled by dopamine. H.l lateral hypothalamus, Hb.l lateral habenula, H.stn sub thalamic nucleus, MLR midbrain locomotive region, Ob olfactory bulb, P.em pre thalamic eminence, P.ge globus pallidus external, S.d1 striatum direct projection, S.d2 striatum indirect projection, Snc substantia nigra pars compacta, Snr substantia nigra pars reticulata.

The striatum uses two opposing dopamine receptors named D1 and D2. D1 is a stimulating modulator though a G.s protein path, and D2 is an inhibiting modulator through a G.i protein path. In the above diagram, high dopamine will activate the seek column via D1 and inhibiting the explore column via D2. Low dopamine inhibits the seek column and enables the explore column. So dopamine becomes an exploit vs explore controller.

In many primitive animals, dopamine is a food signal. In c.elegans the dopamine neuron is a food-detecting sensory neuron. In vertebrates, the hunger and food-seeking areas like the lateral hypothalamus (H.l) strongly influence midbrain dopamine neurons both directly and indirectly. Indirectly, H.l to lateral habenula (Hb.l) causes non-reward aversion [Lazaridis et al. 2019].

For the essay, I’m taking H.l as multiple roles (H.l is a composite area with at least nine sub-areas [Diaz et al. 2023]), both calculating potential reward (odor) via the H.l to Vta/Snc connection, and cost (exhaustion of seek task without success) via the H.l to Hb.l to Vta/Snc connection.

References

Cisek P. Resynthesizing behavior through phylogenetic refinement. Atten Percept Psychophys. 2019 Oct

Cisek P. Evolution of behavioural control from chordates to primates. Philos Trans R Soc Lond B Biol Sci. 2022 Feb 14

Cohen MR, Maunsell JH. Attention improves performance primarily by reducing interneuronal correlations. Nat Neurosci. 2009 Dec;12(12):1594-600.

Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21

Diaz, C., de la Torre, M.M., Rubenstein, J.L.R. et al. Dorsoventral Arrangement of Lateral Hypothalamus Populations in the Mouse Hypothalamus: a Prosomeric Genoarchitectonic Analysis. Mol Neurobiol 60, 687–731 (2023).

Gruberg E., Dudkin E., Wang Y., Marín G., Salas C., Sentis E., Letelier J., Mpodozis J., Malpeli J., Cui H. Influencing and interpreting visual input: the role of a visual feedback system. J. Neurosci. 2006;26:10368–10371

Lazaridis I, Tzortzi O, Weglage M, Märtin A, Xuan Y, Parent M, Johansson Y, Fuzik J, Fürth D, Fenno LE, Ramakrishnan C, Silberberg G, Deisseroth K, Carlén M, Meletis K. A hypothalamus-habenula circuit controls aversion. Mol Psychiatry. 2019 Sep

Ryczko D, Grätsch S, Schläger L, Keuyalian A, Boukhatem Z, Garcia C, Auclair F, Büschges A, Dubuc R. Nigral Glutamatergic Neurons Control the Speed of Locomotion. J Neurosci. 2017 Oct 4

Salamone JD, Correa M, Nunes EJ, Randall PA, Pardo M. The behavioral pharmacology of effort-related choice behavior: dopamine, adenosine and beyond. J Exp Anal Behav. 2012 Jan

Watson GDR, Hughes RN, Petter EA, Fallon IP, Kim N, Severino FPU, Yin HH. Thalamic projections to the subthalamic nucleus contribute to movement initiation and rescue of parkinsonian symptoms. Sci Adv. 2021 Feb 5

18: Neuroscience issues with proto-striatum

The previous proto-striatum model is flawed because it focused too much on sensory input and not enough on action efferent copies. To fix this focus, the model can use midbrain locomotive region (MLR) actions as a bias selector.

Recall that the simulation needed the striatum to solve an action jitter problem by introducing a win-stay bias. Once the animal turns left, it should bias toward continued left turns. Before the fix, the animal randomly chose a direction every 50ms, reversing itself, causing problems in avoiding corners and obstacles. The simulation problem was an action-selection problem not a sensor problem.

In the vertebrate striatum, action feedback comes from the MLR via the parafascicular thalamus (T.pf). The T.pf connection to the striatum is unique, both in its targeting of striatal interneurons (S.cin and S.pv), but also for its connection to the medium spiny projection neurons (S.spn), the main striatal neurons [Ragu et al. 2006]. T.pf connects directly to S.spn dendrites, not merely the spines as with other inputs. This direct connection potentially gives a stronger stimulus, and its uniqueness suggests it may be an older, more primitive connection.

Action-focused striatum model

So, I’m changing the striatum model to follow an action focus. After an action fires the motor command neurons (B.rs reticulospinal), the MRL sends an efferent copy of the motor command to the striatum via T.pf.

Action feedback model for proto-striatum. B.rs reticulospinal motor command, MLR midbrain locomotive region, Ob olfactory bulb, Snc substantia nigra pars compacta, S.pv striatal parvalbumen interneuron, S.spn spiny projection neuron.

In the above diagram, the main sensor path is still from the olfactory bulb (Ob) to the substantia nigra pars compacta (Snc / posterior tuberculum) and then to MLR, basically a stimulus-response path. A previous action biases the sensory path for the next action by activating a corresponding S.spn, which disinhibits Snc, making the next sensory input more powerful.

Comparison with the previous model

As a comparison, the following diagram shows the previous striatal model. Unlike the new model, the final selected action didn’t bias the next action because there was no feedback connection. (The reset signal to S.pv is a different circuit, and doesn’t bias the decision because it applies to all choices equally.)

sense-focused proto-striatum model.
Previous photo-striatum, where a prior selected sense biased the next sense. B.ss somatosensory touch.

In addition, the sensory input must coordinate striatal disinhibition via S.spn with its excitation of the Snc action. Although not impossible evolutionarily, the double coordination required makes it less likely. The new model not only incorporates the action but simplifies the sensor circuit.

Parafascicular thalamus

For personal reference, here’s a summary of the T.pf connections [Smith et al. 2022].

Connections of the parafascicular thalamus.

Essentially all the T.pf inputs are motor efference copies and all the T.pf outputs are to the basal ganglia. Inputs include the following areas: vision/optic motor (OT and pretectum), midbrain locomotive region (MLR, M.pag, V.ppt, V.ldt), diencephalon locomotive region (H.zi), consummatory action (B.bp), forebrain attention (P.bf) and cortical action (C.fef, C.moss, C.gu). The cingulate cortex might be unusual (C.cc), although it also has motor areas.

Striatum as attention

Attention is a difficult topic, in part because it’s used in so many diverse ways that the word is often more confusing than helpful [Hommel et al. 2019], [Krauzlis et al. 2014]. However, I think it’s interesting that the action-based striatum model looks like selective attention.

Simplification of proto-striatum showing resemblance to selective attention.

When a left action biases the next action to stay the same, its mechanism is to enhance the sensory path, as if it’s paying attention more to one side than another.

Engineering feedback: dopamine mistake

When implementing this idea, the simulation doesn’t need dopamine feedback. Instead of forcing the dopamine just because the basal ganglia has dopamine feedback I’m taking it out from the model. Since I’ve only implemented a prototype portion of the basal ganglia, this may be okay instead of a fatal flaw. When the full model arises, we’ll see if this is a mistake.

Actual simulation implementation, removing dopamine and reset feedback.

Notice that the only dopamine in this model is descending, with no ascending dopamine [Ryczko and Dubuc 2017].

References

Hommel B, Chapman CS, Cisek P, Neyedli HF, Song JH, Welsh TN. No one knows what attention is. Atten Percept Psychophys. 2019 Oct

Krauzlis RJ, Bollimunta A, Arcizet F, Wang L. Attention as an effect not a cause. Trends Cogn Sci. 2014 Sep;18(9):457-64

Raju DV, Shah DJ, Wright TM, Hall RA, Smith Y. Differential synaptology of vGluT2-containing thalamostriatal afferents between the patch and matrix compartments in rats. J Comp Neurol. 2006 Nov 10

Ryczko D, Dubuc R. Dopamine and the Brainstem Locomotor Networks: From Lamprey to Human. Front Neurosci. 2017 May 26

Smith JB, Smith Y, Venance L, Watson GDR. Thalamic Interactions With the Basal Ganglia: Thalamostriatal System and Beyond. Front Syst Neurosci. 2022 Mar 25

18: Engineering issues with proto-striatum

The planned striatum model of essay 17 quickly runs into simulation problems because it’s missing priority selection between avoiding obstacles and seeking food. Obstacle avoidance needs a higher priority than seeking an odor plume, but a naive striatum doesn’t support that priority.

Broken striatum model where toward and away have no priority. Ob olfactory bulb, B.ss somatosensory touch, B.rs reticulospinal motor command.

This model fails because this striatum has no priority of away (avoid) actions from toward (approach) actions. An animal can’t simply follow an odor blindly, ignoring obstacles, but this model doesn’t support that priority.

Tectum

Adding the tectum seems like the right solution, although I was planning on putting it off until dealing with vision.

The tectum (optic tectum / superior colliculus) is better known for its vision support, but the deeper tectum layers are a general action-decision system. At its lower levels near periaqueductal gray (M.pag) it has a topographic direction-based map on its intermediate level and an action-based map in the deep level.

The tectum and M.pag are neighbors, almost layers of each other, and in animals like the frog, the M.pag is as a deeper layer of the tectum.

Relation between M.pag and OT in mammals (left) and frog (right), where the ventricle shape determines the anatomical label for homologous areas.

The tectum is an action organizer, not just a vision organizer. For the simulation, the action matters since the simulated animal doesn’t have vision.

Amphioxus, a non-vertebrate chordate that’s a model into pre-vertebrate evolution, has a few motor-related cells with the same genetic markers as the tectum [Pergner et al. 2020]. It’s conceivable that the amphioxus tectum is more action focused, since the amphioxus frontal eye is only a dozen photoreceptors with no lens.

Action categories

The tectum has split circuits for turning and for approach and avoid [Wheatcroft et al. 2022]. The simulation can use something like the following circuit.

Split tectum and striatum circuit. B.rs reticulospinal motor command, B.ss somatosensory input, M.lr midbrain locomotor region, M.pag periaqueductal gray, Ob olfactory bulb, S.d dorsal striatum, S.ot olfactory tubercle.

Approach (toward) senses like food odors excited toward actions, and avoidant (away) sense like touch excite away actions. Because the priority areas are split, each striatum can choose between non-priority options (left vs right). The priority resolves only later in the midbrain locomotor region, using context input to decide which major direction to use. In this split model, the simplified striatum circuit can work because all of striatum options are equal priority.

As a note on accuracy, the diagram misrepresents the actual olfactory path, specifically the real olfactory tubercle. In reality, olfaction has a distant, complicated path to the tectum.

Short-cut escape signal

The previous diagram is also misleading because it’s too organized, as if each function has a dedicated, planned circuit. Although the tectum itself is highly-organized, the downstream and modulating circuits are more ad hoc. For example, the zebrafish has an escape mechanism that short-cuts the tectum and drives the B.rs command motor directly [Zwaka et al. 2022].

fast escape shortcut of tectal locomotion circuit.
Fast escape shortcut of tectum-mediated locomotion.

In the above diagram, the escape circuit short-circuits any decisions of the tectum and striatum. Relatedly, the “switch” area in M.lr isn’t as tidy as the diagram suggests. It’s more like that M.lr contains multiple actions which laterally inhibit each other in a priority scheme, modulated by M.pag.

As an additional correct, many of the modulators like M.pag affect the tectum directly, instead of the diagram’s dedicated priority-resolution function.

References

Pergner J, Vavrova A, Kozmikova I, Kozmik Z. Molecular Fingerprint of Amphioxus Frontal Eye Illuminates the Evolution of Homologous Cell Types in the Chordate Retina. Front Cell Dev Biol. 2020 Aug 4

Wheatcroft T, Saleem AB, Solomon SG. Functional Organisation of the Mouse Superior Colliculus. Front Neural Circuits. 2022 Apr 29

Zwaka H, McGinnis OJ, Pflitsch P, Prabha S, Mansinghka V, Engert F, Bolton AD. Visual object detection biases escape trajectories following acoustic startle in larval zebrafish. Curr Biol. 2022 Dec 5

Essay 18: Proto-striatum

A problem with essay 17 was the lack of action stickiness, which became a problem for avoiding obstacles. When the animal hits an obstacle head-on, both touch sensors fire and the animal chooses a direction randomly. Because the decision repeats every tick (30ms) and chooses randomly to break ties, the animal flutters between both choices and remains stuck until enough random choices are in the same direction to escape the obstacle. What’s needed is a stick choice system to keep a direction once it’s selected. In some decision studies, this is a “win-stay” capability.

A previous essay solved this issue with muscle-based timing or a dopamine-based system, but some of the theories of the striatum function suggest it might solve the problem. The core idea uses the dopamine as a feedback enhancer to sway choice to “stay.”

Simplified proton-striatum circuit for “win-stay.” B.ss somatosensory (touch), B.rs reticulospinal motor control, M.lr midbrain locomotive region, S.pv parvalbumin GABA inhibitory interneuron, Snc substantia nigra pars compacta, S.spn striatum spiny projection neuron (aka medium spiny neuron), ACh acetylcholine, DA dopamine.

The circuit is intended not as the full vertebrate basal ganglia, but a possible core function for a pre-vertebrate animal in the early Cambrian. The circuit here represents only the direct path and specifically only the striostome (patch) circuit, and only represents the downstream connections, and ignores the efferent copy and upstream enhancements. Despite being simplified, I think it’s still to complicated as a single evolutionary step.

Simplified proto-circuit

If that simplified striatal circuit is too complicated for an evolutionary step, but lateral inhibition is a reasonable circuit.

Simplified photo-circuit with lateral inhibition.

The above simplified circuit is a simple lateral inhibition circuit with an added reset function from the motor region.

The main path is through the somatosensory touch (B.ss), through the substantia nigra pars compacta (Snc – posterior tubuculum in zebrafish) to the midbrain locomotive region (M.lr). [Derjean et al. 2010] traced a similar path for olfactory information. I’m just replacing odor with touch.

The reset function might be a simple efferent copy from the central pattern generator for timing. In a swimming animal like an eel, the spinal cord controls the oscillation of body undulation, moving the animal forward. Because the cycle is periodic, when the motor system fires at a specific phase such as an initial-segment muscle twitch, it can send a copy of the motor signal upstream as an efferent copy. That signal is periodic, clock-like, something like the theta oscillation in vertebrates, and upper layers can use that clock.

Zebrafish larva swim in discrete bouts, each on the order of 500ms to 2sec. Since the specific mechanism that organizes bouts isn’t known, any model is just a guess, but might motivate some of the striatal circuitry. Specifically, the acetylcholine (ACh) path in the striatum. The motor swimming clock could break movement into bouts with a reset signal.

Since the sense to Snc to M.lr is a known circuit [Derjean et al. 2010], lateral inhibition is a common circuit, and motor efferent copy of central pattern oscillation is also common, this simplified circuit seems like a plausible evolutionary step.

Improved circuit

Some problems in the simplified circuit lead to improvements in the full circuit. The simplified circuit is susceptible to noise, leading to twitchy behavior, because sensors and nerves are noisy. Secondly, when two options compete, a weaker signal might win the competition if it arrives first. An accumulator system that averages the signals will give better comparisons.

To improve the decisions, the new circuit adds a single pair of inhibition neurons, specializes the existing neurons, and changes the connections.

Circuit improving noise and decision.

To improve decision making, the S.spn neurons are now accumulators, averaging inputs over 100ms or so, just long enough to reduce noise without harming response time too much. As an implementation detail, the S.spn neurons might either accumulate calcium (Ca) itself, or a partner astrocyte might accumulate Ca.

To improve noise behavior, the added Snc inhibition neurons tonically inhibit the Snc neurons, so a stray signal from B.ss to Snc won’t inadvertently trigger the action before the decision. The dual inhibition is a slightly complicated circuit which reduces noise because an active path (disinhibited) has only sense inputs; the modulatory signals are taken away.

The dopamine feedback has the benefit of being a modulator instead of a pure feedback signal. Because it’s a multiplicative modulator, dopamine doesn’t trigger the cycle itself. When the signal ends, the dopamine feedback doesn’t continue a ghost reverberation signal.

Choice decisions: drift diffusion

Psychologists, economists, and neuroscientists have several useful models for decision making, primarily deriving from the drift diffusion model [Ratcliff and McKoon 2008], which extends a random walk model to decision-making. While most of the research appears to be centered on visual choice in the cortical (C) visual system, such as the lateral intraparietal area (C.lip), the concepts are general and the circuits simple, which could apply to many neural circuits, even outside of the mammalian cortex.

Drift-diffusion is a variation of a random walk. Each new datum adds a vector to an accumulator, walking a step, until the result crosses a threshold.

Circuits for leaky competing accumulator (LCA) and feed-forward models of two-choice decision.

One simple model is the leaky competing accumulator (LCA) of [Usher and McClelland 2001], where each choice has an accumulator, and the accumulators inhibit each other laterally. Another model use feedforward inhibition instead of lateral inhibition, where each sense inhibits its competitors. For this essay, these models seem a good, simple options for the simulation.

In the context of the striatum, [Bogacz and Gurney 2007] analyze the basal ganglia and cortex as a choice-based decision system. They interpret the direct path (S.d1) as the primary accumulator, and the indirect path (S.d2 / P.ge / H.stn) as feed-forward inhibition. They suggest that the basal ganglia could produce near-optimal decision in the two-choice task.

References

Bogacz R, Gurney K. The basal ganglia and cortex implement optimal decision making between alternative actions. Neural Comput. 2007;19:442–477

Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21

Ratcliff, R., & Childers, R. (2015). Individual differences and fitting methods for the two-choice diffusion model of decision making. Decision, 2(4), 237.

Usher, M., & McClelland, J. L. (2001). On the time course of perceptual choice: The leaky competing accumulator model. Psychological Review, 108, 550–592.

Wang, X.-J. (2002). Probabilistic decision making by slow reverberation in cortical circuits. Neuron, 36, 1–20.