47: Striatum as a mosaic of broken mirrors

The mosaic of broken mirrors is an analogy for the striatum in the basal ganglia [Da Cunha et al 2009]. The striatum represents a vertebrate’s actions and environment in a broken and overlapping fashion. While actions have focal projections to the striatum, the contextual input is broad and diffuse [Fee MS 2012]. While the hippocampus represents the environment globally, the striatum depends on piecewise representation. This means striatal learning is unable to generalize, because mosaic fragments lack a global perspective [Da Cunha et al 2009].

In the essays I’ve used the striatum as a timeout for food seek from an odor plume. When a food odor doesn’t have any food, or the odor is behind a barrier, the animal needs some sort of timeout to give up seeking the odor, turn away from the false odor, and search else where. However, the current simulation only uses the seek action for the timeout; it lacks environment context.

A simple illustration depicting two circular shapes; one has a blue and red figure inside, while the other contains a green star-like shape.
Simulation screenshot showing the model animal trapped into perseverating seek in the center of a false odor plume. Circles represent odor plumes and the star represents food.

The above simulation screenshot shows the general problem. The circles represent food odor plumes and the star represents food. The animal has followed a false odor plume and will continue circling the center until the striatum timeout. However, there is a nearby valid odor plume with food in it. The animal should avoid the false odor plume and search the correct odor plume, but currently it can’t distinguish the two, because it’s only using its own seek action as a key. If the animal could detect environmental context differences between the two plumes, it could search more effectively.

As a context, odor neighborhoods [Jacobs 2022], [Marin et al 2021] can represent a primitive representation of place. If each place has a different set of odor molecules, the animal can use that odor scene to distinguish false odor plumes from true food odor sources.

Striatum as timeout

In the essays I’ve used the striatum as a timeout mechanism to prevent perseveration, specifically the S.d2 (striatum projection neurons with D2.i Gi inhibiting dopamine receptor), which use A2a.s (adenosine Gs stimulating receptor) to measure the buildup of Ado (adenosine). The striatum’s projection neurons are roughly evenly divided between S.d1 (D1.s Gs stimulating dopamine receptor) and S.d2. For long stimulations, S.d1 generally motivate the current action and S.d2 opposes the action and produces avoidance [Soares-Cunha et al 2020], but for short stimulations both are active for action initiation [Cui G et al 2013].

Ado signaling limits swimming in frog tadpoles [Dale 1998]. In the striatum A2a.s receptors in S.d2 projection neurons also detect Ado buildup from neuron activity and from astrocytes that monitor neuron activity [Kang S et al 2020]. The Ado buildup activates S.d2 neurons and increases its internal PKA (protein kinase A) [Ma L et al 2022]. PKG slowly builds up during activity with a buildup time constant on the order of 10-20s and a decay constant on the order of 70s [Ma L et al 2022], but the increase appears to be log or sigmoid-like, not linear, suggesting that longer timeouts would be possible with high thresholds or opposition from S.d1. In S.d2, the PKA buildup enables the release of the opioid enkephalin [Konradi et al 2003], [Hook et al 2008], which activates the DOR.i (δ-opioid inhibitory receptor), which is necessary for the inhibitory/avoidance behavior for S.d2 [Soares-Cunha et al 2020].

Because the essay simulation needs a timeout to limit food seek perseveration, and the Ado and S.d2 avoidance chain could plausibly implement that timeout, I’ve been using it as the basis for the simulation’s timeout. This function seems evolutionarily plausible, because avoiding perseveration is important to keep the animal from unproductive seeking, and the implementation is fairly straightforward, only requiring already existing Ado timeout sensing and S.d2, without needing the entire basal ganglia. However, up to this point, that timeout has only used the action as a key, and has not included any context. Essay 15 and 16 did cover odor seek timeout in the context of associative habituation, but on the context of the fruit fly mushroom body.

Action and context as striatum inputs

S.pn (striatum projection neurons: both S.d1 and S.d2) are often called medium spiny neurons because their extensive dendrites are covered with spines. Spines are small dendrite compartments that receive axon inputs. Spines can compartmentalize Ca2+ (calcium) transients [Fang LZ and Creed 2024], meaning S.pn activation is not necessarily global across the entire dendrite tree, but compartmentalized. In S.d (dorsal striatum), cortical axons attach to S.pn spines, while T.pf (parafascicular thalamus) connects to the dendrite shaft [Fee MS 2012]. T.pf signals include ongoing action feedback and efference copies from the hindbrain and midbrain, including OT (optic tectum), while the cortex provides environmental context.

Diagram illustrating the connection between cortical inputs (C), thalamic inputs (T.pf), and striatum projection neurons (S.pn) via dendritic spines.
Rough diagram of inputs to S.pn dendrites. Cortical input is to distal dendrites and spines, while T.pf is more proximal and to dendrite shaft. C (cortex), S.pn (striatum projection neuron), T.pf (parafascicular thalamus).

Songbirds have a portion of the basal ganglia devoted to singing called Area X [Kornfeld et al 2020]. Area X receives song action variability information from C.lman (lateral nucleus of anterior nidopallum) and timing context from C.hvc, which are areas of the songbird cortex. C.lman provides an action efference copy of the variation actions [Fee MS 2014]. The majority (85%) of C.hvc input is on S.pn spines, while 55% of the C.lman action information is on the dendrite shafts [Kornfeld et al 2020]. The action efferent copy input to S.pn are not plastic, while the contextual input on the shafts is plastic [Fee MS 2012]. The context drives S.pn activation to an Up state, gating the core action driver [Fee MS 2012]. Action input to the striatum is focuses, while contextual information is diffuse [Fee MS 2012].

S.pn inputs can differ in their attachment to spines or dendrite shafts, and they can also differ in triggering behavior and for Up states. S.pn are normally hyper polarized, meaning they are normally especially difficult to trigger. Some inputs can shift S.pn into an Up state, where they are more easily triggered. In S.v (ventral striatum), inputs from E.sub.v (ventral subiculum in the hippocampal complex) can shift S.pn into an Up state for hundreds of millisecond [O’Donnell and Grace 1995], [Sesack and Grace 2010]. When E.sub.v is disable, S.v spontaneous or bistable activity halts, and other inputs such as F.pfc (prefrontal cortex) can’t trigger action potentials [O’Donnell and Grace 1995]. Up state transitions can be facilitated by dopamine [Fang LZ and Creed 2024], [Lahiri and Bevan 2020] and astrocyte sensing of glutamate activity [D’Ascenzo et al 2007], [Yu X et al 2018].

For the purposes of the essay simulation, these differences in striatum input suggest it’s plausible to treat action input and contextual input as distinct types of input, following [Fee MS 2012]. Specifically, that the combination of an action input and a context is required to drive the striatum timeout. This means that a seek timeout can be specific to a context and not overflow to other contexts.

Innate and contextual odors

In vertebrates, O.sn (odor sensory neurons) axons project to O.gl (glomerules) in Ob (olfactory bulb), where they connect with O.pn (odor projection neuron: mitral and tufted cells in mammals) dendrites. Each O.gl is a large neuropil (axon and dendrite connection area) where multiple O.sn and O.pn combine. In mammals, each O.gn responds to a single O.sn odor feature. Each O.gl typically responds to several odor molecules, and each odor molecule drives multiple O.gl [Wilson and Mainen 2006], [Weiss 2020]. In other vertebrates, each O.gl can combine inputs from multiple O.sn. Insect odor processing also uses glomerules, but this shared structure is independent evolution not homology because even the underlying odor detection receptors are unrelated between insects and vertebrates [Weiss 2020]. The glomeruli structure is likely simply an effective way of connecting multiple O.sn to O.pn.

As an analogy that the simulation uses, consider phonemes in a syllable, where each syllable is like an odor molecule and each phoneme is like a glomeruli. The syllable “cat” consists of “c-“, “-a-“, and “-t”, corresponding to three glomerules, and “c-” is driven by many different syllables. So, O.gl doesn’t identify the whole odor, but only a feature of the odor, like “c-“, but the features can be recombined to identify the odor.

The odor glomerules in vertebrates are divided into a smaller innate group and a larger contextual group. Most mammals have distinct Ob and O.a (accessory olfactory bulb). The lamprey Ob.m (medial Ob) projects directly to the midbrain, including Hb.m (medial habenula) and V.pt (posterior tuberculum) [Derjean et al 2010], [Beauséjour et al 2022], while the Ob.l (lateral Ob) projects to Pa (pallium/cortex) and basal ganglia [Beauséjour et al 2022], [Beauséjour et al 2024], [Suryanarayana et al 2021].

Previous essays have only used the innate Ob.m projection and ignored the Ob.l projection. This essays adds the Ob.l context projection to S.o t(olfactory tubercle), which is a part of S.v (ventral striatum) with large, direct olfactory input, and output to H.l (lateral hypothalamus) and Pv (ventral pallidum). The Ob.l context may represent odor neighborhoods, introducing a notion of place.

Odor neighborhoods

Odors rarely occur in isolation, are dynamic in space and time [Marin et al 2021], and form spatial neighborhoods [Jacobs 2022]. Olfactory curs influence E.hc (hippocampus) place fields, and place cells in blind rats are similar to sighted rats [Marin et al 2021]. In O.pir.p (posterior piriform cortex/olfactory cortex), place can be decoded to 90% accuracy with 240 neurons [Poo C et al 2022]. The olfactory spatial hypothesis considers odor are more for navigation than for identification [Jacobs 2012], where an odor neighborhood is a local area of odor mixtures.

It seems plausible that an early proto-vertebrate could use a combination of odor features from O.gl in an early S.v to restrict a seek timeout to a local place. The circuit is a straightforward extension of existing Ado timeout circuitry.

Seek striatum

For the essay’s simulation, I’m using odor context from Ob.l as a neighborhood detector. The striatum has two inputs: an action that enables the striatum during a seek and a place context identified by odor to restrict the search.

A simulation screenshot depicting various spatial representations of sensory inputs, including graphs and spatial maps related to odor detection and processing mechanisms.
Screenshot of the seek task blocked by a U-shaped barrier with odor neighborhoods represented by color and pattern. The local odor “rat” is represented by odor glomerules for “r-“, “-a-“, and “-t”.

The above screenshot shows the animal after its timeout from a failed odor seek when blocked by a barrier. The star represents food and the circle is an odor plume. Each pattern in the arena represents an odor neighborhood. The right side of the screenshot shows the active glomerules for the neighborhood, represented by “r-“, “-a-” and “-t”. I’m using syllables to represent odor molecules and phonemes to represent odor features detected by Ob glomerules.

Fruit fly mushroom body and Kenyon cells

Because the hypothetical proto-vertebrate would have a much simpler striatum than the mammal striatum, consider the comparison with the fruit fly MB (mushroom body) and its KC (Kenyon cells), which has a similar structure to the Sv projections to Pv, but has a much smaller scale. The mushroom body is highly conserved among insects and possibly predates all arthropods [Fiala and Kaun 2024], and serves as an odor pattern detector. In fruit flies, 52 O.pn project to ~2000 KC [Chan ICW et al 2024], which project to 24 MBON (mushroom body output neurons) [Seki et al 2017]. Each KC has three to seven claws [Zheng et al 2022], which are essentially single-connection dendrites.

A diagram illustrating the connectivity of Kenyon cells (KCs) in the fruit fly mushroom body, showing olfactory sensory neuron (OSN) inputs, projection neurons (PNs), and connections to mushroom body output neurons (MBONs) with their neurotransmitter types.
Architecture of the Drosophila mushroom body, adapted from [Aso Y et al 2014]. For this essay, only the left side projections of PN to KC are important. KC (Kenyon cell), PN (olfactory projection neuron), OSN (olfactory sensory neuron)

The above diagram shows the fruit fly mushroom body, but only the KC on the left are relevant for this essay. The MBON on the right would correspond to Pv (ventral pallidum) in this analogy. Each of the 2000 KC receive essentially random olfactory input from the O.pn, where each KC receives 3-7 O.pn inputs.

This mushroom body structure roughly corresponds to vertebrate O.pn projections to S.ot (ventral striatum olfactory tubercle), which projects to Pv. Although the connectivity pattern is similar, the two structures are not homologous in any fashion. Insect O.sn and vertebrate O.sn use entirely separate olfactory receptor families, and KCs use ACh (acetylcholine) as a neurotransmitter, while S.pn use GABA and vertebrate-specific opioids. The point of the analogy here is only to compare the scale of the O.gl and S.ot for a possible proto-vertebrate because the mammal S.ot is vastly too large to be plausible for that ancestor.

The MB has been compared to vertebrate CB-like (cerebellum-like) structures in the hindbrain [Farris 2011], suggesting that both serve as adaptive sensory filters. CB-like structures have dual inputs: one is sensory-specific input and the other is multimodal contextual. The MB also serves as a brake on insect locomotion, because fruit flies with MB lesions are less likely to stop locomotion once begun moving. This locomotion stopping is similar to the seek perseveration timeout in this essay.

Diagram comparing the olfactory processing pathways in insects and vertebrates. The insect mushroom body pathway includes odor sensory neurons (O.sn), odor projection neurons (O.pn), Kenyon cells (KC), and mushroom body output neurons (MBON), leading to seek/avoid responses. The vertebrate pathway similarly includes O.sn, O.pn, striatal projection neurons (S.pn), and the ventral pallidum (Pv), also leading to seek/avoid responses.
Analogy between the insect mushroom body and the vertebrate ventral basal ganglia. KC (Kenyon cells), MBON (mushroom body output neuron), O.pn (olfactory projection neuron), O.sn (olfactory sensory neuron), Pv (ventral pallidum), S.pn (striatum projection neuron)

For scale, consider using the lamprey Ob during the syllable analogy. The lamprey has approximately 40 olfactory receptor genes [Beauséjour et al 2020]. If we exclude about 10 innate from Ob.m, the 30 are contextual odors in Ob.l Consider splitting each odor as a syllable into three odor features as pheromones. If the 30 lamprey O.gl were organized like phonemes, then it might have 10 initial consonants, 10 vowels, and 10 final consonants. Suppose each S.pn receives three inputs from O.pn: one of 10 initial consonants, one of 10 vowels, and one of 10 final consonant. The 1000 S.pn would cover the possible syllables, expanding the dimensionality from 30 odor phonemes to 1000 odor syllables. Analogously, the Drosophila 52 O.pn phonemes expand to ~2000 KC syllables, roughly the same order of magnitude. Of course, the olfactory neighborhoods aren’t actually nicely ordered into convenient human-readable syllables, but it’s a convenience analogy, particularly for the simulation.

Returning to the original metaphor of the mosaic of broken mirrors [Da Cunha et al 2009], the O.pn breaks odor molecules (syllables and unbroken mirror) into a broken set of odor features (phonemes and mosaic tessera), and randomly reassembles the features in S.pn like tessera in a mosaic, partially recovering the original syllable structure, albeit lossy. Because the features are broken pieces stripped from their original odor molecule identity, the system could add other modalities, such as lateral line or whisker sensing, or temperature, or color fragments, before combining them, Although the fruit fly KCs primarily combine odor inputs, they also include a smaller number of non-odor inputs such as visual, gustatory, mechanosensory, and proprioceptive inputs [Farris 2011].

S.core seek and S.msh roam

So far I’ve used the striatum as a timeout to avoid seek perseveration, giving up on a failed food odor. Adding an odor neighborhood context improves the accuracy of seek perseveration control. Now that odor neighborhoods are available, the animal could also avoid neighborhoods that it’s already searched, essentially creating a memory breadcrumb, as explored in essay 44.

A diagram showcasing a grid with labeled sections including 'dm,' 'dl,' 'msh.d,' 'msh.v,' 'core,' 'lsh,' 'mot,' and 'lot,' indicating different components or regions.
Rough topographic divisions of the striatum. The blue S.msh.d is used for roam, and the amber S.core and S.lsh are used for seek. S.core (Sv core), S.dl (dorsolateral striatum), S.dm (dorsomedial striatum), S.lot (lateral S.ot – olfactory tubercle of Sv), S.lsh (lateral shell of Sv), S.mot (medial S.ot), S.msh.d (medial shell of Sv, dorsal part), S.msh.v (medial shell of Sv, ventral part), Sv (ventral striatum aka nucleus accumbens)

Sv (ventral striatum) is divided into S.core (Sv core) and S.sh (S shell), where S.core surrounds the anterior commissure, which connects Ob and amygdala. The shell further divides into S.msh (medial shell) and S.lsh (lateral shell), with S.ot also dividing into S.mot (medial S.ot) and S.lot (lateral S.lot). S.msh itself divides into S.msh.d (dorsal S.msh) and S.msh.v (ventral S.msh). These regions have distinct genetic transcription factor types and connectivity, and S.msh may be even more complicated with further genetically defined subtypes [Chen R et al 2021].

Functionally, S.lsh and S.core are more similar, related to cues and seek [Floresco 2015], [Chen G et al 2023], [Ding YD et al 2022], [Dobrovitsky 2017], while S.msh is distinct and related to place [Al-Hasani et al 2015], [Humphries and Prescott 2010], but not cues [Domingues et al 2025]. S.sh is important for place habituation: avoiding places already visited [Floresco 2015]. S.core is more associated with seek actions, and S.msh associated with place preference and avoidance [Fisher et al 2025]. S.ot is less studied, but is strongly Ob and O.pir related. Assigning S.lot to the same group as S.lsh and S.mot with S.msh is not well supported functionally, but does have some transcriptional support [Chen R et al 2021]. S.msh.d generally supports RTPP (real-time place preference) and S.msh.v RTPA (real-time place avoidance) [Ding YD et al 2022], [D’Aquila 2024], [Faget et al 2024], but because other studies report S.msh.d as necessary for cued avoidance [Ramirez et al 2015], the S.msh function may not be as simple as a clear RTPA/RTPP difference. Sv has three-dimensional aspects as well, with S.msh.a (anterior S.msh) and S.msh.p (posterior S.msh) having opposing seek and avoid motivation [Castro et al 2015], [Berridge 2019], [Bond et al 2020], [Marinescu and Labouesse 2024], with differing projections to locomotion vs eating regions [Richard and Berridge 2011].

As a complication for reading the research, because the heterogeneity of Sv regions was relatively recently discovered, many papers report results for S.sh without distinguishing between S.lsh, S.msh.d, or S.msh.v, despite these regions having different or even opposing functions. Older papers often simply report results for Sv without even distinguishing S.core from S.sh. Another complication is that Sv is also important for eating, not simply dedicated to seek or avoidance. Stimulating S.sh.a immediately stops eating [Reed et al 2018]. However, eating and roaming/seeking are related because they’re mutually exclusive: the animal needs ot stop roaming or seeking to eat. In case cases eating circuits may actually be stop-moving circuits, since 70% of S.sh are inhibited while eating and 30% are short lived excite while eating [Marinescu and Labouesse 2024]. Note that although the divisions in S.v are broadly topographic, some sub-functions could be mixed salt-and-pepper style, particularly in the complicated S.msh region.

So, the essay can add S.msh place habituation to avoid places already visited [Floresco 2015]. This seems likely to be S.msh.d because S.msh.v is more associated with threat avoidance [Ding YD et al 2022].

Odor neighborhoods for roaming

The odor spatial hypothesis suggests that the vertebrate Ob (olfactory bulb) is used more for spatial navigation than odor identification [Jacobs 2012]. Odors form spatial neighborhoods [Jacobs 2022], and the mammalian E.hc (hippocampus) place fields are driven by odor [Jacobs 2022]. Odors are rarely in isolation, but are dynamic in space and time. Place cells in blind rats are similar to sighted rats [Marin et al 2021]. A very old model of E.hc called it the rhinencephalon (nose brain), which was displaced by the discovery of spatial place fields, but if place is grounded by an old odor neighborhood circuit, then rhinencephalon may be accurate [Jacobs 2022].

For the essay simulation I’ve created two parallel basal ganglia paths for seek and roam. The seek path is only activated when the animal is following a food odor plume. The roam path is more broadly activated when the animal is searching for food. For specific paths, S.core and S.lsh appear specific to seek [Dobrovitsky 2017], [Soares-Cunha et al 2020], [Walle et al 2024]. Sv research doesn’t investigate roam circuits per se, but RTPA (real-time place avoidance) and RTPP (real-time place preference) and conditioned place preference are centered on S.msh [Britt et al 2012], [Marinescu and Labouesse 2024].

Flowchart illustrating the pathways in the basal ganglia for seeking and roaming behaviors in response to olfactory signals. Left side shows 'Ob.l place' input leading to seek and roam pathways, including interactions between various neural regions.
Paths used by the simulation for seek timeout and roam timeout with odor neighborhood context. H.l (lateral hypothalamus), Hb.lm (lateral habenula, medial part), MLR (midbrain locomotor region), Ob.l (lateral olfactory bulb), Pv.dl (ventral pallidum, dorsolateral part), Pv.vm (ventral pallidum, ventromedial part), R1.a (anterior hindbrain locomotor region), R5.rs (mid-hindbrain turning region), S.core (ventral striatum core), S.msh.d (striatum medial shell, dorsal part), T.pf (parafascicular thalamus), V.rn (raphe nuclei)

The above diagram shows a hypothetical dual seek and roam circuit. S.core uses seek action feedback / efference copy to enable a seek timeout to avoid perseveration. S.msh.d uses H.l (lateral hypothalamus) roaming driver to enable place habituation to avoid searching places already visited.

A screenshot of a simulation showing an animal's movement in a patterned arena with various sections representing different odor neighborhoods. Circles denote active olfactory stimuli and a star marks the location of food. Syllables and phonemes are used to represent odor features.
Screenshot of the simulation with the animal leaving an area it’s already explored. The phonemes “d-“, “-o-“, and “-g” represent the current odor neighborhood.

The above screenshot shows the simulation for roaming odor neighborhood. Each pattern represents a different odor neighborhood. The phonemes on the right — “d-“, “-o-“, and “-g” — represent odor features of the neighborhood. The animal is avoiding the bottom neighborhood marked by blue horizontal lines because roaming has timed out for that neighborhood.

Box plot and violin plot comparing the distances traveled in an open field for two groups: with 'PvRoam' and without 'PvRoam'.
Monte Carlo simulation of the animal’s search. PvRoam represents roaming with timeout enabled. No PvRoam representing roaming without timeout.

As a test to verify that avoiding place repetition improves food search, I ran 300 Monte Carlo simulations for both the roam timeout enabled and disabled. In the simulation code, the timeout circuit is organized by its Pv projection, which owns the corresponding Sv. So enabling the roaming timeout means enabling PvRoam. The results suggest that avoiding place repetition improves performance by regarding the long-search tail of the distribution.

Discussion: multimodal feature inputs

The essay’s simulation only used odor features as striatum inputs for identifying neighborhoods, but the mosaic model can work with multimodal inputs. For example the lateral line sense can detect a barrier to the right of the animal, That signal can be added to the striatum mosaic to distinguish odor neighborhoods bordered by a reef from a neighborhood over sand. Temperature sensors can distinguish cold and warm neighborhoods. Similarly, even simple, non-imaging photoreceptors tuned to multiple colors could help distinguish sand from reef or deep blue ocean from shallow waters. Three or four bits of visual information could help distinguish neighborhoods without needing complicated visual processing.

Similarly, although the fruit fly mushroom body mainly has odor inputs, it also includes some visual, gustatory, and thermosensory input [Chan ICW et al 2024]. Like a striatum mosaic, the visual processing in the mushroom body isn’t complex, but it can distinguish environments.

Insect mushroom body as a cerebellum-like structure

An interesting comparison between the insect MB (mushroom body) and vertebrate CB-like (cerebellum-like) structures suggests that both act as adaptive sensory filters [Farris 2011]. Vertebrate CB-like structures, mainly in the hindbrain, use anti-Hebbian plasticity to predict and erase self-motion from sensor data [Bell et al 2008], [Montgomery et al 2012].

For example, the aquatic lateral-in sense uses water motion sensors to detect objects and prey. If the animal is swimming near an obstacle to the right, the relative water motion produces a curl around the animal, which a relatively simple circuit can decode to infer the barrier [Oteiza et al 2017]. Because the animal’s own swimming also produces water movement, a CB-like organ R.mon (medial octavo lateral nucleus) subtracts the self signal, enabling more accurate obstacle and prey detection.

Similar to the striatum context and action architecture explored in this essay, CB-like structures have a context formed by parallel fibers, which encodes multimodal combination of self-action and proprioceptive input, and a primary sensory input, which the context modulates. Also similar to this essay’s striatum model, repeated activation is anti-Hebbian: suppressing repeated activation.

There are major differences between the striatum and CB-like functionality, of course. CB-like structures form an adaptive filter to produce a more useful signal, while the striatum timeout in this essay avoid repeating search areas, and it’s hard to find any commonality in those two functions other than the very general avoidance of repetition.

Possible cortex enhancements

Because the simulation is a tow model, it hides the noise and signal problems. Odors in particular are difficult and messy sensor input because odors in water are clumpy, not the clean odor gradients and neighborhoods of the model. Real odor signals will appear and disappear, and neuron signals are generally short, between 3ms for fast AMPA receptors and 100ms NMDA receptors, but the odor needs much longer sustain, on the order of several seconds.

Cortical pyramidal neurons can activate a sustained ADP (afterdepolarization) model lasting on the order of 6-8 seconds when activated by ACh (acetylcholine) activating mACh.q (Gq coupled acetylcholine receptor). This sustained activity could stretch an odor neighborhood signal across the gaps in spotty odor receptor signal. A proto-vertebrate improvement could use a simple proton-cortical circuit as short term memory.

A more complicated improvement is associative pattern recognition. The mosaic striatum model can detect simple patterns, but it can’t generalize, and may be susceptible to noise and distractor signals. Typically, an odor scene will have multiple odor molecules, unlike the simulation’s simplified model. Cortex circuits could filter the noisy inputs and produce a more reliable input to the striatum, replacing direct Ob input with more conceptual O.pir (piriform cortex) engrams.

Odor and place

Although odors can form neighborhoods, they aren’t necessarily precise or reliable. One complicated improvement is dedicated cortical regions devoted to identifying place. O.pir ish the main olfactory cortex in mammals with homologous olfactory cortexes in other vertebrates. In mammals, O.pir.a (anterior O.pir) idenfieis odors, and O.pir.p (posterior O.pir) detects place [Poo C et al 2022]. The neuronal connectivity of O.pir.p resembles parts of E.hc, which is well-known to encode place.

Consider an evolutionary sequence of improvements that starts from a simple striatum mosaic that detects odor neighborhoods, then improves that primitive place detection with more sophisticated cortical place in O.pir.p. That single-mode odor place detection could then combine with other sensory modes using egocentric and allocentric inputs like head direction and landmarks into a more reliable place detection in E.hc.

References

Al-Hasani R, McCall JG, Shin G, Gomez AM, Schmitz GP, Bernardi JM, Pyo CO, Park SI, Marcinkiewcz CM, Crowley NA, Krashes MJ, Lowell BB, Kash TL, Rogers JA, Bruchas MR. Distinct Subpopulations of Nucleus Accumbens Dynorphin Neurons Drive Aversion and Reward. Neuron. 2015 Sep 2;87(5):1063-77. 

Beauséjour PA, Auclair F, Daghfous G, Ngovandan C, Veilleux D, Zielinski B, Dubuc R. Dopaminergic modulation of olfactory-evoked motor output in sea lampreys (Petromyzon marinus L.). J Comp Neurol. 2020 Jan 1;528(1):114-134.

Beauséjour, P.A., Zielinski, B. and Dubuc, R., 2022. Olfactory-induced locomotion in lampreys. Cell and tissue research, pp.1-15.

Beauséjour PA, Veilleux JC, Condamine S, Zielinski BS, Dubuc R. Olfactory Projections to Locomotor Control Centers in the Sea Lamprey. Int J Mol Sci. 2024 Aug 29;25(17):9370. 

Bell, Curtis C., Victor Han, and Nathaniel B. Sawtell. Cerebellum-like structures and their implications for cerebellar function. Annu. Rev. Neurosci. 31 (2008): 1-24.

Berridge KC. Affective valence in the brain: modules or modes? Nat Rev Neurosci. 2019 Apr;20(4):225-234. 

Bond CW, Trinko R, Foscue E, Furman K, Groman SM, Taylor JR, DiLeone RJ. Medial Nucleus Accumbens Projections to the Ventral Tegmental Area Control Food Consumption. J Neurosci. 2020 Jun 10;40(24):4727-4738. 

Britt JP, Benaliouad F, McDevitt RA, Stuber GD, Wise RA, Bonci A. Synaptic and behavioral profile of multiple glutamatergic inputs to the nucleus accumbens. Neuron. 2012 Nov 21;76(4):790-803. 

Castro DC, Cole SL, Berridge KC. Lateral hypothalamus, nucleus accumbens, and ventral pallidum roles in eating and hunger: interactions between homeostatic and reward circuitry. Front Syst Neurosci. 2015 Jun 15;9:90.

Chan, I.C.W., Chen, N., Hernandez, J., Meltzer, H., Park, A. and Stahl, A., 2024. Future avenues in Drosophila mushroom body research. Learning & Memory, 31(5), p.a053863.

Chen, G., Lai, S., Bao, G., Ke, J., Meng, X., Lu, S., Wu, X., Xu, H., Wu, F., Xu, Y. and Xu, F., 2023. Distinct reward processing by subregions of the nucleus accumbens. Cell reports, 42(2).

Chen R, Blosser TR, Djekidel MN, Hao J, Bhattacherjee A, Chen W, Tuesta LM, Zhuang X, Zhang Y. Decoding molecular and cellular heterogeneity of mouse nucleus accumbens. Nat Neurosci. 2021 Dec;24(12):1757-1771.

Da Cunha C, Wietzikoski EC, Dombrowski P, Bortolanza M, Santos LM, Boschen SL, Miyoshi E. Learning processing in the basal ganglia: a mosaic of broken mirrors. Behav Brain Res. 2009 Apr 12;199(1):157-70.

Dale N. Delayed production of adenosine underlies temporal modulation of swimming in frog embryo. J Physiol. 1998 Aug 15;511 ( Pt 1)(Pt 1):265-72. 

D’Ascenzo M, Fellin T, Terunuma M, Revilla-Sanchez R, Meaney DF, Auberson YP, Moss SJ, Haydon PG. mGluR5 stimulates gliotransmission in the nucleus accumbens. Proc Natl Acad Sci U S A. 2007 Feb 6;104(6):1995-2000. 

D’Aquila, PS, 2024. Licking microstructure in response to novel rewards, reward devaluation and dopamine antagonists: possible role of D1 and D2 medium spiny neurons in the nucleus accumbens. Neuroscience & Biobehavioral Reviews, p.105861.

Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21;8(12):e1000567. 

Ding YD, Chen X, Chen ZB, Li L, Li XY, Castellanos FX, Bai TJ, Bo QJ, Cao J, Chang ZK, Chen GM, Chen NX, Chen W, Cheng C, Cheng YQ, Cui XL, Duan J, Fang YR, Gong QY, Hou ZH, Hu L, Kuang L, Li F, Li HX, Li KM, Li T, Liu YS, Liu ZN, Long YC, Lu B, Luo QH, Meng HQ, Peng DH, Qiu HT, Qiu J, Shen YD, Shi YS, Si TM, Tang YQ, Wang CY, Wang F, Wang K, Wang L, Wang X, Wang Y, Wang YW, Wu XP, Wu XR, Xie CM, Xie GR, Xie HY, Xie P, Xu XF, Yang H, Yang J, Yao JS, Yao SQ, Yin YY, Yuan YG, Zang YF, Zhang AX, Zhang H, Zhang KR, Zhang L, Zhang ZJ, Zhao JP, Zhou RB, Zhou YT, Zhu JJ, Zhu ZC, Zou CJ, Zuo XN, Yan CG, Guo WB. Reduced nucleus accumbens functional connectivity in reward network and default mode network in patients with recurrent major depressive disorder. Transl Psychiatry. 2022 Jun 6;12(1):236. 

Dobrovitsky V, West MO, Horvitz JC. The role of the nucleus accumbens in learned approach behavior diminishes with training. Eur J Neurosci. 2019 Nov;50(9):3403-3415. 

Domingues, A.V., Carvalho, T.T., Martins, G.J., Correia, R., Coimbra, B., Bastos-Gonçalves, R., Wezik, M., Gaspar, R., Pinto, L., Sousa, N. and Costa, R.M., 2025. Dynamic representation of appetitive and aversive stimuli in nucleus accumbens shell D1-and D2-medium spiny neurons. Nature communications, 16(1), p.59.

Faget L, Oriol L, Lee WC, Zell V, Sargent C, Flores A, Hollon NG, Ramanathan D, Hnasko TS. Ventral pallidum GABA and glutamate neurons drive approach and avoidance through distinct modulation of VTA cell types. Nat Commun. 2024 May 18;15(1):4233. 

Fang, L.Z. and Creed, M.C., 2024. Updating the striatal–pallidal wiring diagram. Nature neuroscience, 27(1), pp.15-27.

Farris, S.M., 2011. Are mushroom bodies cerebellum-like structures?. Arthropod structure & development, 40(4), pp.368-379.

Fee MS. Oculomotor learning revisited: a model of reinforcement learning in the basal ganglia incorporating an efference copy of motor actions. Front Neural Circuits. 2012 Jun 27;6:38. 

Fee MS. The role of efference copy in striatal learning. Curr Opin Neurobiol. 2014 Apr;25:194-200. 

Fisher, A.A., Gonzalez, L.S., Cappel, Z.R., Grover, K.E., Waclaw, R.R. and Robinson, J.E., 2025. Dopaminergic encoding of future defensive actions in the mouse nucleus accumbens. PNAS nexus, 4(5), p.pgaf128.

Floresco SB. The nucleus accumbens: an interface between cognition, emotion, and action. Annu Rev Psychol. 2015 Jan 3;66:25-52.

Hook, V., Toneff, T., Baylon, S. and Sei, C., 2008. Differential activation of enkephalin, galanin, somatostatin, NPY, and VIP neuropeptide production by stimulators of protein kinases A and C in neuroendocrine chromaffin cells. Neuropeptides, 42(5-6), pp.503-511.

Humphries MD, Prescott TJ. The ventral basal ganglia, a selection mechanism at the crossroads of space, strategy, and reward. Prog Neurobiol. 2010 Apr;90(4):385-417. 

Jacobs L. F. (2012). From chemotaxis to the cognitive map: the function of olfaction. Proc. Natl. Acad. Sci. U.S.A. 109(Suppl. 1) 10693–10700

Jacobs LF. How the evolution of air breathing shaped hippocampal function. Philos Trans R Soc Lond B Biol Sci. 2022 Feb 14;377(1844):20200532. 

Kang S, Hong SI, Lee J, Peyton L, Baker M, Choi S, Kim H, Chang SY, Choi DS. Activation of Astrocytes in the Dorsomedial Striatum Facilitates Transition From Habitual to Goal-Directed Reward-Seeking Behavior. Biol Psychiatry. 2020 Nov 15;88(10):797-808.

Konradi, C., Macı́as, W., Dudman, J.T. and Carlson, R.R., 2003. Striatal proenkephalin gene induction: coordinated regulation by cyclic AMP and calcium pathways. Molecular brain research, 115(2), pp.157-161.

Kornfeld, J., Januszewski, M., Schubert, P., Jain, V., Denk, W. and Fee, M.S., 2020. An anatomical substrate of credit assignment in reinforcement learning. BioRxiv, pp.2020-02.

Lahiri AK, Bevan MD. Dopaminergic Transmission Rapidly and Persistently Enhances Excitability of D1 Receptor-Expressing Striatal Projection Neurons. Neuron. 2020 Apr 22;106(2):277-290.e6. 

Ma L, Day-Cooney J, Benavides OJ, Muniak MA, Qin M, Ding JB, Mao T, Zhong H. Locomotion activates PKA through dopamine and adenosine in striatal neurons. Nature. 2022 Nov;611(7937):762-768. 

Marin AC, Schaefer AT, Ackels T. Spatial information from the odour environment in mammalian olfaction. Cell Tissue Res. 2021 Jan;383(1):473-483. 

Marinescu AM, Labouesse MA. The nucleus accumbens shell: a neural hub at the interface of homeostatic and hedonic feeding. Front Neurosci. 2024 Jul 30;18:1437210. 

Montgomery, John C., David Bodznick, and Kara E. Yopak. The cerebellum and cerebellum-like structures of cartilaginous fishes. Brain Behavior and Evolution 80.2 (2012): 152-165.

O’Donnell P, Grace AA. Synaptic interactions among excitatory afferents to nucleus accumbens neurons: hippocampal gating of prefrontal cortical input. J Neurosci. 1995 May;15(5 Pt 1):3622-39.

Poo C, Agarwal G, Bonacchi N, Mainen ZF. Spatial maps in piriform cortex during olfactory navigation. Nature. 2022 Jan;601(7894):595-599.

Ramirez F, Moscarello JM, LeDoux JE, Sears RM. Active avoidance requires a serial basal amygdala to nucleus accumbens shell circuit. J Neurosci. 2015 Feb 25;35(8):3470-7. 

Reed SJ, Lafferty CK, Mendoza JA, Yang AK, Davidson TJ, Grosenick L, Deisseroth K, Britt JP. Coordinated Reductions in Excitatory Input to the Nucleus Accumbens Underlie Food Consumption. Neuron. 2018 Sep 19;99(6):1260-1273.e4. 

Sesack SR, Grace AA. Cortico-Basal Ganglia reward network: microcircuitry. Neuropsychopharmacology. 2010 Jan;35(1):27-47.

Soares-Cunha C, de Vasconcelos NAP, Coimbra B, Domingues AV, Silva JM, Loureiro-Campos E, Gaspar R, Sotiropoulos I, Sousa N, Rodrigues AJ. Nucleus accumbens medium spiny neurons subtypes signal both reward and aversion. Mol Psychiatry. 2020 Dec;25(12):3241-3255. 

Suryanarayana, S. M., Perez-Fernandez, J., Robertson, B., & Grillner, S. (2021). Olfaction in lamprey pallium revisited—dual projections of mitral and tufted cells. Cell Reports, 34(1).

Walle R, Petitbon A, Fois GR, Varin C, Montalban E, Hardt L, Contini A, Angelo MF, Potier M, Ortole R, Oummadi A, De Smedt-Peyrusse V, Adan RA, Giros B, Chaouloff F, Ferreira G, de Kerchove d’Exaerde A, Ducrocq F, Georges F, Trifilieff P. Nucleus accumbens D1- and D2-expressing neurons control the balance between feeding and activity-mediated energy expenditure. Nat Commun. 2024 Mar 21;15(1):2543. 

Weiss, L., 2020. Information processing in the olfactory system of different amphibian species (Doctoral dissertation, Dissertation, Göttingen, Georg-August Universität, 2020).

Wilson, R.I. and Mainen, Z.F., 2006. Early events in olfactory processing. Annu. Rev. Neurosci.29(1), pp.163-201.

Yu X, Taylor AMW, Nagai J, Golshani P, Evans CJ, Coppola G, Khakh BS. Reducing Astrocyte Calcium Signaling In Vivo Alters Striatal Microcircuits and Causes Repetitive Behavior. Neuron. 2018 Sep 19;99(6):1170-1187.e9.

Essay 22 issues: subthalamic nucleus simulation

The essay 22 simulation explored a striatum model where the two decision paths competed: odor seeking vs random exploration, using dopamine to bias between exploration and seeking. This model resembled striatum theories like [Bariselli et al. 2020] that consider the stratum’s direct and indirect paths as competing between approach and avoidant actions.

Issues in essay 22 include both neuroscience divergence and simulation problems. Although the simulation is a loose functional model, that laxity isn’t infinite and it may have gone too far from the neuroscience.

Adenosine and perseveration

Seeking and foraging have a perseveration problem: the animal must eventually give up on a failed cue, or it will remain stuck forever. The give-up circuit in essay 22 uses the lateral habenula (Hb.l) to integrate search time until it reaches a threshold to give up. An alternative circuit in the stratum itself involves the indirect path (S.d2), the D2 dopamine receptor and adenosine, with a behaviorally relevant time scale.

When fast neurotransmitters are on the order of 10 milliseconds, creating a timeout on the order of a few minutes is a challenge. Two possible solutions in that timescale are long term potentiation (LTP) where “long” means about 20 minutes, and astrocyte calcium accumulation, which is also about 10 to 20 minutes.

Adenosine receptors (A2r) in the striatum indirect path (S.d2) measure broad neural activity from ATP byproducts that accumulate in the intercellular space. Over 10 minutes those A2r can produce internal calcium ion (Ca) in the astrocytes or via LTP to enhance the indirect path. Enhancing the indirect path (exploration), eventually causes a switch from the direct path (seeking) to exploration, essentially giving-up on the seeking.

Ventral striatum

Although the essay models the dorsal striatum (S.d), the ventral striatum (S.v aka nucleus accumbens) is more associated with exploration and food seeking. In particularly, the olfactory path for food seeking goes through S.v, while midbrain motor actions use S.d. In salamanders, the striatum only processes midbrain (“collo-“) thalamic inputs, while olfactory and direct senses (“lemno-“) go to the cortex [Butler 2008]. Assuming the salamander path is more primitive, the essay’s use of S.d in the model is a likely mistake.

But S.v raises a new issue because S.v doesn’t use the subthalamus (H.stn) [Humphries and Prescott 2009]. Although, that model only applies to the S.v shell (S.sh) not the S.v core (S.core).

Ventral striatum pathway. MLR midbrain locomotive region, P.v ventral pallidum, S.sh ventral striatum shell, Vta ventral tegmental area.

In the above diagram of a striatum shell circuit, an odor-seek path is possible through the ventral tegmental area (Vta) but there is no space for an alternate explore path.

Low dopamine and perseveration

[Rutledge et al. 2009] investigates dopamine in the context of Parkinson’s disease (PD), which exhibits perseveration as a symptom. In contrast to the essay, PD is a low dopamine condition, and adding dopamine resolves the perseveration. But that resolve is the opposite of essay 22’s dopamine model, where low dopamine resolved perseveration.

Now, it’s possible that give-up perseveration and Parkinson’s perseveration are two different symptoms, or it’s possible that the complete absence of dopamine differs from low tonic dopamine, but in either case, the essay 22 model is too simple to explain the striatum’s dopamine use.

Dopamine burst vs tonic

Dopamine in the striatum has two modes: burst and tonic. Essay 22 uses a tonic dopamine, not phasic. The striatum uses phasic dopamine to switch attention to orient to a new salient stimulus. The phasic dopamine circuit is more complicated than the tonic system because it requires coordination with acetylcholine (ACh) from the midbrain laterodorsal tegmentum (V.ldt) and pedunculopontine (V.ppt) nuclei.

A question for the essays is whether that phasic burst is primitive to the striatum, or a later addition, possibly adding an interrupt for orientation to an earlier non-interruptible striatum.

Explore semantics

The word “explore” is used differently by behavioral ecology and in reinforcement learning, despite both using foraging-like tasks. These essays have been using explore in the behavioral ecology meaning, which may cause confusion on the reinforcement learning sense. The different centers on a fixed strategy (policy) compared with changing strategies.

In behavioral ecology, foraging is literal foraging, animals browsing or hunting in a place and moving on (giving up) if the place doesn’t have food [Owen-Smith et al. 2010]. “Exploring” is moving on from an unproductive place, but the policy (strategy) remains constant because moving on is part of the strategy. The policy for when to stay and when to go [Headon et al. 1982] often follows the marginal value theorem [Charnov 1976], which specifies when the animal should move on.

In contract, reinforcement learning (RL) uses “explore” to mean changing the policy (strategy). For example, in a two-armed bandit situation (two slot machines), the RL policy is either using machine A or using machine B, or a fixed probabilistic ratio, not a timeout and give-up policy. In that context, exploring means changing the policy not merely switching machines.

[Kacelnick et al. 2011] points out that the two-choice economic model doesn’t match vertebrate animal behavior, because vertebrates use an accept-reject decision [Cisek and Hayden 2022]. So, while the two-armed bandit may be useful in economics, it’s not a natural decision model for vertebrates.

Avoidance (nicotinic receptors in M.ip)

The simulation uncovered a foraging problem, where the animal remained around an odor patch it had given up on, because the give-up strategy reverts to random search. Instead, the animal should leave the current place and only resume search when its far away.

Path of simulated animal after giving up on a food odor.

In the diagram above, the animal remains near the abandoned food odor. The tight circles are the earlier seek before giving up, and the random path afterwards is the continued search. A better strategy would leave the green odor plume and explore other areas of the space.

As a possible circuit, the habenula (Hb.m) projects to the interpeduncular nucleus (M.ip) uses both glutamate and ACh as neurotransmitters, where ACh amplifies neural output. For low signals without ACh, the animal approaches the object, but high signals with ACh switch approach to avoidance. This avoidance switching is managed by the nicotine receptor (each) which is studied for nicotine addiction [Lee et al. 2019].

An interesting future essay might explore using nicotinic aversion to improve foraging by leaving an abandoned odor plume.

References

Bariselli S, Fobbs WC, Creed MC, Kravitz AV. A competitive model for striatal action selection. Brain Res. 2019 Jun 15;1713:70-79.

Butler, Ann. (2008). Evolution of the thalamus: A morphological and functional review. Thalamus & Related Systems. 4. 35 – 58.

Charnov, Eric L. Optimal foraging, the marginal value theorem. Theoretical population biology 9.2 (1976): 129-136.

Cisek P, Hayden BY. Neuroscience needs evolution. Philos Trans R Soc Lond B Biol Sci. 2022 Feb 14;377(1844):20200518.

Headon T, Jones M, Simonon P, Strummer J (1982) Should I Stay or Should I Go. On Combat Rock. CBS Epic.

Humphries MD, Prescott TJ. The ventral basal ganglia, a selection mechanism at the crossroads of space, strategy, and reward. Prog Neurobiol. 2010 Apr;90(4):385-417.

Kacelnik A, Vasconcelos M, Monteiro T, Aw J. 2011. Darwin’s ‘tug-of-war’ vs. starlings’ ‘horse-racing’: how adaptations for sequential encounters drive simultaneous choice. Behav. Ecol. Sociobiol. 65, 547-558.

Lee HW, Yang SH, Kim JY, Kim H. The Role of the Medial Habenula Cholinergic System in Addiction and Emotion-Associated Behaviors. Front Psychiatry. 2019 Feb 28

Owen-Smith N, Fryxell JM, Merrill EH. Foraging theory upscaled: the behavioural ecology of herbivore movement. Philos Trans R Soc Lond B Biol Sci. 2010 Jul 27;365(1550):2267-78. 

Rutledge RB, Lazzaro SC, Lau B, Myers CE, Gluck MA, Glimcher PW. Dopaminergic drugs modulate learning rates and perseveration in Parkinson’s patients in a dynamic foraging task. J Neurosci. 2009 Dec 2

15: Seeking Food: Perseveration and Habituation

Essay 15 is adding food-seeking to the simulated slug. Before the change in essay 14, the slug didn’t seek from a distance, but it does slow when it’s above food to improve feeding efficiency. The slug doesn’t have food-approach behavior, but it does have consummatory behavior. Because the slug doesn’t seek food, it only finds food when it randomly crosses a tile. Most of its movement is random, except for avoiding obstacle.

In the screenshot above, the slug is moving forward with no food senses and no food approach. Its turns are for obstacle avoidance. The food squares have higher visitation because of the slower movement over food. Notice that all areas are visited, although there is a statistical variation because of the obstacle.

Although the world is tiled for simulation simplicity, the slug’s direction, location, and movement is floating-point based. The simulation isn’t an integer world. This continuous model means that timing and turn radius matters.

The turning radius affects behavior in combination with timing, like the movement-persistence circuit in essay 14. The tuning affects the heat map. Some turning choices result in the animal spending more time turning in corners when the dopamine runs out. This turn-radius dependence occurs in animals as well. The larva zebrafish has 13 stereotyped basic movements, and its turns have different stereotyped angles depending on the activity.

Food approach

Food seeking adds complexity to the neural circuit, but it’s still uses a Braitenberg vehicle architecture. (See the discussion on odor tracking for avoided complexity.) Odor sensors stimulate muscles in uncrossed connections to approach the odor. Touch sensors use crossed connections to avoid obstacles. For simultaneous odor and touch, an additional command neuron layer resolves the conflict to favor touch.

Circuit for obstacle avoidance and food approach for simulated slug.

The command neurons correspond to vertebrate reticulospinal neurons (B.rs) in the hindbrain. Interestingly, the zebrafish circuit does seem to have direct connections from the touch sensors to B.rs neurons, exactly as pictured. In contrast, the path from odor receptors to B.rs neurons is a longer, more complicated path.

For the slug’s evolutionary parallel, the odor’s attractively is hardcoded, as if evolution has selected an odor that leads to food. Even single-celled animals follow attractive chemicals and avoid repelling chemicals, and in mammals some odors are hardcoded as attractive or repelling. For now, no learning is occurring.

Perseveration (tunnel vision)

Unfortunately, this simple food circuit has an immediate, possibly fatal, problem. Once the slug detects an odor, it’s stuck moving toward it because our circuit can’t break the attraction. Although the slug never stops, it orbits the food scent, always turning toward it. The hoped-for improvement of following an odor to find food is a disaster.

In psychology, this inability to switch away is called perseveration, which is similar to tunnel vision but more pathological. Once a goal is started, the person can’t break away. In reinforcement learning terminology, the inability to switch is like an animal stuck on exploiting and incapable of breaking away to explore.

In the screenshot, the heat map shows the slug stuck on a single food tile. The graph shows the slug turning counter-clockwise, slowed down (sporadically arrested) over the food.

To solve the problem, one option is to re-enabled satiation for the simulation, as was added in essay 14, but satiation only solves the problem if the tile has enough food to satiate the animal. Unfortunately, the tile might have a lingering odor but no food, or possibly an evolutionary food odor that’s unreliable, only signaling food 25% of the time. Instead, we’ll introduce habituation: the slug will become bored of the odor and start ignoring it for a time.

In the fruit fly the timescale for habituation is about 20 minutes to build up and 20-40 minutes to recover. The exact timing is likely adjustable because of the huge variability in biochemical receptors. So, 20 minutes probably isn’t a hard constant across different animals or circuits for habituation, but more of a range between a few minutes or an hour or so.

Because the minutes to hour timescale for habituation is much wider than the 5ms to 2s range for neurotransmitters, the biochemical implementation is very different. Habituation seems to occur by adding receptors to increase receptor and/or adding more neurotransmitter generators to produce a bigger signal.

Fruit fly odor habituation

[Das 2011] studies the biochemical circuit for fruit fly odor habituation. The following diagram tries to capture the essence of the circuit. The main, unhabituated connection is from the olfactory sensory neuron (ORN) to the projection neuron (PN), which projects to the mushroom body. The main fast neurotransmitter for insects is acetylcholine (ACh), represented by beige.

The key player in the circuit is the NMDA receptor on the PN neuron’s dendrite, which works with the inhibitory LN1 GABA neuron to increase inhibition over time to habituate the odor.

The LN1 neuron drives habituation. Its GABA neurotransmitter release inhibits PN, which reduces the olfactory signal. Because LN1 itself can be inhibited, this circuit allows for a quick reversal of habituation. Habituation adds to simple inhibition by increasing the synapse strength (weight) over time when its used and decreasing the weight when its idle.

An NMDA receptor needs both a chemical stimulus (glutamate and glycine) and a voltage stimulus (post-synaptic activation, PN in this case). When activated, it triggers a long biochemical chain with many genetic variations to change synapse weight. In this case, it triggers a retrograde neurotransmitter (such as nitrous oxide, NO) to the pre-synaptic LN1 axon, directing it to add new GABA release vesicles. Because adding new vesicles takes time (20 minutes) and removing the vesicles also takes time (20-40 minutes), habituation adds longer, useful behavior that will help solve the odor perseveration problem.

Slug with habituation

The next slug simulation adds trivial habituation following the example of the fruit fly. The simulation adds a value when the slug senses an odor and decrements the value when the slug doesn’t detect an odor. When habituation crosses a threshold, the odor sensor is cut off from the command neurons. The behavior and heat map look like the following:

As the heat map shows, habituation does solve the fatal problem of perseveration for food approach. The slug visits all the food nodes without having any explicit directive to visit multiple nodes. Adding a simple habituation circuit, creates behavior that looks like an explore vs exploit pattern without the simulation being designed around reinforcement learning. Explore vs exploit emerges naturally from the problem itself.

In the screenshot, the bright tile has no intrinsic meaning. Because habituation increases near any food tile, it can only decay when away from food. That bright tile is near a big gap that lets habituation to drop and recharge.

The code looks like the following:

impl Habituation {
fn update_habituation(&mut self, is_food_sensor: bool) {
if is_food {
self.food = (self.food + Self::INC).min(1.);
} else {
self.food = (self.food - Self::DEC).max(0.);
}
}
fn is_active(&self) -> bool {
self.food < Self::THRESHOLD
}
}

Discussion

The essays aren’t designed as solutions; they’re designed as thought experiments. So, their main value is generally the unexpected implementation details or issues that come up in the simulation. For example, how habituation thresholds and timing affect food approach. The bright tile in the last heat map occurs because that food source is isolated from other food sources.

If the slug passes near food sources, the odor will continually recharge habituation and it won’t decay enough to re-enable chemotaxis. The slug needs to be away from food for some time for odor tracking to re-enable. In theory, this behavior could be a problem for an animal. Suppose the odor range is very large and the animal is very slow, like a slug. If simple habituation occurs, the slug might habituate to the odor before it reaches the food, making it give up too soon.

As a possible solution, the LN1 inhibitory neuron that implements habituation could itself be disabled, although that beings back the issue of the animal getting stuck. But perhaps it would instead be diminished instead of being cut off, giving the animal persistence without devolving into perseveration.

Odor vs actual food

Another potential issue is the detection of food itself as opposed to just its odor. If the food tile has food, the animal should have more patience than if the tile is empty with just the food odor. That scenario might explain why other habituation circuits include serotonin or dopamine as modulators. If food is actually present, there should be less habituation.

The issue of precise values raises calibration as an issue, because evolution likely can’t precisely calibrate cutoff values or calibrate one neuron to another. Some of the habituation and related synapse adjustment may simply be calibration, adjusting neuron amplification to the system. In a sense, that calibration would be learning but perhaps atypical learning.

References

Das, Sudeshna, et al. “Plasticity of local GABAergic interneurons drives olfactory habituation.” Proceedings of the National Academy of Sciences 108.36 (2011): E646-E654.

Shen Y, Dasgupta S, Navlakha S. Habituation as a neural algorithm for online odor discrimination. Proc Natl Acad Sci U S A. 2020 Jun 2;117(22):12402-12410. doi: 10.1073/pnas.1915252117. Epub 2020 May 19. PMID: 32430320; PMCID: PMC7275754.