Essay 39: Food Zone

H.l (lateral hypothalamus) is a key node in the foraging system and has an interesting capability of distinguishing a food zone from a non-food zone [Jennings et al 2015]. In a sense foraging is searching for a food zone and then eating.

Foraging as a state machine.

The above diagram shows the foraging phases that I’ve already covered in earlier essays. Importantly, each phase is an independent action path as part of a distributed system, not merely a state in a state machine. To force the separate action paths to act like a state machine, each transition needs to suppress the preceding and following state. In particular the eating phase needs to inhibit the seeking system. This lateral inhibition is important because circuitry is required to force activation of only a single system at a time.

The food zone is particularly interesting for filter feeding, which is naturally area based and long term, as opposed to snapping up a morsel of food. Non-vertebrate chordates are filter feeders, lamprey larvae are filter feeders, and early jawless vertebrates were also likely filter feeders [D’Aniello et al 2023]. Tunicate ascidians, the closest non-vertebrate chordates, have an extreme version of this foraging loop, where the tadpoles find a feeding place after swimming for 12 hours and then settling in place for their adult life [Anselmi et al 2024]. The ascidian foraging state marine is a straight line that ends in the eating phase in the food zone, not continuing in a loop. The ascidian search and settle might give a hint how the vertebrate foraging circuitry is organized.

Ascidians

As covered in essay 30, the ascidian larva nervous system has several seeking (taxis) systems: geotaxis (gravity avoidance – moving up), phototaxis (light avoidance), and dimming for predator and obstacle avoidance. Ascidian navigation disperses the larva from its parent and prefers to settle on the underside of ledges by avoiding gravity while avoiding light. Its settling sensors also avoid toxic or irritating areas and may try to find food-friendly areas, although the specific sensor capabilities aren’t well known. When the larva finds an appropriate place, around 14 hours after hatching, it settles for life [Hoyer et al 2024].

Functional organization of the ascidian larva navigation and settling circuit.

The above diagram is a functional representation of the ascidian larva navigation brain. For this essay the important part is the palp and food-zone sensor and the settling neurons that inhibit motor neurons. The palms are three tentacle-like protrusions from the larva head, which attach the ascidian to a rock with cement glands [Johnson et al 2024]. They contain chemosensory and mechanosensors that distinguish the settling zone from non-settling zones [Hoyer et al 2024]. Interestingly, the genetic markers for the palp neurons are similar to markers for the vertebrate forebrain.

Head cement glands still exist in some fish larvae [Pottin et al 2010] and most frog tadpoles [Nokhbatolfoghahai and Downie 2005], [Rétaux and Pottin 2011], [Sive and Bradley 1996]. Frog tadpoles will swim up and attach to the underside of leaves or to the water surface. This cement gland and settling system may have existed in the pre-vertebrate ancestor and shared for tunicates and vertebrates. Unlike the ascidians the pre-vertebrates likely did not permanently settle. For the sake of this essay, let’s assume they temporarily settled to filter feed in a location and only moved on if filter feeding was unsuccessful or if forced to move by predators, competitors, or environmental hazards.

Ascidian larva navigation and palp settling circuit with the settling circuit highlighted. Each of the boxes represents a single neuron or a small (5-10) group of neurons. Labels are neuron names.

The above diagram shows specific neurons in the ascidian larva brain. The importance here is the glutamate pnIN (palp interneuron) to GABA pnRN (palp relay neuron), which inhibits all motor neurons and interneurons. Comparing vertebrate and ascidian neural systems is sketchy and probably should be avoided because both have diverged [Holland 2016]. For this essay, I’ll ignore that sound advice to try to motivate part of the vertebrate nervous system.

H.stn as an analogous node to the settling neurons. H.stn (subthalamic nucleus), MLR (midbrain locomotor region), Ob (olfactory bulb), OT (optic tectum), P.v (ventral pallidum), R.rs (reticulospinal motor command), S.v (ventral striatum), S.nr (substantia nigra pars reticulata), V.pt (posterior tubuculum)

The above diagram shows the H.stn (subthalamic nucleus) as fulfilling a similar role as the pnIN from ascidian Ciona, suppressing seek in preparation for eating. Part of P.v (ventral pallidum) suppresses S.v (ventral striatum) during eating [Vachez et al 2021]. This P.v “arkypallidal” subset is named after similar neurons in P.ge (globus pallidus) that suppresses S.d (dorsal striatum). Although the driver of this eating suppression isn’t known, the timing of the arkypallidal activation closely matches V.dr serotonin food activation [Spring and Nautiyal 2024], ramping at the end of seek and peaking after eating. Also, H.stn and P.ge form an oscillating pair, evident in Parkinson’s disease. So, it’s plausible that H.stn drives persistent suppression of the seek path in S.v through its projection to P.v, possibly influenced or driven by V.dr (dorsal raphe, serotonin). This specific path is speculation but seems compatible with experiments. The second suppression path is the well-known H.stn to S.nr (substantia nigra pars reticulata) that suppresses motor activity. Snr has a widespread suppression or MLR (midbrain locomotor region), R.rs (reticulospinal motor command), and Snr suppresses Snc (substantia nigra parsa compacta dopamine). Note that the medial H.stn, the area connected with P.v, merges with H.l with minimal boundary [Haynes and Haber 2013].

Food zone

Let’s return to the H.l food zone in [Jennings et al 2015] and consider where the food zone information might come from. Following [Jacobs 2012], let’s treat olfaction as the central sense for navigation, which is particularly compelling for food zones.

The diagram below shows the H.l main connectivity. Not displayed is the H.l internal sensing of nutrient information peptides like glucose sensing and leptin fat sensing. H.l doesn’t receive direct sensory input with the exception of R.pb (parabrachial nucleus), which sends nociceptive information like itch or pain. Because an itchy or painful place is a poor choice for filter feeding, this R.pb input is negative place information for a filter-feeding zone, but R.pb doesn’t give positive reasons to stay like food odors.

H.l connectivity encompasses much of the limbic system, driven by olfactory information. A.bl (basolateral amygdala), E.hc (hippocampus), F.pfc (prefrontal cortex), H.arc (hypothalamus arcuate), H.l (lateral hypothalamus), H.pv (paraventricular hypothalamus), H.stn (subthalamic nucleus), Hb.l (lateral habenula), M.pag (periaqueductal gray), Ob (olfactory bulb), O.pir (piriform cortex), P.bst (bed nucleus of the stria terminalis), P.v (ventral pallidum), R.pb (parabrachial), S.a (central amygdala), S.ls (lateral striatum), S.v (ventral striatum), V.dr (dorsal raphe – serotonin), Vta (ventral tegmental area – dopamine)

As the diagram suggests, the information H.l receives about food sources is very abstract. It receives cue information from A.bl (basolateral amygdala), place information from E.hc (hippocampal complex), value-like information from F.ofc (orbitofrontal cortex) and task-like information from F.vm (ventromedial prefrontal cortex). All of those areas are strongly connected with the olfactory system. While H.l doesn’t receive odor place information directly from sensors, it receives multiple organizational perspectives on odor information. P.bst (bed nucleus of the stria terminalis) receives very similar olfactory input as H.l, and it also receives negative information from R.pb. However, R.pb sends different nociceptive information to the S.a (central amygdala)/P.bst extended amygdala than it sends to H.l [Arthurs et al 2023]. The R.pb projections to H.l compared to S.a/P.bst are not redundant.

Not only are the H.l inputs abstract, but the outputs are also abstract, in contrast to direct action paths. This abstraction might be a later evolutionary development, similar to V.pt (posterior tuberculum) in zebrafish. V.pt is roughly homologous to Vta (ventral tegmental area) in mammals, but V.pt has more direct locomotor output to MLR (midbrain locomotor region), while most of Vta’s output is generally abstract.

As a note, the diagram does not include H.l ox (orexin) or H.l mch (melanin-concentrating hormone), partially for simplicity and partially because the zebrafish H.l is distinct from the ox and mch populations, suggesting that the mammalian ox and mch areas of H.l can be separated from the rest of H.l function. The diagram also omits some other connections like Ppt (pedunculopontine nucleus).

Food and serotonin

Returning to the foraging state diagram, it’s important that each “state” is a large, distributed, complex system, not a state in a state machine. The seek state includes areas like S.v, Vta, H.l, E.hc, F.pfc, and the motor regions MLR and R.rs (reticulospinal motor command) with the help of cortical areas and can include OT (optic tectum). Although the eating state is small, it is still comprised of many areas, including V.dr (dorsal raphe), OT.d, R.my.irt (medulla eating), H.l, H.pstn (parasubthalamic nucleus), R.pb and possibly some Vta and S.v subareas. Although the system is not a state machine, each “state” needs to laterally suppress the other systems to prevent multiple action paths from colliding.

Foraging state machine with dopamine and serotonin modulation. DA (dopamine), V.dr (dorsal raphe), Vta ventral tegmental area, 5HT (serotonin).

The split between eat and seek is important, because many studies merge the behavior into a general category “feeding.” Because some experiments only measure total feeding, it can be difficult to distinguish whether the experiment is measuring a seek effect or an eating effect. For example, eating needs to suppress seek to keep the animal from wandering away from the food. If an experiment stimulates eat but inhibits seek, the animal might not search for food even if it’s ready to eat. If it doesn’t seek food, it doesn’t find food.

This distinction between eating and seeking is exhibited by the question of serotonin, which is a heterogeneous system that has a role in feeding. The serotonin from V.dr is a heterogenous system with V.dr having at least 14 different genetic clusters [Okaty et al 2020] with at least 11 different projection patterns [Ren et al 2014]. Earlier studies noted that 30% of V.dr were active during eating [Fornal et al 1996], and many others have noted V.dr being active for “reward” (eating).

Suppose one component of V.dr serotonin encourages eating while discouraging seeking. If an experiment floods the brain with serotonin, it might see total feeding drop because serotonin suppresses seeking food, even if it encourages long meals when it finds food. The confusion becomes greater for studies looking for the even more abstract “reward” as opposed to concrete eating. The point being that serotonin in particular is a complicated system, not reducible to a single value or function.

Eating related effects of serotonin. DA (dopamine), H.arc (hypothalamus arcuate), H.stn (subthalamic nucleus), P.v (ventral pallidum), S.nr (substantia nigra pars reticulata), V.dr (dorsal raphe), Vta (ventral tegmental area), Vta.g (GABA neurons of Vta), 5HT (serotonin)

The above diagram shows some of the eating-related projections. Only a few of the 14 V.dr subtypes are know. The V.dr to Vta connection is one of the known projections and drives the seek system [Courtiol et al 2021], [Wang HL et al 2019]. Unfortunately, the other projections are not known, in particular the 30% of V.dr that is active while eating [Bromberg-Martin et al 2010].

V.dr enhances satiety with 5HT2c.q (serotonin G-q stimulating receptor) in H.arc POME satiety neurons, which suppresses the AgRP hunger peptide. Note that AgRP drops just before eating, suggesting that it’s a seek-promoting system, but an eating-promoting system [Bhave and Nettow 2021]. The prediction suppression only occurs after training and V.dr serotonin shows inverse behavior, possibly suggesting V.dr as suppressing H.arc. Untrained V.dr serotonin only responds after tasting [Li et al 2016], but trained V.dr serotonin responds about 2 seconds before eating [Zhong et al 2016].

Filter feeding and foraging theory

Let’s the consider filter feeding using foraging theory. Foraging theory studies how animals browse patches of food, such as a cluster of flowers for a bee or worms in pine cones for birds [Krebs et al 1974] or a hunting spot for a predator. In particular, foraging theory considers how long the animal should stay at a particular patch before deciding to move on: measuring the give up time. A filter-feeding proto-vertebrate needs to decide if the current food rate is good enough to stay at the current food zone.

The MVT (marginal value theorem) suggests that an animal should move on if the current patch has less food than the environment average [Charnov 1976]. MVT has simplifying assumptions that are challenged by the complexity in the world [Pyke 1984], [Wajnberg et al 2006]. MVT assumptions include omniscience, immortality, determinism, no competition, no predation, and no hunger. Some of those complexities are important to the essay, particularly the omniscience. In MVT the animal knows the average environment food value, but this omniscience isn’t plausible for simple animals [Tenhumberg et al 2001], and the essay animal has almost no learning at all. Realistic search is stochastic and can fail, such as a predator hunting, which is particularly important if the animal is starving. Starvation and satiation are also not covered by the MVT. If the animal is starving, it might stick with a non-optimal, low quality food source below the environment average because not finding a better patch is too risky. Simple organisms use rules of thumb instead of complex strategy, and even birds seem to use a constant give up time [Krebs et al 1974].

As a side note, the foraging terms for eating (“exploiting”) and searching for a new patch (“exploring”) have been appropriated by RL (reinforcement learning) [Sutton and Barto 2018] with some differences in meaning. Reinforcement learning use an n-armed bandit (gambling slot machine) model, where exploring means finding the reward rates of the other arms before deciding on the best arm to exploit. The RL focus is on gather information, generally in a finite and persistent system. In contrast, this essay uses the original foraging terminology.

Covered in essay 36, vertebrate food motivation divides into hunger-driven (“homeostatic”) and opportunistic (“hedonic”) foraging. These form two levels of search and involve different circuits with some overlap. When no longer hungry, mice will not eat plain food but will still eat rich food. In terms of foraging theory, hungry mice will stay longer at poor patches, while sated mice will leave more quickly.

Simulation complexity

After starting to implement the simulation, the issue of complication became overwhelming. Specifically, adding the striatum is too complicated. Consider the issue of distinguishing the eating function of dopamine vs serotonin, when both are responsive to eating food. That similarity makes it difficult to find the system function. The system must have developed from a simpler system because the ascidian feeding or amphioxus feeding is not overly complicated. For the sake of the simulation, I’m backing off and considering only the hindbrain and hypothalamus systems, treating the striatum as a later enhancement.

Hypothalamus and raphe nuclei

The core of the simulation is the pair of H.l and V.dr. As mentioned above, H.l is driven by food zone indicators and can drive both seeking and eating. V.dr is responsive to eating and as part of the hindbrain (it derives from r1) it is a good candidate for primitive, tunicate-like filter feeding circuitry.

Simulation eating model. Ob and H.l form the forebrain food zone system, while V.dr and R.nts form the hindbrain eating system. H.l (lateral hypothalamus), Ob (olfactory bulb), R.nts (nucleus of the solitary tract), V.dr (dorsal raphe).

The diagram above is a simplification, where the Ob to H.l connection represents an ancient version of the food zone system. The V.dr to R.nts (nucleus of the solitary tract) connection includes more hindbrain structures such as medulla eating circuits. The simplification has H.l as a food zone controller and V.dr as an eating sustaining manager.

Although V.dr is a serotonin system, not V.dr neurons are non-serotonin, both glutamate and GABA. As mentioned above the V.dr and V.mr (median raphe) serotonin neurons have at least 11-14 distinct neuron types and projection types. For the essay I’m assuming at least one serotonin neuron type is a measure of eating food. In the simulation successful filter feeding increases the serotonin for eating.

Start and sustain

Let’s return to foraging, where the central decision is when to stop exploiting a patch if it’s not effective. Consider a simple where the animal gives up on a patch if the feeding rate drops below a fixed threshold. Filter feeding naturally has delays between starting filter feeding, trapping some prey, and later receiving nutrients in the gut. This raises a problem: the feeding rate is zero until some food is digested, which implies the animal should give up immediately.

Foraging give-up occurs when the combination of a start signal and sustain signal drop below a threshold.

One solution is to prime the system with a start signal. While the start signal exists, the animal won’t leave even if it hasn’t digested any nutrients. In the simulation H.l is responsible for the start signal and V.dr is responsible for both the sustain and for integrating the two systems. The H.l start signal comes from the food zone detection.

However, the start signal raises a new issue because the start signal must stop to allow sustain to act as the primary decision variable. If H.l always sends the food zone signal to V.dr, it will remain active as long as the animal is in the food zone, preventing the animal from leaving the zone. So, H.l itself needs a timeout. The simulation uses a striatum timeout to disable the H.l food zone signal. The striatum connection can either represent the striatum layer between the olfactory and cortical layers and H.l, or it can represent H.l reciprocal input to the striatum.

The start timeout has the same issues as other striatum systems. Specifically, it needs to remain timed out until the animal leaves the food zone.

Simulation

The screenshot below shows the animal feeding from a low-quality food zone. The grey star is a food zone (grey represents poor food). The nearby purple checkerboard is an avoidance zone, representing an aversive area such as itch or high carbon dioxide.

Simulation of the animal filter feeding at a poor food zone just before giving up.

In the screenshot the startup signal from H.l is temporarily sustaining feeding. It will soon timeout and the animal will abandon the food zone.

Avoidance response and search

The simulation adds two other serotonin-based systems: one for avoiding toxic areas and one for search. Avoidance is one of the V.mr functions. The search serotonin represents the V.dr to Vta connection, despite the current essay disabling the seek function. These two functions may not be serotonin functions because V.mr avoidance is largely non-serotonin, and the V.dr to Vta connection is primarily glutamate. Because the avoidance and search are not the primary focus of the essay, I’m putting off the question of accuracy to a later essay.

Discussion

The essay’s big questionable decision is the omission of the striatum, particularly because I’ve already used the striatum for give-up timing. For eating as opposed to seeking, one possible area appears to be S.dl.vl, which is the orobranchial, mouth area [Foster et al 2021]. Because S.dl receives late dopamine from food in the gut, it might be a good candidate for filter feeding sustain.

Map of the striatum. dl (dorsal lateral striatum), dm (dorsal medial striatum), lsh (lateral shell), msh.d (dorsal medial shell), msh.v (ventral medial shell), ot (olfactory tubercle)

A second area is S.msh.d (dorsal medial shell) which responds to hedonic “liking” and drives strong eating [Castro et al 2016], [Richard and Berridge 2011], [Richard et al 2013]. S.msh.d drives H.l, which is central to the essay. In addition S.msh has longer, sustained dopamine (5-10s) contrasted with shorter dopamine in S.dl (100ms) [de Jong et al 2022].

From a motivational perspective, S.dl.vm and S.msh.d are strong candidates, but they lack the lateral inhibition of seek that’s necessary for the state machine to work. S.dl.vl also works through OT.d.l (optic tectum deep motor areas), which would add more complexity to this essay. In contrast the V.dr serotonin is already part of the hindbrain motor areas, and serotonin is already inhibitory toward seek. V.dr requires fewer additional systems to work. For future work, the two striatum areas are strong areas to research.

References

Anselmi C, Fuller GK, Stolfi A, Groves AK, Manni L. Sensory cells in tunicates: insights into mechanoreceptor evolution. Front Cell Dev Biol. 2024 Mar 14;12:1359207. 

Arthurs JW, Pauli JL, Palmiter RD. Activation of Parabrachial Tachykinin 1 Neurons Counteracts Some Behaviors Mediated by Parabrachial Calcitonin Gene-related Peptide Neurons. Neuroscience. 2023 May 1;517:105-116. 

Bhave VM, Nectow AR. The dorsal raphe nucleus in the control of energy balance. Trends Neurosci. 2021 Dec;44(12):946-960.

Bromberg-Martin ES, Hikosaka O, Nakamura K. Coding of task reward value in the dorsal raphe nucleus. J Neurosci. 2010 May 5;30(18):6262-72.

Castro DC, Cole SL, Berridge KC. Lateral hypothalamus, nucleus accumbens, and ventral pallidum roles in eating and hunger: interactions between homeostatic and reward circuitry. Front Syst Neurosci. 2015 Jun 15;9:90.

Charnov, E. L. (1976b). Optimal foraging: The marginal value theorem. Theoretical Popula- tion Biology, 9, 129–136.

Courtiol E, Menezes EC, Teixeira CM. Serotonergic regulation of the dopaminergic system: Implications for reward-related functions. Neurosci Biobehav Rev. 2021 Sep;128:282-293.

D’Aniello S, Bertrand S, Escriva H. Amphioxus as a model to study the evolution of development in chordates. Elife. 2023 Sep 18;12:e87028. 

de Jong JW, Fraser KM, Lammel S. Mesoaccumbal Dopamine Heterogeneity: What Do Dopamine Firing and Release Have to Do with It? Annu Rev Neurosci. 2022 Jul 8;45:109-129. 

Fornal CA, Metzler CW, Marrosu F, Ribiero-do-Valle LE, Jacobs BL. A subgroup of dorsal raphe serotonergic neurons in the cat is strongly activated during oral-buccal movements. Brain Res. 1996 Apr 15;716(1-2):123-33.

Foster NN, Barry J, Korobkova L, Garcia L, Gao L, Becerra M, Sherafat Y, Peng B, Li X, Choi JH, Gou L, Zingg B, Azam S, Lo D, Khanjani N, Zhang B, Stanis J, Bowman I, Cotter K, Cao C, Yamashita S, Tugangui A, Li A, Jiang T, Jia X, Feng Z, Aquino S, Mun HS, Zhu M, Santarelli A, Benavidez NL, Song M, Dan G, Fayzullina M, Ustrell S, Boesen T, Johnson DL, Xu H, Bienkowski MS, Yang XW, Gong H, Levine MS, Wickersham I, Luo Q, Hahn JD, Lim BK, Zhang LI, Cepeda C, Hintiryan H, Dong HW. The mouse cortico-basal ganglia-thalamic network. Nature. 2021 Oct;598(7879):188-194. 

Haynes WI, Haber SN. The organization of prefrontal-subthalamic inputs in primates provides an anatomical substrate for both functional specificity and integration: implications for Basal Ganglia models and deep brain stimulation. J Neurosci. 2013 Mar 13;33(11):4804-14. 

Holland, L. Z. (2016). Tunicates. Current Biology, 26(4), R146-R152.

Hoyer J, Kolar K, Athira A, van den Burgh M, Dondorp D, Liang Z, Chatzigeorgiou M. Polymodal sensory perception drives settlement and metamorphosis of Ciona larvae. Curr Biol. 2024 Mar 25;34(6):1168-1182.e7. 

Jacobs L. F. (2012). From chemotaxis to the cognitive map: the function of olfaction. Proc. Natl. Acad. Sci. U.S.A. 109(Suppl. 1) 10693–10700 10.1073/pnas.1201880109 

Jennings JH, Ung RL, Resendez SL, Stamatakis AM, Taylor JG, Huang J, Veleta K, Kantak PA, Aita M, Shilling-Scrivo K, Ramakrishnan C, Deisseroth K, Otte S, Stuber GD. Visualizing hypothalamic network dynamics for appetitive and consummatory behaviors. Cell. 2015 Jan 29;160(3):516-27. 

Johnson CJ, Razy-Krajka F, Zeng F, Piekarz KM, Biliya S, Rothbächer U, Stolfi A. Specification of distinct cell types in a sensory-adhesive organ important for metamorphosis in tunicate larvae. PLoS Biol. 2024 Mar 13;22(3):e3002555.

Krebs JR, Kacelnik TP (1978) Tests of optimal sampling by foraging great tits. Nature 275:27–31

Li Y, Zhong W, Wang D, Feng Q, Liu Z, Zhou J, Jia C, Hu F, Zeng J, Guo Q, Fu L, Luo M. Serotonin neurons in the dorsal raphe nucleus encode reward signals. Nat Commun. 2016 Jan 28;7:10503. 

Nokhbatolfoghahai M, Downie JR. Larval cement gland of frogs: comparative development and morphology. J Morphol. 2005 Mar;263(3):270-83. doi: 10.1002/jmor.10305. 

Okaty BW, Sturrock N, Escobedo Lozoya Y, Chang Y, Senft RA, Lyon KA, Alekseyenko OV, Dymecki SM. A single-cell transcriptomic and anatomic atlas of mouse dorsal raphe Pet1 neurons. Elife. 2020 Jun 22;9:e55523. 

Pottin K, Hyacinthe C, Rétaux S. Conservation, development, and function of a cement gland-like structure in the fish Astyanax mexicanus. Proc Natl Acad Sci U S A. 2010 Oct 5;107(40):17256-61. 

Pyke, G.H., 1984. Optimal foraging theory: a critical review. Annual review of ecology and systematics, 15, pp.523-575.

Ren J, Isakova A, Friedmann D, Zeng J, Grutzner SM, Pun A, Zhao GQ, Kolluru SS, Wang R, Lin R, Li P, Li A, Raymond JL, Luo Q, Luo M, Quake SR, Luo L. Single-cell transcriptomes and whole-brain projections of serotonin neurons in the mouse dorsal and median raphe nuclei. Elife. 2019 Oct 24;8:e49424.

Rétaux S, Pottin K. A question of homology for chordate adhesive organs. Commun Integr Biol. 2011 Jan;4(1):75-7.

Richard JM, Plawecki AM, Berridge KC. Nucleus accumbens GABAergic inhibition generates intense eating and fear that resists environmental retuning and needs no local dopamine. Eur J Neurosci. 2013 Jun;37(11):1789-802. 

Richard JM, Berridge KC. Nucleus accumbens dopamine/glutamate interaction switches modes to generate desire versus dread: D(1) alone for appetitive eating but D(1) and D(2) together for fear. J Neurosci. 2011 Sep 7;31(36):12866-79.

Sive H, Bradley L. A sticky problem: the Xenopus cement gland as a paradigm for anteroposterior patterning. Dev Dyn. 1996 Mar;205(3):265-80. 

Spring MG, Nautiyal KM. Striatal Serotonin Release Signals Reward Value. J Neurosci. 2024 Oct 9;44(41):e0602242024. 

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). The MIT Press.

Tenhumberg, B., Keller, M. A. & Possingham, H. P. Using Cox’ s proportional hazard models to implement optimal strategies: an example from behavioural ecology 2. Wasp behaviour model. Behaviour 33, 597–607 (2001).

Vachez YM, Tooley JR, Abiraman K, Matikainen-Ankney B, Casey E, Earnest T, Ramos LM, Silberberg H, Godynyuk E, Uddin O, Marconi L, Le Pichon CE, Creed MC. Ventral arkypallidal neurons inhibit accumbal firing to promote reward consumption. Nat Neurosci. 2021 Mar;24(3):379-390. 

Wajnberg, E., Bernhard, P., Hamelin, F. & Boivin, G. Optimal patch time allocation for time-limited foragers. Behav. Ecol. Sociobiol. 60, 1–10 (2006).

Wang HL, Zhang S, Qi J, Wang H, Cachope R, Mejias-Aponte CA, Gomez JA, Mateo-Semidey GE, Beaudoin GMJ, Paladini CA, Cheer JF, Morales M. Dorsal Raphe Dual Serotonin-Glutamate Neurons Drive Reward by Establishing Excitatory Synapses on VTA Mesoaccumbens Dopamine Neurons. Cell Rep. 2019 Jan 29;26(5):1128-1142.e7. 

Zhong W, Li Y, Feng Q, Luo M. Learning and Stress Shape the Reward Response Patterns of Serotonin Neurons. J Neurosci. 2017 Sep 13;37(37):8863-8875. 

Essay 37: Odor neighborhood

Let’s revisit the striatum timeout from essay 31: striatum LTD, where food seeking used the striatum as a timeout to avoid perseveration. Without the timeout, the animal continues to seek toward the odor source even if the food was missing. This essay adds to the timeout by adding an odor context as a cached set of locations to avoid until the timeout, as opposed to avoiding all locations until the timeout. This odor neighborhood resembles the olfactory spatial hypothesis [Jacobs 2012], which considers olfaction as primarily a navigation sense. The added specificity to failed-seek avoidance improves search for other nearby food sources.

Recap of essay 31: striatum LTD

The food seek logic from essay 31 has two search states: a general roaming search and an odor seek. If the odor seek times out, the animal avoids the current area to prevent perseveration. Essay 35 on hippocampal sequences explored using a sequence to specify the avoidance timeout.

Foraging state diagram with roaming and odor seeking
Foraging state machine with two search modes: a general roaming search and a target-specific seek.

For the timeout, the essay uses the Sv (ventral striatum aka nucleus accumbens) to suppress failed food seeking [Lafferty et al 2020]. Without the S.v timeout, the animal perseverates at the seek task and gets stuck in the center of the odor plume.

Striatum as timing out failed odor search.
Circuit of S.v for timing out a failed food seek. Adenosine drives a ramping timeout signal that reduces motivation by switching from the seek path via V.pt to the avoidance path via Hb.l. Ad (adenosine), Hb (habenula), Ob (olfactory bulb), Pv (ventral pallidum), S.d1 (striatum projection neuron with D1 dopamine receptor), S.d2 (striatum projection neuron with D2 projection receptor), V.pt (posterior tuberculum – Vta/Snc)

The above diagram shows the essay 31 circuit, largely based on the lamprey. V.pt (posterior tuberculum) is a locomotion hub that receives a direct signal from Ob.m (medial olfactory bulb) and drives downstream motor areas [Derjean et al 2012]. Hb.l (lateral habenula) drives place avoidance. S.v (ventral striatum) drives the timeout selection in Pv (ventral pallidum). Ad (adenosine) is the timeout variable, which increases as neural activity in S.d1 (striatum D1 projection neuron) and S.d2 (striatum D2 projection neuron) continues. Adenosine is a byproduct of ATP (adenosine triphosphate) energy production, and is also a gliotransmitter from astrocytes that monitor synapse activity. Essentially, the Sv subcircuit in red acts as a timeout for the main seek circuit.

Importantly, because the essay 31 timeout only uses the seek odor itself as a key, it can’t distinguish spatially distinct odors, such as different flowers for a honeybee.

Neighborhood odor as context

Because essay 31 only used the Ob seek odor as a signal, a timeout of that odor locks out all food search for that odor. That lockout may be long because the S.d2 LTD (long term depression) recovery time is on the order of 20 to 60 minutes. Consider an analogy to a bee searching a field of flowers for nectar. If one flower is missing nectar, the bee should give up on that flower, but it shouldn’t abandon the entire task until a 60 minute timer expires.

Odor neighborhoods with food odor plumes. Each colored area is an odor neighborhood and each cloud is an odor plume. Only the starred areas contain food.

In the above diagram, the stars represent food locations and the clouds represent food odor plumes. Odor plumes without food are false odors. The colors of the regions represent odor neighborhoods, where non-food odors distinguish the areas. Suppose the animal first searches in the dark orange area and fails to find food. If it next reaches the green area with the star, the timeout from the failed orange search will block the search unless the timeout is specific to the orange neighborhood.

Olfactory spatial hypothesis

The olfactory spatial hypothesis argues that a primary function for olfaction is navigation, as opposed to simply proving identification [Jacobs 2012]. This navigation-centric idea is fleshed out in the parallel map theory, which argues that the hippocampus is primarily organized around two maps: a bearing map using gradients to distant odor landmarks, and a sketch map with local landmark cues [Jacobs and Schenk 2003]. The parallel map theory associates the distant bearing map with E.dg (dentate gyrus of the hippocampus) and the local sketch map with E.ca1 (CA1 region of the hippocampus).

The current essay uses the broad idea of the olfactory spatial hypothesis and the idea of a local olfactory neighborhood. The olfactory neighborhood provides a context to restrict the striatum timeout. Functionally it resembles the local sketch map, but it’s not strictly speaking a map, only a cache of failed locations.

Lamprey dual odor path

The lamprey is a useful animal model because it represents the older jawless vertebrates that preceded the development of the jaw and the majority of more complex vertebrates and because it has a simpler brain. In the lamprey, Ob.m directly drives locomotion via V.pt (posterior tuberculum), which is homologous to the mammalian midbrain dopamine areas Vta (ventral tegmental area) and Snc (substantia nigra pars compacta). Unlike the mammalian dopamine areas, the lamprey V.pt drives locomotion directly to MLR (midbrain locomotor region) and R.rs (reticulospinal motor neurons) [Beauséjour et al 2020].

The rest of the lamprey Ob drives the pallium (cortex) and subpallium (basal ganglia). Unlike the mammalian Ob which only drives specific olfactory cortical areas, the lamprey Ob broadly connects to the entire pallium [Derjean et al 2010], [Suryanarayana et al 2021]. Note that the lamprey pallium is smaller than the Ob [Pombal and Megías 2019].

Dual olfactory projections: direct to locomotor via V.pt and indirectly through the S.ot/P.v (basal ganglia). Hb.l (lateral habenula), MLR (midbrain locomotor region), Ob.l (lateral olfactory bulb), Ob.m (medial olfactory bulb), Pv (ventral pallidum), R.rs (reticulospinal motor command), S.ot (olfactory tubercle), V.pt (posterior tuberculum)

The above diagram illustrates the dual olfactory projection. The main action path is Ob.m to the V.pt to the MLR locomotion [Derjean et al 2010], [Beauséjour et al 2020], [Beauséjour et al 2024]. Not shown is the Ob.m projection to the Hb.m (medial habenula) – R.ip (interpeduncular area) for chemotaxis. The previous essay included the Ob.m to S.ot path for the timeout, which suppressed chemotaxis to avoid perseveration. Ob.l is the new addition, providing distinguishing context to the S.ot circuit.

Striatal discrimination

To represent distinct timeouts, different context or olfactory neighborhoods need distinct neurons or at least different dendrite spines. The striatum architecture is well-suited for this task because of the very large number of S.pn (striatal projection neurons aka medium spiny neurons). Each S.pn can represent a distinct combination of signal and context.

Striatum architecture to represent multiple timeouts, each with a unique context key built from unique distinguishing combination of inputs. cxt-1 (context input), Ob.m (medial olfactory bulb), Pv (ventral pallidum), S.pn (striatum projection neuron).

The above diagram shows the context-keyed timeout architecture. Each S.pn is associated with a distinguishing context, but all of these use the same primary signal. Because S.pn stores the timeout in the LTP (long term potentiatiation) / LTD in its dendrite spines, the multiple S.pn neurons allow for distinct persistent timeout variables. Furthermore, a single S.pn can support multiple contexts because each S.pn has several dendrites, on the order of 8-12, each of which can respond to a distinct input combination.

Note the similarity of this fan-out to granule cells in the hippocampus and cerebellum, and the Kenyon cells in Drosophila fruit fly. This expansion of the coding dimensionality allows for a large space to place odors while reducing overlaps [Laurent 2002].

Striatum UP states

In mammals S.pn are only active with sustained input from multiple distributed cortical sources [Shipp 2017]. This sustain input the S.pn into an UP state, which allows a primary signal to drive the neuron, but doesn’t drive an AP (action potential) directly. Typically the context UP state inputs drive distal dendrites and spines, and the primary signal drives the proximal dendrite. S.pn are hyper polarized at rest, making it difficult for a signal to drive an AP directly. The UP state depolarizes the S.pn, allowing the signal to drive an AP. Essentially this means the context neurons are required gates for the signal.

Combinations of context neurons drive a dendrite UP state, which allows the signal to drive the projection neuron. CN (context neuron), S.nr (substantia nigra pars reticulata), S.pn (striatum projection neuron).

The above diagram shows how each S.pn has an associated context made from a conjunction of several context neurons. Each S.pn has a different combination of context neurons, each differing greatly from its neighbor [Bolam and Bevan 2006]. Multiple simultaneous context neurons are necessary for an UP state.

Broad circuit

Taking an overview of this system, let’s see how addition of this context information affects the seek and timeout circuit affects the earlier circuit.

Olfactory timeout circuit with Ob.l added as a context input to S.v. Ad (adenosine), Hb (habenula), Ob (olfactory bulb), Pv (ventral pallium), S.d1 (striatum D1 projection neuron), S.d2 (striatum D2 projection neuron), V.pt (posterior tubuculum)

The above diagram shows the addition of Ob.l to S.ot was the only change necessary, along with the dimension expansion of the S.pn.

Cache-like model for simulation

The striatum architecture poses a scaling problem for the simulation. The striatum has a large number of neurons, each with a large number of essentially random inputs. This architecture works because the possible combinations are predefined. Each odor neighborhood is a conjunction of odor features, each corresponding to an Ob glomeruli and O.mc (olfactory mitral cells). The many predefined conjunctions are likely to match any new odor combination. However, a simulation model using this architecture would be overly large.

Because the essay model is a toy model, it can use a much simplified system. A cache-like architecture can work because only a few odor locations are active at any time. The cache only holds the recent odor locations, and the cache entry for an odor location is removed when the timeout expires. The simulation cache only needs to store the active locations, unlike the striatum, which holds the much larger number of possible distinct locations.

Simulation

The simulation adds a simplification of odor neighborhoods. Instead of simulating accurate odor plumes, each location has a place code, which then produces an odor code. In the screenshot below, the hexagonal colors represent these place codes that produce odor neighborhoods.

Simulation screenshot of the animal reaching food in a different neighborhood than the previously avoided neighborhood.

The above diagram shows two different odor neighborhoods (teal vs red). The animal avoids the red neighborhood after failing to find food, but seeks in the teal neighborhood to find the food. If the animal had first searched the teal neighborhood without food, it would have avoided have avoided the teal neighborhood with food.

Discussion

A major simplification in the simulation is consistency and precision in odor cues. In an actual environment, odors are not reliable. For now I’m not adding that complexity, but it might explain the need for cortical circuits in O.pir (piriform olfactory cortex) and E.hc (hippocampus). If an odor is irregular, some circuit needs to maintain a consistent odor neighborhood for the timeout circuit to work. In the simulation because the Ob perfectly represents the odor neighborhood and food plume, the downstream circuits can use the Ob signal directly. If the odor varies slightly within a neighborhood, or is lost intermittently, the S.ot timeout circuit could shift to a different S.pn timeout, breaking the logic of the circuit. A later essay might explore how cortical areas like O.pir might be necessary to create a stable neighborhood.

References

Beauséjour PA, Auclair F, Daghfous G, Ngovandan C, Veilleux D, Zielinski B, Dubuc R. Dopaminergic modulation of olfactory-evoked motor output in sea lampreys (Petromyzon marinus L.). J Comp Neurol. 2020 Jan 1;528(1):114-134. 

Beauséjour PA, Veilleux JC, Condamine S, Zielinski BS, Dubuc R. Olfactory Projections to Locomotor Control Centers in the Sea Lamprey. Int J Mol Sci. 2024 Aug 29;25(17):9370.

 Bolam, J. P., & Bevan, M. D. (2006). Microcircuits of the striatum. In Basal Ganglia and Thalamus in Health and Movement Disorders (pp. 29-39). Boston, MA: Springer US.

Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21;8(12):e1000567. 

Jacobs LF, Schenk F. Unpacking the cognitive map: the parallel map theory of hippocampal function. Psychol Rev. 2003 Apr;110(2):285-315. 

Jacobs L. F. (2012). From chemotaxis to the cognitive map: the function of olfaction. Proc. Natl. Acad. Sci. U.S.A. 109(Suppl. 1) 10693–10700 

Lafferty CK, Yang AK, Mendoza JA, Britt JP. Nucleus Accumbens Cell Type- and Input-Specific Suppression of Unproductive Reward Seeking. Cell Rep. 2020 Mar 17;30(11):3729-3742.e3.

Laurent G. Olfactory network dynamics and the coding of multidimensional signals. Nat Rev Neurosci. 2002 Nov;3(11):884-95. 

Pombal MA, Megías M. Development and Functional Organization of the Cranial Nerves in Lampreys. Anat Rec (Hoboken). 2019 Mar;302(3):512-539. 

Shipp S. The functional logic of corticostriatal connections. Brain Struct Funct. 2017 Mar;222(2):669-706. 

Suryanarayana SM, Pérez-Fernández J, Robertson B, Grillner S. Olfaction in Lamprey Pallium Revisited-Dual Projections of Mitral and Tufted Cells. Cell Rep. 2021 Jan 5;34(1):108596. 

Essay 26: Ignoring distracting odors

I’ve been ignoring distracting cues in the previous essays for simplification. Since the simulated animal only encountered a single odor at a time, it never needed to select one and ignore the other. In essay 26, I’ll implement a very simple first approximation to ignoring distractors, using the P.bf (basal forebrain) control of the Ob (olfactory bulb) as a switchboard to let the selected odor through and inhibit the ignored distractor.

Simulated animal (triangle) encountering two odor plumes (circles).

In the diagram above, the animal (triangle) is seeking food using the purple odor cue as a gradient direction. When it encounters the distractor odor in blue, it should ignore the distractor, otherwise the two odors will mingle into an incorrect summed gradient and the animal will seek in the wrong direction [Cisek 2022].

Temporal chemotaxis

For essay 26, I’m switching chemotaxis (odor seeking) to use the apical temporal gradient search, using Hb.m (medial habenula) and B.ip (interpeduncular nucleus) like the phototaxis in essay 24. The apical system follows the chimera brain model of [Tosches and Arendt 2013], which suggests that odor senses and actions are distinct systems from bilateral tactile senses. For the essays, the shift is from a bilateral, Braitenberg-like [Braitenberg 1984] system to a modulated random walk like the bacterial tumble-and-run.

Olfactory tumble-and-run system using Hb.m and B.ip for temporal gradient direction, and B.rs for the modulated random walk. B.ip interpeduncular nucleus, B.rs hindbrain reticulospinal motor area, Hb.m medial habenula, Ob olfactory bulb.

The above diagram shows the problem with distractor odors. Because the tumble-and-run system uses a single temporal gradient, it necessarily adds both odors together for its input. The summed input goes to the Hb.m (medial habenula) and B.ip (interpeduncular nucleus) system to modulate the random walk direction.

When the animal crosses into the overlapping distractor odor, it will follow the combined signal, distracted from the original seek target. To avoid distraction, the system can either amplify the current odor A, or inhibit the distractors like odor B.

Analogy with nucleus isthmi

An earlier essay 19 also had an attention / distractor problem, with a different issue of action consistency, and used a zebrafish circuit in P.ni (nucleus isthmi) as a solution. In larval zebrafish P.ni works together with OT (optic tectum) to sustain attention on prey during a hunt [Henriques et al 2019]. P.ni is an ACh (acetylcholine neurotransmitter) and GABA (inhibiting neurotransmitter) system that both amplifies the predicted prey location and inhibits surrounding areas.

Nucleus isthmi circuit as adapted by essay 19. ACh acetylcholine, OT optic tectum, Pni nucleus isthmi.

In the above diagram for the essay 19 circuit, a simultaneous left and right touch would select one action at random and sustain that choice for subsequent movement with the P.ni positive feedback circuit. The outputs are crossed because it’s an avoidance circuit: an obstacle on the left triggers a right turn.

Importantly, the positive feedback is modulatory; it doesn’t trigger an action by itself. At a synapse level, ACh triggers mAChR (ACh metabotropic receptor, Gs stimulatory type) on the sensor axon, amplifying the sensor’s neurotransmitter release. The ACh and mAChR act as the decay timer, because they have a slow time constant on the order of a few seconds. If the sensor doesn’t stimulate the circuit, as when successfully avoiding the obstacle, the attention will decay over a few seconds, resetting the system to its original state.

A similar function applies to Ob and P.bf (basal forebrain), where P.bf acts like P.ni to sustain attention to the selected odor. “Basal forebrain” is a general name for a collection of functionally-related subcortical areas in the ventral (“basal”) forebrain, all pallidal-like (P). The specific P areas for the Ob are P.hdb (horizontal diagonal band) and Po.me (magnocellular preoptic area), but I’ll use P.bf for simplicity.

Olfactory bulb as a switchboard

In this model, Ob acts like a switchboard controlled by P.bf. P.bf selects attended odor paths in Ob, where Ob either passes the odor signal to its destination or inhibits the signal if it’s a distractor. P.bf opens and closes gated circuits in Ob.

Although the architecture of the Ob and P.bf circuit resembles the P.ni circuit, Ob appears to rely more heavily on inhibitory GABA for the gating operation, although ACh is also important [Böhm et al 2020], [de Saint Jan et al 2020], [Nunez-Parra et al 2000]. Since this essay is a first cut, simplified model, I’m using a single signal that represents a gating attention / inhibition signal, and glossing over the ACh vs GABA distinction.

Olfactory bulb switchboard using basal forebrain to gate selected odors. B.ip interpeduncular nucleus, B.rs reticulospinal motor, Hb.m medial habenula, Omt mitral/tufted output cells, Osn olfactory sensor neurons, P.bf basal forebrain.

In the above diagram where the switchboard selects odor A and inhibits odor B, the apical seek circuit receives only odor A’s signal. P.bf gates odors from Osn (olfactory sensory neurons) to Omt (mitral/tufted output cells), which then add to form a single signal for the temporal gradient tumble-and-run seek. For simplicity, I’ve shown the P.bf ACh and GABA signal as a simple gating control.

Once the system detects odor A, P.bf configures the switchboard to pass through A and inhibit other odors, locking out the distractor. Because the selecting signals are modulators, they don’t drive a signal until an odor signal arrives. Like the P.ni circuit, attention will timeout as ACh and its slow mACh receptor decay. When the animal leaves the odor plume, the system resets because the absence of odor A collapses the feedback loop.

Although the essay’s switchboard is an improvement over the naive summation of odor signals, it’s still quite limited. There’s no active selection of a best odor, and the system can’t switch to a better odor cue. Also, since the global give-up circuit isn’t integrated with P.bf, giving up on odor A can’t select odor B. Instead the animal must leave the plume and reset the system.

Slightly more complete Ob switchboard

The Ob is a surprisingly complex system; it’s not just a simple odor system. In addition to the P.bf, Opir (olfactory piriform cortex) also modulates the Ob system, and Ob itself has lateral inhibition between Omt (mitral cell output), which is plastic, learning to discriminate odors itself, as well as modulatory input from the serotonin and noradrenaline system.

In the real Ob, many Osn for the same odor feed into a single Ogl (olfactory glomeruli), which provides input to several Omt, all representing the same odor. Each odor feature has its own Ogl system, several hundred in mammals (two in the essay simulation). Ogl is where the neuropil of the Osn axons meet the Omt dendrites, in a fan-in to fan-out system. Also, each Ogl has many inhibitory Opg (periglomerular inhibitors) with multiple variations, and each Omt has several inhibitory Ogc (olfactory granule cells). The basic fan-in and fan-out structure looks like the following diagram.

Olfactory bulb glomerule fan-in and fan-out system. Bip interpeduncular nucleus, B.rs reticulospinal motor, Hb.m medial habenula, Ogc olfactory granule cell inhibitor, Ogl olfactory glomerule, Omt olfactory mitral/tufted output, Opg olfactory periglomerular inhibitor, Osn olfactory sensor neuron.

The switchboard diagram below focuses on the ACh and GABA control from P.bf. It combines multiple Osn, Opg, Omt and Ogc into single items.

Partial olfactory bulb switchboard circuit. B.ip interpeduncular nucleus, B.rs reticulospinal motor, Hb.m medial habenula, Ogc olfactory granule cell, Ogl olfactory glomeruli, Omt olfactory mitral/tufted output, Opg olfactory periglomerular inhibitor, Opir olfactory piriform cortex, Osn olfactory sensory neuron.

To break down the diagram, the core of the switchboard circuit is the Osn to Ogl to Omt to output path; everything else is gating to select or inhibit the signal.

Odor gating happens in two locations: modulating Omt’s input dendrite tree in Ogl by Opg and modulating Omt’s output by Ogc (olfactory granular cell). Because each Omt’s input Ogl is shared for several Omt, the Opg inhibition likely affects many or all Omt for a single Ogl. In contrast, the Ogc inhibition is individual, and the Omt and Ogc circuit creates and manages gamma oscillations, which amplifies and reduces noise from the signal.

Although I’m not planning on touching cortical areas for many essays, the Opir (olfactory piriform cortex) modules the Ob switchboard in a similar circuit as B.pf with some difference. Since the Opir input to the many Ogc and many Ogl is not odor selective [Boyd et al 2015], Ogc must learn the meaning of the Opir input through plasticity.

Global give-up circuit

The essay’s task engagement and give-up circuit currently uses H.l (lateral hypothalamus) and Hb.l (lateral habenula) with V.dr (dorsal raphe serotonin) [Hikosaka 2010], [Chowdhury and Yamanaka 2016]. When a seek fails Hb.l suppresses H.l, H.l ends seek, and the animal moves on [Post et al 2022].

Global give-up circuit. H.l lateral hypothalamus, Hb.l lateral habenula, V.dr dorsal raphe, 5HT serotonin.

Because the global give-up circuit is entirely disconnected from the olfactory selective attention from the essay, giving up means giving up on all odors, not just the current attended odor.

Simulation

For this essay, I refactored much of the simulation code to clean up ideas from previous essays. A new hindbrain module manages the main locomotion like the zebrafish hindbrain motor area [Dunn et al 2016], which is possibly different from the tetrapod / amniote locomotion in the midbrain. Because the essay animal is currently more primitive than amniotes, this simplification seemed appropriate and makes the code organization more clear.

Olfactory locomotion is now random-walk based following apical tumble-and-run, as opposed to the earlier bilateral path through Vta (ventral tegmental area / posterior tuberculum) and OT (tectum). In zebrafish both paths exist, which I might explore later, but this essay is restricted to the apical temporal gradient search.

The seek mode now slows the animal and adjusts the Levy walk parameters to simulate ARS (area restricted search). As I’ll cover in the problems section, switching to seek mode is still hardcoded.

I split the habenula seek from habenula give-up (Hb.m from Hb.l) and pulled the gradient seek and head direction from B.ip into the habenula seek. Conceptually, the habenula seek code now represents Hb.m and B.ip as a single complex.

Simulated odor seeking with target attention and distractor inhibition.

In the screenshot above, the animal is making a u-turn to return to the food when the odor gradient (blue semicircle) is opposite the head direction (black semicircle). In the upper right, the green box outlined in red represents the attended green odor signal, while the white box outline in blue represents the suppressed blue odor. Despite the Osn naively sensing both blue and green odors because the animal is in the overlap area, only the green odor passes through Omt to the seek system.

The square borders around the odor color represent P.bf modulation. Red is attended (100% pass through), blue is inhibited (10% pass through), and grey is unmodulated (50% pass through).

In the diamond-shaped homunculus, the bright blue triangle represents the u-turn nudge.

As the goal vector shows, the guessed goal direction isn’t very accurate, particularly when the animal is making a turn. Currently, the animal continues to update its guess even in the middle of a turn when the odor data and averages are not appropriate for the current direction.

References

Böhm E, Brunert D, Rothermel M. Input dependent modulation of olfactory bulb activity by HDB GABAergic projections. Sci Rep. 2020 Jul 1;10(1):10696. 

Boyd AM, Kato HK, Komiyama T, Isaacson JS. Broadcasting of cortical activity to the olfactory bulb. Cell Rep. 2015 Feb 24;10(7):1032-9.

Braitenberg, V. (1984). Vehicles: Experiments in synthetic psychology. Cambridge, MA: MIT Press. “Vehicles – the MIT Press”

Chowdhury S, Yamanaka A. Optogenetic activation of serotonergic terminals facilitates GABAergic inhibitory input to orexin/hypocretin neurons. Sci Rep. 2016;6:36039

Cisek P. Evolution of behavioural control from chordates to primates. Philos Trans R Soc Lond B Biol Sci. 2022 Feb 14

De Saint Jan D. Target-specific control of olfactory bulb periglomerular cells by GABAergic and cholinergic basal forebrain inputs. Elife. 2022 Feb 28;11:e71965.

Dunn, Timothy, Yu Mu, Sujatha Narayan, Owen Randlett, Eva A Naumann, Chao-Tsung Yang, Alexander F Schier, Jeremy Freeman, Florian Engert, Misha B Ahrens (2016) Brain-wide mapping of neural activity controlling zebrafish exploratory locomotion eLife 5:e12741

Henriques PM, Rahman N, Jackson SE, Bianco IH. Nucleus Isthmi Is Required to Sustain Target Pursuit during Visually Guided Prey-Catching. Curr Biol. 2019 Jun 3;29(11):1771-1786.e5. 

Hikosaka O. The habenula: from stress evasion to value-based decision-making. Nat Rev Neurosci. 2010 Jul;11(7):503-13. 

Nunez-Parra A, Cea-Del Rio CA, Huntsman MM, Restrepo D. The Basal Forebrain Modulates Neuronal Response in an Active Olfactory Discrimination Task. Front Cell Neurosci. 2020 Jun 5;14:141. 

Post RJ, Bulkin DA, Ebitz RB, Lee V, Han K, Warden MR. Tonic activity in lateral habenula neurons acts as a neutral valence brake on reward-seeking behavior. Curr Biol. 2022 Oct 24;32(20):4325-4336.e5.

Tosches, Maria Antonietta, and Detlev Arendt. The bilaterian forebrain: an evolutionary chimaera. Current opinion in neurobiology 23.6 (2013): 1080-1089.

Essay 20: Olfactory avoidance

Although the essays have implemented obstacle avoidance, they haven’t yet explored olfactory avoidance. Olfactory avoidance is distinct from obstacles, not just because obstacles have higher priority, but because the olfactory system is from an entirely different nervous system than the sensorimotor system. In the chimaeral brain theory [Tosches and Arendt 2013], bilaterian brains are composed of an apical nervous system (ANS) focused on chemo senses (olfactory external and hypothalamic internal), and a blastoporal nervous system (BNS) focused on sensorimotor control like obstacle avoidance.

Olfactory path

The paths for olfactory motion compared with obstacle motion shows the value of the chimaeral theory in making sense of the brain. Working backward from the midbrain locomotive region (MLR), the acetylcholine (ACh) MLR nuclei specialize: the pedunculopontine nucleus (M.ppt) supports the sensorimotor BNS, and the laterodorsal tegmental nucleus (M.ldt) supports the chemosensory ANS.

Sensor-locomotion paths: olfactory on top and somatosensory on bottom. B.ll lateral line, B.rs reticulospinal motor command, B.ss somatosensory, Hb.m medial habenula, M.ldt laterodorsal tegmental nucleus, M.ppt pedunculopontine nucleus, Ob.m medial olfactory bulb, OT tectum, R.vis visual input, ,Vta ventral tegmental area.

In the above diagram, food odors and warning odors use distinct paths to the MLR. Food odors from the olfactory bulb (Ob) pass through the ventral tegmental area (Vta – posterior tuberculum in zebrafish) to the MLR [Derjean et al. 2010]. Aversive odors like cadaverine pass through the medial habenula (Hb.m) to the M.ldt portion of the MLR [Stephenson-Jones et al. 2012]. The food and avoidance paths are distinct because hunger and satiety from the hypothalamus modulate the food path, while the avoidance path can pass through unmodulated. These olfactory locomotion paths correspond to the ANS.

Lamprey medial habenula path

All vertebrates share this basic architecture, including the lamprey, one of the most evolutionary-distant vertebrates. [Stephenson-Jones et al. 2012] traced the Hb.m circuit, showing that Hb.m inputs are from the olfactory path, the parapineal (light attraction), and an electron-sensory alarm to the interpeduncular nucleus (M.ip).

Lamprey olfactory warning path through the habenula to the MLR. M.ip interpeduncular nucleus.

The above diagram fills out the olfactory warning path. The interpeduncular nucleus is a key node in the avoidance circuit, and also key to locomotor-induced theta, and one of the two serotonin nodes. Mip has a major output to the serotonin areas: dorsal raphe (V.dr) and medial raphe (V.mr) and to the central grey (M.pag) [Quina et al. 2017] and M.ldt as well as structures associated with hippocampal (E.hc) theta [Lima et al. 2017].

Medial habenula behavior

In larval zebrafish, Hb.m supports olfactory avoidance [Choi et al. 2017], [Jeong et al. 2021], and light seeking [Zhang et al. 2017]. At least one study indicates that it may also affect food seeking [Chen et al. 2019]. The non-Ob input to Hb.m — the posterior septum (P.ps) — produce locomotion when stimulated [Ostu et al. 2018], suggesting that later evolved functionality maintains the original basal function.

In zebrafish, M.ip only projects to serotonin areas (V.dr and V.mr), not to dopamine or MLR areas. The lamprey connectivity suggests that the M.ip to M.ldt connection was lost in fish.

The Hb.m to M.ip connection is affected by nicotine. An interesting property is that low stimulation and high stimulation have opposite effects. Low stimulation uses glutamate connections and is attractive while high stimulation adds ACh and is aversive [Krishnan et al. 2014].

Developmental genetic notes

As an interesting aside, both Hb.m and avoidant layers of OT shared a genetic marker Brn3a (aka pou4f1) [Quina et al. 2009], [Fedtsova et al. 2008]. That marker also appears in the cerebellum’s inferior olive, trigeminal sensory areas, and the amphioxus motor LPN3 neuron [Bozzo et al. 2023].

M.ldt and M.ppt are sibling areas, deriving from the r1 rhombic lip [Machold et al. 2011].

Glutamate and GABA neurons in M.ip, Vta, and M.ldt all derive from r1 basal neurons [Lahti et al. 2016].

Locomotion switchboard

The addition of olfactory avoidance further complicates the switchboard combining the various locomotor streams, especially if the olfactory path uses serotonin as a modulator as opposed to a straight glutamate connection. Although I’ll probably use a fixed priority for essay 20, and as [Cisek 2022] notes, avoidance can be combined additively, at some point the switchboard will need more control, especially when essays add vision and consummatory actions.

References

Bozzo M, Bellitto D, Amaroli A, Ferrando S, Schubert M, Candiani S. Retinoic Acid and POU Genes in Developing Amphioxus: A Focus on Neural Development. Cells. 2023 Feb 14

Chen W-Y, Peng X-L, Deng Q-S, Chen M-J, Du J-L, Zhang B-B. Role of Olfactorily Responsive Neurons in the Right Dorsal Habenula-Ventral Interpeduncular Nucleus Pathway in Food-Seeking Behaviors of Larval Zebrafish. Neuroscience. 2019

Choi JH, Duboue ER, Macurak M, Chanchu JM, Halpern ME. Specialized neurons in the right habenula mediate response to aversive olfactory cues. Elife. 2021 Dec 8

Cisek P. Evolution of behavioural control from chordates to primates. Philos Trans R Soc Lond B Biol Sci. 2022 Feb 14

Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21

Fedtsova N, Quina LA, Wang S, Turner EE. Regulation of the development of tectal neurons and their projections by transcription factors Brn3a and Pax7. Dev Biol. 2008 Apr 1

Jeong YM, Choi TI, Hwang KS, Lee JS, Gerlai R, Kim CH. Optogenetic Manipulation of Olfactory Responses in Transgenic Zebrafish: A Neurobiological and Behavioral Study. Int J Mol Sci. 2021 Jul 3

Krishnan S, Mathuru AS, Kibat C, Rahman M, Lupton CE, Stewart J, Claridge-Chang A, Yen SC, Jesuthasan S. The right dorsal habenula limits attraction to an odor in zebrafish. Current Biology. 2014

Lahti L, Haugas M, Tikker L, Airavaara M, Voutilainen MH, Anttila J, Kumar S, Inkinen C, Salminen M, Partanen J. Differentiation and molecular heterogeneity of inhibitory and excitatory neurons associated with midbrain dopaminergic nuclei. Development. 2016 Feb 1

Lima LB, Bueno D, Leite F, Souza S, Gonçalves L, Furigo IC, Donato J Jr, Metzger M. Afferent and efferent connections of the interpeduncular nucleus with special reference to circuits involving the habenula and raphe nuclei. J Comp Neurol. 2017 Jul 1

Machold R, Klein C, Fishell G. Genes expressed in Atoh1 neuronal lineages arising from the r1/isthmus rhombic lip. Gene Expr Patterns. 2011 Jun-Jul

Otsu Y, Lecca S, Pietrajtis K, Rousseau CV, Marcaggi P, Dugué GP, Mailhes-Hamon C, Mameli M, Diana MA. Functional Principles of Posterior Septal Inputs to the Medial Habenula. Cell Rep. 2018 Jan 16

Quina LA, Wang S, Ng L, Turner EE. Brn3a and Nurr1 mediate a gene regulatory pathway for habenula development. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience. 2009

Stephenson-Jones M, Floros O, Robertson B, Grillner S. Evolutionary conservation of the habenular nuclei and their circuitry controlling the dopamine and 5-hydroxytryptophan (5-HT) systems. Proc Natl Acad Sci U S A. 2012 Jan 17

Tosches, Maria Antonietta, and Detlev Arendt. “The bilaterian forebrain: an evolutionary chimaera.” Current opinion in neurobiology 23.6 (2013): 1080-1089.

Zhang BB, Yao YY, Zhang HF, Kawakami K, Du JL. Left Habenula Mediates Light-Preference Behavior in Zebrafish via an Asymmetrical Visual Pathway. Neuron. 2017 Feb 22

Essay 17: Proto-Vertebrate Locomotion

The locomotive model in essays 14 to 16 were non-vertebrate. Essay 17 takes the same problems, avoiding obstacles and seeking food, and with a model based on the vertebrate brain. Since these models are still Precambrian or early Cambrian, they don’t include the full vertebrate architecture, but try to find core components that might have been a basis for later vertebrate developments.

The animal is a slug-like creature with mucociliary forward movement, where propulsion is cilia or cilia-like and steering is muscular. This combination of slug-like motion and vertebrate brain is probably not evolutionary accurate, but it allows touch-based obstacle avoidance without the complications of vision of lateral-line senses.

The animal seeks food by following odor plumes, and avoids obstacles by turning away when touching them. The locomotion model includes the following components:

  • [Braitenberg 1984] navigation (simple crossed vs uncrossed signals for approach and avoid).
  • Obstacle avoidance with a direct touch-to-muscle circuit.
  • Odor-seeking with distinct “what” and “where” paths.
  • Perseveration fix with an explicit give-up circuit.
  • Motivation-state (satiety) control of odor-seeking (“why” path [Verschure et al. 2014])

Proto-vertebrate model

A diagram of the proto-vertebrate model, including analogous brain regions follows:

Proto-vertebrate locomotive model. Key: B.sp spinal motor, B.rs reticulospinal motor command (medial and lateral), B.ss spinal somatosensory, H.l lateral hypothalamus, Hb.l lateral habenula, M.lr midbrain locomotive region (M.ppt), Ob olfactory bulb, Snc substantia nigra pars compacta. DA dopamine, ACh acetylcholine.

For the sake of readability, the model simplifies the actual vertebrate midline crossing patterns, leaving only a single cross between B.rs (reticulospinal) and B.sp (spinal), which represents Braitenberg navigation.

In this model, obstacle avoidance is reflexive between B.ss (somatosensory touch) and B.rs. Odor navigation (“where”) flows through Snc (substantia nigra pars compacta) to M.lr (midbrain locomotive region). In the zebrafish, the Snc area is the posterior tuberculum, and the M.lr like represents M.ppn (pedunculopontine tegmental nucleus). The motivation-state (hunger or satiety) and “what” (food odor vs non-food) flow through H.l (lateral hypothalamus). The give-up circuit flows through Hb.l (lateral habenula).

Olfactory navigation path

[Derjean et al. 2010] traced a path in zebrafish from Ob (olfactory bulb) to the posterior tuberculum (mammal Snc) to the midbrain locomotive region (likely M.ppn), to the reticulospinal motor command neurons.

Zebrafish olfactory to motor path in [Derjean 2010].

A similar olfactory to motor path has been traced in lamprey by [Suryanarayana et al. 2021] and [Beausejour et al. 2021].

I’ve labeled this path as a “where” path, based on simulation requirements, but as far as I know, that label has no scientific basis.

The Snc / posterior tubuculum area includes descending glutamate and dopamine (DA) neurons, although the Snc is better known for its ascending dopamine path. Since [Ryczko et al. 2016] reports a mammalian descending glutamate and DA path from Snc to M.ppn, portions of this descending path appears to be evolutionarily conserved. The DA appears to be an effort boost, increasing downstream activity, but most of the activity is glutamate.

Braitenberg navigation

[Braitenberg 1986] vehicles are a thought experiment for simple circuits to implement approach and avoid navigation. In the original, the vehicles have two light-detection sensors connected to drive wheels. Depending on the connection topology, sign and thresholds, the simple circuits can implement multiple behaviors.

Braitenberg vehicles for approach and escape
Braitenberg circuits for approach and escape.

A circuit that combines the output of approach and avoid circuits with some lateral inhibition can implement both approach and avoidance with avoidance taking priority. In the essay simulation, if the animal touches a wall, it will turn away from the obstacle, temporarily ignoring any odor it might be following.

Circuit for obstacle avoidance and food approach for simulated slug.
Circuit for combined odor approach and touch obstacle avoidance.

Mammalian locomotion appears to use a similar circuit between the superior colliculus (OT – optic tectum) and the motor driving B.rs neurons [Isa et al. 2021]. This circuit pattern implies that approach and avoidance are separate behaviors, only reconciled at the end. For example, a punishing reinforces that increases avoidance is not simply the mirror image of a non-reward that decreases approach. The two reinforcers modify different circuits.

“What” path vs “where” path

The mammalian visual system has separate “what” and “where” paths. One path detects what object is in focus, and one path keeps track of where the object location is. This division between object decision and navigation has been useful in the simulation, because navigation details are quickly lost in the circuit when deciding what to do with an odor.

“What” and “where” paths as configuring a switchboard.

When an animal senses an odor, say a food odor, the animal needs to identify it as a food odor, decide if the animal is hungry or sated, and decide if there’s a higher-priority task. All that processing and decision can lost the fine timing phase and amplitude details needed for precise navigation. Gradient following, for example, needs fine differences in timing or amplitude to decide whether to turn left or right. By splitting the long, complicated “what” decision from the short, simple “where” location, the circuit can benefit from both.

[Cohn 2015] describes the fruit fly mushroom body as a switchboard, where dopamine neurons configure the path for olfactory senses to travel. In the context of “what” and “where”, the “what” path configures the switchboard and the “where” path follows the connected circuit.

Some odor-based navigation has a more extreme division between “what” and “where.” Following odor in water isn’t always gradient-based navigation, because odors form clumps instead of gradient plumes. Instead of following a gradient, the animal moves against the current toward the odor source. In that latter situation, the “where” path uses entirely different senses for navigation, using water flow mechanosensors, not olfactory sensors [Steele et al. 2023].

Navigation against current toward an odor plume.

The diagram above illustrates a food-searching strategy for some animals in a current, both water and air. In water, the current is more reliable for navigation than an odor gradient. When there’s no scent, the animal swims back and forth across the current. When it detects a food odor, it swims against the current. If it loses the odor, it will return to back and forth swimming. In this navigation type, entirely different senses drive the “what” and “where” paths.

Foraging and give-up time

Giving up is an essential part of goal-directed behavior. If an animal cannot ever give up, it will be stuck on the goal without escaping. In the context of foraging, the give-up time is optimized with the marginal value theorem [Charnov 1976], suggesting that an animal should move to another patch when its current reward-gaining rate drops below the average rate for the environment. Animal behavior researchers like [Kacelnik and Brunner 2002] have observed animals roughly following this theorem, although using simpler heuristics.

In more complex animals, the failure to give up can be pathological, such as psychological perseveration.

Odor-following state diagram including give-up timer.
Foraging state diagram illustrating the give-up timer

The give-up circuit needs some kind of internal timer or cost integrator, and a way to cancel the task. In this essay’s model, the lateral habenula (Hb.l) computes the give-up time or integrates the cost, and it cancels the task by suppressing the locomotive signal through Snc.

Habenula as a give-up circuit

Hb.l is positioned to act as a give-up circuit. It receives cost signals as non-rewarded bouts or as aversive events. [Stephenson-Jones et al. 2016] interprets the Hb.l input, P.hb (habenular-projecting pallidum), as evaluating action outcome. Hb.l can suppress both the midbrain dopamine and midbrain serotonin areas. In learned helplessness situations or depression, Hb.l is hyperactive [Webster et al. 2020], causing reduced activity.

Habenula circuit as a give-up mode in a locomotive circuit.

[Hikosaka 2012] suggests the habenula’s role as suppressing motor activity under aversive conditions, a role evolved from its close relationship to the pineal gland’s circadian scheduling.

In a review article, [Hu 2020] discusses the suppressive effects of the habenula, also remarking on its role as a reward-prediction error. In particular, noting that H.l (lateral hypothalamus) to Hb.l is aversive. The Hu article also notes that Hb.l knock-out abolishes the error signal from reward omission, not an error signal from aversive (shock or obstacles).

Once the threshold is crossed, the Hb.l to Snc signal produces behavioral avoidance, reduced effort and depressive-like behavior from learned helplessness. The Hb.l is the only brain area consistently hyperactive in animal models of depression.

Note, since this essay’s simulation is a non-learning behavioral model, the only “prediction” possible is an evolutionary intrinsically-attractive odor, and the only role for an error is giving up the current behavior. Here, I’m interpreting the H.l to Hb.l signal as a cost signal, integrated by Hb.l, that gives up when it crosses a threshold.

Vertebrate reference

For reference, here’s a functional model of the vertebrate brain.

Functional model of vertebrate brain.

The areas in this model cluster around the hindbrain isthmus divider. B.rs are hindbrain neurons near the isthmus. M.lr (M.ppn) are midbrain neurons that migrate from the hindbrain (r1) to the midbrain. Snc is the midbrain tegmental area (the V – value area), near the isthmus, and contiguous with M.ppn. Similarly the H.l area that projects to Snc is contiguous with it. The habenula is the most distant area, located above the thalamus near the pineal gland (not in the diagram as a simplification, but associated with the pallidum areas.) So, the areas discussed here are a small part of the entire brain, but interestingly clustered around the isthmus divider near the cerebellum.

Minimal viable straw man

I think it’s important to remember that the essay simulations are an engineering project not a scientific one. One difference is that the simulations necessary require decisions beyond science. Another difference is that the project needs a simple core that may not correspond to any evolutionary animal. For example, even simple animals have some rudimentary vision, if only two or three pigment spots. For another, learning centers like the mushroom body. And dealing with internal biological issues like breathing and blood pressure with motion.

This model in particular is more of a straw man or minimal viable product than an actual proposal for an ancestral proto-vertebrate mind. The model is intended to be a straw man, a target that might give a base framework to criticize or build on.

Alternative olfactory paths

Another potential “what” path for innate behavior goes through the medial habenula, which is responsive to odors and produces place avoidance [Amo et al. 2014], but [Chen et al. 2019] suggests it also supports attraction for food odors.

Olfactory innate path through habenula. Key: A.co cortical amygdala, H.l lateral hypothalamus, Hb habenula (medial and lateral), IPN interpeduncular nucleus, M.pag periaqueductal gray, Ob olfactory bulb.

In mammals, the olfactory path to H.l goes through the cortical amygdala (A.co) [Cádiz-Moretti et al. 2017]. While this essay is deliberately omitting the cortex, in the lamprey the olfactory path goes through the lateral pallium (LPa, corresponding to mammalian O.pir piriform cortex) to the posterior tubercular (Snc in mammals.)

For this essay, I’ve picked the Ob to Snc path instead of the alternatives for simplicity. The habenula path is very tempting, but would require exploring the IPN and serotonin (5HT) paths to the MLR, which is more complicated than a “what” path through H.l

Subthalamic nucleus as give-up circuit

The sub thalamic nucleus (H.stn) is associated with a “stop” action, stopping downstream motor actions, either because of a new, surprising stimulus, or from higher-level commands. Since a give-up signal stops the seek goal, the stop action from H.stn might play a part in the control

H.stn stop is in parallel to habenular give-up. Key: H.l lateral hypothalamus, H.stn subthalamic nucleus, Hb.l lateral habenula, M.lr midbrain locomotor region, Snc substantia nigra pars compacta, Snr substantia nigra part reticulata.

H.stn is believed to have a role in patience in decision making [Frank 2006] and in encoding reward and cost [Zénon et al. 2016], which is very similar to the role of the habenula, and H.stn projects to Hb.l via P.hb habenula-projecting pallidum.

However, the H.stn’s patience is more related to holding off (stopping) action before making a decision, related to impulsiveness, while the give-up circuit is more related to persistence, continuing an action. So, while the two capabilities are related, they’re different functions. Since current essay simulation does not have patience-related behavior arrest but does need a give-up time, the habenula seems a better fit.

Serotonin inhibition path

In zebrafish, the habenula inhibits the dorsal raphe (V.dr, serotonin neurons) but not Snc or dopamine [Okamoto et al. 2021]. The inhibition works through V.dr to the Snc/posterior tubuculum to the locomotive regions.

As with the alternative olfactory paths, this serotonin inhibition path may be more evolutionary primitive, but would add complexity to the essay’s model, so will be held off for later exploration.

Conclusions

As mentioned above, the purpose of this model is a basis for the current essay’s simulation, and as a straw man to focus alternatives to see if there might be a better minimal model.

References

Amo, Ryunosuke, et al. “The habenulo-raphe serotonergic circuit encodes an aversive expectation value essential for adaptive active avoidance of danger.” Neuron 84.5 (2014): 1034-1048.

Beauséjour PA, Zielinski B, Dubuc R. Olfactory-induced locomotion in lampreys. Cell Tissue Res. 2022 Jan

Braitenberg, V. (1984). Vehicles: Experiments in synthetic psychology. Cambridge, MA: MIT Press. “Vehicles – the MIT Press”

Cádiz-Moretti B, Abellán-Álvaro M, Pardo-Bellver C, Martínez-García F, Lanuza E. Afferent and efferent projections of the anterior cortical amygdaloid nucleus in the mouse. J Comp Neurol. 2017 

Charnov, Eric L. “Optimal foraging, the marginal value theorem.” Theoretical population biology 9.2 (1976): 129-136.

Chen, Wei-yu, et al. “Role of olfactorily responsive neurons in the right dorsal habenula–ventral interpeduncular nucleus pathway in food-seeking behaviors of larval zebrafish.” Neuroscience 404 (2019): 259-267.

Cohn R, Morantte I, Ruta V. Coordinated and Compartmentalized Neuromodulation Shapes Sensory Processing in Drosophila. Cell. 2015 Dec 17

Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21

Frank, Michael J. “Hold your horses: a dynamic computational role for the subthalamic nucleus in decision making.” Neural networks 19.8 (2006): 1120-1136.

Hikosaka, Okihide. The habenula: from stress evasion to value-based decision-making. Nature reviews neuroscience 11.7 (2010): 503-513.

Hu, Hailan, Yihui Cui, and Yan Yang. “Circuits and functions of the lateral habenula in health and in disease.” Nature Reviews Neuroscience 21.5 (2020): 277-295.

Isa, Tadashi, et al. “The tectum/superior colliculus as the vertebrate solution for spatial sensory integration and action.” Current Biology 31.11 (2021)

Kacelnik, Alex, and Dani Brunner. “Timing and foraging: Gibbon’s scalar expectancy theory and optimal patch exploitation.” Learning and Motivation 33.1 (2002): 177-195.

Okamoto H, Cherng BW, Nakajo H, Chou MY, Kinoshita M. Habenula as the experience-dependent controlling switchboard of behavior and attention in social conflict and learning. Curr Opin Neurobiol. 2021 Jun;68:36-43. doi: 10.1016/j.conb.2020.12.005. Epub 2021 Jan 6. PMID: 33421772.

Ryczko D, Cone JJ, Alpert MH, Goetz L, Auclair F, Dubé C, Parent M, Roitman MF, Alford S, Dubuc R. A descending dopamine pathway conserved from basal vertebrates to mammals. Proc Natl Acad Sci U S A. 2016 Apr 26

Steele TJ, Lanz AJ, Nagel KI. Olfactory navigation in arthropods. J Comp Physiol A Neuroethol Sens Neural Behav Physiol. 2023

Stephenson-Jones M, Floros O, Robertson B, Grillner S. Evolutionary conservation of the habenular nuclei and their circuitry controlling the dopamine and 5-hydroxytryptophan (5-HT) systems. Proc Natl Acad Sci U S A. 2012 Jan 17;109(3)

Stephenson-Jones M, Yu K, Ahrens S, Tucciarone JM, van Huijstee AN, Mejia LA, Penzo MA, Tai LH, Wilbrecht L, Li B. A basal ganglia circuit for evaluating action outcomes. Nature. 2016 Nov 10

Suryanarayana SM, Pérez-Fernández J, Robertson B, Grillner S. Olfaction in Lamprey Pallium Revisited-Dual Projections of Mitral and Tufted Cells. Cell Rep. 2021 Jan 5

Verschure PF, Pennartz CM, Pezzulo G. The why, what, where, when and how of goal-directed choice: neuronal and computational principles. Philos Trans R Soc Lond B Biol Sci. 2014 Nov 5

Webster JF, Vroman R, Balueva K, Wulff P, Sakata S, Wozny C. Disentangling neuronal inhibition and inhibitory pathways in the lateral habenula. Sci Rep. 2020 May 22

Zénon A, Duclos Y, Carron R, Witjas T, Baunez C, Régis J, Azulay JP, Brown P, Eusebio A. The human subthalamic nucleus encodes the subjective value of reward and the cost of effort during decision-making. Brain. 2016 Jun;139(Pt 6):1830-43.

Essay 16: Learning to Ignore

Essay 16 extends the odor seeking (chemotaxis) of essay 15 by adding a single memory item. The memory caches a failed odor search, avoiding the cost of searching for false odors. The neuroscience source is the fruit fly Drosophila. The simulation is still based on a Braitenberg slug with distinct circuits for chemotaxis and for obstacle avoidance.

Mushroom body model

The fruit fly mushroom body (MB) is the learning center. MB is a modulating system: if it’s knocked out, the fruit fly behaves normally, although with only intrinsic, unlearned behavior. Essay 16 focuses on a single MBON short term memory (STM) output neuron, which may specifically be the γ2 neuron.

Architecture of fruit fly mushroom body.
Mushroom body architecture, adapted from [Aso et al. 2014]

For simplicity and focus, essay 16 isn’t implementing the KC yet. Instead, the γ2 MBON receives its input directly from a small number of odor projection neurons (PN) that was implemented in essay 15. Essentially, the input is a small set of primitive odors, where the full KC is a massive combinatorial odor spectrum.

Candidate odors from evolution

To motivate why evolution might develop learning, consider the food-seeking slug from essay 15. Since the food odor in essay 15 perfectly predicted food, there was no reason to learn anything about food. The simulation’s “evolution” has perfectly solved the artificially perfect world, selecting exactly those odors needed to find food.

Choosing the right set of candidate odors is a dilemma for evolution. Too many candidates means wasted search time. Too few candidates avoids wasted time, but misses out on opportunities, which may be a smaller problem than too many candidates because the animal can fall back to random, brownian-motion search. This beast against including semi-predictive odors might mean that early Precambrian evolution might only favor the highest predictors and skip semi-productive odors.

Candidate odor surrounded by distractors

The preceding image represents odors potentially available to the slug from an evolutionary design perspective. If the beige color is the only candidate for food, the slug will ignore the blue-ish colors because it never senses the odor. There’s no need for a circuit or behavior to distinguish the two. For the animal, those odors don’t exist.

Food odors don’t perfectly predict food, either because of lingering odors or simply candidate odors that aren’t always from nutritious food. For example, the fruit fly can taste sweet and it can also sense nutrition from a rest in blood sugar. That distinction between sweet and nutritious is reflected in the mushroom body with specific neurons for each [Owald et al. 2015].

Classical association

For this essay, let’s explore what could be the most trivial memory, in the context of the fruit fly MB. The MB output has only 24 neurons in 15 distinct compartments per hemisphere. Each compartment appears to have specialized roles, such as short term memory (STM) vs long term memory (LTM) [Bouzaiane et al. 2015], and water seeking compared to sugar seeking [Owald et al. 2015].

Although learning studies typically use classical association (Pavlovian) terminology, where a conditioned stimulus (CS) like the food odor becomes associated with an unconditioned stimulus (US) like consuming food, I don’t think that framing is useful for the odor-seeking behavior of the simulated slug.

Naive animal missing food before classical training

In the diagram above, which follows the classical model, the animal (arrow) missing the food (brown square) despite being in the candidate odor’s plume because it hasn’t learned the associate the odor (CS) with the food (US). It only learns the association if it finds the food through brownian random search. Even then, if it randomly hits another food source with a different odor, it will forget the first, limiting the gain from this learning.

Even the non-learning algorithm of essay 14 performs better, because naive searching of all candidate odors is relatively successful, even if slightly time inefficient. Behaviorally, the difference is between default-approach or default-ignore. Default-approach needs to learn to ignore and default-ignore needs to learn to approach.

Learning to ignore

Learning to ignore is an alternative to the classical associative way of looking at the problem. It’s not an argument against classical conditioning in total, but it is a different perspective that highlights different features of the problem.

Successful approach to food and failure

The diagram above shows a successful approach to food and an unsuccessful approach. Both candidate odors potentially signal food because evolution ignores useless odors, but in this neighborhood the reddish odor is a non-food signal. As in essay 14, habituation will rescue the animal from perseveration, spending infinite time exploring a useless odor, but once an odor is found useless, ignoring it from the start would improve search efficiency.

Since odors in a neighborhood are likely similar, the animal is likely to encounter the useless odor soon. So, remembering a single item, like a single item cache, will improve the search by avoiding cost, until the animal reaches an area that does have nutritious food. The single-item cache lets the animal ignore patches of non-predictive odors.

Single item cache (short term memory)

A single mushroom body output neuron (MBON) and its associated dopamine neuron (DAN) can implement a single item cache by changing the weights of the KC to MBON synapses with long-term depression (LTD). Following the previous discussion, since it’s more efficient to remember the last failure than the last success, the learning is LTD at the synapse between the odor and the MBON. In fruit flies, short term memory (STM) is on the order of 2h. (For a fuller discussion between “short” and “long” term see [Sossin 2008].)

Reduced MB circuit for negative learning

In the above diagram, the O2 synapse with a ball represents the LTD cache item. If the animal senses odors for either O1 or O3, it approaches the odor. If it senses O2, it ignores the odor because of the LTD at the synapse. (The colors follow the mnemonic model. Purple represents primary sensor/odor, and blue represents apical/limbic/odor and motivation areas.)

The DAN needs to implement a failure signal to implement LTD, which is actually relatively complicated. Unlike success, which has an obvious direct stimulus when finding food, failure is ambiguous. How long the animal should persist before giving up is a difficult problem, and at very least requires a timer even for the simplest strategy. Because habituation already implements a timeout, an easy solution is to copy the circuit or possibly use its output. So, if the animal exists the odor plume because of habituation, the DAN might signal failure.

Another possibly strategy is for the DAN to continuously degrade the active signal as in habituation, and only rescue the synapse when discovering food. Results from [Berry et al. 2018] show the needed degrading over time (ramping LTD) in their study of MBON-γ2, although that study didn’t explore the rescuing of approach by finding food that we seed.

So, the second strategy might require a second, opposing neuron, which I’ll probably explore later. For this essay, the DAN will produce a failure signal on timeout and a success signal on finding food, something like a reward prediction error signal from reinforcement learning [Sutton and Barto 2018], but without using a reinforcement learning architecture.

Mammalian correlates

In mammals, the KC/MBON synapse with DAN modulation circuit functionally resembles the hippocampus CA3 to CA1 synapse (E.hc, E.ca1, E.ca3) with locus coerulus (V.lc). In mammals, V.lc is known as the primary source of noradrenaline, associated with surprise and orientation, but it also contains dopamine neurons, and strongly innervates the hippocampus.

[Aston-Jones and Cohen 2005] discuss the locus coeruleus involvement in decision-making, specifically in explore vs exploit decision. If time passes and exploitation continues to fail by not finding food, V.lc signaling encourages moving on and exploring different options, a behavior similar to ours.

The E.ca3 to E.ca1 connection (and E.ec, entorhinal cortex) is believed to detect novelty, and V.lc is active during exploration, Like the fruit fly MBON, the hippocampus uses LTD to learn a new place, using V.lc signal [Lemon et al. 2012] like the DAN.

In contrast with my simulated slug, since the E.hc novelty output doesn’t directly drive food approach, because the mammalian brain is far more complex and abstract, the comparison isn’t exact, but it is an interesting similarity.

Simulation: shared habituation

In essay 15, the simulated slug approached an intrinsically attractive odor to find food, but needed a habituation circuit to avoid perseveration. The fruit fly LN1 neurons between the ~50 main olfactory sensory neurons (ORN) and the ~150 olfactory projection neurons (PN) implement primary olfactory habituation. In this essay, I’m essentially adding a second odor to the system. Although the fruit fly has separate habituation circuits for each of the 50 primary odors, it’s interesting to see what a shared habituation circuit might look like in the simulation.

Shared habituation pre-learning circuit

The simulation heat map shows the animal spends much of its time between the odor plumes because the habituation timeout keeps refreshing, despite encountering different odors. While habituation is active, the animal doesn’t approach either odor plume, but mostly moves in the default semi-random pattern. Only when habituation times out will it approach a new odor.

Shared habituation. Warm colors are food-predicting odors, blue are distractors.

Split habituation

The fruit fly has a split habituation unlike the previous simulation. Each primary odor has an independent habituation circuit, which is synapse specific.

Split habituation in pre-learning circuit

In the simulation of split habituation, the animal spends more time investigating the odors because each new odor has its own habituation timeout. It can move from a failed odor and immediately explore a new odor.

Simulation heat map for split habituation

Although the animal spends much of its time exploring the distractor candidate odors, it’s still a big improvement over random search, because it’s more likely to find food instead of a near miss.

Single distractor

Since a single distractor exactly fits the single item cache, it’s unsurprising that adding the cache immediately solves the distractor problem. In the following heat map, the animal only explores the successful candidate odor and ignores the distractor.

Simulation heat map for single-item negative cache

Multiple odors and distractors

Multiple distractor odors is more interesting for a single item cache because it introduces miss-rate as a prominent issue and allows comparison between negative caching and positive caching (classical association). The table below is a summary of feeding time as a success metric for each strategy.

AlgorithmFeeding time
No odor approach0.8%
No learning7.4%
LTD (cache)8.0%
LTP (classical)5.7%
Success comparison for multiple algorithms, measured by feeding time

No odor approach

As a baseline, the first simulation disables all odor approach. The animal only reaches food when it runs into it randomly. While it’s above the food, the animal will slow, improving its efficiency somewhat. This strategy was explored in essay 14, and resembles the feeding of Trichoplax in [Smith et al. 2015].

Simulation heat map for odor-ignoring animal.

As the heat map shows, this strategy is pretty terrible. Because the animal only finds food by randomly crossing it, its success rate is purely a matter of the area covered by food. Although this strategy may have been effective with Precambrian bacteria mats, where finding food isn’t an issue, it’s a problem when finding food is a necessary task.

No learning

Intrinsic chemotaxis is important as a baseline for the learning strategies. In the fruit fly intrinsic odor approach behavior is in the lateral horn. When the MB is disabled, the lateral horn continues to approach odors.

Simulation heat map for non-learning odor-seeking animal.

As the above heat map shows, intrinsic odor approach is a vast improvement over non-chemotaxis, improving food time from 0.8% to 7.4% in this environment.

Negative caching (LTD)

The single item caching that’s the focus of this post improves the food time from 7.4% to 8% by avoiding some of the time spent on non-food odors. The difference isn’t as dramatic as adding odor approach itself, but it’s an improvement.

Simulation heat map for single-item negative cache

In this strategy, the animal remembers the last failure odor, and ignores the odor plume the next time it reaches it. The animal explores all other odors, including failure odors. On a cache miss (failure), the animal remembers the new failure and forgets the old one.

Classical conditioning (LTP)

The next strategy tries to simulate what classical conditioning might look like if it was used for behavior. In this simulation, the animal only follows the odor after it’s associated with food, which means the animal needs to randomly discover the food first.

Simulation heat map for single-item classical association learning

This strategy is actually worse than the non-learning case, because it only finds one food source at a time. Although the heat map shows both being visited, the areas are actually alternating. One source is the only found food for a long time until the other is randomly discovered, when the roles switch and the first is now ignored.

Simulation limitations

I think it’s important to point out some simulation limitations, particularly since I’ve added performance numbers for comparison. The simulation environment and timings can affect the numbers dramatically. For example, the odor plume size dramatically affects the classical conditioning algorithm. If finding food without following the odor is difficult, the classical conditioning animal will have great difficulty finding a new odor.

Specifically, if the gain from following the odor is large, then classical conditioning will always have a penalty, because it loses out on that gain until it makes its association. In contrast an explore-first strategy will always gain the odor-exploring advantage. If the gain of explore-first outweighs its cost, then a non-learning explore-first will win against associative learning.

cost_cache = p_miss * cost_miss + (1 - p_miss) * cost_hit
cost = cost_cache + cost_noncache

Consider the rough cache cost model above to see some of the issues with the negative cache. If the non-cacheable cost greatly outweighs the cache miss cost, then it doesn’t matter if the animal learns to avoid irrelevant odors. Contrariwise, if the miss cost is very large, then the miss rate is critical.

In addition, the miss rate is highly dependent on spatial and temporal locality. If similar odors are tightly grouped, even a small cache will have a low miss rate. But if there are many different distractor types spread randomly, the cache will miss most of the time.

Links

References

Aso Y, Hattori D, Yu Y, Johnston RM, Iyer NA, Ngo TT, Dionne H, Abbott LF, Axel R, Tanimoto H, Rubin GM. “The neuronal architecture of the mushroom body provides a logic for associative learning.” Elife. 2014

Aston-Jones, Gary, and Jonathan D. Cohen. “An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance.” Annu. Rev. Neurosci. 28 (2005): 403-450.

Berry JA, Phan A, Davis RL. “Dopamine Neurons Mediate Learning and Forgetting through Bidirectional Modulation of a Memory Trace.” Cell Rep. 2018

Bouzaiane E, Trannoy S, Scheunemann L, Plaçais PY, Preat T. “Two independent mushroom body output circuits retrieve the six discrete components of Drosophila aversive memory.” Cell Rep. 2015 May 26;11(8):1280-92.

Lemon N, Denise Manahan-Vaughan, “Dopamine D1/D5 Receptors Contribute to De Novo Hippocampal LTD Mediated by Novel Spatial Exploration or Locus Coeruleus Activity, Cerebral Cortex,” Volume 22, Issue 9, September 2012, Pages 2131–2138.

Owald D, Felsenberg J, Talbot CB, Das G, Perisse E, Huetteroth W, Waddell S. “Activity of defined mushroom body output neurons underlies learned olfactory behavior in Drosophila“. Neuron. 2015 Apr 22

Smith CL, Pivovarova N, Reese TS. “Coordinated Feeding Behavior in Trichoplax, an Animal without Synapses.” PLoS One. 2015 Sep 2

Sossin, Wayne S. “Defining memories by their distinct molecular traces.Trends in neurosciences 31.4 (2008): 170-175.

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.

Essay 15: Odor Navigation

Since essay 15 is exploring the fruit fly mushroom body, which is olfactory-focused, the simulated animal needs odor-based behavior, following attractive odors toward food. The simulation assumes an odor gradient navigation system without implementing the details. Basically, it will cheat and assume a direction vector toward the odor, because the details don’t seem to matter for the mushroom body.

In reality, the animal should use a timed gradient calculation with a single sensor (klinotaxis) or a directional navigation with multiple lateral sensors (tropotaxis). Simple animals use either navigation technique, and even bacteria can use klinotaxis with a run-and-tumble strategy.

Odor plumes in water

The gradient itself is a big oversimplification for a marine slug as used in essay 15, because odors in water don’t have a simple gradient but clump instead. [Steele et al. 2023]. Since the odor plumes, clumps and filaments drive on the waters current, following the current upstream is a more effective than computing gradients.

To following a clumped odor plume in a water current, animals move upstream, against the flow toward the source. The navigation is based on the current flow mechanosensors, not an odor gradient. The odor sensing merely enables current following, which is an interesting circuit between chemosensory and mechanosensory circuits. Odor detection provides timing and go/no-go while the mechanosensory circuit navigates, somewhat like the “what” vs “where” split in the visual cortex.

In the diagram above, the odor control of the contra-flow navigation is inhibitory, a common pattern in vertebrate brain. For example, the striatum complex (basal ganglia) tonically inhibits its output, including midbrain locomotion or optic tectum. When an action is selected, the striatum disinhibits the midbrain command neurons. Despite the complication of disinhibition – double inhibition – the system improves signal noise.

When an inhibitory neuron disables a command, the added noise doesn’t matter because the behavior is disabled, and the extra control signal noise doesn’t harm the command. When the inhibitory control is taken away, the system has clean, undisturbed sensory data. As a contrast, in an excitatory system where the odor sensor positively excited the command, the odor control signal would add noise to the mechanical sensors, reducing precision. So, despite the extra complication of a double-negative inhibitory system, it’s behaviorally superior.

Essay 15 relevance

Although this odor navigation probably won’t be part of the essay 15 simulation, I think it’s important to describe what’s left out when simplifying a model. If the simulation becomes too simplified, it can lose the essence of the behavior. The simplification is necessary to keep the model uncluttered and focused, but the dividing line is a judgement call.

References

Steele TJ, Lanz AJ, Nagel KI. “Olfactory navigation in arthropods.” J Comp Physiol A Neuroethol Sens Neural Behav Physiol. 2023 Jul;209(4):467-488. doi: 10.1007/s00359-022-01611-9. Epub 2023 Jan 20. PMID: 36658447; PMCID: PMC10354148.