In the simulation, food search has two phases: a roaming phase, which is a correlated random walk, and a seek phase which climbs an odor gradient to food. Seek is very efficient, although it does require a timeout to handle false odor plumes, but roaming is essentially a random walk, leaving lots of opportunities for improvement. One possible improvement is reducing repeated search by avoiding already searched areas.
Evolutionary own-trail avoidance
Some of the earliest bilaterian fossil trails show the crawling animals avoiding crossing their own trail [Sims and Kiverstein 2022], suggesting this optimization existed in even the earliest bilaterians. The fossil trails also show wall-following (thigmotaxis). By combining own-trail avoidance, wall-following, and u-turns the bilaterians generated spiral-like paths that efficiently searched the local area.
The communal unicellular slime mold Physarum polycephalum leaves a trail of slime as it moves and the slime mold avoids its own trail [Reid et al 2012]. Physarum has been studied as solving complex search like the traveling salesman problem and maze escape. Even a simple animal can implement own-trail avoidance. Robot navigation and mapping has experimented with own-trail avoidance [Balch 1993]. Ants use pheromones to make trails to food [Jackson et al 2006], which is essentially an external memory for navigation.
The simulated animal represents a chordate proto-vertebrate, and the proto-vertebrates were freely swimming filter feeders, which essentially precludes leaving an odor trail because water currents would immediately disturb the odors. For the sake of this essay, I’m ignoring this practical implausibility, because I’m interested in how memory might use existing proto-vertebrate avoidance circuits. Using breadcrumbs as external memory can be a prelude to neural internal memory [Sims and Kiverstein 2022].
Odor avoidance
Odor trail avoidance could use at least two direct action paths in the proto-vertebrate brainstem. One path goes through Hb.mv (medial ventral habenula) to R1.a (anterior hindbrain motor area), and another path goes through V.pt (posterior tuberculum) to MLR (midbrain locomotor region) to R5.rs (hindbrain reticulospinal motor). Hb.mv, R1.a, MLR, and R5.rs are all highly conserved areas in vertebrates, while V.pt is likely conserved as the equivalent to mammalian midbrain dopamine Vta (ventral tegmental area) and Snc (substantia nigra pars compacta). Other action paths exist using the cortex, but to keep the animal simple, I’m still postponing all cortical circuits.
In lampreys, the Ob.m (medial olfactory bulb) projects direction to Hb.m (medial habenula) [Stephenson-Jones et al 2012], [Suryanarayana et al 2021]. Zebrafish Ob also projects to right Hb.d (Hb.mv in mammals) to R.ip.v (ventral interpeduncular nucleus, R.ip.m medial R.ip for mammals) [Miyasaka et al 2014], [Choi JH et al 2021], [Krishnan et al 2014], [Turner 2016] which can support chemotaxis to avoid odors [Choi JH et al 2021], [Krishnan et al 2014].
The R.ip.v (R.ip.m for mammals) is interconnected with R1.a for motor and can drive taxis avoidance, including chemotaxis [Chen WY et al 2019], phototaxis [Chen X and Engert 2014], and chemotaxis [Palieri et al 2024].
Possible negative chemotaxis for breadcrumb avoidance using the lamprey odor to habenula path. Hb.mv (medial ventral habenula), Ob.m (medial olfactory bulb), R1.a (anterior hindbrain motor area), R.ip.m (median interpeduncular nucleus), V.mr (median raphe).
Tetrapods do not have a direct Ob to Hb connection, in contrast to the lamprey and possibly fish. In a the fire-bellied toad, a basal amphibian, the Ob only projects to the habenular commissure but not to Hb itself [Freudenmacher et al 2020]. Similarly, mammals do not have a direct Ob projection to Hb.mv. Instead, Hb.mv is almost entirely driven by posterior P.se (septum) [Viswanath et al 2014], [Yamaguchi et al 2013], [Choi K et al 2016], which is mainly driven by E.hc (hippocampus). This mammalian upgrade from direct Ob sensory input to cognitive E.hc input is a major reason I’m using this Hb.mv to R.ip.m path for the essay.
V.mr (median raphe) will be important fr this essay to detect transition from roaming to avoidance, allowing U-turns or directional turns only at border crossings. When the animal first enters the avoidance odor plume, it should turn away, but it shouldn’t continue turning while it’s inside the plume.
Odor seek
I’m treating the section path through V.pt as a seek action path because of its resemblance to the Vta/Snc connectivity with MLR, and the Ob to V.pt can drive movement [Derjean et al 2014], but I haven’t seen a study showing seek functionality. In lampreys Ob.m projects to V.pt [Derjean et al 2014], which drives MLR, which drives R5.rs (hindbrain reticulospinal motor neurons) [Beauséjour et al 2022]. Zebrafish Ob.m projects to V.pt [Imamura et al 2020].
Possible odor-seek path in lampreys and fish. MLR (midbrain locomotor), Ob.m (medial olfactory bulb), OT (optic tectum), R.rs (mid-hindbrain reticulospinal motor), V.pt (posterior tuberculum).
For this essay, I’m interpreting this circuit as a seek system to approach food odors. Because MLR is highly interconnected with OT (optic tectum), I’m considering this system as “tectal” although I’m not actually using OT for this essay.
An alternative breadcrumb path
An alternative path for memory could use Hb.lm (medial Hb.l – lateral habenula), which projects to V.mr.glu (glutamate V.mr) and Vta.pm (posterior-medial Vta). Vta.pm drives RTPA (real time place avoidance through S.msh.v (ventral S.msh – medial shell of the ventral striatum). V.mr.glu drives E.hc (hippocampus) theta through P.msdb (median septum and diagonal band), which can also drive RTPA itself.
Memory for avoiding repetition is the underlying goal here, but this essay is focused on a fictional breadcrumb odor, in part because the path is simpler. Although Hb.lm produces avoidance like Hb.mv, it’s more complicated to explain where the Hb.lm signal comes from. In contrast the lamprey Hb.mv is directly driven by Ob.m.
Possible memory avoidance paths using the habenula, hippocampus and basal ganglia. E.hc (hippocampus), Hb.lm (medial lateral habenula), P.msdb (median septum and diagonal band), S.msh.v (ventral medial shell of the ventral striatum), V.mr.glu (median raphe glutamate projection), Vta.pm (posterior medial ventral tegmental area).
The above diagram shows a possible mammalian path for memory-based avoidance. Hb.lm is driven by S.msh (medial shell of the ventral striatum), which is associated with place preference and avoidance. Because the animal is avoiding already explored areas, this is a possible action path for mammals. However, that memory-based system requires far more neural machinery than the essay’s proto-vertebrate allows.
Medial habenula
For context (but not strictly needed for this essay), Hb.m has two areas with distinct circuits, although in mammals these two areas further subdivide into five subareas. In lampreys and fish Hb.d (mammal Hb.m) is asymmetrical. In fish, odor goes to the right Hb.m, and light input goes to the left Hb.m. The right odor input is used for chemotaxis [Chen R et al 2023], [Chen WY et al 2019], such as food seeking or avoiding predators. The light input is used as a landmark for body and head direction [Lavian et al 2024], such as using the sun as a compass.
Median habenula divisions dorsal and ventral with their respective inputs and outputs. The dorsal Hb.m is for landmarks and head direction and the ventral Hb.m is for taxis, including chemotaxis and thermotaxis. Hb.md (dorsal Hb.m – medial habenula), Hb.mv (ventral Hb.m), P.bac (bed nucleus of the anterior commissure), P.ldt (laterodorsal tegmental area), P.ts (triangular septum), R1.a (anterior hindbrain motor area), R.dta (dorsal tegmental area), R.dtg-vtg (dorsal tegmental area of Gudden, ventral tegmental area of Gudden), R.ip.l (lateral R.ip – interpeduncular nucleus), R.ip.m (median R.ip), R.nin (nucleus incertus), V.mr (median raphe).
Taxis and landmark information use different areas of R.ip. Taxis uses R.ip.m (median R.ip in mammals, ventral R.ip in fish) [Krishnan et al 2014], and direction uses R.ip.l (lateral R.ip in mammals, dorsal R.ip in fish) [Lavian et al 2024]. R.ip.l and R.ip.m correspondingly project differently. R.ip.l is interconnected with directional nuclei in R.dta (dorsal tegmental area) such as R.dtg (dorsal tegmental nucleus of Gudden) for head direction, R.vtg (ventral tegmental nucleus of Gudden), and R.nin (nucleus incertus) for eye direction. R.ip.m is connected with motivational areas such as V.mr, P.ldt (laterodorsal tegmentum) and the motor R1.a.
In mammals, Hb.m only receives input from the posterior septum, specifically P.bac (bed nucleus of the anterior commissure), P.ts (triangular septum), P.sf (septofimbrial nucleus), and P.ms (median septum) [Juárez-Leal et al 2022]. These septal areas are largely driven by E.hc. This indirect, cognitive hippocampal input for mammals contrasts with the direct sensory input for fish and lampreys.
Full habenula
For further context, Hb.l (lateral habenula) also divides into two major areas, which further subdivide into nine subnuclei. Hb.lm (medial Hb.l) will likely be important soon because it’s an avoidance path that projections directly to V.mr and Vta.pm motivational avoidance areas, but it’s not yet important for this essay. Hb.lm is important for avoidance and Hb.ll (lateral Hb.l) for failure. Hb.ll.ov (oval sub nucleus of Hb.ll) is highly studied for its role in dopamine motivation and learning, and its inputs are specific to Hb.ll.ov, which contrasts with other Hb.l inputs that are more diffuse.
Zebrafish Hb.v (Hb.l in mammals) only projects to the serotonin are V.mr, but not to dopamine areas [Agetsuma et al 2010]. Because lamprey Hb.l does project to dopamine as well as serotonin areas [Stephenson-Jones et al 2012], the zebrafish may be a secondary loss, but it does suggest that V.mr is more critical to Hb.l than the dopamine projection.
Functional connectivity of the habenula. H.l.glu (lateral habenula glutamate projection), Hb.ll (oval nucleus of the lateral Hb.l – lateral habenula), Hb.lm (medial Hb.l), Hb.md (dorsal Hb.m – medial habenula), Hb.mv (ventral Hb.m), P.bac (bed nucleus of the anterior commissure), P.epn (endopeduncular nucleus), P.ldt (laterodorsal tegmental area), P.ts (triangular septum), R1.a (anterior hindbrain motor area), R.dta (dorsal tegmental area), R.dtg-vtg (dorsal and ventral tegmental nuclei of Gudden), R.ip.l (lateral R.ip – interpeduncular nucleus), R.ip.m (medial R.ip), V.mr (median raphe), Vta.l (lateral Vta – ventral tegmental area), Vta.pm (posteromedial Vta).
The above diagram shows a functional diagram of Hb and some of its inputs and outputs. This essay uses Hb.mv avoidance taxis for avoid the breadcrumb odor. The next essay may use Hb.lm avoidance (non-taxis avoidance) for place avoidance memory. Hb.md landmark and Hb.ll failure are not currently used in the simulation, but may become important soon.
Simulation
The essay’s simulation has the animal searching for food using a correlated random walk as its base search strategy. Adding breadcrumbs for the animal’s own path could potentially improve the search by avoiding re-searching old areas. The simulation uses an Ob to Hb.m path, which then drives R1.a motor area. If the animal detects a breadcrumb, it turns away from its current path.
Simulation modules for the breadcrumb negative taxis. Hb.mv (ventral Hb-m – medial habenula), Ob.m (medial olfactory bulb), R1.a (anterior hindbrain motor area), R.ip (interpeduncular nucleus), V.mr (median raphe).
The above diagram shows the simulation modules in this breadcrumb avoidance action path. HbTaxis includes both Hb.m and R.ip. The Raphe module represents V.mr and stores the current for a short time on the order of a second. The animal should only make a U-turn if it newly encounter its trail. If it’s already avoiding the trail, it should move ballistically. The Raphe module maintains the current avoidance action to enable boundary-only turns.
Screenshot of the animal seeking food in an open field. The teal star represents food, the teal circle represents its odor plume, and the purple circles represent the breadcrumb trail.
Simulation roam
Roaming is driven by circadian wake and by a FoodZone detection, as used in essay 43. Parts of H.l respond to the animal entering a food zone [Jennings et al 2015]. For this essay, H.l HypMove drives roam when outside a food zone and pauses inside a food zone for filter feeding. HypMove represents the SLR (subthalamic / hypothalamic locomotor region), which is part of H.l [Ji C et al 2024].
Simulation modules for roaming. Wake and hunger drives roaming, which stops when the animal reaches a food zone. H.l (lateral hypothalamus), H.scn (suprachiasmatic nucleus – circadian), N.sp (spinal cord), P.bst (bed nucleus of the stria terminalis), SLR (subthalamic motor region), S.ls (lateral septum), R1.a (anterior hindbrain motor area).
Importantly, the roaming signal needs to disable breadcrumb avoid. If the animal is in a food zone, any roaming optimization needs to stop when roaming stops. In the essay’s stimulation, I’m using R1.a HindMove as an integration point for roaming motivation with the chemotaxis.
Simulation seek
The breadcrumb avoidance needs to coordinate with the Seek module. Seek follows a target odor toward food, essentially chemotaxis. The Seek module implements a bilateral, directional seek. In lamprey the V.pt receives direct input from Ob and projects to MLR [Derjean et al 2010], [Beauséjour et al 2022], [Beauséjour et al 2024], which drives locomotion through R5.rs.chx10 (mid-hindbrain reticulospinal) [Cregg et al 2020]. The Seek module is enabled by HypMove, which represents H.l, in particular its roaming signal.
Simulation modules for the seek action path, directly driven by olfactory input. H.l (lateral habenula), MLR (midbrain locomotor region), N.sp (spinal cord), Ob.m (medial olfactory bulb), S.lsh (lateral shell of the ventral striatum), R5.chx10 (mid-hindbrain locomotor region), V.pt (posterior tuberculum).
The Seek module uses an entirely different locomotion action path than the breadcrumb’s avoid action path. Seek drives MidMove, representing MLR, which projects to R5.rs.chx10 (mid-hindbrain reticulospinal motor area), which is distinct form the R1.a motor area. In contrast the breadcrumb taxis used HbTaxis to the HindMove R1.a module. These two action paths only directly interact at the spinal cord motoneurons. Because there’s not central node that manages these two action paths, they need to inhibit each other as a distributed system.
Importantly, the Seek module needs a timeout to avoid perseveration. I’m using the striatum as a timeout system, as I’ve done in that last few essays. The striatum region would likely correspond to the mammalian S.lsh (lateral shell of S.v – ventral striatum) or S.core (core of S.v) because those are involved with seeking, as opposed S.msh (medial shell of S.v), which is more involved with place.
Screenshot of the U-trap scenario as the animal times out its seek. The teal star represents the food zone and the teal disk represents its odor plume.
The above screenshot shows the animal just after the striatum timeout expire. The circular teal area is an odor plume, the teal star is the food zone, and the beige walls are barriers. While Seek is active, the animal struggles against the barrier to try to follow the odor plume. When the striatum expires, the animal returns to its roaming.
Monte Carlo results
I updated the simulation framework to enable Monte Carlo experiments without using the graphical view. Each scenario timed the animal searching, finding, and eating food with success defined as nutrients in the animal’s gut. The scenarios executed 200 times. The two scenarios maps were an open field and a U-shaped trap. Each map had a scenarios with a large seek odor plume and a scenario without an odor plume.
Monte Carlo results comparing roaming without breadcrumb avoidance against trail avoidance.
Although the breadcrumb trail shows a small improvement in the open field, it’s a minor difference. For this simple implementation there isn’t a huge gain with the breadcrumbs. It’s possible that a better implementation would improve the results, but this essay was looking for large gains from a simple change.
The breadcrumb strategy did avoid crossing the animal’s trail more than the roaming-only strategy, but often the breadcrumbs pushed the animal away from the goal. If the animal made a mistaken turn away from the goal, the trail-avoidance would exacerbate that mistake by driving the animal to search further away from the goal. In contrast the default roam would often reverse its mistake.
The seek trap scenario found that striatum timeout with avoid was better than timeout that just disabled seek. If that result generalizes, it might help explain why the S.v (ventral striatum) output region P.v (ventral pallidum) produces avoidance when triggered by S.d2 (striatum projection neurons with D2.i inhibitory dopamine receptors) indirect path.
The seek trap also showed the need for progressively increasing timeout. Because the timeout recovery time is currently fixed, the animal could restart seeking before exiting the trap, producing a cycle of seek and timeout. The current striatum timeout matches the adenosine building on the timeframe of 120s to 180s, but it doesn’t include longer term plasticity. Plasticity would progressively increase the timeout on the order of 20 minutes to an hour.
Issues raised by the simulation
The simulation raised several issues because it integrated multiple action paths that I’d previously implemented independently.
The breadcrumb avoid in Hb-R.ip should respect the roam and food zone calculated in H.l.
The breadcrumb avoid is distinct from chemotaxis avoid such as avoiding predator odor, which is also in the Hb-R.ip circuit.
How does the breadcrumb avoid interact with ARTR (anterior hindbrain turning region)?
How does the R.pb (parabrachial) toxic-environment avoid interact with the Hb-R.ip taxis avoid?
How does V.rn (serotonin raphe nuclei) interact with Hb-R.ip and R1.a? These regions are highly interconnected.
Avoid itself should have a timeout. S.msh.v and Vta.pm are activated for avoidance and could serve as an avoidance timeout.
Seek (V.pt) uses a different MLR action path and mid-hindbrain R5.rs than the anterior hindbrain R1.a motor output used by SLR roam and Hb-R.ip avoidance. How is this conflict managed? In the lamprey, inhibiting SLR does not affect the Ob to MLR to R5.rs action path [Derjean et al 2010].
Seek needs to stop when roaming stops for a food zone.
Seek timeout needs to progressively increase when the initial timeout is insufficient to escape the U-trap.
Roam vs seek action paths
I’m treating the roam action path as distinct from the seek path. Roam uses Hb.m → R.ip → R1.a using SLR, but seek uses V.pt → MLR → R5.rs.chx10. These two paths use similar input from innate-odor Ob.m and final output N.sp (spinal motoneurons), but everything else is independent. I’m associating the roaming path with limbic areas and seek path with tectal-associated areas, but OT (optic tectum) is not part of the essay’s simulation. The important issue here is how the two paths interact.
For the seek path I’m using the lamprey V.pt as a proto-vertebrate seek precursor to the mammalian Vta.l and S.lsh seek system.
Subcircuit showing the distinct action paths for roaming and seeking. Roaming is associated with SLR and limbic areas, and seeking is associated with MLR and tectal-associated areas. Hb.mv (ventral medial habenula), MLR (midbrain locomotor region), N.sp (spinal motoneurons), Ob.m (medial olfactory bulb), P.ldt (laterodorsal tegmental area), R1.a (anterior hindbrain motor region), R5.chx10 (mid-hindbrain motor region), V.pt (posterior tuberculum).
Studies involving nicotine addiction have identified an inhibitory path in mammals from R.ip avoidance via P.ldt (laterodorsal tegmentum) and the Vta to S.lsh circuit [Wolfman et al 2018], [Kim K and Picciotto 2023]. R.ip ⊣ P.ldt → Vta.l → S.lsh. R.ip inhibits P.ldt, which inhibits Vta.l phasic dopamine, which inhibits seek.
For the simulation, I’m using P.ldt as an inhibitory path from HbTaxis to inhibit Seek. In mammals P.ldt and Ppt (pedunculo-pontine tegmentum) are distinct but related areas, but non-mammal studies do not show distinct areas, at least for the studies I’ve read. I’m assuming a proto-vertebrate would have a single Ppt/P.ldt complex. Ppt is either part of the MLR or at least highly associated with it, and Ppt is highly interconnected with OT.
Simulation model
The Hb.m roam and V.pt seek action paths described above need to interact with the hunger and food-zone driving input from H.l HypMove. In this system, H.l HypMove drives both R1.a and V.pt. In mammals H.l as SLR drives R1.a for roaming [Ji C et al 2024]. Mammalian H.l is also strongly interconnected with Vta.
The R.pb.l tactile toxic avoidance needs to interact with the R1.a roaming circuit. The simulation’s R.pb.l RpbAvoid drives avoiding in R1.a HindMove, which is almost certainly incorrect. R.pb drives avoidance for place-specific irritations like itch. This R.pb itch-avoidance projection is to hypothalamic nuclei such as H.pv and H.l, and is distinct from R.pb projections for S.a (central amygdala), and P.bst (bed nucleus of the stria terminalis), which functionally handles food-related issues like sickness. In the diagram, the dotted line from RpbAvoid to HypMove does not currently exist in the simulation, but I will need to change that connectivity in a later essay.
Discussion
The experiment explored if a simple breadcrumb odor could improve searching for food. The breadcrumb odor drives an avoidance circuit in R1.a using the same avoidance action path as for predator odors with Hb.m to R.ip. The essay’s implementation did not show a significant improvement from roaming random walk.
One possibility is that this result is accurate and a simple breadcrumb trail is not an improvement over random walk for this scenario. The breadcrumb avoids crossing the animal’s own path. When the animal makes a wrong choice away from the food, this avoidance can exacerbate the error by forcing the animal to continue searching further away instead of crossing the trail to correct the mistake.
Another possibility is that the essay’s implementation is too simplistic or is broken. While possible or even likely, I’d have expected that if breadcrumbs provide an immediate large improvement, that even a flawed implementation would show significant gains.
The most significant improvement for seeking was the timeout and subsequent avoidance when the animal gave up seeking the odor. This timeout avoidance was more effective than the breadcrumb avoidance of the seek, and both were more effective with giving up and resuming roam without an avoidance phase. The striatum as a timeout and memory device is both simpler than a complicated mental breadcrumb system and potentially more effective.
References
Agetsuma M., Aizawa H., Aoki T., Nakayama R., M. Takahoko, M. Goto, T. Sassa, R. Amo, T. Shiraki, K. Kawakami, et al. The habenula is crucial for experience-dependent modification of fear responses in zebrafish Nat. Neurosci., 13 (2010), pp. 1354-1356
Beauséjour, P.A., Zielinski, B. and Dubuc, R., 2022. Olfactory-induced locomotion in lampreys. Cell and tissue research, pp.1-15.
Beauséjour PA, Veilleux JC, Condamine S, Zielinski BS, Dubuc R. Olfactory Projections to Locomotor Control Centers in the Sea Lamprey. Int J Mol Sci. 2024 Aug 29;25(17):9370.
Chen R, Xu X, Wang XY, Jia WB, Zhao DS, Liu N, Pang Z, Liu XQ, Zhang Y. The lateral habenula nucleus regulates pruritic sensation and emotion. Mol Brain. 2023 Jun 27;16(1):54.
Chen WY, Peng XL, Deng QS, Chen MJ, Du JL, Zhang BB. Role of Olfactorily Responsive Neurons in the Right Dorsal Habenula-Ventral Interpeduncular Nucleus Pathway in Food-Seeking Behaviors of Larval Zebrafish. Neuroscience. 2019 Apr 15;404:259-267.
Chen X, Engert F. Navigational strategies underlying phototaxis in larval zebrafish. Front Syst Neurosci. 2014 Mar 25;8:39.
Choi JH, Duboue ER, Macurak M, Chanchu JM, Halpern ME. Specialized neurons in the right habenula mediate response to aversive olfactory cues. Elife. 2021 Dec 8;10:e72345.
Choi K, Lee Y, Lee C, Hong S, Lee S, Kang SJ, Shin KS. Optogenetic activation of septal GABAergic afferents entrains neuronal firing in the medial habenula. Sci Rep. 2016 Oct 5;6:34800.
Cregg JM, Leiras R, Montalant A, Wanken P, Wickersham IR, Kiehn O. Brainstem neurons that command mammalian locomotor asymmetries. Nat Neurosci. 2020 Jun;23(6):730-740.
Derjean D, Moussaddy A, Atallah E, St-Pierre M, Auclair F, Chang S, Ren X, Zielinski B, Dubuc R. A novel neural substrate for the transformation of olfactory inputs into motor output. PLoS Biol. 2010 Dec 21;8(12):e1000567.
Freudenmacher, L., Schauer, M., Walkowiak, W., & von Twickel, A. (2020). Refinement of the dopaminergic system of anuran amphibians based on connectivity with habenula, basal ganglia, limbic system, pallium, and spinal cord. Journal of Comparative Neurology, 528(6), 972-988.
Imamura F, Ito A, LaFever BJ. Subpopulations of Projection Neurons in the Olfactory Bulb. Front Neural Circuits. 2020 Aug 28;14:561822.
Jennings JH, Ung RL, Resendez SL, Stamatakis AM, Taylor JG, Huang J, Veleta K, Kantak PA, Aita M, Shilling-Scrivo K, Ramakrishnan C, Deisseroth K, Otte S, Stuber GD. Visualizing hypothalamic network dynamics for appetitive and consummatory behaviors. Cell. 2015 Jan 29;160(3):516-27.
Ji, C., Zhang, Y., Lin, Z., Zhao, Z., Jiao, Z., Zheng, Z., Shi, X., Wang, X., Li, Z., Yu, S. and Qu, Y., 2024. Activation of hypothalamic-pontine-spinal pathway promotes locomotor initiation and functional recovery after spinal cord injury in mice. bioRxiv, pp.2024-11.
Juárez-Leal I, Carretero-Rodríguez E, Almagro-García F, Martínez S, Echevarría D, Puelles E. Stria medullaris innervation follows the transcriptomic division of the habenula. Sci Rep. 2022 Jun 16;12(1):10118.
Kim, K. and Picciotto, M.R., 2023. Nicotine addiction: More than just dopamine. Current opinion in neurobiology, 83, p.102797.
Krishnan S, Mathuru AS, Kibat C, Rahman M, Lupton CE, Stewart J, Claridge-Chang A, Yen SC, Jesuthasan S. The right dorsal habenula limits attraction to an odor in zebrafish. Curr Biol. 2014 Jun 2;24(11):1167-75.
Lavian, H., Prat, O., Petrucco, L., Stih, V. and Portugues, R., 2024. The representation of visual motion and landmark position aligns with heading direction in the zebrafish interpeduncular nucleus. BioRxiv, pp.2024-09.
Miyasaka N, Arganda-Carreras I, Wakisaka N, Masuda M, Sümbül U, Seung HS, Yoshihara Y. Olfactory projectome in the zebrafish forebrain revealed by genetic single-neuron labelling. Nat Commun. 2014 Apr 9;5:3639.
Palieri V, Paoli E, Wu YK, Haesemeyer M, Grunwald Kadow IC, Portugues R. The preoptic area and dorsal habenula jointly support homeostatic navigation in larval zebrafish. Curr Biol. 2024 Feb 5;34(3):489-504.e7.
Stephenson-Jones M, Floros O, Robertson B, Grillner S. Evolutionary conservation of the habenular nuclei and their circuitry controlling the dopamine and 5-hydroxytryptophan (5-HT) systems. Proc Natl Acad Sci U S A. 2012 Jan 17;109(3):E164-73.
Suryanarayana, S. M., Perez-Fernandez, J., Robertson, B., & Grillner, S. (2021). Olfaction in lamprey pallium revisited—dual projections of mitral and tufted cells. Cell Reports, 34(1).
Turner KJ, Hawkins TA, Yáñez J, Anadón R, Wilson SW, Folgueira M. Afferent Connectivity of the Zebrafish Habenulae. Front Neural Circuits. 2016 Apr 26;10:30.
Viswanath H, Carter AQ, Baldwin PR, Molfese DL, Salas R. The medial habenula: still neglected. Front Hum Neurosci. 2014 Jan 17;7:931.
Wolfman SL, Gill DF, Bogdanic F, Long K, Al-Hasani R, McCall JG, Bruchas MR, McGehee DS. Nicotine aversion is mediated by GABAergic interpeduncular nucleus inputs to laterodorsal tegmentum. Nat Commun. 2018 Jul 13;9(1):2710.
Yamaguchi T, Danjo T, Pastan I, Hikida T, Nakanishi S. Distinct roles of segregated transmission of the septo-habenular pathway in anxiety and fear. Neuron. 2013 May 8;78(3):537-44.
H.l (lateral hypothalamus) is a key node in the foraging system and has an interesting capability of distinguishing a food zone from a non-food zone [Jennings et al 2015]. In a sense foraging is searching for a food zone and then eating.
Foraging as a state machine.
The above diagram shows the foraging phases that I’ve already covered in earlier essays. Importantly, each phase is an independent action path as part of a distributed system, not merely a state in a state machine. To force the separate action paths to act like a state machine, each transition needs to suppress the preceding and following state. In particular the eating phase needs to inhibit the seeking system. This lateral inhibition is important because circuitry is required to force activation of only a single system at a time.
The food zone is particularly interesting for filter feeding, which is naturally area based and long term, as opposed to snapping up a morsel of food. Non-vertebrate chordates are filter feeders, lamprey larvae are filter feeders, and early jawless vertebrates were also likely filter feeders [D’Aniello et al 2023]. Tunicate ascidians, the closest non-vertebrate chordates, have an extreme version of this foraging loop, where the tadpoles find a feeding place after swimming for 12 hours and then settling in place for their adult life [Anselmi et al 2024]. The ascidian foraging state marine is a straight line that ends in the eating phase in the food zone, not continuing in a loop. The ascidian search and settle might give a hint how the vertebrate foraging circuitry is organized.
Ascidians
As covered in essay 30, the ascidian larva nervous system has several seeking (taxis) systems: geotaxis (gravity avoidance – moving up), phototaxis (light avoidance), and dimming for predator and obstacle avoidance. Ascidian navigation disperses the larva from its parent and prefers to settle on the underside of ledges by avoiding gravity while avoiding light. Its settling sensors also avoid toxic or irritating areas and may try to find food-friendly areas, although the specific sensor capabilities aren’t well known. When the larva finds an appropriate place, around 14 hours after hatching, it settles for life [Hoyer et al 2024].
Functional organization of the ascidian larva navigation and settling circuit.
The above diagram is a functional representation of the ascidian larva navigation brain. For this essay the important part is the palp and food-zone sensor and the settling neurons that inhibit motor neurons. The palms are three tentacle-like protrusions from the larva head, which attach the ascidian to a rock with cement glands [Johnson et al 2024]. They contain chemosensory and mechanosensors that distinguish the settling zone from non-settling zones [Hoyer et al 2024]. Interestingly, the genetic markers for the palp neurons are similar to markers for the vertebrate forebrain.
Head cement glands still exist in some fish larvae [Pottin et al 2010] and most frog tadpoles [Nokhbatolfoghahai and Downie 2005], [Rétaux and Pottin 2011], [Sive and Bradley 1996]. Frog tadpoles will swim up and attach to the underside of leaves or to the water surface. This cement gland and settling system may have existed in the pre-vertebrate ancestor and shared for tunicates and vertebrates. Unlike the ascidians the pre-vertebrates likely did not permanently settle. For the sake of this essay, let’s assume they temporarily settled to filter feed in a location and only moved on if filter feeding was unsuccessful or if forced to move by predators, competitors, or environmental hazards.
Ascidian larva navigation and palp settling circuit with the settling circuit highlighted. Each of the boxes represents a single neuron or a small (5-10) group of neurons. Labels are neuron names.
The above diagram shows specific neurons in the ascidian larva brain. The importance here is the glutamate pnIN (palp interneuron) to GABA pnRN (palp relay neuron), which inhibits all motor neurons and interneurons. Comparing vertebrate and ascidian neural systems is sketchy and probably should be avoided because both have diverged [Holland 2016]. For this essay, I’ll ignore that sound advice to try to motivate part of the vertebrate nervous system.
H.stn as an analogous node to the settling neurons. H.stn (subthalamic nucleus), MLR (midbrain locomotor region), Ob (olfactory bulb), OT (optic tectum), P.v (ventral pallidum), R.rs (reticulospinal motor command), S.v (ventral striatum), S.nr (substantia nigra pars reticulata), V.pt (posterior tubuculum)
The above diagram shows the H.stn (subthalamic nucleus) as fulfilling a similar role as the pnIN from ascidian Ciona, suppressing seek in preparation for eating. Part of P.v (ventral pallidum) suppresses S.v (ventral striatum) during eating [Vachez et al 2021]. This P.v “arkypallidal” subset is named after similar neurons in P.ge (globus pallidus) that suppresses S.d (dorsal striatum). Although the driver of this eating suppression isn’t known, the timing of the arkypallidal activation closely matches V.dr serotonin food activation [Spring and Nautiyal 2024], ramping at the end of seek and peaking after eating. Also, H.stn and P.ge form an oscillating pair, evident in Parkinson’s disease. So, it’s plausible that H.stn drives persistent suppression of the seek path in S.v through its projection to P.v, possibly influenced or driven by V.dr (dorsal raphe, serotonin). This specific path is speculation but seems compatible with experiments. The second suppression path is the well-known H.stn to S.nr (substantia nigra pars reticulata) that suppresses motor activity. Snr has a widespread suppression or MLR (midbrain locomotor region), R.rs (reticulospinal motor command), and Snr suppresses Snc (substantia nigra parsa compacta dopamine). Note that the medial H.stn, the area connected with P.v, merges with H.l with minimal boundary [Haynes and Haber 2013].
Food zone
Let’s return to the H.l food zone in [Jennings et al 2015] and consider where the food zone information might come from. Following [Jacobs 2012], let’s treat olfaction as the central sense for navigation, which is particularly compelling for food zones.
The diagram below shows the H.l main connectivity. Not displayed is the H.l internal sensing of nutrient information peptides like glucose sensing and leptin fat sensing. H.l doesn’t receive direct sensory input with the exception of R.pb (parabrachial nucleus), which sends nociceptive information like itch or pain. Because an itchy or painful place is a poor choice for filter feeding, this R.pb input is negative place information for a filter-feeding zone, but R.pb doesn’t give positive reasons to stay like food odors.
H.l connectivity encompasses much of the limbic system, driven by olfactory information. A.bl (basolateral amygdala), E.hc (hippocampus), F.pfc (prefrontal cortex), H.arc (hypothalamus arcuate), H.l (lateral hypothalamus), H.pv (paraventricular hypothalamus), H.stn (subthalamic nucleus), Hb.l (lateral habenula), M.pag (periaqueductal gray), Ob (olfactory bulb), O.pir (piriform cortex), P.bst (bed nucleus of the stria terminalis), P.v (ventral pallidum), R.pb (parabrachial), S.a (central amygdala), S.ls (lateral striatum), S.v (ventral striatum), V.dr (dorsal raphe – serotonin), Vta (ventral tegmental area – dopamine)
As the diagram suggests, the information H.l receives about food sources is very abstract. It receives cue information from A.bl (basolateral amygdala), place information from E.hc (hippocampal complex), value-like information from F.ofc (orbitofrontal cortex) and task-like information from F.vm (ventromedial prefrontal cortex). All of those areas are strongly connected with the olfactory system. While H.l doesn’t receive odor place information directly from sensors, it receives multiple organizational perspectives on odor information. P.bst (bed nucleus of the stria terminalis) receives very similar olfactory input as H.l, and it also receives negative information from R.pb. However, R.pb sends different nociceptive information to the S.a (central amygdala)/P.bst extended amygdala than it sends to H.l [Arthurs et al 2023]. The R.pb projections to H.l compared to S.a/P.bst are not redundant.
Not only are the H.l inputs abstract, but the outputs are also abstract, in contrast to direct action paths. This abstraction might be a later evolutionary development, similar to V.pt (posterior tuberculum) in zebrafish. V.pt is roughly homologous to Vta (ventral tegmental area) in mammals, but V.pt has more direct locomotor output to MLR (midbrain locomotor region), while most of Vta’s output is generally abstract.
As a note, the diagram does not include H.l ox (orexin) or H.l mch (melanin-concentrating hormone), partially for simplicity and partially because the zebrafish H.l is distinct from the ox and mch populations, suggesting that the mammalian ox and mch areas of H.l can be separated from the rest of H.l function. The diagram also omits some other connections like Ppt (pedunculopontine nucleus).
Food and serotonin
Returning to the foraging state diagram, it’s important that each “state” is a large, distributed, complex system, not a state in a state machine. The seek state includes areas like S.v, Vta, H.l, E.hc, F.pfc, and the motor regions MLR and R.rs (reticulospinal motor command) with the help of cortical areas and can include OT (optic tectum). Although the eating state is small, it is still comprised of many areas, including V.dr (dorsal raphe), OT.d, R.my.irt (medulla eating), H.l, H.pstn (parasubthalamic nucleus), R.pb and possibly some Vta and S.v subareas. Although the system is not a state machine, each “state” needs to laterally suppress the other systems to prevent multiple action paths from colliding.
Foraging state machine with dopamine and serotonin modulation. DA (dopamine), V.dr (dorsal raphe), Vta ventral tegmental area, 5HT (serotonin).
The split between eat and seek is important, because many studies merge the behavior into a general category “feeding.” Because some experiments only measure total feeding, it can be difficult to distinguish whether the experiment is measuring a seek effect or an eating effect. For example, eating needs to suppress seek to keep the animal from wandering away from the food. If an experiment stimulates eat but inhibits seek, the animal might not search for food even if it’s ready to eat. If it doesn’t seek food, it doesn’t find food.
This distinction between eating and seeking is exhibited by the question of serotonin, which is a heterogeneous system that has a role in feeding. The serotonin from V.dr is a heterogenous system with V.dr having at least 14 different genetic clusters [Okaty et al 2020] with at least 11 different projection patterns [Ren et al 2014]. Earlier studies noted that 30% of V.dr were active during eating [Fornal et al 1996], and many others have noted V.dr being active for “reward” (eating).
Suppose one component of V.dr serotonin encourages eating while discouraging seeking. If an experiment floods the brain with serotonin, it might see total feeding drop because serotonin suppresses seeking food, even if it encourages long meals when it finds food. The confusion becomes greater for studies looking for the even more abstract “reward” as opposed to concrete eating. The point being that serotonin in particular is a complicated system, not reducible to a single value or function.
Eating related effects of serotonin. DA (dopamine), H.arc (hypothalamus arcuate), H.stn (subthalamic nucleus), P.v (ventral pallidum), S.nr (substantia nigra pars reticulata), V.dr (dorsal raphe), Vta (ventral tegmental area), Vta.g (GABA neurons of Vta), 5HT (serotonin)
The above diagram shows some of the eating-related projections. Only a few of the 14 V.dr subtypes are know. The V.dr to Vta connection is one of the known projections and drives the seek system [Courtiol et al 2021], [Wang HL et al 2019]. Unfortunately, the other projections are not known, in particular the 30% of V.dr that is active while eating [Bromberg-Martin et al 2010].
V.dr enhances satiety with 5HT2c.q (serotonin G-q stimulating receptor) in H.arc POME satiety neurons, which suppresses the AgRP hunger peptide. Note that AgRP drops just before eating, suggesting that it’s a seek-promoting system, but an eating-promoting system [Bhave and Nettow 2021]. The prediction suppression only occurs after training and V.dr serotonin shows inverse behavior, possibly suggesting V.dr as suppressing H.arc. Untrained V.dr serotonin only responds after tasting [Li et al 2016], but trained V.dr serotonin responds about 2 seconds before eating [Zhong et al 2016].
Filter feeding and foraging theory
Let’s the consider filter feeding using foraging theory. Foraging theory studies how animals browse patches of food, such as a cluster of flowers for a bee or worms in pine cones for birds [Krebs et al 1974] or a hunting spot for a predator. In particular, foraging theory considers how long the animal should stay at a particular patch before deciding to move on: measuring the give up time. A filter-feeding proto-vertebrate needs to decide if the current food rate is good enough to stay at the current food zone.
The MVT (marginal value theorem) suggests that an animal should move on if the current patch has less food than the environment average [Charnov 1976]. MVT has simplifying assumptions that are challenged by the complexity in the world [Pyke 1984], [Wajnberg et al 2006]. MVT assumptions include omniscience, immortality, determinism, no competition, no predation, and no hunger. Some of those complexities are important to the essay, particularly the omniscience. In MVT the animal knows the average environment food value, but this omniscience isn’t plausible for simple animals [Tenhumberg et al 2001], and the essay animal has almost no learning at all. Realistic search is stochastic and can fail, such as a predator hunting, which is particularly important if the animal is starving. Starvation and satiation are also not covered by the MVT. If the animal is starving, it might stick with a non-optimal, low quality food source below the environment average because not finding a better patch is too risky. Simple organisms use rules of thumb instead of complex strategy, and even birds seem to use a constant give up time [Krebs et al 1974].
As a side note, the foraging terms for eating (“exploiting”) and searching for a new patch (“exploring”) have been appropriated by RL (reinforcement learning) [Sutton and Barto 2018] with some differences in meaning. Reinforcement learning use an n-armed bandit (gambling slot machine) model, where exploring means finding the reward rates of the other arms before deciding on the best arm to exploit. The RL focus is on gather information, generally in a finite and persistent system. In contrast, this essay uses the original foraging terminology.
Covered in essay 36, vertebrate food motivation divides into hunger-driven (“homeostatic”) and opportunistic (“hedonic”) foraging. These form two levels of search and involve different circuits with some overlap. When no longer hungry, mice will not eat plain food but will still eat rich food. In terms of foraging theory, hungry mice will stay longer at poor patches, while sated mice will leave more quickly.
Simulation complexity
After starting to implement the simulation, the issue of complication became overwhelming. Specifically, adding the striatum is too complicated. Consider the issue of distinguishing the eating function of dopamine vs serotonin, when both are responsive to eating food. That similarity makes it difficult to find the system function. The system must have developed from a simpler system because the ascidian feeding or amphioxus feeding is not overly complicated. For the sake of the simulation, I’m backing off and considering only the hindbrain and hypothalamus systems, treating the striatum as a later enhancement.
Hypothalamus and raphe nuclei
The core of the simulation is the pair of H.l and V.dr. As mentioned above, H.l is driven by food zone indicators and can drive both seeking and eating. V.dr is responsive to eating and as part of the hindbrain (it derives from r1) it is a good candidate for primitive, tunicate-like filter feeding circuitry.
Simulation eating model. Ob and H.l form the forebrain food zone system, while V.dr and R.nts form the hindbrain eating system. H.l (lateral hypothalamus), Ob (olfactory bulb), R.nts (nucleus of the solitary tract), V.dr (dorsal raphe).
The diagram above is a simplification, where the Ob to H.l connection represents an ancient version of the food zone system. The V.dr to R.nts (nucleus of the solitary tract) connection includes more hindbrain structures such as medulla eating circuits. The simplification has H.l as a food zone controller and V.dr as an eating sustaining manager.
Although V.dr is a serotonin system, not V.dr neurons are non-serotonin, both glutamate and GABA. As mentioned above the V.dr and V.mr (median raphe) serotonin neurons have at least 11-14 distinct neuron types and projection types. For the essay I’m assuming at least one serotonin neuron type is a measure of eating food. In the simulation successful filter feeding increases the serotonin for eating.
Start and sustain
Let’s return to foraging, where the central decision is when to stop exploiting a patch if it’s not effective. Consider a simple where the animal gives up on a patch if the feeding rate drops below a fixed threshold. Filter feeding naturally has delays between starting filter feeding, trapping some prey, and later receiving nutrients in the gut. This raises a problem: the feeding rate is zero until some food is digested, which implies the animal should give up immediately.
Foraging give-up occurs when the combination of a start signal and sustain signal drop below a threshold.
One solution is to prime the system with a start signal. While the start signal exists, the animal won’t leave even if it hasn’t digested any nutrients. In the simulation H.l is responsible for the start signal and V.dr is responsible for both the sustain and for integrating the two systems. The H.l start signal comes from the food zone detection.
However, the start signal raises a new issue because the start signal must stop to allow sustain to act as the primary decision variable. If H.l always sends the food zone signal to V.dr, it will remain active as long as the animal is in the food zone, preventing the animal from leaving the zone. So, H.l itself needs a timeout. The simulation uses a striatum timeout to disable the H.l food zone signal. The striatum connection can either represent the striatum layer between the olfactory and cortical layers and H.l, or it can represent H.l reciprocal input to the striatum.
The start timeout has the same issues as other striatum systems. Specifically, it needs to remain timed out until the animal leaves the food zone.
Simulation
The screenshot below shows the animal feeding from a low-quality food zone. The grey star is a food zone (grey represents poor food). The nearby purple checkerboard is an avoidance zone, representing an aversive area such as itch or high carbon dioxide.
Simulation of the animal filter feeding at a poor food zone just before giving up.
In the screenshot the startup signal from H.l is temporarily sustaining feeding. It will soon timeout and the animal will abandon the food zone.
Avoidance response and search
The simulation adds two other serotonin-based systems: one for avoiding toxic areas and one for search. Avoidance is one of the V.mr functions. The search serotonin represents the V.dr to Vta connection, despite the current essay disabling the seek function. These two functions may not be serotonin functions because V.mr avoidance is largely non-serotonin, and the V.dr to Vta connection is primarily glutamate. Because the avoidance and search are not the primary focus of the essay, I’m putting off the question of accuracy to a later essay.
Discussion
The essay’s big questionable decision is the omission of the striatum, particularly because I’ve already used the striatum for give-up timing. For eating as opposed to seeking, one possible area appears to be S.dl.vl, which is the orobranchial, mouth area [Foster et al 2021]. Because S.dl receives late dopamine from food in the gut, it might be a good candidate for filter feeding sustain.
Map of the striatum. dl (dorsal lateral striatum), dm (dorsal medial striatum), lsh (lateral shell), msh.d (dorsal medial shell), msh.v (ventral medial shell), ot (olfactory tubercle)
A second area is S.msh.d (dorsal medial shell) which responds to hedonic “liking” and drives strong eating [Castro et al 2016], [Richard and Berridge 2011], [Richard et al 2013]. S.msh.d drives H.l, which is central to the essay. In addition S.msh has longer, sustained dopamine (5-10s) contrasted with shorter dopamine in S.dl (100ms) [de Jong et al 2022].
From a motivational perspective, S.dl.vm and S.msh.d are strong candidates, but they lack the lateral inhibition of seek that’s necessary for the state machine to work. S.dl.vl also works through OT.d.l (optic tectum deep motor areas), which would add more complexity to this essay. In contrast the V.dr serotonin is already part of the hindbrain motor areas, and serotonin is already inhibitory toward seek. V.dr requires fewer additional systems to work. For future work, the two striatum areas are strong areas to research.
Foster NN, Barry J, Korobkova L, Garcia L, Gao L, Becerra M, Sherafat Y, Peng B, Li X, Choi JH, Gou L, Zingg B, Azam S, Lo D, Khanjani N, Zhang B, Stanis J, Bowman I, Cotter K, Cao C, Yamashita S, Tugangui A, Li A, Jiang T, Jia X, Feng Z, Aquino S, Mun HS, Zhu M, Santarelli A, Benavidez NL, Song M, Dan G, Fayzullina M, Ustrell S, Boesen T, Johnson DL, Xu H, Bienkowski MS, Yang XW, Gong H, Levine MS, Wickersham I, Luo Q, Hahn JD, Lim BK, Zhang LI, Cepeda C, Hintiryan H, Dong HW. The mouse cortico-basal ganglia-thalamic network. Nature. 2021 Oct;598(7879):188-194.
This essay explores using the hippocampus as a sequence generator [Buzsáki and Tingley 2018] to precisely time the avoidance action after a failed seek. In essay 31, the animal started with a roaming random search, which turned into a directed seek when it smelled a food odor. If the animal failed to find food after a timeout, it would avoid the area. This essay expands on that model by improving the avoidance action. Previously, the avoidance time was modeled on a cellular timeout, which is imprecise. Instead, we can use an hippocampus sequence to time the avoidance.
State diagram for the foraging task. A random walk roaming search switched to a directed seek when the animal smells a food odor. It switches to avoidance if the seek times out.
This foraging search resembles a Levy walk, which combines a area-restricted brownian walk with longer movements to avoid repeated searches in an area.
In the previous essay, the seek action timed out using astrocyte-managed adenosine in the striatum, but the avoid timeout wasn’t specified, presumably piggybacking on the astrocyte adenosine. Improving the avoidance time and distance can use a sequence generated by the hippocampus. In mice, this kind of distance measurement and timeout is seen in C.pp (posterior parietal cortex) with neurons tiling the delay period, forming a sequence [Harvey et al 2012], [Rajan et al 2016] and in E.ca1 (hippocampus CA1 area) [Pezzulo et al 2017].
Because this essay remains as primitive as possible, and we haven’t added cortex regions yet, we can only add one simplified cortical area. One option is to consider C.pp as a primitive cortex region that can self-generate the necessary sequences. Another option is to consider E.hc (hippocampus) as the main sequence generator and treat C.pp as a later specialization for sophisticated vertebrates like mammals.
The hippocampus can be seen as a sequence generator [Buzsáki and Tingley 2018]. E.hc (hippocampus) sequences are approximately 7s from first place cell neuron to the last neuron in the sequence [Pezzulo et al 2017], and E.hc delay timing is approximately 8s [Abela et al 2015]. Similarly F.pl (prelimbic prefrontal cortex) sequence neurons tile choice encoding for 7s across an experimental trial. C.pp neurons can tile the distance from a start position in mice [Harvey et al 2012], [Rajan et al 2016]. In mice E.ca1 path integration has a maximum of 2m unless extended by landmarks [Fischler-Ruiz et al 2021].
Additionally, the timeout sequence needs a mechanism to trigger it and to drive the avoidance action. I’ll cover a possible trigger from H.sum (supramammillary) and an output from E.hc either directly to H.l (lateral hypothalamus), H.sum, or Poa (preoptic area), or using S.ls (lateral septum as an intermediary.
E.hc place cell for delay
A study in [Fischler-Ruiz et al 2021] studied E.ca1 (CA1 region of hippocampus) with mice traveling a virtual maze, where the liquid reward was 4m from the initial starting point. Instating of waiting for the full 4m, mice expected reward at 2m unless an odor landmark near 2m extended the range to 4m. Neurons in E.ca1 tracked the distance traveled with short time neuron fragment tiling the delay period. A sparse number of neurons active for a short period and as one neuron ends another, new neuron takes its place, like a long thread is made of shorter fibers. This thread frays at around 2m when mice lose track of the task.
Diagram of neurons tiling a delay period. After a simultaneous initial spike of many neurons at top left, a sequence of neurons fire and replace each other.
Mouse C.pp neurons in virtual maze tile the delay period [Kamiński and Rutishauser 2020]. The above diagram shows the neural tiling. Each row is a single neuron, and time is on the x axis, and the neurons are sorted by their activity. At top is an initial simultaneous burst of many neurons that respond to some event. This large burst is followed by smaller sets of neurons that persist for a small time. When one neuron stops, another neuron takes its place, until the sequence frays and ends.
Roam vs Seek
H.sum is associated with active exploration and is quiet while eating [Kesner et al 2021]. H.sum is activated by food anticipation, food restriction, or gherkin, a hunger peptide [Le May et al 2019].
[Wee et al 2019] H.l (lateral hypothalamus) and Hc (caudal hypothalamus) are both feeding related but are anti-correlated. Hc activates when the zebrafish is hungry and drives the roaming search for food, but when a specific target (paramecium for zebrafish) is detected, H.l activates and Hc deactivates for specific hunting and for eating [Wee et al 2019].
In zebrafish, Hc is comparable to the mammal posterior hypothalamus, which includes H.sum and H.mb (mammillary body), H.tu (tuberal hypothalamus), and H.arc (arcuate nucleus). Zebrafish Hc includes all these areas [Schredelseker and Driver 2020], which are genetically and functionally distinct. Unfortunately this makes it unclear which area is driving the roaming search. H.arc for example is associated with hunger and anti-correlated with eating, using hunger neuropeptides AgRP and NPY in H.arc [Berrios et al 2021]. H.sum is also activated by food anticipation or hunger [Le May et al 2019].
AgRP drops quickly on food cues, drops more slowly for gut nutrient detection, and is slow or permanent in energy balance like blood glucose [Berrios et al 2021]. The specific path for seek dropping AgRP is H.l glutamate to H.dm (dorsal medial hypothalamus) GABA to H.arc AgRP. The H.dm effect disappears with sleep time. RTPP (real-time place preference) is 80% if AgRP is inhibited.
In addition, zebrafish H.l is not identical with the mammal H.l. For example, the orexin and MCH neuropeptide neurons in mammals are part of H.l, while they are in distinct non-H.l areas in zebrafish [Schredelseker and Driver 2020].
In other words, while this essay is taking H.sum as embodying the roam as opposed to H.l embodying the seek, the exact correlations of roam and seek are not yet known.
E.hc input and output
If E.hc drives the seek timeout’s avoid action sustain, how does the seek timeout trigger the E.hc sequence timer to start, and how does E.hc sustain the avoidance? E.hc has three major avoidance action paths it could influence: the obstacle avoidance path through OT (optic tectum) from essay 34, the temporal gradient taxis path through Hb.m – R.ip (medial habenula to interpeduncular nucleus) from essay 33, and the related motivational path through H.l (lateral hypothalamus) and Hb.l (lateral habenula).
Action paths for avoidance to the hippocampus. The left panel shows obstacle avoidance from the optic tectum via C.pp. The right panel shows taxis avoidance from the Hb.m to R.ip path. C.pp (posterior parietal cortex), E.hc (hippocampus), E.mec (medial entorhinal cortex), E.por (postrhinal cortex), H.sum (supramammillary), Hb.m (medial habenula), OT (optic tectum), P.ldt (laterodorsal nucleus), P.ms (median septum), R.in (nucleus incepts), R.ip (interpeduncular nucleus), V.mr (median raphe)
The above diagram shows possible input paths to E.hc. On the left, OT (optic tectum) sends visual threat and obstacle information to C.pp and E.por (postrhinal / parahippocampal cortex) via T.lp (lateral posterior / pulvinar thalamus), then to E.mec (medial entorhinal cortex), and finally to E.hc. In the right panel, Hb.m-R.ip sends feedback to E.hc through multiple paths, including H.sum, V.mr (median raphe), R.in (nucleus incepts) and through P.ms (median septum). These two paths differ not only on the information they convey, but also their circuit effects on E.hc. While the left OT path is a data-driven path that excites glutamate neurons, the right Hb.m-R.ip path is a control path that drives E.hc interneuron control circuitry using GABA, ACh (acetylcholine), and 5HT (serotonin) to modulate the action and timing of E.hc.
For this essay, the control path is more interesting, is part because it’s more closely tied to the seek-roam control, and in part because the control circuitry is more fundamental to E.hc operation. The interneuron circuit for the cortex and the hippocampus are highly conserved for all vertebrates, but at some point in evolution they must have been new. Because this essay is exploring adding the first cortical-like area, it needs to address the interneuron controls and not merely assume their existence.
Motivational avoid feedback to the hippocampus. A motivational avoid signal from Hb.l drives E.c through V.mr. E.hc (hippocampus), H.sum (supramammillary), Hb.l (lateral habenula), P.ms (median septum), V.mr (median raphe)l
A different, motivational path using V.mr arrives from Hb.l (lateral habenula) instead of the Hb.m – R.ip taxis path. Hb.l is more motivational and Hb.m it more of a physical taxis function. As I’ll be using later in the essay, the full avoid path first travels through the motivational Hb.l, then to E.hc for the avoid time/distance sequence, and finally to Hb.m – R.ip for physical avoidance.
Output paths for timed avoidance. The left panel shows a possible path to OT via C.pp. The right panel shows a path to Poa and H.sum via S.ls. C.pp (posterior parietal cortex, E.ca1 (hippocampus CA1 area), E.sub (hippocampus subiculum), H.l (lateral hypothalamus), H.sum (supramammillary), OT (optic tectum), Poa (preoptic area), S.ls (lateral septum).
The avoidance output path from E.hc could use one of three action paths. E.hc can drive obstacle-like avoidance through C.pp to OT. Secondly it can drive physical gradient taxis avoidance through Hb.m via P.ts (triangular septum) and P.bac (bed nucleus of the anterior commissure) [Yamaguchi et al 2013]. Finally it can drive motivational avoidance to H.l, H.sum, and Poa (preoptic area) through S.ls (lateral septum). The above diagram shows the obstacle avoidance path through C.pp to OT and the motivational avoidance path through H.l, H.sum, and Poa. (Note: Poa to M.pag to MLR is more direct avoidance than motivational.) The Poa path drives movement directly through M.pag (periaqueductal gray) and MLR (midbrain locomotor region), which the diagram omits for brevity.
Another E.hc motivational output uses S.v (ventral striatum / nucleus accumbens) and P.v (ventral pallidum / endopeduncular nucleus) to H.l and Vta (ventral tegmental area) for seeking and to Hb. for avoidance. Although this path is similar to the S.ls motivational path above, in mammals at least it’s a distinct circuit. Unlike the S.ls path, the S.v motivational path does not target H.sum or Poa but does strongly target Hb. Note that S.v and S.ls are similar structures both derived from the same progenitor domain LGE (lateral ganglionic eminence) and is neighbors with S.v, also known as nucleus accumbens septi, Latin for nucleus adjacent to the septum.
Output path from hippocampus to medial habenula via P.bac and P.ts. E.hc (hippocampus), Hb.m (medial habenula), P.bac (bed nucleus of the anterior commissure), P.ts (triangular septum).
Mammals also have a direct E.hc output to Hb.m-R.ip via P.ts (triangular septum) and P.bac (bed nucleus of the anterior commissure, which is sometimes identified with part of P.bst bed nucleus of the stria terminalis) [Proulx et al 2014]. As described in previous essays, Hb.m has a direct function for phototaxis [Chen and Engert 2014], chemotaxis [Beretta et al 2012], thermotaxis, and some social conflict [Agetsuma et al 2010]. This mammal P.ts/P.bac is understudied, or I haven’t found any study describing its function, but by its connection E.hc could drive taxis via the Hb.m-R.ip connection. This connection does not exist in lamprey [Stephenson-Jones et al 2012], but I haven’t read any studies on this connectivity for other non-mammal animals.
Ancestral vertebrate forebrain
Stepping back to cover the ancestral vertebrate forebrain (cortex and basal ganglia) to understand likely primitive areas, particularly the E.hc equivalent. Ancestral models generally use Pa (pallium) to describe cortical-like areas and divide pallial areas into multiple regions, generally from four to six, depending on the theory. These areas are named by their location, where MPa (medial pallium) is in the middle, DPa (dorsal pallium) is on top, and LPa (lateral pallium) on the side.
The above diagram shows the quadripartite model of the ancestral vertebrate forebrain [Hegarty et al 2024], [Pessoa et al 2019]. The dorsal areas are cortical-like and the ventral (basal) areas are striatal and pallidal (basal ganglia). The four areas match most vertebrate structure. MPa (medial pallium) is hippocampal, DPa (dorsal pallium) is sensory neocortex, LPa (lateral pallium) is the insula cortex (eating and tasting), E.lec (lateral entorhinal) and F.ofc (orbitofrontal), and VPa (ventral pallium) is the olfactory cortex and amygdala. In the diagram, the notch at the top is deliberate because the neural tube is formed by curling up a neural sheet until the two ends nearly match at the top. In most vertebrates the two ends at MPa curl more until they reach the base, forming two hemispheres, but in teleost fish the two ends curl out, putting MPa on lateral outside (Dl in teleosts) and VPa in the middle (Dm in teleosts) [Hegarty et al 2024], [Roth and Dicke 2013], [Porter and Mueller 2020].
Although this model is useful, it does minimize differences between vertebrates. For example, the mammalian DPa, the “neocortex,” doesn’t nicely match up with reptiles, instead the amygdala-like areas (DVR) take a larger role.
Unrolled hippocampus, showing the consistent order among amniotes. C (isocortex/neocortex), DPa (dorsal pallium), E.ca1, E.ca2, E.ca3 (hippocampus CA1, CA2, CA3 areas), E.dg (hippocampus dentate gyrus), E.ec (entorhinal cortex), E.sub (hippocampus subiculum).
In amniotes (lizards, birds, and mammals) E.hc has a similar structure, including E.dg (dentate gyrus), E.ca1, E.ca3, E.sub (subiculum), E.lec (lateral entorhinal cortex), and E.mec (medial entorhinal cortex) [Medina et al 2017]. As mentioned previously, although the fish Dl is likely an E.hc-like structure, its internals differ from the amniotes. If the fish Dl was a subset of the mammal E.hc, it would be useful to know which areas are more primitive.
The lamprey is interesting because it may not have an E.hc / MPa equivalent at all. A recent genetic cell analysis [Lamanna et al 2023], [Hervas-Sotomayor 2023] suggests that MPa is not ancestral because the medial pallial area in lampreys matches H.em (prethalamic eminence) instead, which is more closely associated with the hypothalamus and habenula, not the forebrain and specifically not matching hippocampal markers. In addition, the lamprey extended amygdala is distinct, well-defined, and separate from the lateral pallium, suggesting it may be more helpful to treat the extended amygdala as a distinct area instead of as part of the pallium. In the case of mammals, that distinction may not apply to A.bl (basolateral amygdala), which resembles the cortical DPa more than other parts of the amygdala [Moreno and González 2007]. [Lammana et al 2023] suggest that the quadripartite model may only apply to later vertebrates, and lampreys having an undifferentiated pallium, an amygdala, and a H.em with unknown purpose. These genetic results reinforce earlier suggestions that the lamprey “MPa” is actually an expended H.em.
For this study, which only uses the hippocampus / MPa, the lack of a lamprey MPa suggests that the sequential hippocampal model is a later vertebrate development. Even if accurate, it would be unknown if it developed specifically with jawed vertebrates (gnathostomes) or if it was an innovation ozone of the many preceding jawless vertebrates.
C.pp navigation delay
In mice, C.pp (posterior parietal cortex) tracks time and distance from a start position with neurons that tile the delay period, as shown in the earlier neuron firing diagram. Each neuron is only active for a fragment of the task, and as one neuron leaves the ensemble, another starts [Harvey et al 2012]. In their task, C.pp was required for memory related success, not needed for cue seeking.
C.pp is “vision for action” [Goodale and Milner 1992]. C.pp tuned to referred self motion and acceleration [Whitlock et al 2012]. In this context, the timing we need it motion timing, so self motion is critical. Some C.pp neurons fire up to 500ms before action and drop immediately on action.
Although these locomotion delays match what the essay needs, Cpp has a number of problems for this purpose. First, C.pp is part of the neocortex, which is specific to mammals. Birds, reptiles, and fish have a cortex (called pallium), but the area around C.pp is organized differently with a larger amygdala area (DVR) than cortical (dorsal pallium) [Aboitiz et al 2003]. Mammalian E.hc as an odor-motor region with 1D maps [Aboitiz and Montiel 2015].
Second, the most direct subcortical C.pp output is to OT (optic tectum), but otherwise C.pp requires C.mo (motor cortex) for any action. But in amphibians, the OT does not receive input from a cortical equivalent, but connects directly with the striatum [Pessoa et al 2019]. In the context of the essay, the OT doesn’t seem related to the place avoidance needed.
Third, the C.pp inputs are primarily indirect through other cortical areas, although using T.lp (lateral posterior thalamus) from OT. So, if C.pp is a primitive cortical area, it seems much more likely to be an OT-focused area, not a foraging area.
As a counterpoint, the electric fish Dc also has direct output to OT, which suggests a C.pp-like role. Dc is driven by the Dl (lateral cortex, possible E.dg) through DD (dorsal cortex, as possible E.ca3) [Fotowat et al 2019]. That study suggested the entire Pa (cortex) area is silent unless actively sensing and sequences were not studied. Other research suggests that only Dl.v (ventral Dl) is equivalent to E.hc, particularly to E.sub [Hegarty et al 2024]. [Rodríguez-Expósito et al 2017], which makes the comparison more tenuous, unless DD is something like C.rs (retrospenial cortex), which in mammals is between E.hc and C.pp.
Similarly, the lamprey LPA (lateral pallium / cortex) has a direct OT projection [Suryanarayana et al 2022], giving it a possible C.pp-like role, but the lamprey does not appear to have a hippocampus equivalent [Lamanna et al 2023]. It does have an expanded H.em (prethalamic eminence) which is physically located where the MPa (medial pallium / hippocampus) would be. Unfortunately, no current studies have covered H.em function in the lamprey, but it seems likely to be significantly different from the hippocampus.
In favor of C.pp as relevant here, a study of electric fish shows a path from Dl (E.dg – dentate gyrus equivalent) to Ddi (E.ca3 equivalent) to Dc (C.rs.5/C.pp.5 equivalent) to OT [Fotowat et al 2019]. However, in that study the entire Pa is silent except when swimming or active sensing (electro-sensing in this case), and it’s unclear if there are any sequences as in the mouse C.pp.
A different description of Dl points out that it primarily projects to Poa and Hc (caudal hypothalamus) [Northcutt 2006]. The fish Hc .sum set of efferents is that Dl projects to the area that includes H.arc, H.mb, H.sum, H.pm, and H.tu, which are developmentally conserved between fish and mammals [Wullimann 2022]. Because these areas have different functions, it would be useful to known exactly which specific areas Dl projects to. In mammals, Poa and H.sum are interconnected and are both associated with exploration and roaming [Escobedo et al 2023], [Ryoo et al 2021].
H.sum
As mentioned previously, H.sum is part of the basal hypothalamus and is highly conserved, existing in sharks [Santos-Durán et al 2022], teleost fish [Wullimann 2022], amphibians and reptiles [Domínguez et al 2016], birds [Kim DW et al 2022], and mammals [Bedont et al 2015], [Croizier et al 2015], [Ferran et al 2015]. I haven’t found an equivalent lamprey hypothalamus study, which would be especially interesting because of the lamprey’s lack of MPa / E.hc equivalent. If H.sum does exist in lampreys, its function and connectivity might show a simpler, more primitive function before adding MPa / E.hc capabilities.
H.sum has several distinct circuits, some with overlapping behavior. H.sum tac1 (Substance-P marker) is highly associated with voluntary locomotion, but not with E.hc theta [Farrell et al 2021]. H.sum is associated with active exploration and produces RTPP (real-time place preference) but is quiet while eating [Kesner et al 2021]. H.sum to Poa (preoptic area) projections are associated with avoidance to threads and shows a strong RTPA (real-time place avoidance) but not anxiety or CPA (conditioned place avoidance) [Escobedo et al 2023]. The H.sum to Poa connection has collaterals to E.ca2 but not to Hb.m or E.dg (hippocampus dentate gyrus). H.sum stimulation inhibits eating and its inhibition is required for eating. The projections to E.dg are related to object novelty and open field exploration [Pan et al 2004], [Chen S et al 2020], while the projection to E.ca2 is related to social novelty and temporal memory [Chen Z et al 2022], [Thirtamara et al 2024]. H.sum is also part of the food reward circuit and is reinforcing [Ikemoto 2010].
E.hc – hippocampus as sequence generator
One model of E.hc (hippocampus) is as a general sequence generator [Buzsáki and Tingley 2018]. The idea notes that E.hc is blind to its inputs, whether olfactory, visual, or vestibular, but what is consistent it its ability to tile gaps between events, and its internal timing structures such as fitting gamma (40-100Hz) bursts inside theta cycles (~8Hz), such as a consistent seven-ish game cycle bursts within a single theta cycle.
E.ca2 – hippocampal CA2
E.hc is a complicated structure with many parts, but must have evolved from an initial core area. Studies with other vertebrates show comparable areas to the hippocampus E.dg (dentate gyrus), E.ca1 (CA1 area), E.ca3 (CA3 area), and E.sub (subiculum area).
Unrolled hippocampus
For this essay, I’m considering E.ca2 as the most primitive because it’s strongly tied to sequence generation [Bhasin and Nair 2022], [He et al 2021], [Lehr et al 2021], [MacDonald and Tonegawa 2021], [Stöber et al 2020] and is reciprocally connected with H.sum, which I’m already using for its connection with RTPA and exploration. E.ca2 is reciprocally connected with H.sum in the early embryonic state, earlier than E.dg and E.ca3 connections [Diethorn and Gould 2023].
In mice, E.ca2 sustains internal sequence memory during delay [Lehr and Stöber 2021]. When E.ca2 is disabled, sequences are destabilized in E.ca1 [Lehr et al 2021]. Interestingly, during sleep E.ca2 appears to fire consistently to remember the animal’s current location [Kay et al 2016].
E.ca2 is distinguished from E.ca3 by its lack of E.dg input [Insausti et al 2023], making it an interesting candidate for an early area because it has fewer dependencies and requirements.
Neural models for sequences and persistence
Neural models for sequences are much less studied than models for persistence and memory. I’m using sequences for a delay period and delay periods are well studied. In behaviorist experiments, “trace conditioning” has a short delay between the cue to be remembered and the food reward or shock punishment.
Before approaching sequences, it seems best to consider persistence, which is better studied, such as persistent working memory. [Zylberberg and Strowbridge 2017] reviews multiple models of persistence. The main difference I’ll highlight is between models focused on intrinsic cellular responses such as bursting, in contrast with models that focus on recurrent connectivity, such as Hopfield networks [Hopfield 1982]. Experimentally, the evidence between cell-autonomous vs recurrent network is an open question [Kamiński and Rutishauser 2020].
Persistent bursts are 3-6 spikes [Zylberberg and Strowbridge 2017]. Epilepsy is related to these bursts and may be a failure in inhibitory circuits to contain expansion of these bursts. These bursts are enabled by metabotropic receptors such as mACh (acetylcholine metabotropic receptor). The hindbrain includes bursting neurons, such as VOR (vestibuloocular reflex), long plateau potentials in spinal motoneurons with Ca2+ (calcium) channels and with K+ (potassium) channels. These bursting mechanisms can rebound after inhibition.
Recurrent models of persistence
Recurrent models of persistence use feedback, recurrent connections of neurons back to the same area. If properly calibrated, these loops produce produce attractors that can remember data [Hopfield 1982]. A difficulty in recurrent models is the issue of fast neural circuits (10-20ms) extending to seconds, and how to maintain stability when 100-fold timescales are required [Zylberberg and Strowbridge 2017]. One common response is to use longer time circuits such as NMDA receptors with 100ms time constants. The longer basal constants reduce the stability issues. For example, one of the models for recurrent storage uses NMDA in a model of the VOR (vestibular-ocular reflex) [Seung 1996], where the eye target is maintained in a line attractor.
Criticisms of recurrence and attractors note the requirement for fine tuning [Zylberberg and Strowbridge 2017]. Continuous attractors require fine turning. For example, line attractors require precise recurrence, while noise and modulation affects the tractors. In addition, the timescale of excitation vs inhibition is important to the stability. In addition neurons tend to switch abruptly between discrete states as opposed to smooth, continuous variation. To reduce the issue, finely spaced concrete attractors are more tolerant than continuous attractors.
Continuous attractors use a smooth neuron rate encoded values, but firing rate models difficult to match actual neuron behavior [Compte et al 2000]. Spontaneous activity at 3.5Hz does not trigger attractor. [Lundqvist et al 2016] notes that individual neurons bridging a multi-second memory delay is rare, but a simple attractor model would have multiple neurons bridging the entire gap. [Cui and Strowbridge 2018] also note there is little experimental support for recurrent synaptic reverberation.
Intrinsic neural persistence
Intrinsic neural persistence models focuses more on cellular mechanisms for persistence, where an individual neuron fires for longer times, such as 2s without requiring external prompting like recurrence. Since the Lundqvist criticism [Lundqvist et al 2016] also applies here, that single neurons do not cover the entire gap, intrinsic cellular persistence alone is insufficient. However, extending the time constant from the 100ms of NMDA to 500ms or even multiple seconds reduces the need for precise calibration.
Persistent firing exists in C (cortex) and Ob (olfactory bulb) using cellular mechanisms, initiates by a burst of AP (action potentials) [Cui and Strowbridge 2018]. Many cellular persistent firings are enabled by m1.q (acetylcholine metabotropic receptor tied to G-q/11 proteins), and additionally require cellular Ca2+ (calcium ions) for bursting, triggered by the input bursts. ACh enhances excitability by reducing ERG (ether-a-go-go) K+ (potassium currents). The ERG K+ currents typically suppress bursting and persistence. By suppressing ERG K+, the ACh to m1.q path disinhibits bursting and persistence. ERG is highly expressed in deep C (layer 5/6 cortex), Snc DA (substantia nigra compacta dopamine), hindbrain, and E.ca1. ERG blockers abolish persistent firing in C.tea (temporal association area) and F.pfc (prefrontal cortex).
In the Cui and Strowbridge study [Cui and Strowbridge 2018], ACh enables persistence for at least 5s at 4.3Hz to 6.4Hz. With normal, non-ACh ERG behavior, input bursts produce a short AP burst followed by suppressed output. When ACh blocks ERG currents, input bursts produce an AP burst followed by persistent firing that requires Ca2+. In short, ACh enables persistent firing in the range of 500ms to 2ms for a single neuron without needing external recurrence.
Other studies show similar modulated persistence in other areas. In E.sub, ACh stimulation of m1.q enables extended plateau potentials [Kawasaki et al 1999], allowing sustained tonic firing. In layer 5 cortex, when ACh enables persistent firing, even the smallest apical depolarization produces L5 repetitive bursts [Schwindt and Crill 1999]. The plateau potential and prolonged response is over 400ms. Similar effects for ACh and m1.q promote sustained persistence in CB (cerebellum) and O.mt (olfactory bulb mitral cell) [Bauer and Schwarz 2018]
Tiling and threading
Combining longer persistent intrinsic cellular activity with recurrence provides many options for persistence and sequence generation. As mentioned above, mouse C.pp neurons in a virtual maze tiles a delay period [Kamiński and Rutishauser 2020]. Each neuron has an intrinsic persistent between 500ms and 1500ms, and only a sparse number like 10% are active at a time [Pastalkova et al 2008]. As one neuron ends its extended 500ms activity, another neuron takes its place. When the animal acts, the sequence is extinguished [Zylberberg and Strowbridge 2017].
One component of the sequence control is the E/I (excitation / inhibition) balance, managed by cortical interneurons. The E / I balance prevents runaway excitation, as in epilepsy, and instead restricts activity to a small subset of neurons. This balance is often oscillatory and combined with winner-take-all systems, where the first neurons to fire suppress slower neurons. In the case of sequences, when one neuron stops firing when its internal persistence ends, the E / I balance shifts towards allowing a new neuron to take its place.
A synfire chain, where each neuron triggers the next neuron in the sequence.
An older model of sequences is the synfire chain, where each neuron triggers the next neuron in the sequence. An argument against the strict synfire chain its developmental plausibility, because it requires precise connectivity.
Using excitation / inhibition balance with semi-random connectivity to allow dynamic sequences. The blue neuron is currently active, and the faded blue neurons are possible next steps in the sequence.
Instead of a precise synfire chain or the precise dynamics required for recurrent systems like [Hopfield 1982], [Rajan et al 2016] proposes a developmental model for sequences. Consider a set of neurons randomly connected, and modify only some of the connections to produce sequences. Neuron development includes calibration and pruning during early development, including intrinsic firing that would allow for sequence selection. Once the network is formed, sequences propagate by cooperation between recurrence and the input.
Adding sequence projection to a readout layer for sequence output
Because sequences are dynamic, their output may require a projection layer to a more meaningful subspace [Kamiński and Rutishauser 2020]. For example, this essay only needs a single active signal to continually drive the avoidance until the sequence ends.
Illustration of a sequence managed by the E / I balance of interneurons and read-out by a projection layer.
The diagram above shows how these ideas fit together. The pyramidal neurons in the middle form the backbone of the sequence. Each neuron has an intrinsic persistent firing for ~500ms. The PV (parvalbumin) interneurons prevent more than one neuron firing. Because the current neuron is already firing, it blocks the next neuron in the chain until the first neuron completes. The sequence is decoded by a separate output layer, such as S.ls (lateral septum) or E.sub (subiculum) before sending to the destination.
Simulation choices
Taking the above research into consideration, the simulation needs to make some choices of a plausible action path. Because this essay builds on the previous essay 31, which used S.v to Hb.l as a seek timeout, it makes sense to continue that path from Hb.l to P.ms and E.hc via V.mr and H.sum.
Action path for the avoid timeout used by the simulation. E.hc (hippocampus), H.sum (supramammillary), Hb.l (lateral habenula), P.ms (median septum), Poa (preoptic area), S.ls (lateral septum), S.v (ventral striatum / nucleus accumbens), V.mr (median raphe)
The output side has two tempting directions: E.hc to Poa and H.sum via S.ls, or E.hc to Hb.m-R.ip via S.ts and S.bac. The E.hc to Hb.m-R.ip has an advantage of exploring differences between Hb.l and Hb.m. One model is Hb.l as the motivational avoid and Hb.m as the physical taxis avoid, where evolution secondarily adds motivational Hb.l avoid as a more sophisticated indirection to the older Hb.m direct avoid taxis. The other option using S.ls to Poa and H.sum has the advantage of possibly being more primitive, because the P.ts / P.bac to Hb.m path may be restricted to mammals and therefore newer. The S.ls path also has the advantage of introducing the strong E.hc to hypothalamus connection via S.ls.
Sequence design
Because the simulation is a toy model where understandability is important, the simulation doesn’t directly implement individual neurons. One advantage is that neuron firing is generally sparse. For example, only about 10% of O.pir (olfactory cortex) neurons fire for any particular odor. Taking an idea from high-dimensional computing [Laiho et al 2015], an engram (neural assembly) can be represented by a vector of digits. Each digit is essentially one sparse neuron selected from a larger set. Base-16 is one neuron selected from 16. This digital representation can be displayed as a normal hex number such as 78af. For display purposes, base-64 is the largest feasible base, because its numerical digits plus lower and upper letters produce 62 with two special digits. Zero is reserved to represent no neurons firing in the digit. So 4305 represents 3 neurons firing, one from each of three active digits with the last digit silent.
In natural sequences, after one neuron falls out, a new random neuron takes its place. In simulation, this randomness adds complexity because to avoid repetition it would need to keep track of earlier values. For neurons this repetition avoidance is built-in because neurons can have a refractory period where they’re less responsive, giving fresh neurons a better chance to be chosen. Instead, the simulation reserves the lowest bits of each digit for sequences, leaving the highest bits for the random engram. The sequence also updates each digit sequentially instead of randomly. Zero represents the end of the sequence when all neurons stop firing. A sequence with a single base-16 digit might look like “8, 9, a, b, 0” or “4, 5, 6, 7, 0” if two bits are sequence bits, and a sequence with two digits might look like “84, 85, 95, 96, a6, a7, ....“
Simulation results
The following diagram shows a sample trajectory where the animal first seeks to the center of the odor plume and spins at the center until the S.v striatum adenosine times out as in essay 31. The timeout triggers a motivational avoid, representing V.mr and H.sum via Hb.l. The motivational avoid starts the E.hc sequence via P.ms, which activates the physical avoid representing either S.ls to Poa or Hb.m to R.ip depending on interpretation. When the sequence ends, E.hc stops driving the physical avoid, and the animal resumes its roaming search.
Simulation of the sequence avoid. The animal has just timed out of a failed search and is currently moving forward, avoiding the previous area.
Discussion
This essay raises a number of issues, mainly because the situation is understudied, and because it involves many regions that are generally studied independently. As a reminder, this essay is at best a thought experiment.
The lamprey’s lack of a MPa / E.hc (medial pallium / hippocampus) is the first interesting issue. H.em (prethalamic eminence) in the expected MPa location is also interesting. What does H.em do? In mammals it’s a progenitor area that disappears, and it produces neurons for P.epn (endopeduncular nucleus) and Cajal-Retzius neurons that populate neocortical layer 1 and organize the neocortical layers. But in lampreys the H.em functionality may be entirely different. It its function an any way similar to a primitive E.hc? Does the lamprey lack all internally generated sequences?
In addition, what happens in lampreys to all the areas connected with MPa / E.hc? Does H.sum and H.mb (mammillary body) exist in lampreys, and if it exists, what does it do? Similar questions for V.mr R.nin (nucleus incepts) and S.ls and P.ms. If E.hc is missing in lampreys, studying those areas might give more information about their function.
More Zebrafish studies about the Dl / E.hc would also be very interesting, because it seems to have a different structure from the common amniote E.hc. Consider the hunger study with the reciprocal activation of Hc (caudal hypothalamus) and H.l (lateral hypothalamus) when the animal detects a food cue [Wee et al 2019]. Because Hc is composed of H.sum, H.mb, H.tu and H.arc, which have distinct functions, learning exactly which areas are activated by hunger and suppressed during seek would be interesting. H.sum and H.mb are also understudied in zebrafish. Do they have similar connectivity to Dl (MPa / E.hc) and function as in mammals? Does Dl have similar sequential functionality? It does have SWF (sharp wave ripples) [Blanco et al 2024], which suggests zebrafish has Dl sequences, but does it have the same threading / tiling of delays?
Although this essay has used E.ca2 as the center of E.hc sequences, but E.ca2 has only been identified in mammals. Genetic transcription comparisons with other animals suggests E.ca3 as better conserved. If E.ca2 is a mammalian innovation, how does that affect how E.hc works in non-mammals?
Chen S, He L, Huang AJY, Boehringer R, Robert V, Wintzer ME, Polygalov D, Weitemier AZ, Tao Y, Gu M, Middleton SJ, Namiki K, Hama H, Therreau L, Chevaleyre V, Hioki H, Miyawaki A, Piskorowski RA, McHugh TJ. A hypothalamic novelty signal modulates hippocampal memory. Nature. 2020 Oct;586(7828):270-274.
Farrell JS, Lovett-Barron M, Klein PM, Sparks FT, Gschwind T, Ortiz AL, Ahanonu B, Bradbury S, Terada S, Oijala M, Hwaun E, Dudok B, Szabo G, Schnitzer MJ, Deisseroth K, Losonczy A, Soltesz I. Supramammillary regulation of locomotion and hippocampal activity. Science. 2021 Dec 17;374(6574):1492-1496.
Roth, G., Dicke, U. (2013). Evolution of Nervous Systems and Brains. In: Galizia, C., Lledo, PM. (eds) Neurosciences – From Molecule to Behavior: a university textbook. Springer Spektrum, Berlin, Heidelberg.
The ascidian circuit in essay 30 had an interesting dopamine subcircuit that looks like an indirect search, where the ascidian coronet cells modulate the underlying phototaxis and geotaxis circuits. While the function of the coronet cells is unknown, if these cells are another seeking system like following an odor, then the coronet sub circuit follows odor by modulating different seek circuits: phototaxis and geotaxis.
Ascidian analogy
Tunicates are the closest non-vertebrate chordates evolutionarily, but they have developed in vastly different directions from the vertebrates, and likely very differently from the shared common ancestor [Holland 2015]. The ascidian tunicates, which are the most studied tunicates, live their adult life as sessile filter feeders like sponges. Their eggs hatch in only 20 hours and their brief tadpole form lasts only for a few hours, just enough to swim and disperse to find a likely permanent settlement place. Their locomotive strategy is to swim up using geotaxis in the morning and swim down using phototaxis in the afternoon. If they’re lucky enough to find a ledge, they swim up into the ledge’s shadow to settle because hanging like a bat from a ledge offers more protection from some predators than resting on the ocean floor [Zega et al 2006].
As would be expected from a 20-hour brain, the navigation circuit is fairly simple. There are two distinct action paths, one for geotaxis using a heavy pigment cell and one for phototaxis using photoreceptors and another pigment cell as a shadow to provide photo-directionality. The two action paths are connected, where dimming produces upward swimming [Bostwick et al 2020].
Ascidian tadpole sub circuit for geotaxis and phototaxis. The horizontal neurons are the main action paths. The coronet DA cells modulate the action paths.
In the above diagram, the geotaxis action path starts from the otolith (“ear stone”) receptor ant2, which is functionally similar to the vestibular system (but not related), passes input to antenna relay neurons (antRN) and then to the right side motor neurons (mgIN-R and MN-r) [Ryan et al 2016]. Similarly, the phototaxis action path starts from the ocellus (eyespot) to the phototaxis relay (prRN) and to the left motor neurons, providing an opposing direction from geotaxis. Importantly for the following discussion, each path has a weak connection to the opposite direction, possibly to add some stochasticity to the movement to improve dispersion of the many tadpoles.
The function of the coronet cells is unknown, although they have some genetic connection the palp sensory cells [Cao et al 2019]. Other papers compare the corona cells to dopamine cells in the hypothalamus and Ob (olfactory bulb) [Horie et al 2018] or ancestral photo-hypothalamus and retina [Sharma et al 2019], possibly related to the fish saccus vasculosus area of the hypothalamus, responsible for some circadian behavior. However, the ascidian tadpole has lost circadian clock genes, which argues against circadian timing [Chung et al 2023]. The coronet cells can accumulate serotonin and the DA might promote onset of metamorphosis [Razy-Kraika et al 2012]. So, the coronet may be involved in triggering metamorphic changes at twilight, which causes the tadpole to dive to deeper waters [Lemaire et al 2021].
Whatever the source, the interesting thing about the circuit is that it’s an indirect modulation of underlying taxis action paths. The action of the coronet is gating or modulatory. While this coronet circuit is not homologous to the basal ganglia, using it as an analogy may be useful. For example, dopamine is a sleep / wake signal for the basal ganglia [Vetrivelan et al 2010]. Because low dopamine reduces basal ganglia activity both at the striatum input layer and the Snr (substantia nigra pars reticulata) output layer, it’s an effective sleep controller.
Indirect chemotaxis
Consider indirect chemotaxis, where the animal is seeking toward the odor, but the underlying action path is phototaxis or geotaxis, like the ascidian circuit above. If the animal detects an odor, it increases the current direction. In other words, the current direction is toward or near a food odor. This strategy is like the e. coli tumble-and-run strategy, where the bacteria runs further when the odor gradient is increasing.
Consider the basal ganglia as an analogy. For example, Ob has some dopamine interneurons (Ob.sac – short axis cells) that project to S.ot (olfactory tubercle) [Burton 2017], a portion of the stratum focused on olfactory input. For the corollary of the phototaxis path, consider the Hb.m (medial habenula) phototaxis path [Zhang et al 2017].
Hypothetical indirect seek circuit where chemotaxis uses an underlying phototaxis to hunt for food. Hb (habenula), Ob (olfactory bulb), P (pallidum), R.ip (interpeduncular nucleus), R.rs (reticulospinal motor neurons), S (striatum), V.mr (median raphe).
When the odor is detected, Ob enables the basal ganglia, which enhances the phototaxis path. If the odor isn’t detected, the default semi-suppressed behavior means the direction is semi-random. This indirect control would allow for seeking odor when the underlying navigation is phototaxis and geotaxis.
Discussion
After writing this description. I think this model may be a bit sketch for something like chemotaxis, although it’s a reasonable model for sleep. Because I’m not sure the idea is likely to be productive, I’m holding off on doing any implementation, but writing down the description in case it makes sense later.
Let’s return to the task of essay 16 on give-up time in foraging, which covered food search with a timeout. At first the animal uses a general roaming search and if it smells a food odor, it switches to a targeted seek following the odor with chemotaxis. If the animal finds food in the odor plume, it eats the food, but if it doesn’t find food, it will eventually give up and avoid the local area before returning to the roaming search.
Search state machine. Roam is the starting state, switching to seek when it detects odor, and switching to avoid after a timeout.
For another attempt at the problem, let’s take the striatum (basal ganglia) as implementing the timeout portion of this task using the neurotransmitter adenosine as a timeout signal and incorporating the multiple action path discussion from essay 30 on RTPA. Adenosine is a byproduct of ATP breakdown and is a measure of cellular activity. With sufficiently high adenosine, the striatum switches from the active seek path to an avoidance path. These circuits are where caffeine works to suppress the adenosine timeout, allowing for longer concentration.
Mollusk navigation
As mentioned in essay 30, the mollusk sea slug has a food search circuit with a similar logic to what we need here. The animal seeks food odors when it’s hungry, but it avoids food odors when it’s not hungry [Gillette and Brown 2015].
Mollusk food search circuit, illustrating a hunger-modulated switchboard. When the animal is not hungry, the switchboard reverses the odor to motor links turning it away from food.
This essay uses the same idea but replaces the hunger modulation with a timeout. When the timeout occurs, the circuit switches from a food seek action path to a food avoid action path.
Odor action paths
Two odor-following actions paths exist in the lamprey, one using Hb.m (medial habenula) and one using V.pt (posterior tuberculum). The Hb.m path is a chemotaxis path following a temporal gradient. The V.pt path projects to MLR (midbrain locomotor region), but The lamprey Ob.m (medial olfactory bulb) projects to both Hb.m (medial habenula) and to V.pt (posterior tuberculum), which each project to different locomotor paths [Derjean et all 2010], Hb.m to R.ip (interpeduncular nucleus) and V.pt to MLR (midbrain locomotor region). The zebrafish also has Ob projections to Hb and V.pt [Imamura et al 2020], [Kermen et al 2013].
Dual odor-seeking action paths in the lamprey and zebrafish. Hb (habenula), Ob.m (medial olfactory bulb), V.pt (posterior tectum).
Further complicating the paths, the Hb.m itself contains both an odor seeking path and an odor avoiding path [Beretta et al 2012], [Chen et al 2019]. Similarly Hb.m has dual action paths for social winning and losing [Okamoto et al 2021]. So, this essay could use the dual paths in Ob.m instead of contrasting Ob.m with V.pt, but the larger contract should make the simulation easier to follow.
This essay’s simulation makes some important simplifications. The Hb to R.ip path is a temporal gradient path used for chemotaxis, phototaxis and thermotaxis. In a real-world marine environment, odor diffusion and water turbulence is much more complicated, producing more clumps and making a simple gradient ascent more difficult [Hengenius et al 2012]. Because this essay is only focused on the switchboard effect, this simplification should be fine.
Striatum action paths with adenosine timeout
The timeout circuit uses the striatum, which has two paths: one selecting the main action, and the second either stopping the action, or selecting an opposing action [Zhai et al 2023]. The two paths are distinguished by their responsiveness to dopamine with S.d1 (striatal projection with D1 G-s stimulating) or S.d2 (striatal projection with D2 G-i inhibiting) marking the active and alternate paths respectively. This model is a simplification of the mammalian striatum where the two paths interact in a more complicated fashion [Cui et al 2013].
Essay odor seek with timeout circuit. The seek path flows from Ob, through S.d1 to P.v to V.pt. The avoid path flows from Obj, though S.d2 to Pv. to Hb. Ad (adenosine), Hb (habenula), Ob (olfactory bulb), Pv (ventral pallidum), S.d1 (striatum D1 projection neuron), S.d2 (striatum D2 projection neuron), V.pt (posterior tuberculum)
As mentioned, the two actions paths are the seek path from Ob to V.pt and the avoid path from Ob to Hb. For the timeout and switchboard, the Ob has a secondary projection to the striatum. Although this circuit is meant as a proto-vertebrate simplification, Ob does project to S.ot (olfactory tubercle) and to the equivalent in zebrafish [Kermen et al 2013].
The timeout is managed by adenosine, which is a neurotransmitter derived from ATP and a measure of neural activity. The striatum has three sub-circuits for this kind of functionality, which I’ll cover in order of complexity.
S.d1 and adenosine inhibition
The first circuit only uses the direct S.d1 path and adenosine as a timeout mechanism. When the animal follows an odor, the Ob to S.d1 signal enables the seek action. As a timeout, ATP from neural activity degrades to adenosine and the buildup of adenosine is a decent measure of activity over time. The longer the animal seeks, the more adenosine builds up. Of the Ob projection axis contains an A1i (adenosine G-i inhibitory) receptor, the adenosine will inhibit the release of glutamate from Ob, which will eventually self-disable the seek action.
S.d1 action path inhibited by adenosine buildup as a timeout. A1i (adenosine G-i inhibitory receptor), Ad (adenosine), mGlu5q (metabotropic glutamate G-q receptor), Ob (olfactory bulb), S.d1 (D1-type striatal projection neuron)
In practice, the striatum uses astrocytes to manage the glutamate release. An astrocyte that envelops the synapse measures glutamate release with an mGlu5q (metabotropic glutamate with G-q/11 binding) receptor and accumulates internal calcium [Cavaccini et al 2020]. The astrocyte’s calcium triggers an adenosine release as a gliotransmitter, making the adenosine level a timeout measure of glutamate activity. The presynaptic A1i receptor then inhibits the Ob signal. The timeframe is on the order of 5 to 20 minutes with a recovery of about 60 minutes, although the precise timing is probably variable. Interestingly, the time-out is a log function instead of linear measure of activity [Ma et al 2022].
This circuit doesn’t depend on the postsynaptic S.d1 firing [Cavaccini et al 2020], which contrasts with the next LTD (long term depression) circuit which only inhibits the axon if the S.d1 projection neuron fires.
S.d1 presynaptic LTD using eCB
S.d1 self-activating LTD uses retrotransmission to inhibit its own input using eCB (endocannabiniods) as a neurotransmitter. Like the astrocyte in the previous circuit, S.d1 uses a mGlu5q receptor to trigger eCB release, but also require that S.d1 fire, as triggered by NMDA glutamate receptor. The axon receives the eCB retrotransmission with a CB1i (cannabinoid G-i inhibitory) receptor and trigger presynaptic LTD [Shen et al 2008], [Wu et al 2015]. Like the previous circuit, the timeframe seems to be on the order of 10 minutes, lasting for 30 to 60 minutes.
S.d1 LTD circuit. A coincidence of glutamate detection with mGlu5q and S.d1 activation with NMDA triggers eCB release, which activates CB1i leading to presynaptic LTD. CB1i (cannabinoid G-i inhibitory receptor), mGlu5q (glutamate G-q receptor), Ob (olfactory bulb), S.d1 (striatum D1-type projection neuron).
This circuit inhibits itself over time without using adenosine or astrocytes. In the full striatum circuit, high dopamine levels suppress this LTD suppression, meaning that dopamine inhibits the timeout [Shen et al 2008].
The next circuit adds the S.d2 path, which uses adenosine and self-activity to trigger postsynaptic LTD.
S.d2 postsynaptic LTP via A2a.s
Consider a third circuit that has the benefits of both previous circuits because it uses adenosine as a timer managed by astrocytes and is also specific to postsynaptic activity. In addition, it allows for a second action path, changing the circuit from a Go/NoGo system to a Go/Avoid action pair. This circuit uses LTP (long term potentiation) on the S.d2 striatum neurons.
Timeout circuit using postsynaptic LTD at the S.d2 neuron and adenosine as a timeout signal. As adenosine accumulates, it stimulates S.d2, which both disables S.d1 and drives the avoid path. A2a.s (adenosine G-s stimulatory receptor), Ad (adenosine), mGlu5q (glutamate G-q metabotropic receptor), Ob (olfactory bulb), S.d1 (striatum D1-type projection neuron), S.d2 (striatum D2-type projection neuron)
When the odor first arrives, Ob activates the S.d1 path, seeking toward the odor. S.d1 is activated instead of S.d2 because of dopamine. In this simple model, the Ob itself could provide the initial dopamine like c. elegans odor-detecting neurons or the tunicate’s coronal cells or the dual glutamate and dopamine neurons in Vta (ventral tegmental area).
As time goes on, adenosine from the astrocyte builds up, which activates the S.d2 A2s.a (adenosine G-s stimulatory receptor) until it overcomes dopamine suppression and increases the S.d2 activity with LTP [Shen et al 2008]. Once S.d2 activates, it suppresses S.d1 [Chen et al 2023] and drives the avoid path.
The combination of these circuits looks like it’s precisely what the essay needs.
Simulation
In the simulation, when the animal is hunting food and finds a food odor plume, it directly seeks toward the center and eats if it find food. In the screenshot below, the animal is eating.
Simulation showing the animal eating food after seeking the odor plume.
Satiation disables the food seek. This might sound obvious, but hunger gating of food seeking requires specific satiety circuits to any seek path that’s food specific, which means the involvement of H.l (lateral hypothalamus) and related areas like H.arc (arcuate hypothalamus) and H.pv (periventricular hypothalamus). And, of course, the simulation requires simulation code to only enable food odor seek when the animal is searching for food.
The next screenshot shows the central problem of the essay, when the animal seeks a food odor but there’s no food at the center.
Screenshot showing the animal stuck in the middle of the food odor plume before the timeout.
Without a timeout, the animal circles the center of the food odor plume endlessly. After a timeout, the animal actively leaves the plume and avoid that specific odor until the timeout decays.
Screenshot showing the animal escaping from the odor plume after the timeout.
This system is somewhat complex because of the need for hysteresis. A too-simple solution with a single threshold can oscillate, because as soon as the animal starts leaving the timeout decays, which then re-enables the food-seek, which then quickly times out, repeating. Instead, the system needs to make re-enabling of the food seek more difficult after a timeout.
But that adds a secondary issue because if food seek is a lower threshold, then the sustain of seek needs to raise the threshold while the seek occurs. So, the sustain of seek needs a lower threshold than starting seek. This hysteresis and seek sustain presumably needs to be handled by the actual striatum circuit.
Discussion
I think this essay shows that using the stratum for an action timeout for food seek is a plausible application. The circuit is relatively simple and is effective, improving search by avoiding failed areas.
However, the simulation does raise some issues, particularly hysteresis problem. If the striatum does provide a timeout along these lines, it must somehow solve the hysteresis problem. While the animal is seeking, the ongoing LTP/LTD inhibition should use a high threshold to stop seeking, but once avoidance starts, there needs to be a high threshold to return to seeking to avoid oscillations between the two action paths.
Because LTD/LTP is a relatively long chemical process (minutes) internal to the neurons, as opposed to an instant switch in the simulation, the delay itself might be sufficient to solve the oscillation problem. It’s also possible that some of the more complicated parts of the circuit, such as P.ge (globus pallidus) and its feedback to the striatum or H.stn (subthalamic nucleus) might affect the sustain of seek or breaking it and so control the hysteresis problem.
The simulation also reinforced the absolute requirement that action paths need to be modulated by internal state like hunger. For the seek paths, both Hb.m and V.pt are heavily modulated by H.l and other hypothalamic hunger and satiety signals.
As expected, the simulation also illustrated the need for context information separate from the target odor. While the food odor is timed out, the animal can’t search the other odor plume because this essay’s animal can’t distinguish between the odor plumes, and therefore avoids both odors. With a long timeout and many odor plumes, this delays the food search. A future enhancement is to add context to the timeout. If the animal can timeout a specific odor plume, it can search alternatives even if the food odor itself is identical.
Unsurprisingly since essay 26 was a first cut at selective attention, it exposed a number of problems with both the neuroscience and the simulation model itself.
Specific give up
The current give up circuit is a global circuit, which doesn’t depend on the current stimulus. For this essay, the animal has two potential and because the give up is global, when the animal gives up, it gives up on both odors.
Global give-up circuit for olfactory seek. H.l lateral hypothalamus, Hb.l lateral habenula, Vdr dorsal raphe, 5HT serotonin.
An improvement would be a cue-specific give up capability. When the animal gives up on odor A, it should investigate odor B. Instead it gives up on both. I need to add some mechanism to create a cue-specific give up capability.
As a possible neural analog, the adenosine receptor can work as a local give-up circuit by integrating neural activity. Since adenosine is essentially a waste produce from neural activity, long activity will accumulate adenosine. The A1 adenosine receptor detects the adenosine and inhibits activity, since it’s a Gi receptor.
Olfactory complexity and attention
The essay’s odor model is extremely oversimplified, because odor receptors are feature detectors, not molecule receptors, and odors are combinations of molecules. Since a specific odor is a combination of features, P.bf (basal forebrain) can’t be a simple winner-take-all inhibitory circuit as implemented in this essay. Instead, attention needs to be a set of features that excludes the distractor odor’s features.
Olfactory gamma and beta
Although the essay treats the olfactory bulb data as direct signals, oscillations are a major feature of the olfactory bulb. Strong odors trigger gamma (40-100Hz) signals in Omt (mitral/tufted output cells), enhanced by ACh (acetylcholine) from P.bf. Feedback from O.pir (olfactory piriform cortex) triggers beta (15-30Hz) oscillations. In addition, interactions with breathing in mammals synchronized with theta (4-10Hz). Although, in the last case, since the simulation animal is aquatic, breathing isn’t an appropriate synchronizer.
Temporal gradient seek issues
Odor seeking in essay 26 uses temporal gradient descent modulated by head direction in Hb.m (medial habenula) and B.ip (interpeduncular nucleus). The animal combines its head direction with the temporal gradient to estimate the odor direction, and it saves the result as a goal vector. As the animal turns, it can improve the direct estimate. In the phototaxis example of essay 25, the saved goal vector direction helped with intermittent data, where it could remember the light location for a few seconds.
Problems with the current odor direction. A quick switch in location incorporates data from the old direction, leading to an incorrect estimate.
However, the system as implemented in the model is extremely limited. It can’t truly triangulate to locate the odor, but can only improve the single direction. In the diagram above, the animal can only select one of the two vectors as an estimate. It can’t combine the two into a better estimate of the center. Also, in the diagram, the earlier estimate is no longer useful because the animal has moved.
Now, the issue might be purely in the simulation. If B.ip and Vdr (dorsal raphe serotonin) are calculating this kind of estimate, it’s likely their computation is better than the current simulation.
The selection is a trade off where a stronger gradient is likely a better estimate, but if the animal moves too far from the earlier sample, the old direction is no longer relevant. Since the animal lacks the sophistication of an allocentric map to resolve the discrepancy, it discards the old value.
The current implementation decays the old estimate to allow newer estimates to overwrite it even if the later gradient is weaker. Essentially the memory is like a leaky integrator, as is appropriate for placing it in the serotonin neurons and/or associate glia with short term (5s) memory as in simple zebrafish motor memory [Dragomir et al 2020].
Bayesian updates
In a future essay, it might be interesting to explore this issue to see if a simple Bayesian system could be implemented in low-complexity circuits, where stronger data would update the current model more than the current model.
Self motion and gradient vectors
When the animal is turning, the running average no longer represents a straight line. For the gradient vector, the system assumes the recent average was measured along the current head direction, but turns violate this assumption. To avoid miscalculating gradient vectors, the animal should suppress measurement during turns.
Swimming and theta
The gradient seek issues above are compounded with swimming with a fixed head. Early vertebrates would have had a fixed head like sharks, meaning that each swimming stroke would move the head from side to side. That sideways movement would affect the odor gradient and head direction.
Inconsistent head vs body direction and odor measurement while swimming with a fixed head.
A simplistic fix would take an odor gradient sample only on each swim stroke, only reporting at the stroke end for consistency and to average from the beginning of the stroke to the end. That solution would give a consistent measurement in a reasonably consistent direction, as opposed to sampling randomly in a cycle.
Log encoding vs linear encoding
For simplicity, I’e used linear encoding for signals in the essays, because the basic functional architecture remains the same, and the simulation isn’t precise enough to need more complexity. But for odors, the dynamic range between a single molecule detection and an overpowering odor doesn’t scale well with a linear representation.
In particular, the odor weight from the simple distance gradient, together with above mentioned temporal gradient issues might be better modeled with a log signal. Basically, the issue I raised above with gradient vector sampling might be more tractable with a different encoding, and log encoding might make the actual neural circuit less finicky than the current linear model.
Seek mode switching
The essay’s simulation lacks a specific mode switching circuit. In vertebrates the peptide core (hypothalamus, PAG, B.pb area) switches action modes from roaming to seek to eating to rest and sleep. These modes are motivated and depend on internal needs and scheduling impulses programmed by evolution.