Vision, Hearing, Eye Movements, and the Brain

Jennifer M. Groh, Ph.D.

line decor
line decor


The Brain and Space

I. Reference frames and coordinate transformations

Spatial locations are defined in a frame of reference. A frame of reference is the reference point or axes used to describe where something is.


For example, in this picture, you can say the coffee cup is on the table, to the left of the chair, or to the right of the book. These are all perfectly good ways of describing the location of the coffee cup, in three different frames of reference.

Suppose you knew the book is on the table and the coffee cup is to the left of the chair. Is that enough information tosay where the coffee cup is with respect to the book? No! The locations of the two objects are defined in different reference frames, and to determine their relative positions requires knowing the relationship betweeen those reference frames. In other words, you would need to know where the chair is with respect to the table (and you would need to know the precise directions and distances, rather than loose words like "on" or "to the left").



How do our brains define the locations of objects? This depends on the kind of object. If the object is visible, then our eyes will report its location to the brain in an eye-centered frame of reference. The frame of reference is eye-centered because of the optics of the eye - to reach the retina, light must pass through the pupil, and individual photoreceptors at a particular spot on the retina can only "view" the location in the visual scene that is lined up with the pupil for that position on the retina. This is why photoreceptors have receptive fields - they can only be affected by light coming from a very restricted region of the visual environment. When the eye moves, the location in the environment that an individual photoreceptor views moves as well.


When the eye looks straight ahead, the image of the tree falls in the center of the retina.


When the eye looks up or down, the image of the tree shifts to a different location on the retina. The retina signals the locations of stimuli in an eye-centered frame of reference.




This shifting frame of reference poses a variety of problems. First, the stability of visual perception: how do we maintain a sense of where things are when the eyes are constantly moving? Second, memory: how do we compare what we see now with what we saw at
some earlier point in time despite the eyes having moved in the meantime?

And there is a third problem. The visual system is not our only sensory system. Our auditory and somatosensory systems also provide information about external objects and events, and these two senses use different frames of reference. For example, receptors on the skin report where on the skin a tactile stimulus is located - a body-surface centered frame of reference. The eyes and body can move with respect to one another, so there is no fixed correspondence between the visual eye-centered and somatosensory body-surface centered reference frames.

The auditory system calculates the locations of sounds using minute differences in sound arrival time and loudness across the two ears, as well as subtle filtering of the frequency content by the folds of the ears. These cues can be used to infer the direction a sound is coming from, defined with respect to the ears. Since the ears are attached to the head, and immobile, this reference frame is referred to as a head-centered reference frame.

So the third problem is how can the brain integrate or combine information from different sources when the spatial reference frames are different?


  The visual and auditory components of stimuli are encoded in different reference frames. The sound of the bird is to the right in a head-centered reference frame - the sound is louder and arrives sooner to the right ear. But if the eyes are turned to the right, then the image of the bird is in the center in an eye-centered reference frame. The relationship between a location defined with respect to the head and a location defined with respect to the eyes depends on the orientation of the eyes with respect to the head.


Research in the Groh laboratory is devoted to addressing these questions. We have designed models for coordinate transformations (transforming from one reference frame to another). We have found evidence for conversion of auditory and somatosensory signals into an eye-position dependent reference frame. Our recent experiments have explored the coding of auditory space at several stages of the auditory pathway, and have suggested that coordinate transformations begin much earlier in the neural hierarchy than was previously suspected. Our results suggest that integration of visual and auditory information may occur in a composite or hybrid reference frame that reflects both head- and eye-centered components of the locations of stimuli

So what do Copernicus and your brain have in common? Well, like your brain, Copernicus solved a frame of reference problem. Astronomers had puzzled for centuries about the strange trajectories of the other planets in our solar system. Copernicus realized that the planets were all moving around the sun, and not the earth, which had made the planetary trajectories seem quite confusing. Later, Kepler determined that these orbits followed a highly predictable elliptical trajectory around the sun. A sun-centered, or heliocentric, view of the solar system, emerged.

Groh lab publications on reference frames:

Groh, JM and Pai, D.  2010. Looking at sounds:  neural mechanisms in the primate brain. In, Primate Neuroethology.  A. Ghazanfar and M. Platt, eds. Oxford University Press. 

Kopčo, N; Lin, I-F. Shinn-Cunningham, B. G. and Groh, J. M. 2009. Reference frame of the ventriloquism aftereffect. J. Neurosci, 29:13809-13814.

Maier, J.X. and Groh, J.M. 2009. Multisensory guidance of orienting behavior.Hearing Research, 258:106-112.

Mullette-Gillman, O. A., Cohen, Y. E. and Groh, JM.  2008. Motor-related signals in the intraparietal cortex encode locations in a hybrid, rather than eye-centered, reference frame.  Cerebral Cortex, epub (PDF version). 

Porter, KK.,Metzger, RR, and Groh, JM 2007. Visual- and saccade-related signals in the primate inferior colliculus.. Proceedings of the National Academy of Sciences, 104(45): 17855-60 (PDF version)

Bulkin, DA., Groh, JM 2006. Seeing sounds: Visual and auditory interactions in the brain. Current Opinions in Neurobiology, 16:415-9(PDF version)

Porter, KK., Metzger, RR Groh, JM 2006. The representation of eye position in primate inferior colliculus. Journal of Neurophysiology, 95:1826-42(PDF version)

Mullette-Gillman, OA., Cohen, YE, Groh, JM 2005. Eye-centered, head-centered, and complex coding of visual and auditory targets in the intraparietal sulcus. Journal of Neurophysiology, 94:2331-2352. (PDF version) (Link to article at J Neurophysiol). See also the editorial by Larry Snyder.

Metzger RR, Mullette-Gillman OA, Underhill AM, Cohen YE, Groh JM. 2004. Auditory saccades from different eye positions in the monkey: implications for coordinate transformations. J Neurophysiol. 2004 Oct;92(4):2622-7. (PDF version) (Link to article at JNeurophysiol). 

Werner-Reiss, U, Kelly, KA, Trause, AS, Underhill, AM and Groh, JM. 2003. Eye position affects activity in primary auditory cortex of primates. Current Biology, 13:554-562. (PDF version)

Groh JM, Trause, A. S., Underhill, A. M., Clark, K. R, Inati, S. 2001. Eye position influences auditory responses in primate inferior colliculus. Neuron, 29:509-518 (PDF version) See also the preview in Neuron by Gregg Recanzone. (Jpeg version of cover) Mirror of Neuron's web site, February 2001 Nature's Science Update has written about our work! (site mirrors:) Nature's web page Nature's Science Update "Seeing is a hearing aid"

Boucher, L, Groh JM, Hughes HC. 2001. Visual latency and the mislocalization of perisaccadic stimuli. Vision Research, 41: 2631:2644. (PDF version)

Groh, JM, Born, RT, and Newsome, WT. 1997. How is a sensory map read out? Effects of microstimulation in area MT on smooth pursuit and saccadic eye movements. J. Neurosci., 17:4312-4330. (PDF version)

Groh, JM and Sparks, DL. 1996. Saccades to somatosensory targets: I. Behavioral characteristics. J. Neurophysiol., 75: 412-427. (PDF version)

Groh, JM and Sparks, DL. 1996. Saccades to somatosensory targets: II. Motor convergence in primate superior colliculus. J. Neurophysiol., 75: 428-438. (PDF version)

 Groh, JM and Sparks, DL. 1996. Saccades to somatosensory targets: III. Influence of eye position on somatosensory activity in primate superior colliculus. J. Neurophysiol., 75: 439-453. (PDF version)

Groh, JM and Sparks, DL. 1992. Two models for transforming auditory signals from head-centered to eye-centered coordinates. Biol. Cybern., 67(4):291-302.(PDF version)


The Brain and Space

II. Coding Formats

The brain has two methods for encoding information at its disposal.  (This is an oversimplification, but probably not a severe one.) One way is by which neurons are active.  For example, beginning with the retina, which neurons are responding to light informs the brain about where the light is coming from.  This occurs due to the optics of the eye:  the light must pass through a small opening, the pupil, in order to reach the retina at the back of the eye. The retina’s photoreceptors peer out at the world through that tiny pinhole, and each individual photoreceptor is only able to view a small portion of the visual scene.  The portion that a given photoreceptor can “see” is known as its receptive field


Light from the top flashlight hits the lower portion of the retina. Light from the bottom flashlight hits the upper portion of the retina. Thus, where on the retina neurons are responding to light indicates to the brain where the light is coming from.

visual coding format


The retina contains over 100 million photoreceptors (Osterberg, 1935).  Each one has a different receptive field, based on its position on the retinal surface and thus what direction light must come from in order to pass through the pupil and be absorbed by that photoreceptor.  Across the population of all photoreceptors, the spatial layout of light present in the visual scene is reproduced as a spatial layout of activation in the array of photoreceptors. 

So the code the brain receives is something like this:



The presence/absence of visual stimuli at different locations in the visual scene can be signaled by the presence/absence of activity in neurons at those corresponding retinal locations.



Again, a gross oversimplification – neurons are not either “on” or “off” – they have graded response patterns that can vary continuously within a range – but the point here is that if you were a Blind Martian and you couldn’t see for yourself but you had some science-fiction-y ability to monitor the activity of the retinal neurons of us Earth Creatures, you’d know quite a lot about the presence and location of light in the scene just from knowing which neurons were active. 

One reason I’ve made this on/off oversimplification is to set up an analogy with electronics and computing. This which-neurons-are-on kind of code resembles the binary digital format used in so many types of electronic devices.  We can redraw the above graph as a binary number like this:





This binary number signals a quantity. Each digit’s location in the number has meaning – a zero or a one in the rightmost digit or bit signals the presence or absence of a 1 (20).  The leftmost digit or bit signals the presence or absence of 256 (29), and totaling these components allows us to express the number in our usual base-ten fashion. 

In short, we can think of the brain’s which-neurons-are-on coding as its form of digital coding.

But many electrical devices are analog. The magnitude of some signal, such as voltage, scales continuously with some kind of information to be encoded.





An analog ammeter, for measuring electrical current.  The amount of current is indicated by how far the needle deflects.





The brain also uses analog coding:  neural activity can vary continuously between being completely silent and discharging action potentials at a rate of hundreds per second.  If a neural digital code is which-neurons-are-on, a neural analog code is how-on-are-they.  So, trivially, you can rethink the above discussion substituting finer-grained values to replace the 0’s and 1’s.  But more importantly, the level of activity of neurons can be meaningful in and of itself. And different kinds of codes can be used in different contexts.

We’ve been investigating the coding format used by the auditory system.  Does the brain use a digital, which-neurons-are-on, format for encoding sound location, or does it use an analog, how-on-are-they, format?  The auditory system has to be clever to determine where sounds are coming from.  No physical mapping of sound location onto neuron location occurs in the ear.  Instead, the brain infers the location of sounds by comparing what each ear hears.  A sound located to the right will be slightly louder in the right ear than in the left ear, and it will arrive in the right ear sooner by a tiny amount – not even a fraction of a second, but a fraction of a millisecond!  How much louder, and how much sooner, varies with the direction the sound is coming from.  The biggest differences occur when the sound is all the way to the left or right, and there is no difference when a sound is straight ahead (or behind). 



An example neuron from the primate Inferior Colliculus, an early auditory area.  The neuron discharges more vigorously for sounds located 90 deg. to the right – where interaural timing and level differences reach their maximum value – than for any other location.  The majority of neurons showed a similar preference for either straight right or straight leftward sounds. (Groh et al., J.Cog. Neurosci., 2003.)



In short, the information from which the brain has to guess the location of sounds is not itself spatial, as it is in the visual system, but is based on magnitude of interaural timing and level differences.   And, interestingly, we have found that in two early stages of the primate brain, the sound location is encoded in neural activity primarily in an analog format.  Most neurons discharge for most sound locations, but the amount of activity that they exhibit depends on the location of the sound.  The greatest activity occurs for sounds near the axis of one ear or the other.  A Deaf Martian could get a pretty good idea about where sounds are coming from (in the horizontal dimension) by monitoring the relative activity of left-preferring vs. right-preferring neurons. 

Groh lab publications on coding formats:

Werner-Reiss, U. and Groh, JM.  2008.  A rate code for sound azimuth in monkey auditory cortex:  implications for human neuroimaging studies.  Journal of Neuroscience.  28:3747-3758. (PDF version)

Porter, KK., Groh, JM 2006. The "other" transformation required for visual-auditory integration: representational format . Progress in Brain Research, 155:313-23(PDF version)

Porter, KK., Metzger, RR Groh, JM 2006. The representation of eye position in primate inferior colliculus. Journal of Neurophysiology, 95:1826-42(PDF version)

Groh, JM, Kelly KA and Underhill, AM. 2003. A monotonic code for sound azimuth in primate inferior colliculus. Journal of Cognitive Neuroscience, 15(8):1217-1231. (PDF version) (Link to article at JoCN website). 

Groh, JM. 2001. Converting neural signals from place codes to rate codes. Biol. Cybern., 85:159-65. (PDF version)

Born, RT, Groh, JM, Zhao, R, and Lukaswewycz, SJ. 2000. Segregation of object and background motion in visual area MT: effects of microstimulation on eye movements. Neuron, 26:725-734. (PDF version). See also this news & views piece in Current Biology by Treue & Ilg

Groh, JM, Born, RT, and Newsome, WT. 1997. How is a sensory map read out? Effects of microstimulation in area MT on smooth pursuit and saccadic eye movements. J. Neurosci., 17:4312-4330. (PDF version)

The Brain and Space

III. Seeing Ears and Hearing Eyes

[Under construction]

Groh lab publications on visual-auditory interactions :

Porter, KK.,Metzger, RR, and Groh, JM 2007. Visual- and saccade-related signals in the primate inferior colliculus.. Proceedings of the National Academy of Sciences, 104(45): 17855-60 (PDF version)

Kopčo, N; Lin, I-F. Shinn-Cunningham, B. G. and Groh, J. M.  2009.  Reference frame of the ventriloquism aftereffect.  J. Neurosci, 29:13809-13814. (PDF version) 

Auditory Prostheses in the Brain

Over the last several decades, auditory prostheses have revolutionized the treatment of deaf patients. However, nearly all of this success has involved prostheses placed in the cochlea, in the ear, rather than in the brain. A subset of patients cannot use a cochlear implant due to structural problems with the cochlea or due to bilateral damage in the auditory nerve. These patients can only be helped by a brain prosthesis. Yet, the current generation of prostheses placed within the brain have not been as successful as cochlear implants: patients typically are not able to understand speech well enough to use the telephone (e.g. Lenarz et al., 2001; Colletti and Shannon, 2005; Tatagiba and Gharabaghi, 2005). This is true both of implants placed in the cochlear nucleus (the more commonly used site) and the inferior colliculus (more recently investigated in a small number of patients). Why these implants have not fulfilled their promise is not clear. This project seeks to shed light on what might be going wrong and how these problems might be solved. The project focuses on an animal model, the non-human primate, and is motivated by the idea that testing in an animal model affords opportunities for combined anatomical, electrophysiological, and behavioral approaches that would be difficult or impossible to conduct in human patients.

Groh lab publications on microstimulation, frequency discrimination, and auditory prostheses :

Ross, D. A. and Groh, JM.  2010.  Effects of microstimulation in the primate inferior colliculus on auditory perception:  implications for the auditory midbrain implant.  Society for Neuroscience Abstracts.

Ross, DA and Groh, JM.  2009.  Performance of monkeys on a frequency discrimination task involving pitch direction (higher vs. lower) judgments.  Society for Neuroscience Meeting. Washington, DC

Groh, JM.  1998.  Reading neural representations.  Neuron, 21:661-664. (PDF version)

Wickersham, I. and Groh, JM.  1998.  Electrically evoking sensory experience.  Current Biology, 8:R412-R414. (PDF version)

Groh, JM, Born, RT, and Newsome, WT.  1997.  How is a sensory map read out?  Effects of microstimulation in area MT on smooth pursuit and saccadic eye movements.  Journal of Neuroscience, 17:4312-4330. (PDF version)

Groh, JM, Born, RT, and Newsome, WT.  1996.  Interpreting sensory maps in visual cortex.  International Brain Research Organization News, 24: 11-12.