The Forgetting Machine Read online

Page 2


  Figure 1.4: Evolution of a Hopfield network

  Starting from an initial configuration (at left), the network evolves by reducing its energy, gradually changing its activation pattern, until it reaches the closest memory (Memory A, in this case).

  The Hopfield model offers us a plausible mechanism of the way the brain stores memories—as patterns of neural activations. Whereas the Britannica definition describes memory as a behavioral process, now we can also see it as the product of the physical activity of neurons. In other words, we have built a bridge between psychology and neuroscience, and have begun to peer inside the black box.

  We saw that the Hopfield model assigns memories based on connectivity changes in the network. But how can the brain change which neurons are connected to each other? In short, while each neuron connects with some 10,000 more, not all of these connections are active. Some are constantly reinforced, like a high-traffic highway that offers a convenient link between two places, while others resemble a deserted, potholed street, one that could in principle connect two places but in practice does not. Just as a path that is unused will eventually become overgrown and impassable, so neural connections that are seldom used may disappear. Building on our traffic analogy, changing the connectivity of a network is like blocking off some streets and rerouting cars to others instead, increasing their traffic. These connectivity changes eventually bring about changes in what information these neurons encode. This is known as neural plasticity, and it is the key mechanism used by the brain to generate and store specific memories.

  The idea that memories relate to neural connectivity goes back to Santiago Ramón y Cajal, in the nineteenth century,5 but the most important contribution to this hypothesis was offered by Donald Hebb in 1949, in a book that would become one of the classics of neuroscience.6 Hebb postulated that the joint activation of neurons reinforces the connections between them, a phenomenon usually summarized in the famous phrase, “Neurons that fire together wire together.” This is hardly a far-fetched notion: if two neurons tend to fire at the same time, it is quite likely that this is because they encode similar information, and thus it makes sense that they are connected, and that their connection is reinforced. Similarly, the wiring between neurons that tend to fire at different times is weakened. This process gives rise to the formation of what are known as Hebbian cell assemblies—that is, groups of interconnected neurons that represent different memories. Hebb’s theory was experimentally verified by Tim Bliss and Terje Lømo, who observed that the coactivation of neurons had a durable effect on strengthening their synaptic connections.7 This reinforcement in the wiring between neurons, called long-term potentiation or LTP, lasted for several weeks or even months under repeated stimulation, and provided clear experimental evidence of the mechanism underlying the formation and storage of memories. Confirming this, a great number of experiments have shown that blocking this LTP mechanism (by means of various pharmacological compounds) inhibits the formation of memories.8

  At this point, it seems that we have managed a general answer to one of our questions. The model offered by Hopfield’s networks, along with the concept of neural plasticity, gives us an idea of how the activations of groups of neurons encode memories. However, as usual, this answer gives rise to many further questions. In particular, how can the brain, with its mere three pounds of matter, store so many memories in such rich detail? Or, more explicitly: Do we have enough neurons to account for such a feat?

  The human brain has approximately 100 billion neurons, or 1011, a one followed by 11 zeros.9 For comparison, there are between two and four hundred billion stars in the Milky Way, putting the number of neurons in the brain on the same order of magnitude. To give you an idea of how many this is, if each of your neurons were a grain of sand, you would have enough to fill a cargo truck.10 Another way to think about the number of neurons we have is by their density—there are about 50,000 per cubic millimeter in the cerebral cortex, which means that roughly 50,000 neurons could fit on the head of a pin. As each neuron is wired to another 10,000, this puts the number of connections on the order of 10,000 times 1011, or 1015, which is roughly the number of grains of sand in a beach 100 meters long.11

  Given all this, it would seem the brain should have no difficulty in storing all of our memories. However, we face two problems. First of all, not all neurons are dedicated to storing memories. In fact, neurons with such a function may make up only a small fraction, since a significant number of neurons must also be devoted to visual and auditory processing, motion control, decision making, emotions, and so on. Second, theoretical calculations show that the number of memories that can be stored by a given number of neurons is limited because of interference effects: in short, if there are too many memories, they begin to become mixed up with one another. Calculations estimate that, given N neurons, a model like Hopfield’s can store some 0.14N memories without interference.12 Hence, if we assume that, for example, just 1 percent of the brain’s 100 billion neurons are involved in the encoding of memories,13 and considering that the total number of memories that can be stored would be only about 14 percent of this number, this gives us a total of approximately 108, or 100 million memories. Of course we must take these estimates with a grain of salt, since the fraction of neurons devoted to memory storage could well be even less than 1 percent, or the brain might not store memories in the way Hopfield proposed but using some less efficient system, in which case our memory capacity would be further reduced. But even if the number of memories we can store were to be one or two orders of magnitude smaller—so about a million—the number seems large enough to be sufficient.

  Figure 1.5

  The number of neurons in the human brain is on the same order as the number of stars in the Milky Way. The image of the Milky Way (left) was taken by the European Southern Observatory. The image of neurons (right) was taken by Julieta Campi in my laboratory.

  Alas, the limitation of the preceding arguments is that there is a deep chasm between understanding how the brain can use, say, Hopfield networks to encode single abstract entities like Memory A, Memory B, etc., and understanding the mechanism whereby it stores memories like those recalled by Roy Batty as he faced Deckard, or the many nuances and specific details we remember from a party with friends. In other words, we believe we remember our past as a movie that we can relive through memory. But how does the brain manage to store all these “movies” in such detail? How do we extrapolate from the mechanism for storing specific concepts (Memory A, Memory B) to the process by which the brain does something much more complex, like reconstructing lived experiences? Moreover, even specific concepts have myriad forms and nuances. My mother in her red ball gown is quite different from my mother wearing an apron in the kitchen, or a yellow T-shirt on the terrace. We’ve seen how Hopfield networks might help us match any of these to a memory of “my mother,” but many of these nuances are also stored as memories in their own right. Each of these memories unfolds into many others, as my mother in her yellow T-shirt on the terrace may be kneading pasta, drinking coffee, or marinating meat for a barbecue. This is what is called combinatorial explosion: each concept gives rise to a multiplicity of more specific concepts, each of which in turn subdivides into many others, and so on.

  So how do we do it? How do we store all this information? The surprising answer is that we basically do not. We remember almost nothing. The idea that we remember a great deal of the subtleties and details of our experiences, as if we are playing back a movie, is nothing more than an illusion, a construct of the brain. And this is perhaps the greatest secret in the study of memory: the astounding truth that, starting from very little information, the brain generates a reality and a past that make us who we are, despite the fact that this past, this collection of memories, is extremely slippery; despite the fact that the mere act of bringing a memory to our consciousness inevitably changes it; despite the fact that what underlies my awareness of a unique, immutable “self” that makes me who I am is
constantly changing. This is precisely the subject of the following chapters, but before delving into the details of how little we remember, we begin by analyzing how much information from the external world—in particular, how much visual information—we perceive at all.

  Chapter 2

  HOW MUCH DO WE SEE?

  In which we introduce information theory, analyze the amount of visual information transmitted to the brain, and discuss the resolution of the eye, eye movements and their measurement using eye trackers, and the perception of art

  A group of researchers at the University of Pennsylvania asked the following question: How much information gathered by our eyes is transmitted to the brain? To find out, they used guinea pigs, and recorded the activity of their retinal neurons as they were shown videos of natural scenes—the kind of visual information the eyes usually handle.1

  To interpret the results of this experiment, we must first define what “information” is and understand how it can be measured by recording the firing of neurons. Let us imagine, for example, that the video viewed by our guinea pig subjects contained at any given time one of only two possible objects: a face or a plant. Mathematically, we can represent the content of the video at a given moment using a single binary digit,2 or a bit of information: 0 if the object is a face and 1 if it is a plant. Imagine now that the video can contain one of four possible objects: a face, a plant, an animal, or a house. In this case, representing the number of possible options requires two bits, or, in other words, two binary numbers: for example, 00 may represent the house, 01 the animal, 10 the plant, and 11 the face.3 If a neuron fires with different intensity in response to each of the four objects, then from its firing we can discern which object is present in the video—and can say that the neuron provides two bits of information, which is as much as can be extracted from this video at a given moment. If the neuron fires with the same intensity to, for instance, “face” and “animal,” and with a second, different intensity to “house” and “plant,” then from its firing we can narrow the identity of the object to one of a group of two, meaning that in this case, the neuron provides one bit of information—half of what it was possible to extract from the video containing two bits of data. These principles, and the calculations they enable us to carry out, are widely used in neuroscience and make up what is called information theory, a discipline developed by Claude Shannon in the mid-twentieth century to study the coding and transmission of information.4

  Information theory underpins everything from the internet to cellular technology, and today, the idea of measuring information in bits is commonplace. A group of eight bits, or a byte—initially representing the number of bits needed to encode the 256 characters of the ASCII code—is the unit by which we measure the storage capacity of a hard drive, some of the most frequently used measurements being the kilobyte (KB: one thousand bytes), the megabyte (MB: one million bytes), the gigabyte (GB: one billion bytes), and the terabyte (TB: one trillion bytes). The color-level resolution of a computer monitor or a digital image, technically known as “color depth,” is also expressed in bits. If a monitor displays a pixel using only one color (like vintage green phosphorous monitors of the kind used in The Matrix), then its color resolution is obviously one bit per pixel. A black-and-white monitor uses 8 bits (or one byte) per pixel, corresponding to 256 shades of gray, while a color monitor can have 24 bits (or three bytes) per pixel, one byte for each primary color (red, green, and blue) used to generate the rest of the color palette.5

  Figure 2.1 shows four versions of a photo of Claude Shannon, each with a different resolution. The photo at top left is a 30 × 30 grid of pixels with one-bit color resolution and has the least information (30 × 30 × 1 = 900 bits); in it we can barely perceive a silhouette. To the right is a 300 × 300 grid in which the details of the photo are more readily recognizable; the information in this photo is 300 × 300 × 1 = 90,000 bits, or about 10 KB. Each image on the bottom contains the same number of pixels as the image immediately above it, but now each pixel has a color resolution of 8 bits. The photo at bottom right has 720,000 bits of information (300 × 300 × 8), or about 0.1 MB, making Shannon’s face clearly recognizable.

  Figure 2.1

  Images of Claude Shannon with 30 × 30 pixels (left) and 300 × 300 pixels (right), using one bit of color per pixel (only black and white, top) and eight bits per pixel (256 shades of gray, bottom)

  With our terminology firmly in place, let us return to the experiments by the researchers at the University of Pennsylvania, and our original question: How much information do the eyes transmit to the brain? Using information theory to compute how much information the neurons had about the videos, the investigators concluded that, on average, the ganglion retinal neurons, which transmit visual information to the brain through the optic nerve, encode between six and thirteen bits of information per second. Considering that the retina of a guinea pig contains about 100,000 of these ganglion neurons, and assuming each of these encodes information independently, this means that the brain of a guinea pig receives approximately one million bits of information per second. Finally, given that the human eye has ten times as many ganglion neurons as that of the guinea pig, the researchers were able to estimate that the human eye transmits information to the brain at around 10 million bits per second, or 10 Mbps, a number that may sound familiar since it’s the transmission speed of a standard Ethernet connection.

  Let us dwell upon this result a little longer. The transmission of visual information to the brain occurs at about one megabyte per second. If we assume that on average we are awake sixteen hours a day, this means that the brain receives a total of 57.6 GB of information per day (3,600 seconds × 16 hours × 1 MB). In other words, every two and a half weeks we could fill a one-terabyte hard drive with the content of what we have seen. But does the eye transmit everything within its reach?

  In one of his much-anticipated presentations of Apple’s newest gadgets—one of the last such presentations he made as CEO of the company—Steve Jobs introduced the iPhone 4. One of the phone’s main innovations was the “Retina” display, which is now standard in Apple products from the iPad to MacBooks. Jobs announced that Retina displays had a resolution of 326 pixels per inch (ppi), four times higher than that of the iPhone 3 and greater than the 300 ppi that, according to Jobs, is the maximum that can be resolved by the human retina with the iPhone held at a standard distance of between 10 and 12 inches (some 30 cm). In other words, the eye can barely distinguish individual pixels in a rendered image with a resolution of 300 ppi, from 30 cm away.6 If I stand 30 cm away from the whiteboard in my office, my field of view (that which I can see if I focus on a given point) is around 30 inches by 20 inches (75 cm horizontally by 50 cm vertically). Thus, in principle, the number of pixels that my eye could perceive in my field of view is 54 megapixels (that is, 30 inches × 300 ppi × 20 inches × 300 ppi = 54,000,000 pixels), about ten times the resolution of the iPhone 4’s digital camera. (I say “in principle” because there is a flaw in the calculation—more about this later.) Of course, if I stand at a distance greater than 30 cm, my field of view expands, but this expansion is balanced by the loss of resolution that results from the increased range. As we saw before, the color of each pixel can be defined using three bytes, which means that 54 megapixels corresponds to 54 × 3 = 162 MB of memory. To get a feel for image continuity, a standard digital video camera captures 30 frames per second. Thirty frames per second at 162 MB per frame gives a total of 4.8 GB processed by my eyes per second. The exact value of this number is irrelevant; what matters is the order of magnitude: gigabytes per second. According to the researchers at the University of Pennsylvania, remember, the amount of information the eyes transmit to the brain is about a megabyte per second. This means that there is a three-order-of-magnitude reduction between the information that could, in principle, be transmitted by the eyes and the information that reaches the brain. In other words, the brain “sees” only about one thousandth of the informati
on in its field of view.

  Why this enormous difference? Was there an error in our arithmetic?

  The above numbers are mathematically correct—but they implicitly assume that the eye processes information with a uniform resolution of 300 ppi throughout the entire field of view. Assuming a uniform resolution makes sense, given that, for example, right now I am able to see everything in front of me in full detail. Or, at least, I think I am. But the ability to see the external world in detail is nothing more than an illusion, a construct of the brain. What we actually see in detail is what lies at the center of our gaze, within a visual angle of one or two degrees. A small (less than 2 mm) depression at the center of our retina, called the fovea, is responsible for producing our area of clear, sharp vision—an area roughly the size of our thumbnail at the end of our stretched arm.

  This fact, surprising as it sounds, is easy to corroborate. One need only extend both arms with the thumbs next to each other and pointing up; focusing on one of the two fingernails makes it all but impossible to notice any detail of the other (if you doubt this, write a few letters on each nail and try reading them). Moreover, if we keep our gaze fixed on the first thumb and move the other arm a few inches to the side, we cannot even see the second thumb in detail, let alone its nail. How is it, then, that we see the world in front of us with such seeming clarity? The illusion arises from the fact that our eyes continually jerk from side to side, making unconscious movements called saccades.