From the Introduction to The Numerati, about how the advance of computing led to this new data-crunching elite. This is included in a BusinessWeek excerpt, published on Aug. 28, 2008.
Imagine you're in a cafe, perhaps the noisy one I'm sitting in at this moment. A young woman at a table to your right is typing on her laptop. You turn your head and look at her screen. She surfs the Internet. You watch.
Hours pass. She reads an online newspaper. You notice that she reads
three articles about China. She scouts movies for Friday night and
watches the trailer for Kung Fu Panda.
She clicks on an ad that promises to connect her to old high school
classmates. You sit there taking notes. With each passing minute,
you're learning more about her. Now imagine that you could watch 150
million people surfing at the same time. That's more or less what Dave
Morgan does.
"What is it about romantic-movie lovers?" Morgan asks, as we sit in
his New York office on a darkening summer afternoon. The advertising
entrepreneur is flush with details about our ramblings online. He can
trace the patterns of our migrations, as if we were swallows or
humpback whales, while we move from site to site. Recently he's become
intrigued by the people who click most often on an ad for car rentals.
Among them, the largest group had paid a visit to online obituary
listings. That makes sense, he says, over the patter of rain against
the windows. "Someone dies, so you fly to the funeral and rent a car."
But it's the second-largest group that has Morgan scratching his head.
Romantic-movie lovers. For some reason Morgan can't fathom, loads of
them seem drawn to a banner ad for Alamo Rent A Car.
Groundhog Day
Morgan, a cheery 43-year-old, wears his hair pushed to the side, as if
when he was young his mother dipped a comb into water, drew it across,
and the hair just stayed there. He grew up in Clearfield, a small town
in western Pennsylvania a short drive from Punxsutawney. Every year on
the second day of February, halfway between the winter solstice and the
vernal equinox, a crowd in that town gathers around a large caged
rodent still groggy from hibernation. They study the animal's response
to its own shadow. According to ancient Celtic lore, that single bit of
data tells them whether spring will come quickly or hold off until late
March. Morgan has migrated as far as can be from such folk predictions.
At his New York startup, Tacoda, he hires statisticians to track our
wanderings on the Web and figure out our next moves. Morgan was a
pioneer in Internet advertising during the dot-com boom, starting up an
agency called 24/7 Real Media. During the bust that followed he founded
another company, Tacoda, and moved seamlessly into what he saw as the
next big thing: helping advertisers pinpoint the most promising Web
surfers for their message.
Tacoda's entire business gorges on data. The company has struck deals with thousands of online publications, from The New York Times to BusinessWeek.
Their sites allow Tacoda to drop a bit of computer code called a cookie
into our computers. This lets Tacoda trace our path from one site to
the next.
The company focuses on our behavior and doesn't bother finding out
our names or other personal details. (That might provoke a backlash
concerning privacy.) But Tacoda can still learn plenty. Let's say you
visit The Boston Globe and read a column on the Toyota Prius. Then you look at the car section on AOL.
Good chance you're in the market for wheels. So Tacoda hits you at some
point in your Web wanderings with a car ad. Click on it, and Tacoda
gets paid by the advertiser—and gleans one more detail about you in the
process. The company harvests 20 billion of these behavioral clues
every day.
Sometimes Morgan's team spots groups of Web surfers who appear to
move in sync. The challenge then is to figure out what triggers their
movements. Once this is clear, the advertisers can anticipate people's
online journeys—and sprinkle their paths with just the right ads. This
requires research. Take the curious connection between fans of romance
movies and the Alamo Rent A Car ad. To come to grips with it, Morgan
and his colleagues have to dig deeper into the data. Do car renters
arrive in larger numbers from a certain type of romance movie, maybe
ones that take place in an exotic locale? Do members of this group have
other favorite sites in common? The answers lie in the strings of ones
and zeros that our computers send forth. Maybe the statistics will show
that the apparent link between movie fans and car renters was just a
statistical quirk. Or perhaps Morgan's team will unearth a broader
trend, a correlation between romance and travel, lust and wanderlust.
That could lead to all kinds of advertising insights. In either case,
Morgan can order up hundreds of tests. With each one he can glean a
little bit more about us and target the ads with ever more precision.
He's taking analysis that once ran through an advertiser's gut, and
replacing it with science. We're his guinea pigs—or groundhogs—and we
never stop working for him.
Fat Digital Dossiers
When it comes to producing data, we're prolific. Those of us
wielding cell phones, laptops, and credit cards fatten our digital
dossiers every day, simply by living. Take me. As I write on this
spring morning, Verizon, my cell-phone company, can pin me down within
several yards of this café in New Jersey. Visa can testify that I'm
well caffeinated, probably to overcome the effects of the Portuguese
wine I bought last night at 8:19. This was just in time for watching a
college basketball game, which, as TiVo
might know, I turned off after the first half. Security cameras capture
time-stamped images of me near every bank and convenience store.
And don't get me started on my Web wanderings. Those are already a
matter of record for dozens of Internet publishers and advertisers
around the world. Dave Morgan is just one in a large and curious crowd.
Late in the past century, to come up with this level of reporting, the
East German government had to enlist tens of thousands of its citizens
as spies. Today we spy on ourselves and send electronic updates minute
by minute.
This all started with computer chips. Until the 1980s, these bits of
silicon, bristling with millions of microscopic transistors, were still
a novelty. But they've grown cheaper and more powerful year by year,
and now manufacturers throw them into virtually anything that can
benefit from a dab of smarts. They power our cell phones, the controls
in our cars, our digital cameras, and, of course, our computers. Every
holiday season, the packages we open bring more chips into our lives.
These chips can record every instruction they receive and every job
they do. They're fastidious note takers. They record the minutiae of
our lives. Taken alone, each bit of information is nearly meaningless.
But put the bits together, and the patterns describe our tastes and
symptoms, our routines at work, the paths we tread through the mall and
the supermarket. And these streams of data circle the globe. Send a
friend a smiley face from your cell phone. That bit of your behavior,
that tiny gesture, is instantly rushing, with billions of others,
through fiber-optic cables. It's soaring up to a satellite and back
down again and checking in at a server farm in Singapore before you've
put the phone back in your pocket. With so many bits flying around, the
very air we breathe is teeming with motes of information.
If someone could gather and organize these far-flung electronic
gestures, our lives would pop into focus. This would create an
ever-changing, up-to-the-minute mosaic of human behavior. The prospect
is enough to make marketers quiver with excitement. Once they have a
bead on our data, they can decode our desires, our fears, and our
needs. Then they can sell us precisely what we're hankering for.
Filtering Out the Noise
It sounds a lot simpler than it is. Sloshing oceans of data, from
e-mails and porn downloads to sales receipts, create immense chaotic
waves. In a single month, Yahoo!
alone gathers 110 billion pieces of data about its customers, according
to a 2008 study by the research firm comScore. Each person visiting
sites in Yahoo's network of advertisers leaves behind, on average, a
trail of 2,520 clues. Piece together these details, you might think,
and our portraits as shoppers, travelers, and workers would jell in an
instant. Summoning such clarity, however, is a slog. When I visit
Yahoo's head of research, Prabhakar Raghavan, he tells me that most of
the data trove is digital garbage. He calls it "noise" and says that it
can easily overwhelm Yahoo's computers. If one of Raghavan's scientists
gives an imprecise computer command while trawling through Yahoo's
data, he can send the company's servers whirring madly through the
noise for days on end. But a timely tweak in these instructions can
speed up the hunt by a factor of 30,000. That reduces a 24-hour process
to about three seconds. His point is that people with the right smarts
can summon meaning from the nearly bottomless sea of data. It's not
easy, but they can find us there.
The only folks who can make sense of the data we create are crack
mathematicians, computer scientists, and engineers. They know how to
turn the bits of our lives into symbols. Why is this necessary? Imagine
that you wanted to keep track of everything you ate for a year. If
you're like I was in the fourth grade, you go to the stationery store
and buy a fat stack of index cards. Then, at every meal you write the
different foods on fresh cards. Meat loaf. Spinach. Tapioca pudding.
Cheerios. After a few days, you have a growing pile of cards. The
problem is, there's no way to count or analyze them. They're just a
bunch of words. These are symbols too, of course, each one representing
a thing or a concept. But they are near impossible to add or subtract,
or to drop into a graph illustrating a trend. Put these words in a
pile, and they add up to what the specialists call "unstructured data."
That's computer talk for "a big mess." A better approach would be to
label all the meats with M, all the green vegetables with G, all the
candies with C, and so on. Once the words are reduced to symbols, you
can put them on a spreadsheet and calculate, say, how many times you
ate meat or candy in a given week. Then you can make a graph linking
your diet to changes in your weight or the pimple count on your face.
|