 |


|


Posterous DOS attack. Someone should write the story posted on August 12, 2010

News

I have a blog on Posterous, a very interesting and agile Web service. Just now I received an apologetic email from the company's CEO, Sachin Agarwal. He says that over the last six days, Posterous has been victimized by powerful Denial of Service Attacks. When the first one hit, last Wednesday, the Posterous team raced to move to new data centers. Another one hit on Friday. It was a crazy six days for Posterous. A tech leader named Vince, Agerwal writes, "worked like a mad man until he passed out on his desk."
Briefly, in 2002/03, I was acting info tech editor at BusinessWeek. If this were a Thursday night back then, I'd be preparing for Friday's story meeting, in which I'd propose a 4-column narrative on the Posterous attacks. Here's a very innovative and popular start-up, I'd argue, whose very existence was threatened by these attacks. (I still don't know where they came from.) It would make a great story. It would give us insights into the dangers surrounding us and a look at a start-up battling them.
But why, another editor would surely ask, should we care about Posterous? (Of course, if the editor in chief appeared to be interested in my story, that question might remain unasked. These meetings were exercises in Kremlinology.) In any case, I'd have to make a case for Posterous. Still, I hope someone somewhere is considering writing a story about it. I'm far too busy with my book to report the story. But it's one I'd like to read.
***
You might be wondering about my blog on Posterous. I started a BusinessWeek alumni network on Ning. When Ning demanded payment for what started out as a free site, I migrated all of the content to Posterous. I look at it not as an active blog, but as a private archive of BusinessWeek history.
|


Goldblatt exhibit: Eyes on South Africa posted on August 10, 2010

General


| We finally got to the Jewish Museum in New York to see the David Goldblatt exhibit. He's a South African who photographed people in his country, white, black and "colored" alike, making their best efforts to live normal lives under apartheid, which created the most abnormal of circumstances. It's a wonderful exhibit if you get the chance.
|


Jaron Lanier critiques IBM's Jeopardy challenge posted on August 9, 2010

Jeopardy book

Jaron Lanier,
the technologist and author who worries that we're getting carried away
with our machines, includes Watson, IBM's Jeopardy-playing computer, in
his latest broadside at the tech industry. Unfortunately, he
misinterprets the question-answering technology. In a long New York Times op-ed, he writes:
...I.B.M. scientists recently unveiled a “question answering” machine that
is designed to play the TV quiz show “Jeopardy.” Suppose I.B.M. had
dispensed with the theatrics, declared it had done Google one better
and come up with a new phrase-based search engine. This framing of
exactly the same technology would have gained I.B.M.’s team as much
(deserved) recognition as the claim of an artificial intelligence, but
would also have educated the public about how such a technology might
actually be used most effectively.
The challenge for a Jeopardy-playing computer is not simply to carry
out searches based on phrases. Google and other search engines already
do that, at least for simply phrased queries. Far beyond pointing toward Web pages, Watson must generate specific answers,
each with its own confidence ranking. This enables the computer to
calculate if it can risk a bet on the clue. It's a far more difficult
challenge than the one Lanier portrays.
That said, he raises interesting questions about the marketing of machine
"intelligence." His point, which he elaborated upon in his book, You Are Not a Gadget,
is that we're too often ceding our decision-making to machines and
"swarms" of online communities. By using Amazon
recommendations or software to compose a harmony line, Lanier says
we're abandoning the human brain--by far the most sophisticated known
work of circuitry in the universe. Instead, we're delegating this work
to far simpler algorithms.
The tech industry promotes this, he says, by branding technologies as
"intelligent" and comparing some of them to the brain. It often
anthropomorphiizes. This is where Watson comes in. Long before the Jeopardy challenge, IBM had a team of researchers working on Question-Answer technology. By channeling this research toward Jeopardy, the company was (and is) clearly looking for a branding opportunity. And by giving Watson a human name and voice, it anthropomorphizes the machine.
Is this a good thing? Well, putting a computer into a match against humans imposes a series of constraints that push researchers very hard. They must prepare the computer for long and confusing clues, and they have to design the system to come up with an answer it can bet on within three to five seconds. This advances the technology. (Whether or not this pays off commercially is still open to question.)
But let's assume that Watson and its kin race produce ever more sophisticated answers for us in coming years. Are we going to accept their responses as "truth" and our own judgments as something less than that? I don't think so. Watson is at its most fascinating and entertaining when it makes mistakes. It is when the machine is struggling or clueless that you most appreciate the astounding complexity of our language and the intricate web of connections in our minds.
|


Jeopardy challenges: knowing what to look for posted on August 8, 2010

Jeopardy book

I've written the first six chapters of the IBM-Jeopardy book,
and my editor at Houghton Mifflin is having her way with them now.
Since we're on a forced-march schedule, I'll be revising this first
half of the book over the next three months while researching and
writing the second half.
Meantime, I thought I'd throw in occasional blog posts on some of the challenges
that IBM's Jeopardy-playing computer, Watson, faces as it plays the
game. One of the big issues is figuring out from the question what it's
supposed to be looking for. For this, it hunts for what researchers
call a "lexical answer type," or LAT, in the Jeopardy clue.
Take this clue from the July 29 game. Under the category, "Who killed me, Shakespeare?" it reads:
"Banquo--this guy who sent the hitmen, though he also got his own hands bloody." What
the computer should be looking for here is "this guy." That's the LAT.
It's not that easy, because intially it appears as though "this guy"
might be Banquo, and not the guy who sent the hitmen. But Watson has
lots of training on finding the LATs, and the word "this" is a very
significant pointer. (The answer, by the way, is MacBeth.)
Some are much harder. In the category "Hip Hop," Jeopardy players in February grappled with this clue: "Not surprisingly, his father Heraclides & his grandfather were both physicians & his mother was a midwife."
The word is never mentioned, but the LAT we're looking for is both a son and a grandson. Often, when Watson
sees "father," it knows to look for a father. But in this case, the
father is given, which means it must look for a son. I can't go into
all the details of how Watson does this here. (Some cannot be disclosed
until after Watson's showdown with human champions next winter.) My point is that even to figure out what it should be looking for requires an immense amount of work. It involves
scores of different algorithms carrying out hunts and building up a host of
statistical probabilities.
Compared to us, Watson is a wasteful question-answerer. We know things. It knows nothing, and often must carry out exhaustive hunts even to find a LAT. I might add that in this "Hip Hop"
category, Watson will no doubt be dedicating billions of computing cycles
to the analysis of words and phrases associated with Hip Hop music. As it turns out, the category has
nothing to do with it. The answers to the five clues are:
Hiphuggers, Hop Scotch, a hopper, the Bunny Hop, and, as referenced
above, Hippocrates.
|


The New York Mosque posted on August 8, 2010

News

I don't usually write about politics or religion, but the controversy around the proposed mosque near Ground Zero in New York has been obsessing me lately.
Some
of the protesters against the Mosque refer to it as a "victory" mosque,
meaning that certain Muslims will celebrate it as the site of the
"victory" on 9/11/01. Since there are some 2 billion Muslims on earth,
that's probably a safe bet. But consider this: Thousands of U.S.
Muslims are serving in our armed forces in Iraq, Afghanistan and
elsewhere. Do those people consider 9/11 a "victory?" And are the
Mosque critics being fair to these soldiers by grouping them with the people
they're risking their lives, on our behalf, to fight?
A second
point: We are extremely fortunate in this country to live in ethnic and
religious peace. There is prejudice, injustice, and anger, of course,
and incidents of violence. But considering that we're an enormously
diverse nation of more than 300 million people, I think the level of
peace and (relative) harmony here is nothing short of remarkable. Among the Muslim populations in the UK and France, there is a much greater sense of grievance and alienation. This has fomented violent uprisings in the Paris suburbs and has fed violent extremism in the UK. We have been spared these problems, in part because we don't have the same colonial histories in Muslim countries (though we're gaining them now), and because our own Muslim population is multi-ethnic and integrated into American society.
The worst thing we could do is communicate to American Muslims that they are a distrusted and detested minority. It would bring us closer to religion-based conflict in this country. And yet that's the message in the protests against the mosque in New York.
|


When it comes to tracking customers, few match the Wall Street Journal posted on August 4, 2010

Privacy

Advertisers who track user behavior online always put in this qualifier: It's anonymous. In other words, they track a Web surfer who seems interested in new cars or romantic movies, but not the specific person. In its latest in a series on data tracking, the Wall Street Journal today reports that this anonymity is "in name only." New technologies can come close to zeroing in on the person with just a smattering of data. Peter Eckersley, staff scientist at the Electronic Frontier Foundation, a privacy advocacy group, says, 33 bits on a person is enough.
Yet the Wall Street Journal, a vigorous customer tracker itself, doesn't have to go to all that trouble. A reader, Mark Naples, pointed out in an email that the Journal, one of few media outlets with a pay-wall, collects personally identifiable info online and has the ability to marry it with the behavior data scooped up with cookies.
Another commenter on this site, Michael Sandora, details the same points on his blog, Indigestion. The difference between most data trackers and the Wall Street Journal, he writes, is this:
To a data tracker, I am a cookie number interested in bluegrass
music, jam-bands, Star Wars, Indiana Jones, and reading about digital
media.
To the Wall Street Journal’s subscription service, I am Michael
Sandora, email address: msandora@thisnotmyemail.com, credit card number
####-####-####-####, bluegrass listener, Star Wars fan, and digital
media follower.
What's more, the Journal's privacy policy, dating from 2008, reserves the right to share this valuable trove with "other select companies to send you promotional materials about
their products and services (that is, unless you've told us not to do so...)
I subscribe to the Journal, use their site, and really don't have problems with their blending my behavioral data with the personally identifiable stuff. I don't mind targeted advertising. And as someone who has lived off of advertising in media my entire career, I want journalism to find a funcional business model for the Internet age. But if the Journal is going to write a series on data privacy, they should pay more than passing attention to the practices of their own company.
|


Privacy loses every time posted on August 2, 2010

Privacy

Monday morning, and before I'm finished my first cup of coffee, I see two stories about the fall of privacy. First, the United Arab Emirates is shutting down
Blackberry data services in their corner of the Arabian Peninsula
because they can't evesdrop on the heavily encrypted messages. Next, I
see in the Wall Street Journal
(behind firewall) that the advertising side of Microsoft, in 2008,
fought back a plan that would have thwarted cookies
(as a default setting) in the the Internet Explorer 8.0 browser. How could
Microsoft sell ads, they argued, with a browser that keeps advertisers from learning
about the Web-surfing patterns of their potential customers?
Both the UAE and Microsoft have reasons to do what they're doing.
The UAE is an oasis of relative freedom in a region that's short of it.
People of all nationalities work in Abu Dhabi and Dubai. I was there
last March. You meet Filipinos, Indians, Kenyons, Europeans, Moroccans. It's a
regular UN. No place would be easier for Al Qaeda to do banking,
organizing, bombing. You can even drive to the UAE from Yemen (though Google
maps,for one reason or another, isn't able to give me the directions).
I'm sure this move by the government angers many in the country (not
least the Blackberry subscribers), but there's a defensable national
security argument for it. It's at least as solid as the reasoning
behind the 2001 Patriot Act in the U.S.
Microsoft
also had its reasons not to interfere with cookies. It had to do with
the profits in its online business, which struggles mightily against
Google, among others. Given the choice between contracts from paying
advertisers and appreciation of privacy-loving and non-paying Web
surfers, they went with the bucks.
And that's my point. Privacy
almost always loses. People say they care about it, but most of us are really
like the UAE and Microsoft. Given a choice between the promise of
security and privacy, we usually opt for security. (We march like sheep
through the scanners at the airport, letting them oggle and grope us,
and we even tolerate it when they snap, NO JOKES!)
At the same
time, most of us drop our privacy concerns in a snap to
save $5 at the supermarket, with a customer loyalty card, or five minutes at a toll booth. What's more, if we
really cared deeply about privacy on the Internet, more of us would ditch Web mail, enable privacy browsing on our computers (and go to the trouble of typing a lot more passwords). And we'd heave the
biggest surveillance machines, our cell phones, into the nearest
gutter. I, for one, choose not to.
What's this all mean? We have hand-me-down notions of privacy that don't really fit our modern machines, networks and lives. In coming years, we'll see that some invasions of privacy (like cookies, in my opinion) are largely abstract. But we'll find others that are all too real. (I fear them in areas of police and medical surveillance.) For now, though, privacy loses, just about every time, to economics and promises of safety.
|


WSJ: Advertiser tracking on the rise posted on July 31, 2010

Datamining

The Wall Street Journal publishes a report today (behind firewall) on cookies, and the growth of consumer-tracking on major Web sites. For the report, they analyzed big Web sites, including their own, and found that many dropped more than 100 cookies into visitors' computers. (The Journal dumps 60 cookies, slightly below the 64-cookie average on the 50 largest sites.) The only big site that doesn't track visitors is Wikipedia.org.
As a reader (and former editor) I found the Journal story maddenly vague. It says that cookies are on the rise, but doesn't give any historical context. It mentions data-analysis companies that are doing highly detailed work, but doesn't name them. And while it states what type of analysis they could do with this detailed data, it doesn't give examples of how it's being used. To wit:
"Some tracking files can record a person's keystrokes online and then transmit the text to a data-gathering company that analyzes it for content, tone, and clues to a person's social connections..... Data-gathering companies [can] build personal profiles that could include age, gender, race, zip code, income, marital status, and health concerns, along with recent purchases and favorite TV shows and movies."
Why not name a few of these companies, and, while they're at it, ask advertisers how such detailed profiles are being used? Also, note the use of the word "could" in the last sentence. Is there evidence that these unnamed companies are actually building these profiles? We don't know.
I dealt with these issues often while researching The Numerati. The problem here, as in much of the data economy, is the gap between the astonishingly rich trove of data and the undeveloped business model for it. Most companies simply don't know how to put the data to use. How do you deal with millions of detailed consumer profiles when you only have four or ten or 20 different types of ad campaigns? You ignore most of the details and put the people into enormous buckets. (Credit-card companies are a notable exception. They can create thousands of different offers and test them against different groups. But they've been at this since long before the age of cookies.)
Eventually advertisers will learn to make use of this information, if a privacy uprising doesn't shut cookies down. But for now much of this detail we're communicating with our clicks and keystrokes is piling up in data centers, largely ignored.
|


Ask.com tries different question-answering posted on July 27, 2010

Jeopardy book

One of the common (and mistaken) assumptions about IBM's
Jeopardy-playing computer, Watson, is that it has a database of answers
to Jeopardy clues, and that it's just a matter of finding the right one.
For Jeopardy, which has a staff of writers coming up with puzzlers,
such a database would be impossible. Consider this clue from earlier
this month: Under the category "Jonah's Druthers," it reads:
"Abord ship in a storm, the men "cast" these items of chance; Jonah's
came up, but he'd rather it didn't. (I think I would have used "hadn't"
for that last verb.) The answer, which isn't that hard for lots of
humans, is "lots." But can you imagine a database waiting with an answer
for that clue? No, Watson has to do loads of hunting, syntactical
analysis and statistical work in three to five seconds to come up with
answers.
But according to the NY
Times, Ask.com is returning to its question-answering AskJeeves
roots with a new Q/A service. This one, unlike Watson, will index some
500 million questions and answers. Most of these, I'm assuming, will be
simple fact answers to simply-phrased questions, what Watson's builders
call "factoids." How far is it from Philadelphia to Pittsburgh? How much
does a Buick LeSabre cost? Most search engines, including Google, are
already providing answers to these types of questions in the search
results. You can often see them without clicking.
The challenge will be to keep the answers fresh. The price changes on
that Buick. Nicolas Sarkozy won't be the president of France forever. A Q/A
database, to stay relevant, has to be very lively, always checking and
refreshing itself.
***
We're driving back from a wonderful wedding in the suburbs of Detroit.
The honeymooners are now in Paris, and we're in Clearfield, Pa., the
home of Dave Morgan, founder of Tacoda and Simulmedia, and the first character I
introduced in The Numerati. Looking around here for dinner last night, I
can understand why he decamped to Manhattan. Though the scenery in this
part of western Pa, especially as dusk on a summer evening, is gorgeous.
|


Confessions of a geezer at the movies posted on July 22, 2010

General



I went to the movies last night and came to grips with a challenge I face: If I want to enjoy popular culture, and maybe even thrive in the work place, I'm going to need to achieve some level of expertise in video games. Games will provide the architecture, and increasingly, the interface, for much of what we do.
The movie was Inception, starring Leonardo di Caprio. In involves an adventure that continues through various levels of dreams. If it had just been about dreams and alternate realities, I think I would have loved it. But this movie behaved like an action video game. Loads of shooting and explosions, lots of buildings and bodies falling, and you're wondering the whole time: What level of the dream are we in? And you learn that certain people, when they're shot, descend to a lower level of dreams.
With all the pyrotechnics and the various levels, it felt like a video game. Leonardo had to descend into a series of alternate worlds and master them in order to reach his prize. As David Denby writes in the New Yorker, it had a thing or two in common with the Greek myth about Orpheus, who has to descend into the underworld to retrieve Eurydice. That would probably make a good video game too.
|



|

|


|
 |









@MichaelPizzo My pleasure. Another book u might like is Afterthought by James Bailey. Not new, but puts data in context of sci/math history

follow me on twitter





The Book Bag - Zoe Page

The Wall Street Journal - John Derbyshire

Frankfurter Allgemeine Zeitung - Milos Vec

The Guardian (UK) - Steven Poole & Christopher Exeter

read more reviews





The appeal of virtual
- May 18, 2010

My next book: IBM's Jeopardy mission
- March 22, 2010

BusinessWeek's strategy
- November 12, 2009

BusinessWeek cannot afford to stay within McGraw-Hill
- August 6, 2009

How to remake BusinessWeek?
- July 16, 2009

Fiction: The Andean Correspondent
- May 30, 2009

It's OK not to read the book...
- January 8, 2009

List of favorite non-fiction books
- December 18, 2008

Early results of behavioral ad campaign
- November 4, 2008

Launching Numerati behavioral campaign: Will deliver 8 million targeted ads
- September 5, 2008

The Worker: Excerpted as BusinessWeek cover story, Aug 28, 2008
- August 28, 2008

Message for math and business readers
- August 27, 2008







|