Wednesday, November 30, 2011

AI Class 6 -- Game Theory

The Prisoner's Dilemma

Here's the scam: Alice and Bob are arrested and separately offered a plea bargain for testifying against the other. The payoff to each of them is different depending on what the other person does (we call refusing to testify Cooperation and ratting the other out Defection):

If both Cooperate, they both get off Scot Free;
If both Defect, they split a small penalty;
If one Defects and other Cooperates,
the Defector get's a small reward,
the Cooperator gets jail time.

This is encoded in the Prisoner's Dilemma payoff matrix:

                    Alice:Defect  Alice:Cooperate
     Bob:Defect     A=-1, B=-1    A=-5, B=1
     Bob:Cooperate  A=1,  B=-5    A=0,  B=0

(Note that this is not a zero-sum game because the payoffs don't add up to zero...but I think that's a different story.)

There are three standard Strategy types in Game Theory:

Dominant A move that does better than any other, no matter what the other player does. In this case it is Defect because, if Alice Defects she will get -1 (versus -5 for Cooperating) if Bob also Defects, or 1 (versus 0) if he Cooperates.

Equilibrium Neither player can benefit from making a unilateral switch to a different move. In this case the Equilibrium is Both Defect because, either player will have a worse payoff if they change to Cooperate on their own. This is the Nash Equilibrium, named after John -- A Beautiful Mind -- Crowe...

Pareto Optimum Both players agree that they are getting the best payoff they can. If either player gets a worse payoff by changing moves it is not an Optimum. Both Cooperate is the Pareto Optimum for Prisoner's Dilemma because they both get 0, but one will get -5 if the other changes to Defect.

So, in general, if Alice doesn't trust Bob and thinks she will never see him again, her best option is to Defect: Even though there is the possibility of getting a slap on the wrist, she doesn't risk getting thrown in the slammer. But if you play this game over and over with the same person, Defecting leads to a worse over-all outcome for both players than Cooperating. Therefore, if you trust your partner to not bail on you, you should both play the Pareto Optimum.

The problem (a talk on this topic is what inspired my GI's Dilemma kinetic sculpture) is this: If you know the number of plays you will be engaging in, it is tempting to Defect on the last play in order to get the reward and a slightly higher over-all payoff. Of course your opponent also knows this, so you need to Defect one play earlier to catch them out. This strategy cascades down through the plays and often makes it impossible to ever play the Pareto Optimum.

The strange thing about this is that it leads to a much worse over-all payoff for both players. And this is what is called Rational in AI...

So my question is, just what exactly is Rational? Is it covering your ass? Or is it getting the best outcome? I wonder if Rational robots would be able to see past the infinite-regression of cascading Mutually Assured Defection to a landscape where Optimal Cooperation was just assumed?

Monday, November 21, 2011

AI Class 6, Midterm Exam: !!100%!!

I guessed right on the Philosophy (both the questions in my previous post were True because in a completely random Environment any Agent behavior is considered Rational -- great to find that out during the Test, eh?), tortured the Logic to death in the correct way, and stumbled in the right direction through the Conditional Independence exercise. Hard to believe but apparently True: What's the Probability of that?

As an after-the-fact proof, here's the Exam and my notes with the answers I decided upon...

AI Class 6, Midterm Exam...ugh

Actually the Midterm is not so bad really...but maybe I should wait until the scores come out before saying that.

It does get off to a rough start with a couple of true/false philosophy of set theory questions:

"There exists (at least) one environment in which every agent is rational."
"For every agent, there exists (at least) one environment in which the agent is rational."

Which seem to be throwing folks into un-bounded loops. It may be that they are actually Logic questions in Agent/Environment clothing. As I noted before, the word "rational" was only used in the summary of what we learned in lesson one and the only way we would have learned its meaning was by inference. Added to this is the fact that the scope of all possible Agents and Environments is not defined anywhere that I can find: Does it include the empty set? If so, what would be considered "Rational" behavior? I guess we'll find out when the Exam is graded.

Otherwise the questions are pretty well specified and -- again modulo my jumping the gun -- don't require a lot of tedious calculation ala Professor Thrun's video pleasures. But let me just say now, "I hate Logic", so that when I fail all 6 sub-parts of Question 12 I can also say, "I told myself so". I tried to use the demo code's DPLLDemo.java to validate my mental gymnastics and got different answers based on the formatting of the questions. So maybe it's even beyond the abilities of the computer to solve -- or else I should start filing bug reports.

At the half-way mark we finally seem to be getting into interesting territory with Markov Models. This is what I tried to do at SFI those many years ago so maybe I'll come to understand what I was on about then. But in general I think I've discovered an unfortunate truth:

...I don't actually like Artificial Intelligence...

The trouble is, so far in this very basic class, AI is being used to find simple -- validated -- solutions to fairly complicated problems, generally using the excruciatingly tedious iterated algorithms for which computers were invented. But what I'm interested in is getting Complex results from Simple systems. And that is my working definition of Complex: it is not Complicated for that very Simple reason.

But I guess it's a good thing to torture myself with the Complicated for a while so I know what I'm not missing....

Tuesday, November 8, 2011

Now for a little change of pace...

We had our first significant snowfall last night. Not much and mostly melted off by now...but...up in the hills, in our Urban Wildland Interface Zone, things get pretty dicey pretty quick because there's no direct sunlight until summer.

So... Around noon today someone coming down the hill managed to drive off the road. By off-roading I mean down a steep embankment end-over-end for at least 100 yards, coming to rest -- fortunately -- on his wheels about 100 yards from another -- fortunately -- more accessible road. Unfortunately the vehicle was not visible through the dense vegetation from either vantage, but -- fortunately again -- the guy he hit on his way down was able to point out where it happened.

Photo: Tom Chilton -- Hondo VFD

After about a half-hour of hiking around and yelling to each other we found the car and patient, who, thanks to modern vehicle restraint systems was not seriously injured and had better vital signs then me. Our medics packaged him up on a backboard in a stokes litter and we belayed him down the rest of the embankment that he hadn't managed to negotiate with his vehicle and onto a Big Wheel. The Big Wheel is exactly that, a Big Wheel with a bunch of handles onto which you strap a litter, so 4 or 6 folks can hump a patient out of some god-forsaken location (we used it this summer to get a guy having a heart attack an hour into a National Forest trail back to civilization-such-as-it-is because the chopper couldn't find anywhere to land).

We loaded the patient into a City of Santa Fe Med unit -- our County guys were busy and probably kicking themselves for missing the fun -- and shipped him off to the Horsepiddle. I'll bet the City guys were happy about the Big Wheel thing, but they are too professionally pre-occupied to thank mere volunteers.

Way better than doing homework, to which I must now return...

Friday, November 4, 2011

Local Color IV -- NM state division

Tourists Visit N.M. for The Beaches?

That's the title of today's ABQ Journal article about re-branding New Mexico to attract more tourists' money. Our new Tourism Secretary Monique Jacobson spent eight months of state funding doing focus groups with folks from elsewhere, only to discover that they might like our beaches if they weren't so boring. So we have some new principle brands to tout (underlining is mine):

The focus groups were used to help develop the basic principles to build the brand from, she said. The five the state decided on are authenticity, discovery, connection and adventure. Others, like “from the earth” were discarded based on feedback from the groups.

Even though we have a high percentage of sciency types around here, mathematical accuracy is not on our list...

Wednesday, November 2, 2011

Naturally Stupid II

GDMIT, GDMIT, GDMIT...I got 87% on the last AI Class homework because I mis-copied one number which threw off the Linear Regression calculations -- I did all the work right, just got the wrong results.

If this was not the automated educational automat of the future I could probably go to the professor and plead my case and maybe even get half credit for being dumb but not stupid. Instead I need to learn to act more like a computer and not make mistakes.

That being said, if you want to see how I managed to get the hard parts of Homework 3 correct, get the spreadsheet here. Note that I've corrected the mistaken regression number so the calculation matches what the computers at Stanford believe.

Tuesday, November 1, 2011

A Modest Proposal, III

For the Tate Modern Turbine Hall... Just in case I get the commission, and my expected MacArthur:

Polish the floor and let two Roomba robots loose in the space. I might have to modify the 'bots to have "people sensors" -- presuming that they don't already, this is part of the research leading up to the installation -- such that they would be attracted to the crowds but also try to keep a discrete distance, thus showing signs of distress if cornered. Braitenberg in Service of the Arts!

And I bet the sound would be great if you could silence the hoi-polloi.

Ich bin un Bricoleur