Saturday, May 24, 2008
Build Your Own Poker Cheat Bot Part Two!
Here is John Devlin´s second-part posting on building his poker bot. Very interesting stuff, and if I ever do write the sequel to "Dirty Poker," which would of course be called "Dirty Poker 2," I will be asking Devlin for his participation in the chapter on online poker bots.
Devlin:
Last week, I posted about how I built a working, real-money online poker bot. This week, we'll discuss how to get started building your own bot, address some specific bot-building techniques, talk about the woeful state of online poker user privacy, respond to a few reader questions and comments, and close out with some recommended reading.
This series is both a detailed manual on how to build and make money with an online poker bot, and a story of how two guys from the Dallas poker underground managed to realize such a bot in practice.
We'll be discussing everything necessary to build what's quickly becoming the de facto standard in online poker botting: a sophisticated parallel processing rig incorporating collusion AI, massive hand history databases, and real-time information flow to dominate the opposition. This is where the world is headed:
NOTE: While I believe building and running an online poker bot to be neither cheating nor unethical, building a ring of such bots is a different story. As a long time poker-player, I absolutely don't condone cheating people out of their money by colluding in online poker or any other game. We're going to discuss things like bot collusion not so we can (all of us) go out and build colluding bot rings, but because these are the techniques people are using right now, today and tomorrow, to gain an unfair edge in online poker.
In order to build, or learn how to defeat, such a contraption, we're going to have to cover a lot of ground:
Artificial Intelligence. Neural nets, genetic algorithms, rules engines, decision trees.
Poker Strategy. Not the stuff you see on TV. Expert EV-driven poker strategy as it exists today, at the tables, in books, and on the forums.
Input Simulation. How to generate an appropriately timed and positioned stream of mouse moves, clicks, and keyboard input.
Operating Systems. Code injection, API hooking, kernel objects, multi-threading, DLLs. In general: forcing Windows to do what you want.
Reverse Engineering. How to reverse engineer poker client applications and the network data streams they rely on.
Probability and Statistics. Bayes Theorem, probability distributions, confidence intervals, and other goodies.
If you're new to poker, and particularly if you're new to programming, this will be a little bit of a trial by fire. Building a bot is basically the programmer equivalent of joining a Fight Club.
A guy who came to Fight Club for the first time, his ass was a wad of cookie dough. After a few weeks, he was carved out of wood.
Sort of. Read on.
Why are so many people interested in building bots?
Last weekend I got a phone call from a friend (a guy I'll have more to say about later), informing me that How I Built a Working Poker Bot, Part 1 had hit the front page of Digg, reddit, del.icio.us, and other sites, and that Coding the Wheel was getting enough traffic to occasionally throttle the server.
Since then, readers have posted upwards of a thousand comments to the story, here and around the net, and I've gotten a couple hundred emails ranging from technical diatribes to job offers to good old-fashioned hate mail.
The response has been unexpected. Thanks to all who took the time to read the first article, and to those who've joined the discussion so far. In hindsight it seems obvious, but many, many people have a vested interest in building, or discouraging the building of, software tools which emulate human behavior. It goes way beyond online poker, then again, in online poker the profit motive is compelling:
A successful online poker bot is worth hundreds of thousands of dollars even if all your bot does is break even. (I do not agree with this point) Sensationalistic? Maybe, but it's grounded in the very dull, non-sensational mathematics of rakeback and player promotions, which I've outlined below, and which we'll revisit in future posts.
Botting Rule #232: Wherever mouse clicks or other human-computer interactions are worth money, hordes of people will try to write software to simulate those interactions.
Why online poker clients are basically spyware
I want to preface this by saying: the online poker sites aren't evil. They're trying to protect their business, and that's perfectly normal. But I think they've strayed from the beaten path when it comes to respecting the privacy of their users.
Many online poker players have no idea that major poker sites silently install a browser plugin which has access to every page you visit in your browser, and potentially every piece of information on that page (the text of private emails; credit card numbers; user login information; etc):
Many online poker players have no idea that poker clients can and will snoop around on your system, potentially viewing sensitive data:
By examining your list of running processes
By reading body and titlebar text from every window you have open
By taking occasional screenshots
By snooping around on your file system, and in the system registry
By doing who knows what else, since there's zero oversight.
This is the definition of spyware. These "safeguards" constitute a basic invasion of your digital privacy and you should be angry about it. It doesn't matter whether the sites actually collect, pay attention to, store, or transmit that information. It doesn't matter if they do it with the best of intentions. It doesn't matter if they tell you they're going to do it in the EULA. They quite simply have no business doing it at all, and if they're going to do it, they should be doing a better job.
In other words, online poker is really the worst of both worlds: extreme invasion of privacy unheard of outside the realm of spyware, and zilch to show for it - zero effective safeguards against bots or other supposedly malicious software.
And yes: it's true that many sites will recognize well-known botting software such as WinHoldem. But they currently do very little in the way of preventing home-grown bots.
Botting Rule #472: The biggest advantage of a poker bot is its obscurity.
Using Spy++ or Winspector to get basic information from the poker client
Spy++ and Winspector are useful window analysis and debugging tools which allow you to view window properties and messages for any window in the system, including poker client windows:
Spy++ has extracted the handle (HWND), caption, class, style, location, and other properties of a Poker Stars table window. We can also drill down and find its child windows, parent window (if any), window class (PokerStarsTableFrameClass, in case you're curious) and other assorted information.
We can also use Spy++ to snoop on the exact messages received by a given window:
Spy++ is included in all versions of Visual Studio, as part of an MSDN subscription. Winspector has a larger feature set than Spy++ and is is available as a free download from http://www.windows-spy.com/.
Later, we'll be using this tool to help us figure out what the poker client is doing, beneath the hood. It's also an extremely useful all-purpose window debugging tool, and an important part of a bot developer's (or any Windows developer's) toolkit.
Why a bot doesn't have to win, in order to win
I've stated repeatedly that a poker bot only needs to break even in order to generate a profit. Here's how it works, for those who aren't familiar with rakeback or other online poker promotions:
IF you play X hands of online poker within a certain amount of time (such as a calendar year).
THEN the poker site gives you Y amount of money, or "player points" which are worth money.
That's a simplified, but otherwise accurate, description. And listen. Y can be very large, even "life-changingly" large. Let's look at some actual figures.
Consider that it's fairly standard for a (human) player to play 8 or even 12 tables simultaneously.
Furthermore, in online poker you're generally getting between 50 and 70 hands per table per hour. Let's call it 55, just to be conservative.
So that's 10 tables @ 55 hands per table per hour, for 10 hours a day, 6 days a week, 50 weeks a year.
If you do the math, you'll find it comes out to around 1.65 million hands per year. If your bot is playing anything other than the microlimits, this is easily enough to qualify you for (just as an example) the Poker Stars Supernova Elite promotion, which when properly liquidated is worth around seventy or eighty thousand dollars the year you achieve it, and two or three times that amount the following year (because of the multiplier effect of FPP bonuses).
Similarly, other sites offer rakeback promotions (~30% rakeback is fairly standard) along with assorted perks. If your bot is playing break-even poker, then your rakeback - quite substantial, as you move up from the microlimits - is pure profit. Ironically, your profit would shrink to zero if the site stopped collecting rake! (Update: a few readers have questioned this statement, claiming that a break-even bot with rake equates to a winning bot without rake. This is true, but the statement was tongue in cheek. It's ironic that a break-even bot ultimately ends up making money because of, rather than in spite of, the rake.) The thing that more than any other makes games tough to beat - the rake - makes it possible for your bot to turn a healthy profit. And that, as a long-time online poker player, makes me smile.
Botting Rule #1274: Many online poker sites would love to allow bots, if only their users would let them.
Botting Rule #47: Online poker players fear bots all out of proportion to the average bot's ability to win in competitive poker.
How to simulate human input (an overview)
In next week's post we'll cover this topic in detail, but for now: in order for your bot to be able to perform actions such as betting, raising, or folding, it needs to know how to talk to the poker client. There are at least three basic mechanisms you can use:
(Most difficult) Reverse-engineer the network protocol and communicate directly with the poker server.
(Somewhat difficult) Hook into the poker client just beneath the user-interface. In this scenario, you don't simulate mouse clicks; you figure out what internal function(s) handle those mouse clicks, and call those functions directly.
(Easy) Simulate user input - mouse clicks, keyboard input, whatever - at the operating system or even the driver level.
The first two techniques are mostly black magic. This third technique - user input simulation - can be further broken down into:
Using the SendInput API.
Directly generating and posting WM_MOUSEMOVE, WM_LBUTTONDOWN, and other messages to the poker window.
Using the (deprecated) keybd_event and mouse_event APIs.
Writing a custom mouse or keyboard driver.
Possibly other techniques..
And regardless of which method you use, you'll want to make it realistic:
By incorporating subtle timing randomizations.
By creating realistic mouse-movement trails.
By occasionally clicking and interacting with unrelated windows.
You'll find that the SendInput method offered the best tradeoff between power and ease of use, but this is a complex topic. I mean: the interactions between user input events and the various Windows subsystems are complex, and will require tweaking to get right. But the code to generate those interactions is fairly simple.
Why poker bots will soon be accepted as some of the strongest players in the world
In the first-ever Man vs. Machine Poker Championship, a University of Alberta research team lead by Dr. Jonathan Schaeffer (the guy behind Chinook, the program that effectively solved the game of checkers)...
...piitted the Polaris poker bot against famous poker professionals, Phil "the Unabomber" Laak and Ali Eslami.
Laak/Eslami won the match with 2 wins, 1 draw, and 1 loss. That's not a hugely convincing margin, and in fact, as the University of Alberta website points out:
The match was a success in many ways. Polaris proved that it was able to compete with some of the best poker players in the world. In fact, the 2-1-1 record of the humans is a little misleading. The actual difference in monetary outcome was just $395 which is a very small amount. The format of the match did a great deal to reduce the large variance in the game of poker, but it does not remove it all. The $395 sum could be as few as one or two hands where Polaris decided to fold when the human who got the same cards decided to continue. For this reason, a future match should prove particularly interesting, as the bot continues to develop in strength.
As for Phil Laak, he had this to say about Polaris, and about poker bots in general:
"We're already at the point where artificial intelligence crushes players that are unsophisticated, beats handily intermediate players, and loses small or wins small against savvy opponents... For Round 1, I'd say the bots have it.
Polaris's performance is reminiscent of the 1996 match in which world chess champion Gary Kasparov fended off the Deep Blue supercomputer, for the last time."
The IBM team responsible for Deep Blue made a few tweaks, and one year later, in 1997, Deep Blue came back and won a historic match, 3.5 to 2.5 (that's two wins, one loss, and three draws). Kasparov's allegations of unfair software-coaching practices were mostly ignored, and IBM never granted him a rematch. That caused some people to question the authenticity of Deep Blue's victory, for almost a decade.
Until 2006, when another chess program, Deep Fritz, handily beat world champion Vladimir Kramnik. The crucial difference between Deep Blue and Deep Fritz:
Deep Blue ran on specially designed supercomputer-grade hardware
Deep Fritz ran on a workstation PC with two Intel Core 2 Duo CPUs
For now I just want to make the point that intelligent, commercial-quality poker bots are a reality in the low and middle limits, and have been for a few years.
MSNBC: Are poker 'bots' raking online pots?
Wired: On the Internet, Nobody Knows You're a Bot
Coding Horror: The Rise of the Poker Bots
ThisIsMoney.co.uk: We put poker bots to the test...
Why DLL injection is so powerful
Several readers have stated that techniques like DLL injection are either a) overkill or b) likely to get you in trouble with the online poker sites. So I'd like to address this issue in a little more detail.
First, here's a video which demonstrates a simple, harmless use of DLL injection in practice: overwriting the Poker Stars cashier balance with a balance dreamt up by the user.
Inject the DLL
Subclass the Cashier window
Detect when the Cashier window is invoked
Override WM_PAINT and display a "fake" balance to the user
First, I recommended it earlier, and I'm going to recommend it again: Jeffrey Richter's excellent book on advanced Windows API development: Windows via C/C++. Buy it. Borrow it. Steal it if you must. Most of the information you'll find inside is available somewhere on the Internet, but you'll spend hours upon hours tracking it down. And nobody - nobody - speaks on this subject with Richter's authority.
Here are some thing to keep in mind when deciding whether or not DLL injection is for you:
DLL injection isn't a hack. It's a formal capability offered by the Windows API, without which hundreds of legitimate applications (ranging from computer-based training apps to instant messaging applications) would stop working.
There are at least 6 ways to inject a DLL (or binary) code into another process on Windows operating systems: 1) via the Registry 2) using Windows hooks 3) as a Debugger 4) with Remote Threads 5) by creating a Trojan DLL 6) CreateProcess/WriteProcessMemory.
DLL injection isn't particularly difficult to implement, and it's not a poorly-documented, error-prone procedure. It more or less just works.
A given poker client can't simply declare that all injected DLLs are evil. The operating system, as well as legitimate third-party applications, can both cause DLLs to be mapped into a process's address space. If Poker Stars, World of Warcraft, or whatever other application were to simply shut down whenever this happened, they'd be unusable.
While it's possible to detect when DLLs are injected into the process's address space, it's not possible to determine whether a given DLL is innocent or malicious. Sure, you can look for the DLL's name, but the name can be changed, with every invocation if need be. You can create some sort of checksum to try to identify malicious DLLs at the binary level. Again, easy to get around by creating a self-modifying .EXE.
That said, is it possible to create a bot without DLL injection? Absolutely. But if you can't get your code to run in the poker client's process, your powers are somewhat limited. For example, you won't be able to subclass poker client application windows. You won't be able to query various process data structures. You won't be able to intercept Windows messages intended for the poker client. And so forth.
Why screen-scraping is a bad idea
A lot of people have argued that some of the screen-scraping techniques I mentioned - such as pixel-testing for hole cards - are difficult to code, error-prone, and tedious to maintain. I couldn't agree more. Screen scraping is a last resort. The bot's Input Component needs to be structured such that it falls back on screen-scraping techniques only when all other options are exhausted.
That said, a mature bot capable of playing at multiple sites will probably require some sort of screen-scraping capability. The trick is to create a generic screen-scraping mechanism driven by XML or some other easily-edited format. That way, when some minor aspect of the poker client UI changes, all you have to do is update the "schema" for that particular online poker venue, without rebuilding the poker bot executables.
This will be the subject of (you guessed it) a future article.
Recommended Reading
This is by no means a complete list of recommended programming and/or poker books. However, all the books on this are a) relevant to building a poker bot and b) contain information which isn't readily available on the Internet, at least not without a good deal of sleuthing. I own a dog-eared, coffee-splattered copy of every book on this list, and have read most of them multiple times. Note: the programming books are coached from a C++ and Windows perspective but the techniques can be ported to other environments/languages.
Programming books:
Windows via C/C++. Jeffrey Richter. Everything you ever wanted to know about the Windows API. Should be required reading for all Windows developers, and for anybody working with .NET, as it addresses many of the API constructs on which .NET is built.
Exploiting Online Games: Cheating Massively Distributed Systems (Addison-Wesley Software Security Series). Greg Hoglund and Gary McGraw. If you actually plan on building an online poker bot, this is the book for you. Discusses many of the same techniques discussed in this series, but with a focus on World of Warcraft and other MMORPGs. All of the techniques discussed apply to poker, however.
Exploiting Software: How to Break Code (Addison-Wesley Software Security Series). Greg Hoglund and Gary McGraw. Companion volume to the above, with an emphasis on local applications (e.g., the poker client application running on your machine).
Poker books:
The Theory of Poker. David Sklansky. Quite possibly the most influential book in poker, written by the father of modern poker theory, David Sklansky. The value of this book is that it explains poker theory in a way that's universally applicable to all poker games. Required reading for all poker players and poker bot programmers.
The Mathematics of Poker. Bill Chen and Jerrod Ankenman. Difficult but extremely rewarding exposition of poker mathematics. In order to program a fluent poker AI you'll want to understand a lot, though not necessarily all, of this material.
Professional No-Limit Hold 'em: Volume I. Matt Flynn, Sunny Mehta, and Ed Miller. In my opinion this is the best book on no-limit Hold'em ever written, and the notion of planning hands around commitment and SPR squares very well with the poker bot AI we'll get into later.
No Limit Hold'em: Theory and Practice. David Sklansky and Ed Miller. Another excellent no-limit Hold'em book.
Gambling Theory and Other Topics. Mason Malmuth. A somewhat dry, rarely-read masterpiece, with some non-trivial mathematics. Not everything here is directly applicable to playing, or building software for, online poker, but it's all useful.
Getting Started in Hold 'em. Ed Miller. Introductory text. There are many good introductions to poker and to Texas Hold'em; this is one of the better ones, and introduces a short-stacked strategy that will be useful in deriving a competitive bot AI for real-money games, down the road.
Conclusion
We took a break from the source code in this installment as I don't want to alienate too quickly the readers who've subscribed in the past week. In the next installment, we're going to have to plunge right back in, beginning with an in-depth introduction to the architecture of the bot's AI component, expressed in C++, C#, or the language of your choice, and concluding with specific code you can use to simulate a realistic stream of human input. We'll also perform explorative surgery on various online poker clients, exposing their inner resources and code structures to the light of day. Last but not least, we'll answer the question of: how do I snoop on the poker client's log or hand history file in real time?
I will be looking out for Devlin´s next update in poker bot building and will blog it when it comes out.