Libratus (AI) vs. Humans: Texas Holdem

angrysoba · Jan 14, 2017

In a quick perusal of the forums here I failed to find a thread on this.

A computer called Libratus is battling humans in the game of Texas Holdem.

It finished above the humans overall on the first day of play, and is now in the second day. A livestream is here, apparently, although I have not watched it, and I know little about poker anyway.

Another team of researchers have already claimed that their AI, DeepStack, has beaten professional poker players here.

lionking · Jan 14, 2017

The problem I have with programs like this in relation to poker is that "card counting" is banned in casinos and if the house suspects this is happening, players are thrown out and can be banned. The programs count cards.

drelda · Jan 14, 2017

lionking said:
The problem I have with programs like this in relation to poker is that "card counting" is banned in casinos and if the house suspects this is happening, players are thrown out and can be banned. The programs count cards.

Card counting is a blackjack idea - not really relevant to poker...

It seems like the AI is beating the humans so far - http://www.theverge.com/2017/1/12/14250420/ai-beating-humans-poker-tournament-cmu-libratus .

This is heads-up (2 player only) no-limit holdem. I wouldn't be surprised if they have finally written an AI that can convincingly beat any human at that game.

In the simpler game of limit holdem computers have been able to beat humans for a few years now. No-limit holdem is a much harder game for a computer - but it seems the researchers have made a lot of progress - on the headsup variation at least.

I'm guessing the full ring game (many players) is a much harder problem again and it will be a few more years before computers can beat world class players at that.

- Drelda

angrysoba · Jan 14, 2017

lionking said:
The problem I have with programs like this in relation to poker is that "card counting" is banned in casinos and if the house suspects this is happening, players are thrown out and can be banned. The programs count cards.

I don't know anything about Texas Hold 'em having never played. However, according to Wikipedia, the game only uses 52 cards.

Other websites I have stumbled across say that only one deck is used, and reshuffled each hand. So counting cards is not an issue.

drelda · Jan 14, 2017

angrysoba said:
Other websites I have stumbled across say that only one deck is used, and reshuffled each hand. So counting cards is not an issue.

Definitely only one deck - shuffled each hand - in all standard poker games.

lionking · Jan 14, 2017

drelda said:
Card counting is a blackjack idea - not really relevant to poker...

It seems like the AI is beating the humans so far - http://www.theverge.com/2017/1/12/14250420/ai-beating-humans-poker-tournament-cmu-libratus .

This is heads-up (2 player only) no-limit holdem. I wouldn't be surprised if they have finally written an AI that can convincingly beat any human at that game.

In the simpler game of limit holdem computers have been able to beat humans for a few years now. No-limit holdem is a much harder game for a computer - but it seems the researchers have made a lot of progress - on the headsup variation at least.

I'm guessing the full ring game (many players) is a much harder problem again and it will be a few more years before computers can beat world class players at that.

- Drelda

You are right. Got my card games wrong.

Bikewer · Jan 14, 2017

I imagine the computer wouldn't have any "tells"....

angrysoba · Jan 14, 2017

Bikewer said:
I imagine the computer wouldn't have any "tells"....

From what I understand one of the problems the computer has is how to bluff properly instead of simply making bets proportionate to the strength of its hand.

Or at least that is the start-point. Generally the computer has to avoid being predictable and still hit on a winning strategy.

Also, I don't know whether the computer has any way of registering its opponents tells as well.

casebro · Jan 14, 2017

Texas Holdem is made to play the most players form one 52 card deck. It uses a common set of um 4 cards, ("the river" ?) leaving more to be dealt to more players. 8-10 players, but not 56-70 cards.

But I could never do well at it. I suspected that the odds changed somehow vs five card draw? I think the hands end up too close to each other, what with all those common cards?

drelda · Jan 14, 2017

angrysoba said:
Also, I don't know whether the computer has any way of registering its opponents tells as well.

I suspect it doesn't even try to pickup tells - or even patterns of play by its opponents.

Most of the serious poker AI programs from the last few years try to play a game-theory optimal strategy - i.e. they try to approximate a nash equilibrium strategy. This means they are trying to play a strategy that cannot be exploited - no matter what their opponent does. This also means they don't try to exploit their opponents weaknesses at all. They try to win by playing in a way that can't be beaten - and that means they can't deviate from the optimal strategy to exploit an opponent - because any deviation would open them up to being counter exploited!

Of course - none of them are playing a truly optimal strategy - its just too hard to calculate that - so they use shortcuts and tricks to approximate that.

I don't know any details about this particular AI - so possible they have taken a different approach.

- Drelda

jakesteele · Jan 14, 2017

drelda said:
Card counting is a blackjack idea - not really relevant to poker...

It seems like the AI is beating the humans so far - http://www.theverge.com/2017/1/12/14250420/ai-beating-humans-poker-tournament-cmu-libratus .

This is heads-up (2 player only) no-limit holdem. I wouldn't be surprised if they have finally written an AI that can convincingly beat any human at that game.

In the simpler game of limit holdem computers have been able to beat humans for a few years now. No-limit holdem is a much harder game for a computer - but it seems the researchers have made a lot of progress - on the headsup variation at least.

I'm guessing the full ring game (many players) is a much harder problem again and it will be a few more years before computers can beat world class players at that.

- Drelda

I'm not sure I understand what you mean by the bolded statement. I was a blackjack/roulette dealer and one of the things they teach you is to look at the chips as "units"(devoid of emotional content). It teaches the dealer to not freak out when they deal monster games. The end result is a dealer that can step into any game with confidence because the odds stay the same no matter the "color" of a chip.

I'm also wondering if the computer is programmed to bluff. My first thought is that it wouldn't be able to do that, which leaves strictly playing by-the-book which becomes very predictable and ultimately beatable.

elgarak · Jan 14, 2017

Bikewer said:
I imagine the computer wouldn't have any "tells"....

And presumably the AI cannot 'see' the human players and can get 'tells' as we humans typically understand them.

Therefore, the only tell the AI can go by is the bets, and maybe the time it takes humans to come up with a particular bet (although this has a large error -- there are lots of variables that can cause the time to vary).

From the little I have seen of more or less professional poker players, they tend to minimize the tells from that -- they either bet the minimum required, or go all in. There's very little in between. Sometimes there are small bets to 'test the waters', so to speak, but I have the impression this is done more the more 'amateur' the player is, and also more in earlier stages of the game. The more professional the player, or they more progressed stat of the game (signified by the blind bets required), the more it is mostly "check", "all-in" and "fold very early". The latter based on the knowledge of players how high a probability certain card combinations have (I would expect to have humans make a LOT more errors in correctly recalling said probabilities compared to an AI. It's also the difference between good and bad players: The better players have a better knowledge of said probabilities, maybe not mathematically correct, but by experience). They also use the bets and the time they take themselves to come up with a bet to psychologically push the other players.

So I guess there's a little bit the AI can go by to psychologically evaluate human players. And, more importantly, it does make a lot less errors and sticks to the ideal strategy (it may have come up by itself). I would also think the AI is much better to ignore certain psychological pushes human players fall for. I would expect the AI to have an advantage over human players because of that. Small, but existing. Since Poker/Texas Hold'em is a winner-takes-all game, the small advantage may appear larger than it actually is.

drelda · Jan 14, 2017

jakesteele said:
I'm not sure I understand what you mean by the bolded statement (No-limit holdem is a much harder game for a computer). I was a blackjack/roulette dealer and one of the things they teach you is to look at the chips as "units"(devoid of emotional content). It teaches the dealer to not freak out when they deal monster games. The end result is a dealer that can step into any game with confidence because the odds stay the same no matter the "color" of a chip.

Its not the size of the chips that differs in no-limit poker - its the size of each bet. In limit poker each bet is limited to be quite small - whereas in no-limit you can bet anything up to the amount in front of you in a single bet. This makes the game much more complex.

jakesteele said:
I'm also wondering if the computer is programmed to bluff. My first thought is that it wouldn't be able to do that, which leaves strictly playing by-the-book which becomes very predictable and ultimately beatable.

Yes the computer will bluff. In fact it will bluff with a frequency that is carefully calculated to balance its actions and make it difficult to exploit.

- Drelda

elgarak · Jan 14, 2017

drelda said:
Its not the size of the chips that differs in no-limit poker - its the size of each bet. In limit poker each bet is limited to be quite small - whereas in no-limit you can bet anything up to the amount in front of you in a single bet. This makes the game much more complex.

Yes the computer will bluff. In fact it will bluff with a frequency that is carefully calculated to balance its actions and make it difficult to exploit.

- Drelda

Speaking of bluffs: Oftentimes, a human player knows of a bluff, or knows a bluff won't work, because of 'tells'. Body-language, or comments made earlier, or signals received long before the player decided to bluff.

Since the AI has no body, and is much better in controlling itself which signals it sends, I would expect it to be much better at bluffing, though not necessarily at recognizing a bluff (that depends on what input the AI gets. Would it get a video feed of the human players and could learn to read body language and tells?).

drelda · Jan 14, 2017

elgarak said:
Since the AI has no body, and is much better in controlling itself which signals it sends, I would expect it to be much better at bluffing, though not necessarily at recognizing a bluff (that depends on what input the AI gets. Would it get a video feed of the human players and could learn to read body language and tells?).

I'm pretty sure they aren't doing that now - in theory it would be possible but difficult - but isn't really the point of what they are trying to do.

In online poker (where you can't see your opponents) there are no such tells / signals - the only information you have are the actions of your opponents. This makes it different from live poker - in that its a purer game - all about strategy and less about reading people.

I think this match is trying to answer whether AI can beat humans at this 'pure' game - rather than trying to emulate live poker.

- Drelda

angrysoba · Jan 14, 2017

elgarak said:
Speaking of bluffs: Oftentimes, a human player knows of a bluff, or knows a bluff won't work, because of 'tells'. Body-language, or comments made earlier, or signals received long before the player decided to bluff.

Since the AI has no body, and is much better in controlling itself which signals it sends, I would expect it to be much better at bluffing, though not necessarily at recognizing a bluff (that depends on what input the AI gets. Would it get a video feed of the human players and could learn to read body language and tells?).

drelda said:
I'm pretty sure they aren't doing that now - in theory it would be possible but difficult - but isn't really the point of what they are trying to do.

In online poker (where you can't see your opponents) there are no such tells / signals - the only information you have are the actions of your opponents. This makes it different from live poker - in that its a purer game - all about strategy and less about reading people.

I think this match is trying to answer whether AI can beat humans at this 'pure' game - rather than trying to emulate live poker.

- Drelda

Well, according to this video, the main point here is to go beyond what is called a "perfect information" game such as chess and go, and to deal with uncertainty in terms of how people behave, hence bluffing is important here. Ultimately general intelligence, and not domain-specific intelligence, is the goal.

angrysoba · Jan 14, 2017

This is also what Deepstack is about:

Artificial intelligence has seen a number of breakthroughs in recent years, with games often serving as significant milestones. A common feature of games with these successes is that they involve information symmetry among the players, where all players have identical information. This property of perfect information, though, is far more common in games than in real-world problems. Poker is the quintessential game of imperfect information, and it has been a longstanding challenge problem in artificial intelligence. In this paper we introduce DeepStack, a new algorithm for imperfect information settings such as poker. It combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition about arbitrary poker situations that is automatically learned from self-play games using deep learning. In a study involving dozens of participants and 44,000 hands of poker, DeepStack becomes the first computer program to beat professional poker players in heads-up no-limit Texas hold'em. Furthermore, we show this approach dramatically reduces worst-case exploitability compared to the abstraction paradigm that has been favored for over a decade.

Link

angrysoba · Jan 14, 2017

They are live now. I am listening to this guy, Jason Les, who seems to be upset about the possibility that the humans are telling the computer how to play.

He is saying it in the I-don't-know-but-I've-been-told manner, but it is weirdly similar to what Garry Kasparov was saying when he was losing to DeepBlue as though the grand master of a game is being beaten by Joe the Plumber who is whispering in his calculator's ear.

https://www.twitch.tv/libratus_vs_jasonles

angrysoba · Jan 14, 2017

Of course, the humans are getting lots of advice from internet chat. Jason is stressed out about the possibility that "the bot" is getting similar help.

angrysoba · Jan 14, 2017

I suppose the owners of this AI would be able to clean up online by allowing it to play under multiple identities in various forums.

drelda · Jan 14, 2017

angrysoba said:
I suppose the owners of this AI would be able to clean up online by allowing it to play under multiple identities in various forums.

Maybe - though most online poker sites try to make it very difficult to use any kind of automation. Also there aren't that many high stakes heads up games available at any one time.

If they could play at an equivalent level in full ring games (many players) there would be more opportunity - but that is probably a much harder game to solve.

- Drelda

Jim_MDP · Jan 14, 2017

A couple months back WPT had a short segment in a few of their episodes (one or the other of their now various half hours)... I'll guess it was Deepstack.

As I recall it went to 1-1 and if they broadcast details of a rubber match, I missed it.

There's some accurate comments in this thread, but also some major errors both in procedure (really?) and in the skills shown by the pros in among other things... bet sizing, pattern recognition in play and bets, tells, and especially how quickly and accurately they calculate the odds on the fly. Like bang on (within a couple percent) in a few seconds. It matters. A lot.

And if you think there aren't tells in an online game, try playing without looking at your hole cards.

Annette Obrestad will tell you how...

http://www.espn.com/espn/poker/columns/story?id=3518570&columnist=bluff_magazine

drelda · Jan 15, 2017

It seems that the computer is convincingly beating the humans so far :

They have a long way to go yet so this could easily reverse.

It works out at about 9bb/100 - which is a huge edge if it continues like that.

Bladesman87 · Jan 16, 2017

The past few years of poker have seen a huge surge in the standard of play (online mostly, live poker is a long way behind in general). This has been a result of the growth in the game theoretic approach to play. The reality is that with good players tells aren't reliable. Actually, they're not that reliable even against some average players.

Poker appears to have a solution (there are certain proofs that Nash ranges exist for any HU situation). It's just like the Simpsons where Lisa plays Bart at rock, paper, scissors, knowing that Bart always chooses rock. She can make an exploitative play of always choosing paper. But she could also choose to play optimally by randomly choosing a throw each with a 1/3 probability, meaning that Bart can never latch on and learn to beat her (he could, at best, break even by employing the same counter strategy).

Poker works the same, just with orders of magnitude greater complexity. The player can raise, fold, call, with some optimal distribution for any situation. And there is no reason why a computer shouldn't be able to do that (and, in the long term, do it better than a human). For limit hold'em (where players can only bet a predetermined fixed amount) the game has been close to solved and computers can play very well.

The difficulty for no limit hold'em is not that computers can't bluff, betting without a hand is trivial, or that they can't read opponents, that's unnecessary, it's that the game tree gets very large, very quickly, and that multiple solutions exist for small changes.

One thing that gets discussed a lot in poker is "what should my pre-flop opening raise be?". It actually seems that it doesn't matter as much as we once thought. Not so long ago, people would raise to 3 big blinds and that was that. Now you see players online will raise to 3bb in some positions, 2.5bb in others, some other player will raise to 2.4bb. And none of them appear to be right or wrong. What changes is the range you can do it with (your "range" being the total combinations of starting cards you would play in this way). You can raise more hands with a smaller opening size, and fewer with a larger one (this being a function of hands that you can adequately defend for different opponent responses). And this means that a computer needs to be able to determine a proper pre-flop strategy for facing 2.1bb raises, 2.2bb raises, 2.3bb raises, and so on. And that's just pre-flop. Now consider how big the game tree becomes for flop bets, which are likely to have even more variation not just in bet sizing, but in equity changes for different board scenarios (AQ having reduced equity on a 45T flop, but T9 having gained equity).

Now, since we can come up with a lot of good maths to approximate proper frequencies for different scenarios, designing a computer that can respond with an approximation of optimal ranges and plays is entirely possible and, similar to a game like chess, doing it better than humans is just a matter of time. Solving the game and creating a perfect poker computer, however, is probably a long way away.

What does this mean for poker? Probably the same as chess. Live players will play for a long time to come, but when everybody is capable of running a poker engine that can beat the world's best human, playing online for money will be a thing of the past.

angrysoba · Jan 16, 2017

Computer's still winning!

drelda · Jan 16, 2017

Bladesman87 said:
What does this mean for poker? Probably the same as chess. Live players will play for a long time to come, but when everybody is capable of running a poker engine that can beat the world's best human, playing online for money will be a thing of the past.

I agree in the long run - but it might take longer than you imagine.

Clearly a lot of progress has been made in the AI for headsup nolimit poker - but multi-player games are going to be much harder to 'solve'. It might be a long time before AI can play world class 10 player NLHE.

Also the approach of aiming for a game theory optimal strategy works for these exhibition matches because it is a true zerosum game - there is no rake (the fee the house takes on each hand). For a 10 player game with a rake - its possible that the a perfect 'optimal' strategy would be be an overall loser because of the rake. It wouldn't lose to other players - but it would have to do more than that and make enough from them to pay for the rake.

It might be that in order to make money in a 10 player game with a rake, it is necessary to depart from optimal strategy and try to read / exploit opponents. This would mean opening up to counter-exploitation of course - so it becomes a complex balance of making as much as possible from opponent weaknesses whilst minimizing your own exploitability.

I don't know if this is the case - I don't think anyone knows what a true equilibrium strategy looks like for 10 player NLHE because its vastly more complex than the headsup game - which is itself right on the limit of current technology.

However the possibility that equilibrium strategy for fullring NL games could be a losing strategy to the rake at least gives us some hope of pushing the robot apocalypse down the road a few more years - as going beyond the optimal strategy would be much harder again.

If you don't get what I mean about an 'optimal' strategy not being the 'best' strategy - consider your example of rock, paper, scissors. Here the nash equilibrium strategy is to make a random selection each round - but in a game with a rake this strategy would be guarenteed to lose. You would have to be able to make predictions about your opponents patterns of play to have any chance of beating the rake.

- Drelda

marplots · Jan 16, 2017

drelda said:
If you don't get what I mean about an 'optimal' strategy not being the 'best' strategy - consider your example of rock, paper, scissors. Here the nash equilibrium strategy is to make a random selection each round - but in a game with a rake this strategy would be guarenteed to lose. You would have to be able to make predictions about your opponents patterns of play to have any chance of beating the rake.

- Drelda

I am not very familiar with how professional poker is played, but it seems like raw luck (getting good cards) would eventually swamp skill as skill sets narrow. This would depend, in part, on how long the game goes on - that is, how much statistical variation is "smoothed out" with extended sampling.

Is it the case now that the top player usually wins? Is there a small set of Tiger Woods dominating or does luck, even with just humans, rise to be the deciding factor?

I guess what I'm wondering is where the variance dominates - skill or luck - and how it may shift around.

drelda · Jan 16, 2017

marplots said:
I guess what I'm wondering is where the variance dominates - skill or luck - and how it may shift around.

Luck dominates in the short term - skill dominates in the long term.

i.e. if I played one hand of poker against a world class player (I am very very far from that!) then the probability of me winning would be less than 50% - but not much less. Maybe 48%.

If I played 100 hands there is still a very good chance I would be ahead at the end of it - maybe 30%.

If I played 10,000 hands then I would have to be very lucky to be ahead. However if the skill difference was less - say a good player against a world class player - then its quite possible for the lesser player to be ahead.

For this exhibition match they are playing 120k hands - but are also playing mirror hands and using other tricks to greatly reduce the effect of variance. Without those tricks 120k hands would be unlikely to be enough to resolve the difference in skill to any significance. Even with 120k hands it might well end up too close to call with statistical significance.

People generally dont realise how much effect variance can have in poker. Its very possible for a 'bad' player to think they are 'good' because they have won money over many thousands of hands - when in fact they were just lucky. Often such players then move up to higher stakes games - but then their luck runs out - and they need a lot more of it because the skill difference at higher stakes is lot greater.

These days a lot of 'top' players have played a huge number of hands online - enough to be confident they aren't just lucky. However they don't have enough data to say with confidence which top player is better than another. Resolving the small skill difference between 2 very good players (without variance reducing tricks) would take something like a million hands - which isn't very practical.

- Drelda

marplots · Jan 16, 2017

drelda said:
(much snipped)

These days a lot of 'top' players have played a huge number of hands online - enough to be confident they aren't just lucky. However they don't have enough data to say with confidence which top player is better than another. Resolving the small skill difference between 2 very good players (without variance reducing tricks) would take something like a million hands - which isn't very practical.

- Drelda

I would expect, given what you outlined, that there would be no useful ranking then of "top players," or at least not without including dozens and dozens of people. Is that the case? Does no limit hold 'em have a small pool of people who consistently come out on top, or do the names flip around almost randomly (mirroring variance in hand quality) between tournaments?

ETA: I suppose, even if there were a pool of top winners, it may have to do with other things not typically attributed to "skill." Like the ability to get a large entry fee, or not breaking down emotionally after a series of poor performances, or avoiding the consequences of addiction to performance enhancing stimulants... I don't know, but things outside of the game which impact who shows up on the scoreboard.

Bladesman87 · Jan 16, 2017

drelda said:
I agree in the long run - but it might take longer than you imagine.

Clearly a lot of progress has been made in the AI for headsup nolimit poker - but multi-player games are going to be much harder to 'solve'. It might be a long time before AI can play world class 10 player NLHE.

I honestly don't know enough about computers to put a time frame on it. PokerSnowie can handle 6-max analysis to a very respectable standard though.

Also the approach of aiming for a game theory optimal strategy works for these exhibition matches because it is a true zerosum game - there is no rake (the fee the house takes on each hand). For a 10 player game with a rake - its possible that the a perfect 'optimal' strategy would be be an overall loser because of the rake. It wouldn't lose to other players - but it would have to do more than that and make enough from them to pay for the rake.

Rake is an interesting point. Regarding the standard that bots need to make, most likely is that if all players are good enough they would all begin losing to rake even while some maintain an edge over each other. This is actually already true for some known spots, particularly in higher rake micros games, where you ought to defend your big blind sub-optimally vs. button opens.

It might be that in order to make money in a 10 player game with a rake, it is necessary to depart from optimal strategy and try to read / exploit opponents. This would mean opening up to counter-exploitation of course - so it becomes a complex balance of making as much as possible from opponent weaknesses whilst minimizing your own exploitability.

I don't know if this is the case - I don't think anyone knows what a true equilibrium strategy looks like for 10 player NLHE because its vastly more complex than the headsup game - which is itself right on the limit of current technology.

However the possibility that equilibrium strategy for fullring NL games could be a losing strategy to the rake at least gives us some hope of pushing the robot apocalypse down the road a few more years - as going beyond the optimal strategy would be much harder again.

If you don't get what I mean about an 'optimal' strategy not being the 'best' strategy - consider your example of rock, paper, scissors. Here the nash equilibrium strategy is to make a random selection each round - but in a game with a rake this strategy would be guarenteed to lose. You would have to be able to make predictions about your opponents patterns of play to have any chance of beating the rake.

- Drelda

I'm not entirely sure I understand your last paragraphs. The "correct" exploitable strategy (i.e. the play that we would make if we knew exactly what our opponent's response would be) will always yield the maximum ev. It doesn't make sense to play exploitatively whilst also minimising how exploitable we are. Any adjustment from the GTO play will necessarily leave you open to counter strategies and lower your ev vs. an adaptive opponent.

The important distinction between poker and RPS is that in RPS when a GTO player plays Bart the GTO bot doesn't have any more ev than when it plays another GTO player. The GTO bot will win 1/3, lose 1/3, draw 1/3 against both Bart and another GTO bot. In poker this isn't true. Any spot for which your opponent deviates from GTO increases your ev. A hypothetical GTO bot would be beating the toughest games and I doubt it would be close.

The only way this wouldn't be true is if the other players also get sufficiently close to GTO play that the edge is swallowed by rake, and at that point there is by definition nothing left that is sufficiently exploitable.

Bladesman87 · Jan 16, 2017

marplots said:
I am not very familiar with how professional poker is played, but it seems like raw luck (getting good cards) would eventually swamp skill as skill sets narrow. This would depend, in part, on how long the game goes on - that is, how much statistical variation is "smoothed out" with extended sampling.

Is it the case now that the top player usually wins? Is there a small set of Tiger Woods dominating or does luck, even with just humans, rise to be the deciding factor?

I guess what I'm wondering is where the variance dominates - skill or luck - and how it may shift around.

With lower edges comes higher variance. To make this clear, imagine that we're playing poker and I'm trying to lose as much as possible. I can play in such a way that I donate large amounts of chips and almost never win a hand. You see little variance because my play is so terrible that I make it almost impossible for you to lose.

Take a counter example where I play against my exact replica. We both play poker in the exact same manner. Now, all that controls where the chips move is the random distribution of the cards.

As for whether the better player usually wins, that's hard to say because of the wide variety of formats and different edges that players have over each other. Players aren't "forced" to play each other like they are in other competitive sports. In Premiership football, all the teams have to play each other twice. In poker, I'm not sitting down if world's best are all at the table. What you can say is that the % of players online who are either break even or losing over all their hands played is in the 90's. Eventually, the money in poker trickles up to the top.

drelda · Jan 16, 2017

marplots said:
I would expect, given what you outlined, that there would be no useful ranking then of "top players," or at least not without including dozens and dozens of people. Is that the case?

Well people make those rankings - based on recent winnings mostly - but I agree they don't really represent who is the 'best' player - just who has had a good run recently.

marplots said:
Does no limit hold 'em have a small pool of people who consistently come out on top, or do the names flip around almost randomly (mirroring variance in hand quality) between tournaments?

Yes the names flip around. Very good players are more likely to win a tournament than a bad player - but not by much. Lets say there was a tournament with 1000 players. The best player in the world might have a 1% chance of winning the tournament - ie 10 times more likely than if everyone was equal. The poor players might only have 0.05% chance of winning - but there are a lot of them - so its not at all uncommon for a relatively unskilled amateur player to win a tournament.

marplots said:
[ETA: I suppose, even if there were a pool of top winners, it may have to do with other things not typically attributed to "skill." Like the ability to get a large entry fee, or not breaking down emotionally after a series of poor performances, or avoiding the consequences of addiction to performance enhancing stimulants... I don't know, but things outside of the game which impact who shows up on the scoreboard.

Yes - all those things and more are extremely important. To be a long term winner you need technical skill at poker - and all those other things.

Note - the extreme variance in poker is what makes it work. You don't get high stakes chess tournaments with relatively unskilled players putting up money - because they know they have no chance. Poker lets recreational players gamble in an enjoyable way - with a reasonable chance of winning - and knowledege that if they gain enough skill they could be long term winners. Thats what keeps the ecosystem of poker working.

- Drelda

marplots · Jan 16, 2017

drelda said:
(much snipped)
Note - the extreme variance in poker is what makes it work. You don't get high stakes chess tournaments with relatively unskilled players putting up money - because they know they have no chance. Poker lets recreational players gamble in an enjoyable way - with a reasonable chance of winning - and knowledege that if they gain enough skill they could be long term winners. Thats what keeps the ecosystem of poker working.

- Drelda

It sounds like our poker playing program has a different goalpost than I assumed. It really only needs to play to the level of participation to match human-level performance, since it's going to be hard to establish it as an enduring winner in an environment where we don't expect to identify a "best" (as contrasted with chess). So we have more of a Turing test than a "poker solved" situation.

Bladesman87 · Jan 16, 2017

marplots said:
It sounds like our poker playing program has a different goalpost than I assumed. It really only needs to play to the level of participation to match human-level performance, since it's going to be hard to establish it as an enduring winner in an environment where we don't expect to identify a "best" (as contrasted with chess). So we have more of a Turing test than a "poker solved" situation.

The reason it's hard to point to a "best" poker player is that to determine that, we either would have to compare their strategies to a GTO solution (they won't share the all of the former, and we don't know the latter), or you'd need top players to play a sufficient number of hands against each other. And because that would likely involve the best players at best having a very, very, small winrate and an awful lot of variance (potentially very long runs of losing), they have no incentive to do that. As an example, one of the very best no limit hold'em cash players in the world is Sauce123 (or Ben Sulsky, if you prefer). He rarely plays NLH any more because he finds it very difficult to get players to sit and play him for any sustained period. Pros don't play in games they're likely losers. I think he mostly plays Pot Limit Omaha now, a game which many have switched to at the top level because it's far further from being solved than NLH and larger edges are still possible.

It's not that there isn't a "best" poker player for a given game, it's just that they won't play 500k hands vs. each other for us to be sure. And if they're very very close to each other in standard, even that might not be enough to say for certain.

drelda · Jan 16, 2017

Bladesman87 said:
I'm not entirely sure I understand your last paragraphs. The "correct" exploitable strategy (i.e. the play that we would make if we knew exactly what our opponent's response would be) will always yield the maximum ev.

The maximum ev for that hand - but our actions in that hand may affect our opponents future responses to us - in ways that reduce our ev in those future hands (ie the metagame effect). Hence we can try to increase our ev by exploitation - but not to the extent that it radically changes our opponents future responses to us - particularly if those changes are likely to remove lots of other chances to more quietly exploit.

Take an example - imagine we somehow had perfect knowledge of GTO strategy and a perfect model of our opponents responses. We are on the river with the worst hand considering whether to go allin. Lets say GTO says we should do this only 0.01% of the time - but we happen to know - given our previously solid image and perfect model that this opponent will definitely fold if we go all in. If we slam 100% of the time here our opponents will quickly adjust and we lose all kinds of more subtle profitable spots where they overfold. However if we do it just 10% of the time - maybe they wont adjust to that - and we can make a lot more than GTO - without anyone noticing.

Bladesman87 said:
Any adjustment from the GTO play will necessarily leave you open to counter strategies and lower your ev vs. an adaptive opponent.

Yes - but most players aren't very good at adapting - so it is possible to exploit them without them effectively countering - if you are subtle about it... Playing GTO is giving every opponent credit to be able to play a perfect counter strategy at all times. Playing maximum exploitation is giving them zero credit to adjust. I'm suggesting there is middle ground - where you increase your ev over GTO - in ways that actual opponents overall aren't able to counter.

Maybe as you say - a GTO strategy would be enough to crush any human for plenty more than the rake - in which case online poker will probably die once that has been developed for fullring games. If GTO for full ring doesn't beat the rake (maybe its incredibly tight for instance) then noone will try to play GTO and there is some hope for online poker (for a bit longer anyway) as well executed exploitative play may still allow long term winners.

- Drelda

Bladesman87 · Jan 17, 2017

drelda said:
The maximum ev for that hand - but our actions in that hand may affect our opponents future responses to us - in ways that reduce our ev in those future hands (ie the metagame effect). Hence we can try to increase our ev by exploitation - but not to the extent that it radically changes our opponents future responses to us - particularly if those changes are likely to remove lots of other chances to more quietly exploit.

Take an example - imagine we somehow had perfect knowledge of GTO strategy and a perfect model of our opponents responses. We are on the river with the worst hand considering whether to go allin. Lets say GTO says we should do this only 0.01% of the time - but we happen to know - given our previously solid image and perfect model that this opponent will definitely fold if we go all in. If we slam 100% of the time here our opponents will quickly adjust and we lose all kinds of more subtle profitable spots where they overfold. However if we do it just 10% of the time - maybe they wont adjust to that - and we can make a lot more than GTO - without anyone noticing.

All this is really saying is that if we know our opponent well enough, and they have exploitable weaknesses, we can exploit them. But this is kind of a moot point, because if we know our opponent's strategy and future strategies we don't need GTO at all.

Yes - but most players aren't very good at adapting - so it is possible to exploit them without them effectively countering - if you are subtle about it... Playing GTO is giving every opponent credit to be able to play a perfect counter strategy at all times. Playing maximum exploitation is giving them zero credit to adjust. I'm suggesting there is middle ground - where you increase your ev over GTO - in ways that actual opponents overall aren't able to counter.

Again, this is true but I think it's fairly trivial. And in practical terms, the rise of game theory in poker has been because playing exploitably started losing big. Games now, where players are balancing ranges, playing mixed strategies with their ranges, and attempting to be unexploitable are far tougher than even say ten years ago.

Maybe as you say - a GTO strategy would be enough to crush any human for plenty more than the rake - in which case online poker will probably die once that has been developed for fullring games. If GTO for full ring doesn't beat the rake (maybe its incredibly tight for instance) then noone will try to play GTO and there is some hope for online poker (for a bit longer anyway) as well executed exploitative play may still allow long term winners.

- Drelda

There was an Ike Haxton (another that is arguably the best cash player in the world) quote where he said something like "Hold'em is a very difficult game to beat, and nobody's even playing that well". What he meant was that we're still so far from solving the game (as I said in my first post, nobody's even proven optimal pre-flop opening ranges for 100bb 6-max) and yet some player's approximations are making it impossible to gain edges over rake in certain spots.

The difficulty in attempting to exploit them is that it would take us many many thousands of hands to learn what their full strategy is and learn how to exploit it, and in that time they may improve. The only solution left in the toughest games has been to move towards a game theoretic approach. It's not in practical doubt that someone like Sauce123's balanced ranges would destroy something like the penny stakes, and we know this because it's beaten far better players.

drelda · Jan 17, 2017

Bladesman87 said:
Again, this is true but I think it's fairly trivial. And in practical terms, the rise of game theory in poker has been because playing exploitably started losing big. Games now, where players are balancing ranges, playing mixed strategies with their ranges, and attempting to be unexploitable are far tougher than even say ten years ago.

Sure - I'm not arguing that GTO isn't important or necessary - I'm arguing that GTO isn't the 'ultimate' best strategy. That would be a strategy that makes the most money in the long run. I think it would understand GTO but also deviate from it to profitably exploit when it can get away with it.

In fact it would be a much harder problem to solve that just GTO - it would have to solve GTO (which is currently beyond our reach for multiplayer games) and then solve how to model opponents and model their reactions to our actions and which deviations from GTO are worthwhile.

I agree that current players are far from GTO - and that GTO would represent a better strategy than people play today. But its not the end of the line - there are strategies that would perform better than GTO against real opponents.

- Drelda

Bladesman87 · Jan 17, 2017

Okay, I think I get where you're coming from. It's actually analogous to the approach taken in softer fields: playing with some sense of balance (weighting value bets to bluffs, solid opening ranges etc.) while deviating for specific opponents. In say a 50nl (25c/50c blinds) online cash game, you're going to be aware that all the players will have lots of exploitable weaknesses, but also that you can't go completely crazy without them catching on e.g. knowing they 4-bet bluff too much and so constantly raising all-in against them will only work a few times before they start calling.

In that case, I think the answer is that you're right about your hypothetical strategy doing better than GTO, but it only applies in a world in which that opponent can never reach GTO himself.

drelda · Jan 17, 2017

The humans are fighting back! After a good session for them its now a lot closer :

marplots · Jan 17, 2017

Dong Kim is dicksing some potatoes!

Libratus (AI) vs. Humans: Texas Holdem

Philosophile

In the Peanut Gallery

Critical Thinker

Philosophile

Critical Thinker

In the Peanut Gallery

Penultimate Amazing

Philosophile

Penultimate Amazing

Critical Thinker

Fait Accompli

Illuminator

Critical Thinker

Illuminator

Critical Thinker

Philosophile

Philosophile

Philosophile

Philosophile

Philosophile

Critical Thinker

Philosopher

Critical Thinker

Thinker

Philosophile

Attachments

Critical Thinker

Penultimate Amazing

Critical Thinker

Penultimate Amazing

Thinker

Thinker

Critical Thinker

Penultimate Amazing

Thinker

Critical Thinker

Thinker

Critical Thinker

Thinker

Critical Thinker

Penultimate Amazing