What is the optimal algorithm for the game, 2048?

Question

I have recently stumbled upon the game 2048. You merge similar tiles by moving them in any of the four directions to make "bigger" tiles. After each move, a new tile appears at random empty position with value of either2 or 4. The game terminates when all the boxes are filled and there are no moves that can merge tiles, or you create a tile with a value of2048.

One, I need to follow a well-defined strategy to reach the goal. So, I thought of writing a program for it.

My current algorithm:

while (!game_over) {
    for each possible move:
        count_no_of_merges_for_2-tiles and 4-tiles
    choose the move with large number of merges
}

What I am doing is at any point, I will try to merge the tiles with values 2 and 4, that is, I try to have 2 and 4 tiles, as minimum as possible. If I try it this way, all other tiles were automatically getting merged and the strategy seems good.

But, when I actually use this algorithm, I only get around 4000 points before the game terminates. Maximum points AFAIK is slightly more than 20,000 points which is way larger than my current score. Is there a better algorithm than the above?

This is how they should have made the Windows 8 home screen ;) -- Interesting game. The guy who wrote an AI solver claims that he uses alpha-beta searching with adaptive deepening. I wonder how you can apply alpha-beta (or its un-optimized ancestor, minimax) to a non-adversarial game like this one. — 500 - Internal Server Error, Mar 12 at 22:54
@500-InternalServerError: IfI were to implement an AI with alpha-beta game tree pruning, it would be assuming that the new blocks are adversarially placed. It's a worst-case assumption, but might be useful. — Charles, Mar 14 at 20:52

Thomas · Accepted Answer · 2014-04-09 13:12:30Z

I'm the author of the AI program that others have mentioned in this thread. You can view the AI inaction or read the source.

Currently, the program achieves about a 90% win rate running in javascript in the browser on my laptop given about 100 milliseconds of thinking time per move, so while not perfect (yet!) it performs pretty well.

Since the game is a discrete state space, perfect information, turn-based game like chess and checkers, I used the same methods that have been proven to work on those games, namelyminimax search with alpha-beta pruning. Since there is already a lot of info on that algorithm out there, I'll just talk about the two main heuristics that I use in thestatic evaluation function and which formalize many of the intuitions that other people have expressed here.

Monotonicity

This heuristic tries to ensure that the values of the tiles are all either increasing or decreasing along both the left/right and up/down directions. This heuristic alone captures the intuition that many others have mentioned, that higher valued tiles should be clustered in a corner. It will typically prevent smaller valued tiles from getting orphaned and will keep the board very organized, with smaller tiles cascading in and filling up into the larger tiles.

Here's a screenshot of a perfectly monotonic grid. I obtained this by running the algorithm with the eval function set to disregard the other heuristics and only consider monotonicity.

2048智能算法讨论

Smoothness

The above heuristic alone tends to create structures in which adjacent tiles are decreasing in value, but of course in order to merge, adjacent tiles need to be the same value. Therefore, the smoothness heuristic just measures the value difference between neighboring tiles, trying to minimize this count.

A commenter on Hacker News gave an interesting formalization of this idea in terms of graph theory.

Here's a screenshot of a perfectly smooth grid, courtesy of this excellent parody fork.

2048智能算法讨论

Free Tiles

And finally, there is a penalty for having too few free tiles, since options can quickly run out when the game board gets too cramped.

And that's it! Searching through the game space while optimizing these criteria yields remarkably good performance. One advantage to using a generalized approach like this rather than an explicitly coded move strategy is that the algorithm can often find interesting and unexpected solutions. If you watch it run, it will often make surprising but effective moves, like suddenly switching which wall or corner it's building up against.

Edit:

Here's a demonstration of the power of this approach. I uncapped the tile values (so it kept going after reaching 2048) and here is the best result after eight trials.

2048智能算法讨论

Yes, that's a 4096 alongside a 2048. =) That means it achieved the elusive 2048 tile three times on the same board.

I developed a 2048 AI using expectimax optimization, instead of the minimax search used by @ovolve's algorithm. The AI simply performs maximization over all possible moves, followed by expectation over all possible tile spawns (weighted by the probability of the tiles, i.e. 10% for a 4 and 90% for a 2). As far as I'm aware, it is not possible to prune expectimax optimization (except to remove branches that are exceedingly unlikely), and so the algorithm used is a carefully optimized brute force search.

Performance

The AI in its default configuration (max search depth of 8) takes anywhere from 10ms to 200ms to execute a move, depending on the complexity of the board position. In testing, the AI achieves an average move rate of 6-10 moves per second over the course of an entire game. If the search depth is limited to 6 moves, the AI can easily execute 20+ moves per second, which makes for someinteresting watching.

To assess the score performance of the AI, I ran the AI 100 times (connected to the browser game via remote control). For each tile, here are the proportions of games in which that tile was achieved at least once:

The minimum score over all runs was 27536; the maximum score achieved was 377792. The median score is 157652. The AI never failed to obtain the 2048 tile (so it never lost the game even once in 100 games).

Here's the screenshot of the best run:

2048智能算法讨论

This game took 14395 moves over 37 minutes, or an average of 6.6 moves per second.

Implementation

My approach encodes the entire board (16 entries) as a single 64-bit integer (where tiles are the nybbles, i.e. 4-bit chunks). On a 64-bit machine, this enables the entire board to be passed around in a single machine register.

Bit shift operations are used to extract individual rows and columns. A single row or column is a 16-bit quantity, so a table of size 65536 can encode transformations which operate on a single row or column. For example, moves are implemented as 4 lookups into a precomputed "move effect table" which describes how each move affects a single row or column (for example, the "move right" table contains the entry "1122 -> 0023" describing how the row [2,2,4,4] becomes the row [0,0,4,8] when moved to the right).

Scoring is also done using table lookup. The tables contain heuristic scores computed on all possible rows/columns, and the resultant score for a board is simply the sum of the table values across each row and column.

This board representation, along with the table lookup approach for movement and scoring, allows the AI to search a huge number of game states in a short period of time (over 10,000,000 game states per second on my 3-year-old laptop (single-core)).

The expectimax search itself is coded as a recursive search which alternates between "expectation" steps (testing all possible tile spawn locations and values, and weighting their optimized scores by the probability of each possibility), and "maximization" steps (testing all possible moves and selecting the one with the best score). The tree search terminates when it sees a previously-seen position (using atransposition table), when it reaches a predefined depth limit, or when it reaches a board state that is highly unlikely (e.g. it was reached by getting 6 "4" tiles in a row from the starting position). The typical search depth is 4-8 moves.

The heuristic scoring algorithm is very straightforward: open squares count for +20000, and for each row and column, +20000 if its largest value is on the edge (as opposed to being in the two middle squares). The former heuristic performs extremely well by itself (usually getting to 4096), but frequently dies after the maximum piece ends up in the middle 4 tiles. Thus, the latter heuristic guides the largest pieces towards the edges, where they are easier to combine.

That the AI achieves the 16384 tile is a huge milestone; I will be surprised to hear if any human players have achieved 16384 on the official game (i.e. without using tools like savestates or undo). Being able to achieve this result fully 13% of the time makes me optimistic that the 32768 tile is achievable with some work!

You can try the AI for yourself. The code is available at https://github.com/nneonneo/2048-ai.

I have refined the algorithm and beaten the game! It may fail due to simple bad luck close to the end (you are forced to move down, which you should never do, and a tile appears where your highest should be. Just try to keep the top row filled, so moving left does not break the pattern), but basically you end up having a fixed part and a mobile part to play with. This is your objective:

2048智能算法讨论

This is the model I chose by default.

1024 512 256 128
  8   16  32  64
  4   2   x   x
  x   x   x   x

The chosen corner is arbitrary, you basically never press one key (the forbidden move), and if you do, you press the contrary again and try to fix it. For future tiles the model always expects the next random tile to be a 2 and appear on the opposite side to the current model (while the first row is incomplete, on the bottom right corner, once the first row is completed, on the bottom left corner).

Here goes the algorithm. Around 80% wins (it seems it is always possible to win with more "professional" AI techniques, I am not sure about this, though.)

initiateModel();

while(!game_over)
{    
    checkCornerChosen(); // Unimplemented, but it might be an improvement to change the reference point

    for each 3 possible move:
        evaluateResult()
    execute move with best score
    if no move is available, execute forbidden move and undo, recalculateModel()
 }

 evaluateResult() {
     calculatesBestCurrentModel()
     calculates distance to chosen model
     stores result
 }

 calculateBestCurrentModel() {
      (according to the current highest tile acheived and their distribution)
  }

A few pointers on the missing steps. Here: 2048智能算法讨论

The model has changed due to the luck of being closer to the expected model. The model the AI is trying to achieve is

 512 256 128  x
  X   X   x   x
  X   X   x   x
  x   x   x   x

And the chain to get there has become:

 512 256  64  O
  8   16  32  O
  4   x   x   x
  x   x   x   x

The O represent forbidden spaces...

So it will press right, then right again, then (right or top depending on where the 4 has created) then will proceed to complete the chain until it gets:

2048智能算法讨论

So now the model and chain are back to:

 512 256 128  64
  4   8  16   32
  X   X   x   x
  x   x   x   x

Second pointer, it has had bad luck and its main spot has been taken. It is likely that it will fail, but it can still achieve it:

2048智能算法讨论

Here the model and chain is:

  O 1024 512 256
  O   O   O  128
  8  16   32  64
  4   x   x   x

When it manages to reach the 128 it gains a whole row is gained again:

  O 1024 512 256
  x   x  128 128
  x   x   x   x
  x   x   x   x

I copy here the content of a post on my blog

The solution I propose is very simple and easy to implement. Although, it has reached the score of 131040. Several benchmarks of the algorithm performances are presented.

2048智能算法讨论

Algorithm

Heuristic scoring algorithm

The assumption on which my algorithm is based is rather simple: if you want to achieve higher score, the board must be kept as tidy as possible. In particular, the optimal setup is given by a linear and monotonic decreasing order of the tile values. This intuition will give you also the upper bound for a tile value: 2048智能算法讨论 where n is the number of tile on the board.

(There's a possibility to reach the 131072 tile if the 4-tile is randomly generated instead of the 2-tile when needed)

Two possible ways of organizing the board are shown in the following images:

2048智能算法讨论

To enforce the ordination of the tiles in a monotonic decreasing order, the score si computed as the sum of the linearized values on the board multiplied by the values of a geometric sequence with common ratio r<1 .

2048智能算法讨论

Several linear path could be evaluated at once, the final score will be the maximum score of any path.

Decision rule

The decision rule implemented is not quite smart, the code in Python is presented here:

@staticmethod
def nextMove(board,recursion_depth=3):
    m,s = AI.nextMoveRecur(board,recursion_depth,recursion_depth)
    return m

@staticmethod
def nextMoveRecur(board,depth,maxDepth,base=0.9):
    bestScore = -1.
    bestMove = 0
    for m in range(1,5):
        if(board.validMove(m)):
            newBoard = copy.deepcopy(board)
            newBoard.move(m,add_tile=True)

            score = AI.evaluate(newBoard)
            if depth != 0:
                my_m,my_s = AI.nextMoveRecur(newBoard,depth-1,maxDepth)
                score += my_s*pow(base,maxDepth-depth+1)

            if(score > bestScore):
                bestMove = m
                bestScore = score
    return (bestMove,bestScore);

An implementation of the minmax or the Expectiminimax will surely improve the algorithm. Obviously a more sophisticated decision rule will slow down the algorithm and it will require some time to be implemented.I will try a minimax implementation in the near future. (stay tuned)

Benchmark

T1 - 121 tests - 8 different paths - r=0.125
T2 - 122 tests - 8-different paths - r=0.25
T3 - 132 tests - 8-different paths - r=0.5
T4 - 211 tests - 2-different paths - r=0.125
T5 - 274 tests - 2-different paths - r=0.25
T6 - 211 tests - 2-different paths - r=0.5

2048智能算法讨论

In case of T2, four tests in ten generate the 4096 tile with an average score of 2048智能算法讨论 42000

Code

The code can be found on GiHub at the following link: https://github.com/Nicola17/term2048-AI It is based on term2048 and it's written in Python. I will implement a more efficient version in C++ as soon as possible.

I became interested in the idea of an AI for this game containing no hard-coded intelligence (i.e no heuristics, scoring functions etc). The AI should"know" only the game rules, and "figure out" the game play. This is in contrast to most AIs (like the ones in this thread) where the game play is essentially brute force steered by a scoring function representing human understanding of the game.

AI Algorithm

I found a simple yet surprisingly good playing algorithm: To determine the move for a given board, the AI plays the game til the end usingrandom moves. This is done many times while keeping track of the end score. Then the average end scoreper starting move is calculated. The move with the highest score is chosen.

Using just 100 runs per move the AI achieves the 2048 tile 80% of the times and the 4096 tile 50% of the times. Using 10000 runs gets the 2048 tile 100%, 70% for 4096 tile, and about 1% for the 8192 tile.

See it in action

The best achieved score is shown here:

2048智能算法讨论

An interesting fact about this AI is that the random-play games are (unsurprisingly) quite bad, yet choosing the best (or least bad) move leads to very good game play: A typical AI game can reach 70000 points and last 3000 moves, yet the random-play runs yield an average of 340 extra points and only 40 moves before dying. (You can see this for yourself by running the AI and opening the debug console.)

This graph illustrates this point: The blue line shows the board score after each move. The red line shows the AI's best end game score at that point. In essence, the red values are "pulling" the blue values upwards towards them, as they are the AI's best guess. Note that the red line is just a tiny bit over the blue line at each point, yet the blue line continues to increase more and more.

2048智能算法讨论

I find this quite surprising, that the AI doesn't need to actually foresee good game play in order to chose the moves that produce it.

Searching later I found this algorithm might be classified as a Pure Monte Carlo Tree Search algorithm.

Implementation and Links

First I created a JavaScript version which can be seen in action here. This version can run 100's of runs in decent time. Open the console for extra info. (source)

Later, in order to play around some more I used @nneonneo highly optimized infrastructure and implemented my version in C++. This version allows for up to 100000 runs per move and even 1000000 if you have the patience. Building instructions provided. It runs in the console and also has a remote-control to play the web version. (source)

Results

Surprisingly, increasing the number of runs does not drastically improve the game play. There seems to be a limit to this strategy at around 80000 points with the 4096 tile and all the smaller ones, very close to the achieving the 8192 tile. Increasing the number of runs from 100 to 100000 increases the odds of getting to around this score limit (from 5% to 40%) but not breaking through it.

Running 10000 runs with a temporary increase to 1000000 near critical positions managed to break this barrier less than 1% of the times achieving a max score of 129892 and a 8192 tile.

Improvements

After implementing this algorithm I tried many improvements including using the min or max scores, or a combination of min,max,and avg. I also tried using depth: Instead of trying K runs per move, I tried K moves per movelist of a given length ("up,up,left" for example) and selecting the first move of the best scoring move list.

Later I implemented a scoring tree that took into account the conditional probability of being able to play a move after a given move list.

However, none of these ideas showed any real advantage over the simple first idea. I left the code for these ideas commented out in the C++ code.

I did add a "Deep Search" mechanism that increased the run number temporarily to 1000000 when any of the runs managed to accidentally reach the next highest tile. This offered a time improvement.

I'd be interested to hear if anyone has other improvement ideas that maintain the domain-independence of the AI.

2048 Variants and Clones

Just for fun, I've also implemented the AI as a bookmarklet, hooking into the game's controls. This allows the AI to work with the original game andmany of its variants.

This is possible due to domain-independent nature of the AI. Some of the variants are quite distinct, such as the Hexagonal clone.

I think I found an algorithm which works quite well, as I often reach scores over 10000, my personal best being around 16000. My solution does not aim at keeping biggest numbers in a corner, but to keep it in the top row.

Please see the code below:

while( !game_over ) {
    move_direction=up;
    if( !move_is_possible(up) ) {
        if( move_is_possible(right) && move_is_possible(left) ){
            if( number_of_empty_cells_after_moves(left,up) > number_of_empty_cells_after_moves(right,up) ) 
                move_direction = left;
            else
                move_direction = right;
        } else if ( move_is_possible(left) ){
            move_direction = left;
        } else if ( move_is_possible(right) ){
            move_direction = right;
        } else {
            move_direction = down;
        }
    }
    do_move(move_direction);
}

Algorithm

while(!game_over)
{
    for each possible move:
        evaluate next state

    choose the maximum evaluation
}

Evaluation

Evaluation =
    128 (Constant)
    + (Number of Spaces x 128)
    + Sum of faces adjacent to a space { (1/face) x 4096 }
    + Sum of other faces { log(face) x 4 }
    + (Number of possible next moves x 256)
    + (Number of aligned values x 2)

Evaluation Details

128 (Constant)

This is a constant, used as a base-line and for other uses like testing.

+ (Number of Spaces x 128)

More spaces makes the state more flexible, we multiply by 128 (which is the median) since a grid filled with 128 faces is an optimal impossible state.

+ Sum of faces adjacent to a space { (1/face) x 4096 }

Here we evaluate faces that have the possibility to getting to merge, by evaluating them backwardly, tile 2 become of value 2048, while tile 2048 is evaluated 2.

+ Sum of other faces { log(face) x 4 }

In here we still need to check for stacked values, but in a lesser way that doesn't interrupt the flexibility parameters, so we have the sum of { x in [4,44] }.

+ (Number of possible next moves x 256)

A state is more flexible if it has more freedom of possible transitions.

+ (Number of aligned values x 2)

This is a simplified check of the possibility of having merges within that state, without making a look-ahead.

Note: The constants can be tweaked..

There is already an AI implementation for this game: here. Excerpt from README:

The algorithm is iterative deepening depth first alpha-beta search. The evaluation function tries to keep the rows and columns monotonic (either all decreasing or increasing) while minimizing the number of tiles on the grid.

There is also a discussion on ycombinator about this algorithm that you may find useful.

I wrote a 2048 solver in Haskell, mainly because I'm learning this language right now.

My implementation of the game slightly differs from the actual game, in that a new tile is always a '2' (rather than 90% 2 and 10% 4). And that the new tile is not random, but always the first available one from the top left.

As a consequence, this solver is deterministic.

I used an exhaustive algorithm that favours empty tiles. It performs pretty quickly for depth 1-4, but on depth 5 it gets rather slow at a around 1 second per move.

Below is the code implementing the solving algorithm. The grid is represented as a 16-length array of Integers. And scoring is done simply by counting the number of empty squares.

bestMove :: Int -> [Int] -> Int
bestMove depth grid = maxTuple [ (gridValue depth (takeTurn x grid), x) | x <- [0..3], takeTurn x grid /= [] ]

gridValue :: Int -> [Int] -> Int
gridValue _ [] = -1
gridValue 0 grid = length $ filter (==0) grid  -- <= SCORING
gridValue depth grid = maxInList [ gridValue (depth-1) (takeTurn x grid) | x <- [0..3] ]

I thinks it's quite successful for its simplicity. The result it reaches when starting with an empty grid and solving at depth 5 is:

Move 4006
[2,64,16,4]
[16,4096,128,512]
[2048,64,1024,16]
[2,4,16,2]

Game Over

Source code can be found here: https://github.com/popovitsj/2048-haskell

My strategy was simple:

Choose to make the highest tile in bottom-right corner
Never swipe the tiles up
Allways have 4 tiles on the bottom of the screen

2048智能算法讨论

up vote-4down vote

This algorithm is not optimal for winning the game, but it is fairly optimal in terms of performance and amount of code needed:

  if(can move neither right, up or down)
    direction = left
  else
  {
    do
    {
      direction = random from (right, down, up)
    }
    while(can not move in "direction")
  }

answered Mar 14 at 21:53

API-Beast
27818

11

Why does this (in your opinion) work? – aoeu Mar 18 at 19:04

3

It's a joke.... – Felipe Tonello Mar 19 at 18:39

5

it works better if you say


random from (right, right, right, down, down, up)

so not all moves are of equal probability. :) – Daren Mar 20 at 15:48

2

Yes, it is based on my own observation with the game. Until you have to use the 4th direction the game will practically solve itself without any kind of observation. This "AI" should be able to get to 512/1024 without checking the exact value of any block. – API-Beast Mar 20 at 18:44

2

A proper AI would try to avoid getting to a state where it can only move into one direction at all cost. – API-Beast Mar 20 at 18:47

show5 more comments

You can treat the computer placing the '2' and '4' tiles as the 'opponent'. — Wei Yen, Mar 15 at 2:53
@WeiYen Sure, but regarding it as a minmax problem is not faithful to the game logic, because the computer is placing tiles randomly with certain probabilities, rather than intentionally minimising the score. — koo, Mar 15 at 14:55
Even though the AI is randomly placing the tiles, the goal is not to lose. Getting unlucky is the same thing as the opponent choosing the worst move for you. The "min" part means that you try to play conservatively so that there are no awful moves that you could get unlucky. — FryGuy, Mar 16 at 4:17
I had an idea to create a fork of 2048, where the computer instead of placing the 2s and 4s randomly uses your AI to determine where to put the values. The result: sheer impossibleness. Can be tried out here:sztupy.github.io/2048-Hard — SztupY, Mar 17 at 1:03
@SztupY Wow, this is evil. Reminds me ofqntm.org/hatetris Hatetris, which also tries to place the piece that will improve your situation the least. — Patashu, Mar 17 at 2:27

秒客网

2048智能算法讨论

基础算法

Minimax

问题

基本思路

解题

Alpha-beta剪枝

针对2048游戏的实现

建模

格局评价

单调性

平滑性

空格数

孤立空格数

对对方选择的剪枝

搜索深度

算法的改进

参考文献

What is the optimal algorithm for the game, 2048?

11 Answers

Monotonicity

Smoothness

Free Tiles

Edit:

Performance

Implementation

Algorithm

Benchmark

Code

AI Algorithm

Implementation and Links

Results

Improvements

2048 Variants and Clones

相关文章