_______ __ ________ __
| \ | \ | \ | \
| βββββββ\ ______ ______ \ββ_______ \ββββββββ ______ \ββ _______
| ββ__/ ββ/ \ | \| \ \ | ββ / \| \/ \
| ββ ββ ββββββ\ \ββββββ\ ββ βββββββ\ | ββ | ββββββ\ ββ βββββββ
| βββββββ\ ββ \ββ/ ββ ββ ββ | ββ | ββ | ββ \ββ ββ\ββ \
| ββ__/ ββ ββ | βββββββ ββ ββ | ββ | ββ | ββ | ββ_\ββββββ\
| ββ ββ ββ \ββ ββ ββ ββ | ββ | ββ | ββ | ββ ββ
\βββββββ \ββ \βββββββ\ββ\ββ \ββ \ββ \ββ \ββ\βββββββ
Welcome to the BrainTris Lab!
In this application, we can teach a neural network to play Tetris. Will it learn to play better
than a human? The goal of the application is to demonstrate the learning capabilities of neural
networks, specifically the DQN neural network, and to understand the
Q-Learning
method.
Given a neural network and a Tetris game. During training, the network "sees" the game board:
- the fixed, fallen elements (the stack content)
- the position and orientation of the current element in play
- all the holes that have formed during the game
- the next piece
The network's reward and punishment can be controlled based on the specified metrics of the game
board and the goal, the number of lines cleared.
Tetris
Tetris - a timeless classic
Tetris is not just a video game; it's a cultural phenomenon that has captivated players
worldwide for more than four decades. Its simple yet addictive mechanics, as well as its ability
to combine strategic thinking and quick reflexes, make it one of the most iconic and influential
games in history.
The Origins and Concept
Tetris was created in 1984 by Alexey Pajitnov, a Soviet software engineer. The idea originated
from pentominoes, which are geometric shapes consisting of five identical squares. Pajitnov
decided to work with shapes made of four squares, hence the "tetra" prefix, which means four in
Greek. Combined with the Russian word "tennis," the name "Tetris" was formed.
The basic concept of the game is extremely simple: blocks of various shapes, called tetrominos,
fall from the top of the screen. The player must rotate and place these blocks to create
complete, gap-free lines at the bottom of the screen. As soon as a line is completed, it
disappears, scoring points for the player and making room for new blocks. The game ends when the
blocks reach the top of the screen, and there is no more space for new ones.
Gameplay and Strategy
Tetris's addictive appeal lies in its ease of learning, but difficulty in mastering. As the game
progresses, the blocks fall faster and faster, putting increasing pressure on the player to make
quick and precise decisions. Successful gameplay requires not only fast fingers but also
strategic planning. The player must think ahead, considering the shapes of the next few incoming
blocks, and place the current block accordingly to maximize the chances of clearing lines.
Particularly important is the T-spin maneuver, where a T-shaped block is rotated into a tight
space, giving special points. In addition, the Tetris clear β clearing four lines simultaneously
with a straight "I" block β results in the highest score, and is often the goal of professional
players.
The Impact and Legacy of Tetris
Tetris achieved instant success in the Soviet Union, then quickly spread worldwide after being
licensed and released on various platforms. With the launch of the Nintendo Game Boy handheld
console in 1989, Tetris became explosively popular. The game is often cited as one of the main
reasons the Game Boy was a huge success, selling millions of copies worldwide.
Over the years, Tetris has seen countless incarnations, appearing on almost every imaginable
platform, from arcades to mobile phones, modern consoles, and PCs. Competitions are held, it is
the subject of scientific research (for example, the phenomenon known as the "Tetris effect,"
where people see blocks in their minds even while sleeping), and it has become deeply embedded
in popular culture.
Tetris's enduring appeal lies in its universality. No language knowledge is needed to understand
it, and despite its simplicity, it can be endlessly deep and challenging. This is the kind of
game that can be played for five minutes on a bus, or for hours at a time, immersed in the
meditative rhythm of arranging blocks. Tetris is not just a game; it's an enduring puzzle that
has entertained and challenged people for generations, and will likely remain with us for a very
long time.
Game and Learning Modes
The application offers three distinct game modes:
- Human play
- Network training
- Network play
Human play
In this game mode, we can try out Tetris games playable by a human. Control is possible with the
arrow keys and the space bar.
- [ ← ]: move element left
- [ → ]: move element right
- [ ↓ ]: drop element fast
- [ SPACE ]: rotate element
We get points for dropped elements and cleared lines. After a certain number of cleared lines,
the game levels up, and the falling speed of the elements increases. If the game board (stack)
fills up, the game ends.
Network training (Train mode)
This mode provides an opportunity to train a neural network with the specified parameters. The
network receives visual input from the game board, including fallen elements, the position and
orientation of the current element, the holes that have formed, and the next piece. The
network's reward and punishment depend on the game board's metrics and the number of lines
cleared.
The Train mode includes the following sub-menu items:
- [ Start]: Starts the network training process.
- [ Save ]: Allows saving the current training state to a file.
- [ Load ]: Allows loading a saved training state from a .zip file.
-
[ Select File ] : In the [Load] menu, you can browse for the file to be loaded, or drag and
drop it into the drop-zone area.
-
[ Options ]: Opens a modal window for setting training parameters. (See: Detailed
description of Training Options.)
- [ Reset ]: Resets the training state to default settings.
Play mode
This game mode allows observing the trained neural network's gameplay in real conditions. If you
hold down the down arrow key during the network's play, the network's game speeds up.
Stop
The running process of any game mode can be interrupted with the stop function. The system then
switches to idle, waiting for further commands.
Training Options
Clicking the [Options] button opens a modal window where parameters for training the neural
network can be set. These parameters influence the reward weights and training behavior.
Reward and Punishment
Positive parameter values indicate reward, negative values indicate punishment. When designing
the reward system, we can tune the following characteristics:
- [Completed Lines]: The weight of the reward received for completed lines.
-
[Lines Cleared Exponent]: The exponent of the number of lines cleared, which affects the
increase of reward with the increase in the number of lines cleared at once.
-
[Aggregate Height]: Reward/punishment for the total height of the stack (usually negative).
- [Holes]: Reward/punishment for the number of holes (usually negative).
- [Bumpiness]: Reward/punishment for the unevenness of the stack (usually negative).
- [Well Depth]: Reward/punishment for the depth of "wells" (either negative or positive).
-
[Penalty Row]: The weight of punishment for penalty rows. If no line is cleared from the
stack for a while, the game punishes the player by inserting a randomly filled bottom row.
We can also associate a negative reward with this during training.
-
[Game End Reward]: Reward/punishment received at the end of the game (usually negative).
-
[Survival Reward]: The weight of the reward received for survival (successful placement of
an element without ending the game). (Usually positive).
Statistical Moving Average Windows
Moving average windows help us observe the long-term direction of the network's development. If
progress stalls or deteriorates over a longer number of games, the network has not found a
solution with the current reward system settings.
- [Moving Avg. Window]: The size of the short-term moving average window for statistics.
- [Long-Term Window]: The size of the long-term window for statistics.
The ratio of these two moving averages indicates the direction of progress.
Other Settings:
-
[Penalty in Training]: Activates penalty rows during training (makes it harder). This
difficulty makes the network learn for a longer period.
-
[Extra Tetrominoes]: Adds additional tetrominos to the game that are not part of classic
Tetris. A significant difficulty, extends the training process.
-
[Force Render to Learn]: Forces the full Tetris game to be rendered during training, which
can affect performance.
Operations
- [Reset to Defaults]: Resets all options to their default values.
- [Apply]: Applies the current settings, and starts or continues training.
- [Cancel]: Closes the options window without saving changes.
Training Panel and Statistics
Training Panel
When training mode is active, the "AI Panel" appears on the screen, providing real-time feedback
on the network's performance and the game's status.
AI Vision
Shows how the neural network "sees" the game board.
Numerical statistical data
- Games: Number of games played.
- Lines: Number of lines cleared so far in the current game.
- Avg Rows: Average line clear metric developing from game to game.
- Trend: Shows the learning trend based on short-term and long-term moving averages.
- Max Lines: The maximum number of lines cleared in a single game so far.
- Max Level: Maximum level reached.
- Upd./sec: Number of network updates per second.
Diagrams
-
Average Rows Cleared: A graph showing the evolution of average cleared rows. If it shows an
upward curve, learning is progressing well.
- Loss: A graph of the network's loss function (Mean Absolute Error).
-
Convergence: A graph of the average loss function's convergence. If it shows a downward
curve, the network is approaching a solution.
-
Learning Rate: A graph of the learning rate, which automatically changes to facilitate the
network's learning. The lower it is, the finer steps the network takes to find the correct
path.
Reward options
We can see the currently set values of the network's reward system.
Lab
Well, here we are at the end of an exciting journey, where the simple yet profound world of
Tetris met the complex challenges of artificial intelligence. With the BrainTris application, my
intention was not just to create a game, but also a laboratory where we can explore the power of
Q-Learning and the learning capabilities of neural networks in practice.
I hope that the tools and detailed documentation I have provided will inspire you to experiment.
Feel free to modify the reward system parameters, observe how different parameters influence the
network's behavior, and discover what strategies the AI adopts to master Tetris.
I wish you much success in your experiments and in creating a neural network that plays as
perfectly as possible! I trust that this application will not only provide entertainment but
also a valuable learning opportunity in the fascinating field of artificial intelligence and
machine learning. Explore the possibilities, and let BrainTris help you understand how an AI
thinks and learns!
Thank you for trying to create a smart artificial intelligence:
β β βββ βββ ββ βββ βββ βββ βββ βββ βββ
βββ βββ βββ βββ βββ ββ βββ β βββ βββ