You would probably try approaches like Monte Carlo Tree Search more than classical reinforcement learning with neural nets or other classifiers (or a combination of both methods). Core idea is to simulate the future of the game for any possible but random choice for the next input. You can then simulate, let's say 50 steps into the future, and take the step yielding to highest reward, based on your reward system. Here is a minimal implementation of MCTS for python and gym Answer from maxawake on reddit.com
🌐
Medium
medium.com › data-science › train-your-own-chess-ai-66b9ca8d71e4
Train Your Own Chess AI. Watch your creation defeat you | by Logan Spears | TDS Archive | Medium
August 17, 2021 - The input, representing the board position, can be encoded using bitboards (64 bits, one for each square on the chess board) for each piece type and a few remaining bits for move index, color to move, and en passant square. Together this input data forms a string of 808 bits (1s and 0s) that can be converted to floats and inputted into the model directly. Now that we planned out our inputs, outputs, and general model concept, lets start building a dataset. I play on Lichess.com because it’s free, open source, and run by developers.
🌐
LazyGuy-_-'s Website
lazy-guy.github.io › blog › chessllama
Chess Llama - Training a tiny Llama model to play chess
Chess Llama was trained using the Huggingface Transformers. The training consisted of 5 epochs with batch size of 16, on a single Nvidia L4 GPU for 18 hours, using the Google Cloud's Vertex AI platform.
Discussions

How do you train Agent for something like Chess or Game of the Generals?
You would probably try approaches like Monte Carlo Tree Search more than classical reinforcement learning with neural nets or other classifiers (or a combination of both methods). Core idea is to simulate the future of the game for any possible but random choice for the next input. You can then simulate, let's say 50 steps into the future, and take the step yielding to highest reward, based on your reward system. Here is a minimal implementation of MCTS for python and gym More on reddit.com
🌐 r/reinforcementlearning
17
9
May 6, 2021
Machine Learning in chess
I want to dip my toe into Machine Learning and I want to do it in chess. True, that we have already opensource `LC0` and `Stockfish NNUE`, but I think that the level of `ML` used here is very specialized, deep and niche. I don't need an exceptionally strong chess strength, but instead I'm looking ... More on chess.com
🌐 chess.com
July 7, 2021
How would I go about making a chess engine using NN..
What you're describing isn't too far off AlphaZero, which uses the complementary strengths of search and neural nets. The idea is you train a neural network to predict the probability of winning from a certain board state, then use monte carlo tree search to simulate the board running out in different ways, where the value of each future state reached is evaluated using the neural network. The network is trained using reinforcement learning and AlphaZero marked the first time neural nets had been used in chess with great results. Deepmind also published a paper this year on training an engine using supervised learning, the model took in the board state as an input and the output is the next move. It learnt from 10 million example games generated by stockfish and is nowhere near as good as AlphaZero, but still plays at grandmaster level. More on reddit.com
🌐 r/learnmachinelearning
11
4
July 15, 2024
Training an AI chess model- Dataset with list of FEN boards?
I would read the alpha zero paper , maybe it gives some clues into how to represent the chess board, or ask on the leela or stockfish discord. As for a dataset of fen boards, you can generate your own by just getting a pgn file with thousands or millions of games, and then generating a fen for each move. I would read the alpha zero paper first though to see if that's even desirable. I believe alpha zero used some form of reinforcement learning with self play so it wasnt exactly taking in random fens, it was playing itself and based on the results of the game updating the model or something. More on reddit.com
🌐 r/chess
3
1
December 10, 2022
🌐
Medium
medium.com › @nihalpuram › training-a-chess-ai-using-tensorflow-e795e1377af2
Training a Chess AI using TensorFlow | by Nihal Puram | Medium
September 14, 2021 - Training a Chess AI using TensorFlow Using the power of deep learning and Stockfish to train a neural network to play the game of chess. Why am I doing this? To put it plainly, I love chess. Don’t …
🌐
Maia Chess
maiachess.com
Maia Chess
PLAYarrow_drop_downANALYSISPUZZLESDRILLSBOT-OR-NOTBROADCASTSCANDIDATES ... Maia is a neural network chess model that captures human style. Enjoy realistic games, insightful analysis, and a new way of seeing chess.
🌐
Xebia
xebia.com › home › blog › no rules, just data: how my chess engine learned the game on its own
Train Chess Engine With Data-driven Learning And No Preset Rules
July 5, 2025 - Predicting the outcome of a position gives the model strategic context that pure move prediction lacks. What amazed me most was that Knightmare learned to play chess without ever being told what chess is. No hardcoded rules. No concepts of "control the center" or "develop your pieces."
🌐
GitHub
github.com › zjeffer › chess-deep-rl
GitHub - zjeffer/chess-deep-rl: Research project: create a chess engine using Deep Reinforcement Learning · GitHub
The client: https://ghcr.io/zjeffer/chess-rl_selfplay-client:latest · To know whether the new network is better than the previous one, let the new network play against the previous best for a high amount of games. Whoever wins the most games, is the new best network. Use that network to self-play again. Repeat indefinitely. I tried this with the newest network against a completely random neural network. These are the results after 10 games: Evaluated these models: Model 1 = models/randommodel.h5, Model 2 = models/model.h5 The results: Model 1: 0 Model 2: 5 Draws: 5
Starred by 175 users
Forked by 13 users
Languages   Jupyter Notebook 73.0% | Python 23.7% | TeX 3.0% | Shell 0.3%
🌐
Reddit
reddit.com › r/reinforcementlearning › how do you train agent for something like chess or game of the generals?
r/reinforcementlearning on Reddit: How do you train Agent for something like Chess or Game of the Generals?
May 6, 2021 -

I was thinking of doing an environment and some testing of RL methods on a game called Game of Generals using OpenAI Gym. But my biggest question is training the agent.

To train it, my intuition is that I need tons of replays of the game being played encoded into something that can be digested by the code, right?

How do you train something like chess or Game of the Generals on its own? Is it possible?

Top answer
1 of 3
8
You would probably try approaches like Monte Carlo Tree Search more than classical reinforcement learning with neural nets or other classifiers (or a combination of both methods). Core idea is to simulate the future of the game for any possible but random choice for the next input. You can then simulate, let's say 50 steps into the future, and take the step yielding to highest reward, based on your reward system. Here is a minimal implementation of MCTS for python and gym
2 of 3
2
I can speak for chess ( because to be honest I am working on a project like that ). The way I did it or to be fair the way I am trying to do it is using DQN. so as an input I take the chess position and encode it ( using either the bitmap representation or the features representation ) I feed that to the NN and I expect the move to play as an output. The output is actually Q-values for each move ( action ), and I choose the move with the highest Q-value. In the training phase, I tweak the Q-values by using the rewards and the Q-values from the next position, and by doing that the agent starts the process of learning. Where can you get the training examples? You get them by storing the attempts of the agent in playing the game ( you store the position, action, reward, and the next position ). At first, these attempts are kinda random but after some training and tweaking of Q-values, the play of the agent becomes much accurate. But for games like chess ( which has so many states, approx. 10^45 ), the training phase can become daunting because you need millions of games or even more for the DQN to start converging.
Find elsewhere
🌐
Carlini
nicholas.carlini.com › writing › 2023 › chess-llm.html
Playing chess with large language models - Nicholas Carlini
September 22, 2023 - Building a chess bot that queries GPT-3.5-turbo-instruct to play chess at the level of a skilled human player.
🌐
Chess.com
chess.com › forum › view › general › machine-learning-in-chess
Machine Learning in chess - Chess Forums - Chess.com
July 7, 2021 - It learned how to play by using the base rules to figure out what actually worked and what did not. It literally learned by trial and error. ... #1 "requiring to train trillions of training data before an engine improves in strength" AlphaZero reached top grandmaster strength by playing 700000 games against itself.
🌐
YouTube
youtube.com › major league hacking
Train an AI to Play Chess (Part 1) - YouTube
Today we're going to start building an AI to play chess. We'll look at how to format a chess board for a machine learning model, learn about a useful chess p...
Published   May 2, 2023
Views   1K
🌐
arXiv
arxiv.org › html › 2409.12272v2
Mastering Chess with a Transformer Model
December 29, 2025 - Transformer models have demonstrated impressive capabilities when trained at scale, excelling at difficult cognitive tasks requiring complex reasoning and rational decision-making. In this paper, we explore the application of transformers to chess, focusing on the critical role of the position ...
🌐
Reddit
reddit.com › r/learnmachinelearning › how would i go about making a chess engine using nn..
r/learnmachinelearning on Reddit: How would I go about making a chess engine using NN..
July 15, 2024 -

I'm very new to neural networks and machine learning in general has managed to make one from scratch that teaches an agent ot figure out the direction up to its food (it sort of works).. I then had this question in mind on how a chess engine can be made using a neural network.. One opinion I got was to calculate all the possible moves in a given board state and simulate each game based on a rank output by the network. This sounded computationally expensive and another method I found is to feed in many real world chess game data in some sort of notation or format. This could be a specific board state followed by a change and the neural network might use things like gradient descent to adjust weights and biases based on many board states in many chess matches. This is my depiction of making a chess engine and I know it could be wrong so (please correct me) and yeah, I'm just exploring this realm and has been a question I liked to ask about..

🌐
freeCodeCamp
freecodecamp.org › news › create-a-self-playing-ai-chess-engine-from-scratch
Create a Self-Playing AI Chess Engine from Scratch with Imitation Learning
September 21, 2023 - You can copy the file path from the folder structure, or if you are on Windows 11 you can press ctrl + shift + c to automatically copy the file path. Great! Now you have Stockfish available in Python! Now you need a dataset so you can train the AI chess engine! You can do this by making Stockfish play games and remembering each position and the moves you could take from there.
🌐
Substack
paulabartabajo.substack.com › p › lets-build-a-chess-game-using-small
Let's build a Chess game using a small and local Language Model
September 6, 2025 - The data is in PGN format, which is a standard format for chess games. Feel free to expand the dataset to include more games by downloading other players' games from this URL. ... For example, the extracted data for Magnus Carlsen is stored in a json file that we push to the Hugging Face hub as Paulescu/MagnusInstruct. ... We used supervised fine-tuning to train the model to "imitate" Magnus.
🌐
Noctie
noctie.ai
Noctie.ai (chess AI)
Noctie is a chess AI that mimics human play from beginner to grandmaster level. That means, humanlike opening choice, mistakes and move timings. Trained on billions of games between humans and carefully fine-tuned over several years.
🌐
GitHub
github.com › k-lombard › Deep-Learning-Chess-AI
GitHub - k-lombard/Deep-Learning-Chess-AI · GitHub
After implementing the rest of the AI, namely the alpha-beta search algorithm that identifies a move by comparing positions with our trained neural net, we then assessed the accuracy of the moves that the AI would output. This metric was more subjective as we were not able to objectively judge the quality of a move. However, based on our personal knowledge of chess and chess analysis software, we were able to estimate the quality of our chess AI’s moves. One final way of measuring the quality of our model was to actually have our AI play different opponents of varying strengths and assign an ELO rating to the engine.
Starred by 9 users
Forked by 3 users
Languages   Python 96.8% | C 2.5% | Cython 0.6% | Fortran 0.1% | Shell 0.0% | PowerShell 0.0%
🌐
Tel Aviv University
cs.tau.ac.il › ~wolf › papers › deepchess.pdf pdf
DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess
The training relies entirely on datasets of several million · chess games, and no further domain specific knowledge is incorporated. The experiments show that the resulting neural network (referred to as · DeepChess) is on a par with state-of-the-art chess playing programs, which
🌐
Chessprogramming
chessprogramming.org › Deep_Learning
Deep Learning - Chessprogramming wiki
In 2015, Matthew Lai trained Giraffe's deep neural network by TD-Leaf [11]. Zurichess by Alexandru Moșoi uses the TensorFlow library for automated tuning - in a two layers neural network, the second layer is responsible for a tapered eval to phase endgame and middlegame scores [12]. In 2016, Omid E. David, Nathan S. Netanyahu, and Lior Wolf introduced DeepChess obtaining a grandmaster-level chess playing performance using a learning method incorporating two deep neural networks, which are trained using a combination of unsupervised pretraining and supervised training.