首页 > 其他分享 >Leela Chess Zero

Leela Chess Zero

时间:2022-12-13 09:24:22浏览次数:66  
标签:Lc0 Zero Chess AlphaZero Leela MCTS

Leela Chess Zero - Chessprogramming wiki

Leela Chess Zero is initiated and announced by Stockfish co-author Gary Linscott. Leela Chess is open source. The goal is to build a strong chess playing entity following the same type of deep learning along with Monte-Carlo tree search (MCTS) techniques of AlphaZero as described in DeepMind's 2017 and 2018 papers, but using distributed training for the weights of the deep convolutional neural network (CNN, DNN, DCNN). Leela Chess Zero由Stockfish的合著者加里·林斯科特发起并宣布。Leela Chess是开源的。目标是按照DeepMind 2017年和2018年论文中所述的AlphaZero的蒙特卡洛树搜索(MCTS)技术,建立一个强大的下棋实体,但使用深度卷积神经网络(CNN、DNN、DCNN)权重的分布式训练。

Lc0 is the actual chess engine that performs the MCTS and reads the self-taught CNN, which weights are persistent in a separate file. Lc0 is written in C++ and may be compiled for various platforms and backends. Lc0是实际的国际象棋引擎,它执行MCTS并读取自学的CNN,这些权重保存在单独的文件中。Lc0是用C++编写的,可以针对各种平台和后端进行编译。

Like AlphaZero, Lc0's evaluates positions using non-linear function approximation based on a deep neural network, rather than the linear function approximation as used in classical chess programs. This neural network takes the board position as input and outputs position evaluation (QValue) and a vector of move probabilities (PValue, policy). Once trained, these network is combined with a Monte-Carlo Tree Search (MCTS) using the policy to narrow down the search to high-probability moves, and using the value in conjunction with a fast rollout policy to evaluate positions in the tree. The MCTS selection is done by a variation of Rosin's UCT improvement dubbed PUCT (Predictor + UCT). 与AlphaZero一样,Lc0使用基于深度神经网络的非线性函数近似来评估局面,而不是经典象棋程序中使用的线性函数近似。该神经网络将棋盘位置作为输入,并输出局面评估(QValue)和着法概率向量(PValue,policy)。一旦经过训练,这些网络将与蒙特卡洛树搜索(MCTS)相结合,使用该策略将搜索范围缩小到高概率着法,并将该值与快速展开策略相结合以评估树中的局面。MCTS的选择是由Rosin的UCT改进(称为PUCT(Predictor+UCT))的变种完成的。

When they say that the rollout policy (I believe they borrowed the term "rollout" from backgammon [西洋双陆棋]) is a linear softmax function they're referring to a generalization of the sigmoid function used in logistic regression. [link]

Like in AlphaZero, the Zero suffix implies no other initial knowledge than the rules of the game, to build a superhuman player, starting with truly random self-play games to apply reinforcement learning based on the outcome of that games. 与AlphaZero一样,Zero后缀意味着除了游戏规则之外,没有其他初始知识,以打造超人玩家,从真正随机的自我游戏开始,根据游戏结果应用强化学习。

The distributed training is realized with an sophisticated client-server model. The client, written entirely in the Go programming language, incorporates Lc0 to produce self-play games. Controlled by the server, the client may download the latest network, will start self-playing, and uploading games to the server, who on the other hand will regularly produce and distribute new neural network weights after a certain amount of games available from contributors. The training software consists of Python code, the pipeline requires NumPy and TensorFlow running on Linux. The server is written in Go along with Python and shell scripts. 分布式训练是通过复杂的客户端-服务端模型实现的。该客户端完全用Go语言编写,结合了Lc0来制作自己和自己下的游戏。在服务器的控制下,客户端可以下载最新的网络,开始游戏,并将游戏上传到服务器,另一方面,服务器将在贡献者提供一定数量的游戏后定期生成和分发新的神经网络权重。训练软件包含Python代码,流程需要在Linux上运行NumPy和TensorFlow。服务端是用Go, Python和shell脚本写的。

象棋旋风NNUE是象棋旋风开发团队2021年设计的象棋引擎,在普通笔记本电脑上就能碾压人类象棋冠军。

标签:Lc0,Zero,Chess,AlphaZero,Leela,MCTS
From: https://www.cnblogs.com/funwithwords/p/16977671.html

相关文章

  • matlab/simulink中如何使用ones/zeros(变量,变量)不报错
    1.脚本声明变量%使用脚本声明结构体变量m并创建simulink.busclcclearm.a1=[333];busInfo=Simulink.Bus.createObject(m);2.在simulink中使用ones报......
  • 1812.determine-color-of-a-chessboard-square 判断国际象棋棋盘中一个格子的颜色
    问题描述1812.判断国际象棋棋盘中一个格子的颜色解题思路太简单了,不写代码classSolution{public:boolsquareIsWhite(stringcoordinates){if((co......
  • mysql中的zeroDateTimeBehavior=convertToNull
    Cannotconvertvalue'0000-00-0000:00:00'fromcolumn1toTIMESTAMP在Mysql数据库中使用DATETIME类型来存储时间,使用JDBC中读取这个字段的时候......
  • lintcode:Trailing Zeros
    15:00StartWriteanalgorithmwhichcomputesthenumberoftrailingzerosinnfactorial.Example11!=39916800,sotheoutshouldbe2ChallengeO(logN)time......
  • [ZMQ] ZeroMQ rk3308 性能测试
    目录stepLatencyTest(one-waylatency)tcpinprocThroughputTesttcpinprocstepadbpush所有附件到udisk目录adbpush的文件添加执行权限LatencyTest(one-waylaten......
  • 283. Move Zeroes
    Givenanarray nums,writeafunctiontomoveall 0'stotheendofitwhilemaintainingtherelativeorderofthenon-zeroelements.Forexample,given num......
  • [LeetCode] 2225. Find Players With Zero or One Losses
    Youaregivenanintegerarray matches where matches[i]=[winneri,loseri] indicatesthattheplayer winneri defeatedplayer loseri inamatch.Return......
  • 零次学习(Zero-Shot Learning)入门zz
    很久没有更文章了,主要是没有找到zero-shotlearning(ZSL)方面我特别想要分享的文章,且中间有一段时间在考虑要不要继续做这个题目,再加上我懒(¬_¬),所以一直拖到了现在。最近......
  • 云服务器安装ZeroTier moon服务
    云服务器搭建moon服务笔记,用的​​三丰云​​,免费服务器,服务器系统,装了两天才算是装完。系统用的CentOS7.6x86_64(64bit)(宝塔面板),装好第一次,用的网站上的VNC连接,连接成功......
  • 零日攻击(Zero-Day Attacks)
    前言零日攻击是网络安全行业中十分常见的攻击方式之一,其具有很大的突发性与破坏性,那么零日攻击是什么?如何防范零日攻击?以下是详细的内容介绍。零日攻击是什么?零......