Microsoft Decloaks After Suphx AI Reaches Top Ranking

Today at the 2019 World Artificial Intelligence Conference held in Shanghai, Microsoft Global Executive VP Shen Xianyang announced that Microsoft Research Asia (MSRA) has created the strongest mahjong A.I. to date. Using advanced machine learning algorithms developed by MSRA, “Suphx” has a better grasp on how to deal with riichi’s uncertain information and evolving game state—a step up from mastering perfect information games like Go and Chess.

The Suphx AI reached 10-dan on Tenhou in just four months!

“ⓝ” indicates a non-player, or A.I. account, on Tenhou.

ⓝSuphx (or Super Phoenix) began training on Tenhou in March of this year. After 4,555 games played Suphx ascended to 10-dan ranking. This beats previous A.I. players “ⓝNAGA25” and “ⓝ爆打” which peaked at 8-dan and 9-dan, respectively. Being limited to free lobbies only, it is unlikely that ⓝSuphx will get an opportunity to rank any higher than the Phoenix room.

Link: ⓝSuphx’s Nodocchi

New A.I. Learning Tools

MSRA attributes part of Suphx’s success to application of new developments in machine learning that help train artificial intelligence algorithms. Adaptive decision making, prophetic coaching, and full forecasting help Suphx evaluate situations with imperfect information and look ahead for potential paths to success.

(What follows is stated to the best of my understanding based on what I have been able to find. The subject is complex and I am no expert, so forgive me if I miss the mark a little.)

Adaptive decision making is a process of expanding and contracting the field of possibilities that Suphx explores. This allows for a wide field of possibilities to remain open while little information is available at the beginning of a hand. The A.I. re-evaluates the game state and narrows its focus accordingly as more tiles are revealed.

Prophetic coaching is a training tool guiding the direction of machine learning during a self-training process. It utilizes hidden information to keep Suphx’s learning path on the right track as it analyzes the available information and statistical outcomes. This essentially keeps Suphx moving in the right direction as it learns to play without giving anything away.

Full forecasting helps Suphx keep the end goal and meta-game in focus. Full forecasting looks past the current hand instead of strictly analyzing past results to inform current decisions. If the goal is to achieve positive points at the end of the game, then first place might be considered optimal with second place an acceptable result. Suphx weighs point differences and statistical possibilities, and distributes its performance across the remaining rounds. This kind of thinking might allow Suphx to seek a draw or even throw a hand to another player in order to remain in a leading position.