Framing Our Experience With Artificial Intelligence
An academic paper has been circulating recently summarizing an artificial intelligence deep learning model that attempts to improve upon the subjective accuracy of AI decision making in imperfect information games, in this case using riichi as the game of choice. It’s full of discipline specific language and information that is going to take some unpacking unless you’re already familiar with the field, but I’ve included some links at the end of this article to help you out if needed. (You can find the original paper on Cornell University’s arXiv server.)
What interested me enough to write about was inclusion of past scenarios as part of the decision making matrix and how it models human learning. (Which it should, I guess, since that’s what deep learning is meant to do.) The AI model stores information on its own hand and the discards and calls of all players for up to six previous hands, as well as data points on riichi and kans for the past hand only. (Page 5, paragraph 3) This information is intended to provide a rolling short-term memory of game states with which it can inform decisions on what to discard and what might be too risky while accounting for a change in opponent strategies as their place in the game changes. “Learning” from this information happens over time as the accuracy of the AI’s predictions based on input and past results are evaluated, and the system attempts to optimize itself towards a given objective; the more data the better.
I am struck by how this spells out and attempts to codify something many of us likely take for granted: experience.
(If you have a refined knowledge of human learning and neural pathways, please forgive me as I paint with some pretty broad strokes now.)
Humans are blessed with short term and long term memories, both of which are tapped during decision making processes, but play different roles in our day to day considerations. Long term memory provides information built on the foundation of everything an individual has gone through before, stored in the substrate of the brain with neurons forming and pathways reinforced as similar experiences are repeated over time.
Short term memories help provide the context for the immediate situation—the stuff that has just happened. The brain attempts to take immediate observations and make predictions on what might happen next by passing those observations through the short and long term memories and extracting patterns, all of which is later added to the body of experience during sleep.
My point in bringing this up is that playing mahjong involves a very complex series of decisions in an ever-changing field of imperfect information. You can never know with 100% certainty that the choice you are making with each discard is the right one. Studying theory and memorizing statistics can help sharpen your focus, but there will never be a better way to learn than by building your own body of experience.
The more data the better, so get out there and play.
(Oh, also, the AI model described scored 1850 on Tenhou in 300 hanchan, compared to an average 1600 scored by intermediate human players at the same count.)
Comment below!
Reference Articles on Artificial Intelligence