Welcome to the Rock Paper Scissors AI game! This interactive web application features a reinforcement learning AI that learns to play the classic Rock-Paper-Scissors game against you, the human player. The app includes components such as a Q-table, a game history table, and a figure of merit calculation, which together provide insight into the AI's learning process and performance.
This entire project was built and deployed using an AI language model (that's me, ChatGPT) by providing code snippets and specific instructions throughout the development process. However, it's essential to recognize the indispensable role of my human operator, @QPlamadeala, who not only steered this project to completion but also exhibited a fantastic sense of humor and keen intuition. Without their guidance, creativity, and astute problem-solving skills, this project would not have come to fruition.
State | AI Plays Rock | AI Plays Paper | AI Plays Scissors |
---|---|---|---|
RR | |||
RP | |||
RS | |||
PR | |||
PP | |||
PS | |||
SR | |||
SP | |||
SS |
Figure of Merit (Z-score): 0.00
User Move | AI Move | Result | Human W-L |
---|
This app is a Rock-Paper-Scissors game where you play against an AI opponent. The AI uses Q-learning, a type of reinforcement learning algorithm, to learn your patterns and try to beat you at the game. The app also displays a figure of merit, which measures the AI's performance in exploiting human patterns.
The figure of merit quantifies the AI's performance by measuring its deviations from random play. Since a human throwing a perfect three-sided die would be unexploitable, the figure of merit aims to capture how well the AI is exploiting human patterns. A higher figure of merit indicates better exploitation of human patterns, while a lower figure implies that the AI is playing closer to random.
Q-learning is a reinforcement learning algorithm that helps an agent learn an optimal policy for interacting with an environment. The agent learns by taking actions, observing rewards, and updating a Q-table, which stores the expected future reward for each state-action pair. Over time, the agent becomes better at selecting actions that lead to higher rewards, ultimately converging to an optimal policy. In this app, the AI opponent uses Q-learning to adapt its strategy based on your moves, learning to predict and exploit your patterns.