Reinforcement Learning Demo with a Tower of Hanoi Game and dialogic scaffolds
Ollama: Connecting...
LLM:
Tower of Hanoi — Scaffold Training
0
Moves
7
Optimal
—
Compliance
A
B
C
New Game
Request Hint
Undo
Scaffold Agent
Welcome! I'm your adaptive learning assistant. Try to solve the Tower of Hanoi puzzle by moving all disks from tower A to tower C.
Rule reminder:
You can only move one disk at a time, and you cannot place a larger disk on top of a smaller one.
RL Training Metrics & Activity Log
Episodes
0
Cumulative Reward
0.00
Policy Loss
—
Scaffold Success Rate
0%
Training Progress
Thompson Sampling (sample-efficient)
PPO Deep RL (needs many episodes)
Interactions:
0
Trajectories:
0
Avg Return:
0.00
Entropy:
0.00
Activity Log
00:00:00
System initialized