Reinforcement Learning — Sandeep Danda

What

Coursework from grad school. A set of notebooks that walk through classic reinforcement learning on small, legible environments before pointing the same methods at a toy stock-trading setup.

Approach

Started with tabular Q-learning and SARSA on gridworlds to build intuition for exploration versus exploitation. Moved to Deep Q-Networks with experience replay for continuous state spaces, and applied the same code to a tiny trading environment built on top of OpenAI Gymnasium.

Stack

Python, NumPy, PyTorch, Jupyter, Gymnasium. No GPU needed for any of it.