Joonkyu Min

I am an undergraduate student at Seoul National University, majoring in Electrical and Computer Engineering. I am currently an intern at CLVR Lab, where I work on robot safety under supervision of Joseph J. Lim.

Email  /  GitHub  /  LinkedIn  / 


What is Value Iteration?

April 13, 2025

In classical dynamic programming (DP) for reinforcement learning with a discrete MDP, value iteration is a method used to compute the optimal value function.

By repeatedly applying the Bellman optimality operator to the value function, and assuming $0 < \gamma < 1$, the value function converges to the true optimal value function $V^*$.

Although value iteration cannot be directly applied to real-world problems with large or continuous state spaces, it provides a background for modern deep reinforcement learning methods, such as Deep Q-Networks (DQN).


Reference

R. Sutton, A. Barto, Reinforcement Learning: An Introduction. The MIT Press, 2018.


Design and source code from Jon Barron's website