What is the difference between value iteration and policy iteration? [closed]
Let’s look at them side by side. The key parts for comparison are highlighted. Figures are from Sutton and Barto’s book: Reinforcement Learning: An Introduction. Key points: Policy iteration includes: policy evaluation + policy improvement, and the two are repeated iteratively until policy converges. Value iteration includes: finding optimal value function + one policy extraction. … Read more