“Reinforcement learning may need to go through millions of years in simulation time to actually be able to learn a policy,” Suh adds. Yet physics-based approaches aren’t as effective as reinforcement learning when it comes to contact-rich manipulation planning — Suh and Pang wondered why. They conducted a detailed analysis and found that a technique known as smoothing enables reinforcement learning to perform so well. Reinforcement learning performs smoothing implicitly by trying many contact points and then computing a weighted average of the results. In each instance, their model-based approach achieved the same performance as reinforcement learning, but in a fraction of the time.