Many existing AI methods aren’t able to overcome this conflict, known as the stabilize-avoid problem, and would be unable to reach their goal safely. MIT researchers have developed a new technique that can solve complex stabilize-avoid problems better than other methods. First, they reframe the stabilize-avoid problem as a constrained optimization problem. Then for the second step, they reformulate that constrained optimization problem into a mathematical representation known as the epigraph form and solve it using a deep reinforcement learning algorithm. “But deep reinforcement learning isn’t designed to solve the epigraph form of an optimization problem, so we couldn’t just plug it into our problem.