PerfectPhysics is an evaluation suite which tests how well world models and video generations adhere to well-known physics constants and laws.
The benchmark includes curated physics experiments spanning from motion physics (e.g. gravity, projectile motion) to fluid dynamics (e.g. viscosity, fluid motion). Each scenario will have initial context frames and will require models to generate the future frames. Core physics constants such as gravitation acceleration, viscosity, friction, and more will then be estimated from generated videos. Each challenge will focus on a different physics experiment and will be evaluated on a different set of physics constants.
Submissions will be evaluated using an internal pipeline which differs depending on the experiment. For example, motion physics experiments will typically use a SAM-based segmentation mask to determine depth-calibrated position for accurate gravity estimation. The goal is to measure whether the video generation models respect physical constraints (such as free-fall gravitational acceleration, friction, viscosity, etc.) across the generated frames.