How Far Is Video Generation From World Model