What Makes MetaStone-S1 the Leading Reflective Generative Model for AI Reasoning?
Researchers from MetaStone-AI & USTC introduce a reflective generative model, MetaStone-S1, which attains OpenAI o3-mini’s performance through a new Reflective Generative Form. Key Innovations Reflective Generative Form Unified Policy and Reward Modeling: MetaStone-S1 integrates the policy model (for generating reasoning trajectories) and the step-level Process Reward Model (PRM) into a single architecture, using shared parameters. This…
