Closing the Loop: Execution-Guided Continuous Generation for Adaptive Model Reasoning
We propose a feedback-driven decoding method where each generated candidate is iteratively refined using execution traces or reward-based adjustments. By conditioning generation on structured feedback from previous attempts, the method enforces progressive error minimization and adaptive correction. This approach enhances model reasoning, reduces compounding failure modes, and improves convergence in both code generation and reinforcement learning-based post-training.