Agents are Decision-Makers First: Leveraging Graph of Decisions for Intermediate Reward Modeling

GoD-IRM introduces intermediate reward modeling for structured decision-making in language models, assigning rewards at each divergence point in a reasoning trajectory. This approach enables fine-grained credit assignment, improving model robustness in long-horizon problem-solving. By reinforcing decision-making rather than just final outputs, GoD-IRM aligns language models more closely with traditional agent-based RL.

February 2025 · Diksha Shrivastava, Mann Acharya, Dr. Tapas Badal

Closing the Loop: Execution-Guided Continuous Generation for Adaptive Model Reasoning

We propose a feedback-driven decoding method where each generated candidate is iteratively refined using execution traces or reward-based adjustments. By conditioning generation on structured feedback from previous attempts, the method enforces progressive error minimization and adaptive correction. This approach enhances model reasoning, reduces compounding failure modes, and improves convergence in both code generation and reinforcement learning-based post-training.

January 2025 · Diksha Shrivastava, Mann Acharya, Dr. Tapas Badal

Can Language Models Formulate ML Problems?

LLMs struggle to identify ML problems in real-world data, limiting their reliability for analytical tasks. While agentic systems offer partial solutions, true automation requires reasoning over complex systems. This blog examines these challenges and explores a new data representation model as a potential step forward.

November 2024 · Diksha Shrivastava

The Need for Hypotheses Generation Cycles, Similar Link Prediction & Agency for Dynamic Databases

A robust framework for reasoning requires more than memorization; it must dynamically form and refine hypotheses. Inspired by theorem-proving frameworks, I propose a dynamic database with static relationships and evolving entities, enabling hypothesis cycles and similar link prediction. This method allows LLMs to infer hidden relationships across subsystems, addressing challenges in AI-driven scientific discovery and decision-making.

November 2024 · Diksha Shrivastava

Developing Swan AI & the Six Graphical Representations for Complex Systems

I developed Swan AI to explore hybrid vector-graph representations for complex, interrelated systems. The goal was a data pipeline enabling AI to search, converse, and query while preserving hierarchical relationships. Existing knowledge graphs and vector databases lacked dynamic dependency modeling, prompting our exploration of six graphical representations, including hybrid vector-graph models and TensorDB. The core research question: Can LLMs infer hidden relationships in unstructured, hierarchical data to automate decision-making?

October 2024 · Diksha Shrivastava