Skip to content

EmPACTLab/Awesome-Neuroscience-Agent-Reasoning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 

Repository files navigation

Nature's Insight: A Novel Framework and Comprehensive Analysis of Agentic Reasoning Through the Lens of Neuroscience

arXiv Maintenance Discussion

Awesome-Neuroscience-Agent-Reasoning

📢 News

image The proposed neuroscience-inspired framework for agentic reasoning. The left panel illustrates the human brain’s reasoning process, where sensory inputs are processed through modality-specific cortices and integrated in higher association areas such as the parietal and prefrontal cortices. This enables abstract reasoning and decision-making, supported by predictive coding mechanisms and memory retrieval from the hippocampus. Inspired by this cognitive flow, the right panel presents a corresponding architecture for AI agents, consisting of sensory input, multi-level information processing, foundational understanding (via foundation models), factual memory storage (knowledge base), and a centralized reasoning module for adaptive and context-aware decision-making. White arrows denote top-down predictive signals based on predictive coding; black arrows represent the forward reasoning process; and dashed lines indicate the conceptual mapping between human brain functions and agent modules.

image The overview of the reasoning process and classification of reasoning behavior from a neuro-perspective. This diagram presents a comprehensive framework of reasoning inspired by human cognitive and neural mechanisms. At the center, a hierarchical reasoning pipeline, spanning data sensory input, information processing, higher-order cognition, and conclusion generation, mirrors the flow of information in biological systems. Surrounding this core are five major categories of reasoning behaviors: perceptual reasoning, driven by multisensory integration; dimensional reasoning, encompassing spatial and temporal inference; relation reasoning, involving analogical thinking and relational matching; logical reasoning, covering inductive, deductive, and abductive logic; and interactive reasoning, focusing on agent-agent and agent-human collaboration within dynamic environments. Together, these components establish a neuro-cognitively grounded taxonomy that bridges biological inspiration and computational implementation in artificial intelligence systems.

Latest Reasoning Surveys

  • Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey (arXiv 2025) Paper Code
  • Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models (arXiv 2025) Paper Code
  • From System 1 to System 2: A Survey of Reasoning Large Language Models (arXiv 2025) Paper Code
  • Logical Reasoning in Large Language Models: A Survey (arXiv 2025) Paper
  • Towards reasoning era: A survey of long chain-of-thought for reasoning large language models (arXiv 2025) Paper Code
  • Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models (arXiv 2025) Paper
  • A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond (arXiv 2025) Paper Code
  • A Survey of Reasoning with Foundation Models (arXiv 2023) Paper Code

Agent Reasoning Framework

image Taxonomy of Agentic Reasoning Techniques Inspired by Neuroscience. This hierarchical structure organizes reasoning methods in artificial agents based on cognitive mechanisms inspired by neuroscience, including dimensional, perceptual, logical, and interactive reasoning, highlighting the integration of biologically plausible mechanisms into artificial intelligence systems. This taxonomy highlights how agents can emulate human-like reasoning across diverse tasks and environments.

Perception-based Reasoning

Part 1: Visual Reasoning

VLM based

  • GeReA: Question-Aware Prompt Captions for Knowledge-based Visual Question Answering (arXiv 2024) Paper Code
  • Lisa: Reasoning segmentation via large language model (CVPR 2024) Paper Code
  • KN-VLM: KNowledge-guided Vision-and-Language Model for visual abductive reasoning (Research Square 2025) Paper
  • Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge (ECCV 2024) Paper Code

LLM based

  • Large language models are visual reasoning coordinators (NeurIPS 2023) Paper Code
  • Enhancing LLM Reasoning via Vision-Augmented Prompting (NeurIPS 2024) Paper
  • Improving zero-shot visual question answering via large language models with reasoning question prompts (ACM 2023) Paper Code
  • Visual Chain-of-Thought Prompting for Knowledge-Based Visual Reasoning (AAAI 2024) Paper
  • Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning (NeurIPS 2024) Paper Code
  • Visual chain of thought: bridging logical gaps with multimodal infillings (arXiv 2023) Paper Code
  • End-to-End Chart Summarization via Visual Chain-of-Thought in Vision-Language Models (arXiv 2025) Paper
  • Llava-o1: Let vision language models reason step-by-step (arXiv 2024) Paper Code

Neuro-symbolic based

  • ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for Compositional Visual Reasoning (COLM 2024) Paper Code
  • Visual programming: Compositional visual reasoning without training (CVPR 2023) Paper Code
  • Vipergpt: Visual inference via python execution for reasoning (CVPR 2023) Paper Code

RL based

  • HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning (ECCV 2024) Paper Code
  • Vision-r1: Incentivizing reasoning capability in multimodal large language models (arXiv 2025) Paper Code Code
  • Visual-rft: Visual reinforcement fine-tuning (arXiv 2025) Paper Code
  • Medvlm-r1: Incentivizing medical reasoning capability of vision-language models (vlms) via reinforcement learning (arXiv 2025) Paper Code
  • VLM-RL: A Unified Vision Language Models and Reinforcement Learning Framework for Safe Autonomous Driving (arXiv 2024) Paper Code

Part 2: Lingual Reasoning

CoT based

  • Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (NeurIPS 2022) Paper
  • Self-Consistency Improves Chain of Thought Reasoning in Language Models (ICLR 2022) Paper
  • Tree of Thoughts: Deliberate Problem Solving with Large Language Models (NeurIPS 2023) Paper Code
  • Graph of Thoughts: Solving Elaborate Problems with Large Language Models (AAAI 2023) Paper Code
  • Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data (EMNLP 2023) Paper Code
  • Active Prompting with Chain-of-Thought for Large Language Models (ACL 2023) Paper Code
  • Large Language Models Are Reasoning Teachers (ACL 2023) Paper Code
  • Chain of Code: Reasoning with a Language Model-Augmented Code Emulator (ICML 2024) Paper Code
  • Abstraction-of-Thought Makes Language Models Better Reasoners (EMNLP 2024) Paper Code
  • Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic (COLING 2024) Paper Code
  • Path-of-Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models (arXiv 2024) Paper
  • Stepwise Self-Consistent Mathematical Reasoning with Large Language Models (arXiv 2024) Paper Code
  • Chain-of-Thought Reasoning Without Prompting (arXiv 2024) Paper
  • Interleaved-Modal Chain-of-Thought (CVPR 2025) Paper Code
  • CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning (arXiv 2025) Paper
  • Chain of Draft: Thinking Faster by Writing Less (arXiv 2025) Paper Code

RL based

  • Making Large Language Models Better Reasoners with Step-Aware Verifier (arXiv 2023) Paper Code
  • Large Language Models Cannot Self-Correct Reasoning Yet (ICLR 2024) Paper
  • Free Process Rewards without Process Labels (arXiv 2024) Paper Code
  • Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations (arXiv 2024) Paper Code
  • Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents(arXiv 2024) Paper Code
  • Self-playing Adversarial Language Game Enhances LLM Reasoning(NeurIPS 2024) Paper Code
  • Does RLHF Scale? Exploring the Impacts From Data, Model, and Method (arXiv 2024) Paper
  • OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning (NAACL 2024) Paper Code
  • Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs (arXiv 2024) Paper Code
  • AutoPSV: Automated Process-Supervised Verifier (arXiv 2024) Paper Code
  • ReST-MCTS: LLM Self-Training via Process Reward Guided Tree Search (arXiv 2024) Paper Code
  • Improve Mathematical Reasoning in Language Models by Automated Process Supervision (arXiv 2024) Paper Code
  • DeepSeek-R1: Incentivising Reasoning Capability in LLMs via Reinforcement Learning (arXiv 2025) Paper Code
  • Reasoning with Reinforced Functional Token Tuning (arXiv 2025) Paper Code
  • Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning (arXiv 2025) Paper Code
  • Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling (arXiv 2025) Paper Code
  • Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search (arXiv 2025) Paper Code
  • Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning (arXiv 2025) Paper Code
  • QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search (arXiv 2025) Paper
  • DHP: Discrete Hierarchical Planning for Hierarchical Reinforcement Learning Agents (arXiv 2025) Paper
  • DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails (arXiv 2025) Paper Code
  • On the Emergence of Thinking in LLMs I: Searching for the Right Intuition (arXiv 2025) Paper
  • KIMI K1.5:SCALING REINFORCEMENT LEARNING WITH LLMS (arXiv 2025) Paper Code

Part 3: Auditory Reasoning

Model/Multimodal Integration

  • Joint audio and speech understanding (IEEE ASRU 2023) Paper Code
  • Listen, think, and understand (ICLR 2024) Paper Code
  • Toward Explainable Physical Audiovisual Commonsense Reasoning (ACMMM 2024) Paper
  • BAT: Learning to Reason about Spatial Sounds with Large Language Models (ICML 2024) Paper Code
  • GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities (arXiv 2024) Paper Code
  • What do MLLMs hear? Examining reasoning with text and sound components in Multimodal Large Language Models (arXiv 2024) Paper

Counter Factual Learning

  • Disentangled counterfactual learning for physical audiovisual commonsense reasoning (NeurIPS 2024) Paper
  • Learning Audio Concepts from Counterfactual Natural Language. (ICASSP 2024) Paper

Part 4: Tactile Reasoning

  • Beyond Sight: Finetuning Generalist Robot Policies with Heterogeneous Sensors via Language Grounding (arXiv 2025) Paper Code
  • Octopi: Object Property Reasoning with Large Tactile-Language Models (arXiv 2024) Paper Code
  • TALON: Improving Large Language Model Cognition with Tactility-Vision Fusion (ICIEA 2024) Paper
  • Vision-language model-based physical reasoning for robot liquid perception (IROS 2024) Paper

Dimension-based Reasoning

Part 5: Spatial Reasoning

  • Visual Spatial Reasoning (TACL 2023) Paper
  • SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities (CVPR 2024) Paper
  • Large Language Models are Visual Reasoning Coordinators (NeurIPS 2023) Paper
  • Is a Picture Worth a Thousand Words? Delving into Spatial Reasoning for Vision-Language Models (NeurIPS 2024) Paper
  • Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs (CVPR 2024) Paper
  • Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark (AAAI 2024) Paper
  • SpatialPIN: Enhancing Spatial Reasoning Capabilities of Vision-Language Models through Prompting and Interacting 3D Priors (NeurIPS 2024) Paper
  • SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models (NeurIPS 2024) Paper
  • Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation (AAAI 2025) Paper
  • Metric Reasoning in Large Language Models (ACM GIS 2024) Paper
  • Weakly-supervised 3D Spatial Reasoning for Text-based Visual Question Answering (IEEE TIP 2023) Paper
  • Towards Grounded Visual Spatial Reasoning in Multi-Modal Vision Language Models (DMLR @ICLR 2024) Paper
  • StarCraftImage: A Dataset for Prototyping Spatial Reasoning Methods for Multi-Agent Environments (CVPR 2023) Paper
  • A Spatial Hierarchical Reasoning Network for Remote Sensing Visual Question Answering (IEEE 2023) Paper
  • Structured Spatial Reasoning with Open Vocabulary Object Detectors (arXiv 2024) Paper
  • A Pilot Evaluation of ChatGPT and DALL-E 2 on Decision Making and Spatial Reasoning (arXiv 2023) Paper
  • SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning (arXiv 2025) Paper Code
  • Dialectical Language Model Evaluation: An Initial Appraisal of the Commonsense Spatial Reasoning Abilities of LLMs (arXiv 2023) Paper
  • Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning (IJCAI 2024) Paper
  • What's "Up" with Vision-Language Models? Investigating Their Struggle with Spatial Reasoning (EMNLP 2023) Paper Code
  • Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models (arXiv 2024) Paper Code
  • Chain-of-Symbol Prompting For Spatial Reasoning in Large Language Models (COLM 2024) Paper
  • GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning (arXiv 2024) Paper
  • Graph-Based Spatial Reasoning for Tracking Landmarks in Dynamic Laparoscopic Environments (IEEE RA-L) Paper
  • TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation (arXiv 2024) Paper
  • End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering (arXiv 2024) Paper Code
  • I Know About "Up"! Enhancing Spatial Reasoning in Visual Language Models Through 3D Reconstruction (arXiv 2024) Paper Code
  • Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning (arXiv 2024) Paper Code

Part 6: Temporal Reasoning

LLM based

  • Text-to-ECG: 12-Lead Electrocardiogram Synthesis Conditioned on Clinical Text Reports (ICASSP 2023) Paper
  • Can Brain Signals Reveal Inner Alignment with Human Languages (EMNLP 2023 Findings) Paper Code
  • TempoGPT: Enhancing Temporal Reasoning via Quantizing Embedding (arXiv 2025) Paper Code
  • PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting(IEEE TKDE 2023) Paper
  • Large Language Models Can Learn Temporal Reasoning (ACL 2024) Paper
  • Back to the future: Towards explainable temporal reasoning with large language models (WWW 2024) Paper
  • Enhancing Temporal Sensitivity and Reasoning for Time-Sensitive Question Answering (EMNLP 2024 Findings) Paper
  • Temporal Reasoning Transfer from Text to Video (ICLR 2025) Paper Code
  • Timo: Towards Better Temporal Reasoning for Language Models (COLM 2024) Paper Code
  • Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning (ICML 2024) Paper Code
  • Getting Sick After Seeing a Doctor? Diagnosing and Mitigating Knowledge Conflicts in Event Temporal Reasoning (NAACL 2024 Findings) Paper
  • Temporal reasoning for timeline summarisation in social media (arXiv 2024) Paper
  • Video LLMs for Temporal Reasoning in Long Videos (arXiv 2024) Paper
  • Enhancing temporal knowledge graph forecasting with large language models via chain-of-history reasoning (ACL 2024 Findings) Paper

Graph based

  • Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs (ICML 2017) Paper
  • Event Graph Guided Compositional Spatial-Temporal Reasoning for Video Question Answering (IEEE TIP 2024) Paper Code
  • Temporal knowledge graph reasoning with historical contrastive learning (AAAI 2023) Paper
  • Temporal inductive path neural network for temporal knowledge graph reasoning (Artificial Intelligence 2024) Paper
  • Large language models-guided dynamic adaptation for temporal knowledge graph reasoning (NeurIPS 2024 ) Paper
  • An improving reasoning network for complex question answering over temporal knowledge graphs (Applied Intelligence 2023) Paper
  • Once Upon a Time in Graph: Relative-Time Pretraining for Complex Temporal Reasoning (EMNLP 2023) Paper Code
  • Timegraphs: Graph-based temporal reasoning (arXiv 2024) Paper
  • Search from History and Reason for Future: Two-stage Reasoning on Temporal Knowledge Graphs (ACL 2021) Paper
  • Temporal knowledge graph reasoning based on evolutional representation learning (SiGIR 2021) Paper
  • TempoQR: Temporal Question Reasoning over Knowledge Graphs (AAAI 2022) Paper
  • Learning to Sample and Aggregate: Few-shot Reasoning over Temporal Knowledge Graphs (NeurIPS 2022) Paper
  • THCN: A Hawkes Process Based Temporal Causal Convolutional Network for Extrapolation Reasoning in Temporal Knowledge Graphs (TKDE 2024) Paper

Symbolic based

  • Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution (CVPR 2021) Paper Code
  • Teilp: Time prediction over knowledge graphs via logical reasoning (AAAI 2024) Paper Code
  • Self-Supervised Logic Induction for Explainable Fuzzy Temporal Commonsense Reasoning (AAAI 2023) Paper

Logic-based Reasoning

  • The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision Paper Code
  • Deeplogic: Joint learning of neural perception and logical reasoning (TPAMI 2022) Paper
  • A survey on neural-symbolic learning systems (Neural Networks) Paper
  • Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning (EMNLP 2023 Findings) Paper Code
  • LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models (EMNLP 2024) Paper Code
  • Faithful Logical Reasoning via Symbolic Chain-of-Thought (ACL 2024) Paper Code
  • Generalization on the Unseen, Logic Reasoning and Degree Curriculum (JMLR 2024) Paper
  • LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers (EMNLP 2023) Paper Code
  • Complex Logical Reasoning over Knowledge Graphs using Large Language Models (arXiv 2023) Paper Code
  • Improved Logical Reasoning of Language Models via Differentiable Symbolic Programming (ACL 2023 Findings) Paper Code
  • GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models (ICLR 2025) Paper
  • Premise Order Matters in Reasoning with Large Language Models (ICML 2024) Paper

Part 7: Inductive Reasoning

  • Inductive reasoning in humans and large language models (Cognitive Systems Research 2024) Paper
  • Hypothesis Search: Inductive Reasoning with Language Models (ICLR 2024) Paper Code
  • Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement (ICLR 2024) Paper
  • Inductive or Deductive? Rethinking the Fundamental Reasoning Abilities of LLMs (arXiv 2024) Paper
  • Incorporating Context Graph with Logical Reasoning for Inductive Relation Prediction (SIGIR 2022) Paper

Part 8: Deductive Reasoning

  • Audio Entailment: Assessing Deductive Reasoning for Audio Understanding (AAAI 2025) Paper
  • Deductive Verification of Chain-of-Thought Reasoning (NeurIPS 2023) Paper Code
  • Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples (NeurIPS 2023) Paper
  • Certified Deductive Reasoning with Language Models (TMLR 2024) PaperCode
  • How Far Are We from Intelligent Visual Deductive Reasoning? (CoLM 2024) Paper Code
  • Learning deductive reasoning from synthetic corpus based on formal logic (ICML 2023) Paper
  • Strategic deductive reasoning in large language models: A dual-agent approach (ICPICS 2024) Paper
  • Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation (IJCLR-NeSy 2022) Paper Code

Part 9: Abductive Reasoning

  • Multi-modal action chain abductive reasoning (ACL 2023) Paper
  • Visual Abductive Reasoning (CVPR 2022) Paper Code
  • Language models can improve event prediction by few-shot abductive reasoning (NeurIPS 2023) Paper
  • Abductive Reasoning in Logical Credal Networks (Neurips 2024) Paper
  • Towards Learning Abductive Reasoning Using VSA Distributed Representations (NeSy 2024) Paper Code
  • Language models can improve event prediction by few-shot abductive reasoning (NeruIPS 2023) Paper

Interaction-based Reasoning

Part 10: Reasoning based on Agent-Agent Interaction

  • Dera: enhancing large language model completions with dialog-enabled resolving agents (arXiv 2024) Paper Code
  • Roco: Dialectic multi-robot collaboration with large language models (ICRA 2024) Paper
  • Chateval: Towards better llm-based evaluators through multi-agent debate (arXiv 2023) Paper
  • Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate (EMNLP 2024) Paper
  • CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation (ICLR 2025) Paper Code
  • Building cooperative embodied agents modularly with large language models (ICLR 2024) Paper Code

Part 11: Reasoning based on Agent-Human Interaction

  • A virtual conversational agent for teens with autism spectrum disorder: Experimental results and design lessons (ACM 2020) Paper
  • Peer: A collaborative language model (arXiv 2022) Paper
  • SAPIEN: Affective Virtual Agents Powered by Large Language Models (ACIIW 2023) Paper
  • Human-level play in the game of Diplomacy by combining language models with strategic reasoning (Science 2022) Paper
  • Language grounded multi-agent reinforcement learning with human-interpretable communication (NeurIPS 2024) Paper

Benchmark

Visual Reasoning

  • Vqa: Visual question answering (CVPR 2015) Paper
  • Making the v in vqa matter: Elevating the role of image understanding in visual question answering (CVPR 2017) Paper Code
  • Gqa: A new dataset for real-world visual reasoning and compositional question answering (CVPR 2019) Paper
  • Roses Are Red, Violets Are Blue... but Should VQA Expect Them To? (CVPR 2021) Paper
  • A corpus for reasoning about natural language grounded in photographs (arXiv 2018) Paper Code
  • Super-clevr: A virtual benchmark to diagnose domain robustness in visual reasoning (CVPR 2023) Paper Code
  • Ok-vqa: A visual question answering benchmark requiring external knowledge (CVPR 2019) Paper
  • A-okvqa: A benchmark for visual question answering using world knowledge (ECCV 2022) Paper Code
  • Clevr: A diagnostic dataset for compositional language and elementary visual reasoning (CVPR 2017) Paper Code

Lingual Reasoning

  • Mr-ben: A meta-reasoning benchmark for evaluating system-2 thinking in llms (arXiv 2024) Paper Code
  • RM-bench: Benchmarking reward models of language models with subtlety and style (ICLR 2025) Paper Code
  • LR2Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems (arXiv 2025) Paper Code
  • Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models (arXiv 2025) Paper Code
  • LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion (arXiv 2025) Paper Code
  • Big-bench extra hard (arXiv 2025) Paper Code
  • Researchbench: Benchmarking llms in scientific discovery via inspiration-based task decomposition (arXiv 2025) Paper
  • MastermindEval: A Simple But Scalable Reasoning Benchmark (arXiv 2025) Paper Code
  • Z1: Efficient Test-time Scaling with Code (arXiv 2025) Paper Code

Auditory Reasoning

  • Audiocaps: Generating captions for audios in the wild (NAACL 2019) Paper Code
  • Clotho: An audio captioning dataset (ICASSP 2020) Paper Code

Tactile Reasoning

  • Transferable tactile transformers for representation learning across diverse sensors and tasks (arXiv 2024) Paper Code
  • Touch100k: A large-scale touch-language-vision dataset for touch-centric multimodal representation (arXiv 2024) Paper Code
  • Anytouch: Learning unified static-dynamic representation across multiple visuo-tactile sensors (arXiv 2025) Paper Code
  • Beyond sight: Finetuning generalist robot policies with heterogeneous sensors via language grounding (arXiv 2025) Paper Code

Spatial Reasoning

  • Raven: A dataset for relational and analogical visual reasoning (CVPR 2019) Paper
  • Grit: General robust image task benchmark (NeurIPS 2022) Paper Code
  • CoDraw: Collaborative drawing as a testbed for grounded goal-driven communication (ACL 2019) Paper Code
  • Touchdown: Natural language navigation and spatial reasoning in visual street environments (CVPR 2019) Paper Code
  • Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments (CVPR 2018) Paper Code
  • Spatialsense: An adversarially crowdsourced benchmark for spatial relation recognition (CVPR 2018) Paper Code

Temporal Reasoning

  • AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning (CVPR 2021) Paper
  • Timebench: A comprehensive evaluation of temporal reasoning abilities in large language models (ACL 2024) Paper Code
  • TRAM: Benchmarking Temporal Reasoning for Large Language Models (ACL 2024 Findings) Paper Code
  • Towards benchmarking and improving the temporal reasoning capability of large language models (ACL 2023) Paper Code
  • MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models (EMNLP 2023 Findings) Paper Code
  • Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos (arXiv 2024) Paper Code
  • Generic Temporal Reasoning with Differential Analysis and Explanation (ACL 2023) Paper
  • V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning (arXiv 2025) Paper
  • MusTQ: A Temporal Knowledge Graph Question Answering Dataset for Multi-Step Temporal Reasoning (ACL 2024 Findings) Paper

Logical Reasoning

  • CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text (EMNLP 2019) Paper Code
  • ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning (ICLR 2020) Paper Code
  • Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4 (arXiv 2023) Paper Code
  • ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning (ACL 2022 Findings) Paper Code
  • Logiqa 2.0—an improved dataset for logical reasoning in natural language understanding (TASLP 2023) Paper Code
  • The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning (ECCV 2022) Paper Code
  • True Detective: A Deep Abductive Reasoning Benchmark Undoable for GPT-3 and Challenging for GPT-4 (arXiv 2022) PaperCode
  • From LSAT: The Progress and Challenges of Complex Reasoning (TASLP 2021) Paper Code
  • Training Verifiers to Solve Math Word Problems (arXiv 2021) Paper Code
  • LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages (NeurIPS 2024) Paper Code
  • LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models (ACL 2024) Paper Code
  • FOLIO: Natural Language Reasoning with First-Order Logic (EMNLP 2024) Paper Code
  • Diagnosing the First-Order Logical Reasoning Ability Through LogicNLI (EMNLP 2021) Paper Code
  • Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models (arXiv 2023)Paper

About

Neuroscience Inspired Agent Reasoning Framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •