Results 311 to 320 of about 2,849,722 (390)
Some of the next articles are maybe not open access.

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

arXiv.org
DeepSeek-R1-Zero has successfully demonstrated the emergence of reasoning capabilities in LLMs purely through Reinforcement Learning (RL). Inspired by this breakthrough, we explore how RL can be utilized to enhance the reasoning capability of MLLMs ...
Wenxuan Huang   +8 more
semanticscholar   +1 more source

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Conference on Empirical Methods in Natural Language Processing
Large reasoning models (LRMs) like OpenAI-o1 have demonstrated impressive long stepwise reasoning capabilities through large-scale reinforcement learning.
Xiaoxi Li   +7 more
semanticscholar   +1 more source

Reasoning with Exploration: An Entropy Perspective

AAAI Conference on Artificial Intelligence
Balancing exploration and exploitation is a central goal in reinforcement learning (RL). Despite recent advances in enhancing language model (LM) reasoning, most methods lean toward exploitation, and increasingly encounter performance plateaus.
Daixuan Cheng   +6 more
semanticscholar   +1 more source

Reasons, practical reason, and practical reasoning

Ratio, 2004
AbstractThe concepts of reasons as supporting elements, of practical reason as a capacity, and of practical reasoning as a process are central in the theory of action. This paper provides a brief account of each. Several kinds of reason for action are distinguished.
openaire   +1 more source

O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning

arXiv.org
Recently, long-thought reasoning LLMs, such as OpenAI's O1, adopt extended reasoning processes similar to how humans ponder over complex problems. This reasoning paradigm significantly enhances the model's problem-solving abilities and has achieved ...
Haotian Luo   +8 more
semanticscholar   +1 more source

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning

arXiv.org
Reasoning language models have shown an uncanny ability to improve performance at test-time by ``thinking longer''-that is, by generating longer chain-of-thought sequences and hence using more compute.
Pranjal Aggarwal, S. Welleck
semanticscholar   +1 more source

Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning

arXiv.org
Chain-of-thought reasoning has significantly improved the performance of Large Language Models (LLMs) across various domains. However, this reasoning process has been confined exclusively to textual space, limiting its effectiveness in visually intensive
Alex Su   +4 more
semanticscholar   +1 more source

Reasoning Models Don't Always Say What They Think

arXiv.org
Chain-of-thought (CoT) offers a potential boon for AI safety as it allows monitoring a model's CoT to try to understand its intentions and reasoning processes.
Yanda Chen   +14 more
semanticscholar   +1 more source

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

arXiv.org
Large reasoning models (LRMs), such as OpenAI-o1 and DeepSeek-R1, demonstrate impressive long-horizon reasoning capabilities. However, their reliance on static internal knowledge limits their performance on complex, knowledge-intensive tasks and hinders ...
Xiaoxi Li   +7 more
semanticscholar   +1 more source

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

arXiv.org
Large Language Models have demonstrated remarkable reasoning capability in complex textual tasks. However, multimodal reasoning, which requires integrating visual and textual information, remains a significant challenge.
Yi Yang   +11 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy