Results 311 to 320 of about 2,849,722 (390)
Some of the next articles are maybe not open access.
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models
arXiv.orgDeepSeek-R1-Zero has successfully demonstrated the emergence of reasoning capabilities in LLMs purely through Reinforcement Learning (RL). Inspired by this breakthrough, we explore how RL can be utilized to enhance the reasoning capability of MLLMs ...
Wenxuan Huang +8 more
semanticscholar +1 more source
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Conference on Empirical Methods in Natural Language ProcessingLarge reasoning models (LRMs) like OpenAI-o1 have demonstrated impressive long stepwise reasoning capabilities through large-scale reinforcement learning.
Xiaoxi Li +7 more
semanticscholar +1 more source
Reasoning with Exploration: An Entropy Perspective
AAAI Conference on Artificial IntelligenceBalancing exploration and exploitation is a central goal in reinforcement learning (RL). Despite recent advances in enhancing language model (LM) reasoning, most methods lean toward exploitation, and increasingly encounter performance plateaus.
Daixuan Cheng +6 more
semanticscholar +1 more source
Reasons, practical reason, and practical reasoning
Ratio, 2004AbstractThe concepts of reasons as supporting elements, of practical reason as a capacity, and of practical reasoning as a process are central in the theory of action. This paper provides a brief account of each. Several kinds of reason for action are distinguished.
openaire +1 more source
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
arXiv.orgRecently, long-thought reasoning LLMs, such as OpenAI's O1, adopt extended reasoning processes similar to how humans ponder over complex problems. This reasoning paradigm significantly enhances the model's problem-solving abilities and has achieved ...
Haotian Luo +8 more
semanticscholar +1 more source
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
arXiv.orgReasoning language models have shown an uncanny ability to improve performance at test-time by ``thinking longer''-that is, by generating longer chain-of-thought sequences and hence using more compute.
Pranjal Aggarwal, S. Welleck
semanticscholar +1 more source
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning
arXiv.orgChain-of-thought reasoning has significantly improved the performance of Large Language Models (LLMs) across various domains. However, this reasoning process has been confined exclusively to textual space, limiting its effectiveness in visually intensive
Alex Su +4 more
semanticscholar +1 more source
Reasoning Models Don't Always Say What They Think
arXiv.orgChain-of-thought (CoT) offers a potential boon for AI safety as it allows monitoring a model's CoT to try to understand its intentions and reasoning processes.
Yanda Chen +14 more
semanticscholar +1 more source
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
arXiv.orgLarge reasoning models (LRMs), such as OpenAI-o1 and DeepSeek-R1, demonstrate impressive long-horizon reasoning capabilities. However, their reliance on static internal knowledge limits their performance on complex, knowledge-intensive tasks and hinders ...
Xiaoxi Li +7 more
semanticscholar +1 more source
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
arXiv.orgLarge Language Models have demonstrated remarkable reasoning capability in complex textual tasks. However, multimodal reasoning, which requires integrating visual and textual information, remains a significant challenge.
Yi Yang +11 more
semanticscholar +1 more source

