Results 221 to 230 of about 692,148 (292)
Some of the next articles are maybe not open access.
DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning
NatureGeneral reasoning represents a long-standing and formidable challenge in artificial intelligence (AI). Recent breakthroughs, exemplified by large language models (LLMs)1,2 and chain-of-thought (CoT) prompting3, have achieved considerable success on ...
DeepSeek-AI +197 more
semanticscholar +1 more source
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
arXiv.orgMathematical reasoning poses a significant challenge for language models due to its complex and structured nature. In this paper, we introduce DeepSeekMath 7B, which continues pre-training DeepSeek-Coder-Base-v1.5 7B with 120B math-related tokens sourced
Zhihong Shao +8 more
semanticscholar +1 more source
Theory & Psychology, 1994
The concept of rationality has its roots in a historic philosophical conception of human beings as creatures of reason. To act on the basis of reason is to act on the basis of reasons, which in turn implies a process of reasoning. An objectivist conception of rationality sees its essence as lying in the use of reasoning processes that conform to ...
openaire +1 more source
The concept of rationality has its roots in a historic philosophical conception of human beings as creatures of reason. To act on the basis of reason is to act on the basis of reasons, which in turn implies a process of reasoning. An objectivist conception of rationality sees its essence as lying in the use of reasoning processes that conform to ...
openaire +1 more source
Annual Review of Psychology, 1990
Strict theories of reasoning are schoolmarmish in their insistence on rules and structure, but this gives them an advantage when inference is relatively well behaved. In the case of reasoning with deductively valid arguments, Strict theories give a convincing account of the universality of certain inference forms and the productivity of reasoning in ...
openaire +2 more sources
Strict theories of reasoning are schoolmarmish in their insistence on rules and structure, but this gives them an advantage when inference is relatively well behaved. In the case of reasoning with deductively valid arguments, Strict theories give a convincing account of the universality of certain inference forms and the productivity of reasoning in ...
openaire +2 more sources
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
arXiv.orgReinforcement Learning with Verifiable Rewards (RLVR) has recently demonstrated notable success in enhancing the reasoning performance of large language models (LLMs), particularly on mathematics and programming tasks. Similar to how traditional RL helps
Yang Yue +6 more
semanticscholar +1 more source
Quarterly Journal of Experimental Psychology, 2006
In two experiments we investigated three-term reasoning with spatial relational assertions using the preposition between as compared to projective prepositions (such as to the left of). For each kind of assertion we distinguish the referent expression (i.e., the grammatical subject) from the relatum expression (i.e., the internal argument of the ...
Hörnig, Robin +2 more
openaire +3 more sources
In two experiments we investigated three-term reasoning with spatial relational assertions using the preposition between as compared to projective prepositions (such as to the left of). For each kind of assertion we distinguish the referent expression (i.e., the grammatical subject) from the relatum expression (i.e., the internal argument of the ...
Hörnig, Robin +2 more
openaire +3 more sources
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
arXiv.orgWe introduce InternVL 3.5, a new family of open-source multimodal models that significantly advances versatility, reasoning capability, and inference efficiency along the InternVL series. A key innovation is the Cascade Reinforcement Learning (Cascade RL)
Weiyun Wang +62 more
semanticscholar +1 more source
arXiv.org
Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models (LLMs), while its mechanisms are not yet well understood.
Shenzhi Wang +17 more
semanticscholar +1 more source
Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models (LLMs), while its mechanisms are not yet well understood.
Shenzhi Wang +17 more
semanticscholar +1 more source
Video-R1: Reinforcing Video Reasoning in MLLMs
arXiv.orgInspired by DeepSeek-R1's success in eliciting reasoning abilities through rule-based reinforcement learning (RL), we introduce Video-R1 as the first attempt to systematically explore the R1 paradigm for incentivizing video reasoning within multimodal ...
Kaituo Feng +7 more
semanticscholar +1 more source
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models
arXiv.orgDeepSeek-R1-Zero has successfully demonstrated the emergence of reasoning capabilities in LLMs purely through Reinforcement Learning (RL). Inspired by this breakthrough, we explore how RL can be utilized to enhance the reasoning capability of MLLMs ...
Wenxuan Huang +8 more
semanticscholar +1 more source

