Results 221 to 230 of about 692,148 (292)
Some of the next articles are maybe not open access.

DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning

Nature
General reasoning represents a long-standing and formidable challenge in artificial intelligence (AI). Recent breakthroughs, exemplified by large language models (LLMs)1,2 and chain-of-thought (CoT) prompting3, have achieved considerable success on ...
DeepSeek-AI   +197 more
semanticscholar   +1 more source

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

arXiv.org
Mathematical reasoning poses a significant challenge for language models due to its complex and structured nature. In this paper, we introduce DeepSeekMath 7B, which continues pre-training DeepSeek-Coder-Base-v1.5 7B with 120B math-related tokens sourced
Zhihong Shao   +8 more
semanticscholar   +1 more source

Reason, Reasons and Reasoning

Theory & Psychology, 1994
The concept of rationality has its roots in a historic philosophical conception of human beings as creatures of reason. To act on the basis of reason is to act on the basis of reasons, which in turn implies a process of reasoning. An objectivist conception of rationality sees its essence as lying in the use of reasoning processes that conform to ...
openaire   +1 more source

Reasoning

Annual Review of Psychology, 1990
Strict theories of reasoning are schoolmarmish in their insistence on rules and structure, but this gives them an advantage when inference is relatively well behaved. In the case of reasoning with deductively valid arguments, Strict theories give a convincing account of the universality of certain inference forms and the productivity of reasoning in ...
openaire   +2 more sources

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

arXiv.org
Reinforcement Learning with Verifiable Rewards (RLVR) has recently demonstrated notable success in enhancing the reasoning performance of large language models (LLMs), particularly on mathematics and programming tasks. Similar to how traditional RL helps
Yang Yue   +6 more
semanticscholar   +1 more source

Between Reasoning

Quarterly Journal of Experimental Psychology, 2006
In two experiments we investigated three-term reasoning with spatial relational assertions using the preposition between as compared to projective prepositions (such as to the left of). For each kind of assertion we distinguish the referent expression (i.e., the grammatical subject) from the relatum expression (i.e., the internal argument of the ...
Hörnig, Robin   +2 more
openaire   +3 more sources

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

arXiv.org
We introduce InternVL 3.5, a new family of open-source multimodal models that significantly advances versatility, reasoning capability, and inference efficiency along the InternVL series. A key innovation is the Cascade Reinforcement Learning (Cascade RL)
Weiyun Wang   +62 more
semanticscholar   +1 more source

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

arXiv.org
Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models (LLMs), while its mechanisms are not yet well understood.
Shenzhi Wang   +17 more
semanticscholar   +1 more source

Video-R1: Reinforcing Video Reasoning in MLLMs

arXiv.org
Inspired by DeepSeek-R1's success in eliciting reasoning abilities through rule-based reinforcement learning (RL), we introduce Video-R1 as the first attempt to systematically explore the R1 paradigm for incentivizing video reasoning within multimodal ...
Kaituo Feng   +7 more
semanticscholar   +1 more source

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

arXiv.org
DeepSeek-R1-Zero has successfully demonstrated the emergence of reasoning capabilities in LLMs purely through Reinforcement Learning (RL). Inspired by this breakthrough, we explore how RL can be utilized to enhance the reasoning capability of MLLMs ...
Wenxuan Huang   +8 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy