Results 1 to 10 of about 597,262 (308)
Generalizability of Code Clone Detection on CodeBERT [PDF]
Transformer networks such as CodeBERT already achieve outstanding results for code clone detection in benchmark datasets, so one could assume that this task has already been solved. However, code clone detection is not a trivial task. Semantic code clones, in particular, are challenging to detect. We show that the generalizability of CodeBERT decreases
Sonnekalb, Tim+3 more
semanticscholar +10 more sources
Accepted for publication at ICSE'16 (preprint, unrevised)
Jeffrey Svajlenko+4 more
semanticscholar +8 more sources
Assessing the Code Clone Detection Capability of Large Language Models [PDF]
This study aims to assess the performance of two advanced Large Language Models (LLMs), GPT-3.S and GPT-4, in the task of code clone detection. The evaluation involves testing the models on a variety of code pairs of different clone types and levels of ...
Zixian Zhang, Takfarinas Saber
semanticscholar +5 more sources
CC2Vec: Combining Typed Tokens with Contrastive Learning for Effective Code Clone Detection [PDF]
With the development of the open source community, the code is often copied, spread, and evolved in multiple software systems, which brings uncertainty and risk to the software system (e.g., bug propagation and copyright infringement).
Shihan Dou+5 more
semanticscholar +5 more sources
Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey [PDF]
Code cloning, the duplication of code fragments, is common in software development. While some reuse aids productivity, excessive cloning hurts maintainability and introduces bugs. Hence, automatic code clone detection is vital. Meanwhile, large language
Shihan Dou+9 more
semanticscholar +4 more sources
Source Code Comments: Overlooked in the Realm of Code Clone Detection [PDF]
Reusing code can produce duplicate or near-duplicate code clones in code repositories. Current code clone detection techniques, like Program Dependence Graphs, rely on code structure and their dependencies to detect clones. These techniques are expensive, using large amounts of processing power, time, and memory.
Kuttal, Sandeep Kaur, Ghosh, Akash
arxiv +6 more sources
A Systematic Review on Code Clone Detection [PDF]
Code cloning refers to the duplication of source code. It is the most common way of reusing source code in software development. If a bug is identified in one segment of code, all the similar segments need to be checked for the same bug.
Qurat Ul Ain+4 more
doaj +4 more sources
TCCCD: Triplet-Based Cross-Language Code Clone Detection
Code cloning is a common practice in software development, where developers reuse existing code to accelerate programming speed and enhance work efficiency. Existing clone-detection methods mainly focus on code clones within a single programming language.
Yong Fang+3 more
doaj +2 more sources
CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code [PDF]
A code clone is a code portion in source files that is identical or similar to another. Since code clones are believed to reduce the maintainability of software, several code clone detection techniques and tools have been proposed.
Toshihiro Kamiya+2 more
semanticscholar +2 more sources
Code Clone Detection based on Event Embedding and Event Dependency [PDF]
The code clone detection method based on semantic similarity has important value in software engineering tasks (e.g., software evolution, software reuse).
Cheng Huang+3 more
semanticscholar +5 more sources