Results 11 to 20 of about 377,372 (346)
Structured Pruning for Deep Convolutional Neural Networks: A survey [PDF]
The remarkable performance of deep Convolutional neural networks (CNNs) is generally attributed to their deeper and wider architectures, which can come with significant computational costs.
Yang He, Lingao Xiao
openalex +3 more sources
A Simple and Effective Pruning Approach for Large Language Models [PDF]
As their size increases, Large Languages Models (LLMs) are natural candidates for network pruning methods: approaches that drop a subset of network weights while striving to preserve performance.
Mingjie Sun +3 more
semanticscholar +1 more source
LLM-Pruner: On the Structural Pruning of Large Language Models [PDF]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. However, such impressive capability typically comes with a substantial model size, which presents significant challenges in both the deployment ...
Xinyin Ma, Gongfan Fang, Xinchao Wang
semanticscholar +1 more source
DepGraph: Towards Any Structural Pruning [PDF]
Structural pruning enables model acceleration by removing structurally-grouped parameters from neural networks. However, the parameter-grouping patterns vary widely across different models, making architecture-specific pruners, which rely on manually ...
Gongfan Fang +4 more
semanticscholar +1 more source
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning [PDF]
The popularity of LLaMA (Touvron et al., 2023a;b) and other recently emerged moderate-sized large language models (LLMs) highlights the potential of building smaller yet powerful LLMs.
Mengzhou Xia +3 more
semanticscholar +1 more source
Beyond neural scaling laws: beating power law scaling via data pruning [PDF]
Widely observed neural scaling laws, in which error falls off as a power of the training set size, model size, or both, have driven substantial performance improvements in deep learning.
Ben Sorscher +4 more
semanticscholar +1 more source
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning [PDF]
We consider the problem of model compression for deep neural networks (DNNs) in the challenging one-shot/post-training setting, in which we are given an accurate trained model, and must compress it without any retraining, based only on a small amount of ...
Elias Frantar, Dan Alistarh
semanticscholar +1 more source
A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations [PDF]
Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources.
Hongrong Cheng +2 more
semanticscholar +1 more source
Channel Pruning for Accelerating Very Deep Neural Networks [PDF]
In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks. Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel ...
Yihui He, Xiangyu Zhang, Jian Sun
semanticscholar +1 more source
HRank: Filter Pruning Using High-Rank Feature Map [PDF]
Neural network pruning offers a promising prospect to facilitate deploying deep neural networks on resource-limited devices. However, existing methods are still challenged by the training inefficiency and labor cost in pruning designs, due to missing ...
Mingbao Lin +6 more
semanticscholar +1 more source

