Results 121 to 130 of about 6,939 (213)

Mamba-Spike: Enhancing the Mamba Architecture with a Spiking Front-End for Efficient Temporal Data Processing

open access: yes
The field of neuromorphic computing has gained significant attention in recent years, aiming to bridge the gap between the efficiency of biological neural networks and the performance of artificial intelligence systems. This paper introduces Mamba-Spike,
Qin, Jiahao, Liu, Feng
core  

Falcon Mamba: The First Competitive Attention-free 7B Language Model

open access: yes
In this technical report, we present Falcon Mamba 7B, a new base large language model based on the novel Mamba architecture. Falcon Mamba 7B is trained on 5.8 trillion tokens with carefully selected data mixtures.
Hacid, Hakim   +6 more
core  

MambaOut: Do We Really Need Mamba for Vision?

open access: yes
Mamba, an architecture with RNN-like token mixer of state space model (SSM), was recently introduced to address the quadratic complexity of the attention mechanism and subsequently applied to vision tasks.
Wang, Xinchao, Yu, Weihao
core  

Locating and Editing Factual Associations in Mamba

open access: yes
We investigate the mechanisms of factual recall in the Mamba state space model. Our work is inspired by previous findings in autoregressive transformer language models suggesting that their knowledge recall is localized to particular modules at specific ...
Atkinson, David   +2 more
core  

Home - About - Disclaimer - Privacy