Results 11 to 20 of about 6,939 (213)

Mamba-Reg: Vision Mamba Also Needs Registers

open access: yes2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Similar to Vision Transformers, this paper identifies artifacts also present within the feature maps of Vision Mamba. These artifacts, corresponding to high-norm tokens emerging in low-information background areas of images, appear much more severe in Vision Mamba -- they exist prevalently even with the tiny-sized model and activate extensively across ...
Feng Wang 0047   +8 more
openaire   +5 more sources

Mamba Retriever: Utilizing Mamba for Effective and Efficient Dense Retrieval

open access: yesProceedings of the 33rd ACM International Conference on Information and Knowledge Management
In the information retrieval (IR) area, dense retrieval (DR) models use deep learning techniques to encode queries and passages into embedding space to compute their semantic relations. It is important for DR models to balance both efficiency and effectiveness.
Hanqi Zhang   +4 more
openaire   +3 more sources

Mamba Hawkes Process

open access: yesCoRR
Irregular and asynchronous event sequences are prevalent in many domains, such as social media, finance, and healthcare. Traditional temporal point processes (TPPs), like Hawkes processes, often struggle to model mutual inhibition and nonlinearity effectively.
Anningzhe Gao, Shan Dai, Yan Hu
openaire   +3 more sources

Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting

open access: yes
New Mamba-based architecture.
Liang, Aobo   +4 more
openaire   +3 more sources

Online Decision Mamba

open access: yes2025 IEEE 7th International Conference on Cognitive Machine Intelligence (CogMI)
Online in-context reinforcement learning enhances offline-trained policies through online fine-tuning. We introduce Online Decision Mamba (ODM), an architecture that replaces the attention mechanism in Online Decision Transformers (ODT) with the Mamba ...
Trenton W. Ruf, Banafsheh Rekabdar
openaire   +2 more sources

idoiagamiz/SCALE-MAMBA: v1.0.0

open access: yes, 2022
Repository for the SCALE-MAMBA MPC ...
NigelSmart   +4 more
core   +1 more source

Mamba Modulation: On the Length Generalization of Mamba

open access: yesCoRR
The quadratic complexity of the attention mechanism in Transformer models has motivated the development of alternative architectures with sub-quadratic scaling, such as state-space models. Among these, Mamba has emerged as a leading architecture, achieving state-of-the-art results across a range of language modeling tasks.
Peng Lu 0006   +6 more
openaire   +2 more sources

A Survey of Mamba

open access: yesCoRR
As one of the most representative DL techniques, Transformer architecture has empowered numerous advanced models, especially the large language models (LLMs) that comprise billions of parameters, becoming a cornerstone in deep learning. Despite the impressive achievements, Transformers still face inherent limitations, particularly the time-consuming ...
Haohao Qu   +7 more
openaire   +2 more sources

Differential Mamba

open access: yesProceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Sequence models like Transformers and RNNs often overallocate attention to irrelevant context, leading to noisy intermediate representations. This degrades LLM capabilities by promoting hallucinations, weakening long-range and retrieval abilities, and reducing robustness.
Nadav Schneider   +2 more
openaire   +3 more sources

A Foundation Model Based CT Biomarker for Non‐Invasive Prediction of Response to Neoadjuvant Immunochemotherapy in Non‐Small Cell Lung Cancer

open access: yesAdvanced Science, EarlyView.
This study introduces a foundation model‐based biomarker for risk stratification of pathological response in non‐small cell lung cancer. A Vision Mamba super‐resolution model standardizes heterogeneous CT images. A multi‐task Swin Transformer then fine‐tunes a pre‐trained lung foundation model to jointly optimize tumor segmentation and response ...
Yanglan Xu   +10 more
wiley   +1 more source

Home - About - Disclaimer - Privacy