DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection [PDF]
We present DINO (\textbf{D}ETR with \textbf{I}mproved de\textbf{N}oising anch\textbf{O}r boxes), a state-of-the-art end-to-end object detector. % in this paper.
Hao Zhang +7 more
semanticscholar +1 more source
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [PDF]
Our objective in this work is video-text retrieval – in particular a joint embedding that enables efficient text-to-video retrieval. The challenges in this area include the design of the visual architecture and the nature of the training data, in that ...
Max Bain +3 more
semanticscholar +1 more source
SoundStream: An End-to-End Neural Audio Codec [PDF]
We present SoundStream, a novel neural audio codec that can efficiently compress speech, music and general audio at bitrates normally targeted by speech-tailored codecs. SoundStream relies on a model architecture composed by a fully convolutional encoder/
Neil Zeghidour +4 more
semanticscholar +1 more source
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection [PDF]
Accurate detection of objects in 3D point clouds is a central problem in many applications, such as autonomous navigation, housekeeping robots, and augmented/virtual reality.
Yin Zhou, Oncel Tuzel
semanticscholar +1 more source
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals [PDF]
We present Sparse R-CNN, a purely sparse method for object detection in images. Existing works on object detection heavily rely on dense object candidates, such as k anchor boxes pre-defined on all grids of image feature map of size H × W. In our method,
Pei Sun +10 more
semanticscholar +1 more source
MixFormer: End-to-End Tracking with Iterative Mixed Attention [PDF]
Tracking often uses a multistage pipeline of feature extraction, target information integration, and bounding box estimation. To simplify this pipeline and unify the process of feature extraction and target information integration, we present a compact ...
Yutao Cui +3 more
semanticscholar +1 more source
The Moral Philosophy of Lucretius and Aquinas: Competing Ends and Means [PDF]
The author first explains wisdom and its importance to moral philosophy. Secondly, he follows with a consideration of the nature of things and the soul as told by Lucretius. Then he presents a brief summary on St. Thomas understanding of soul and how his
Jason Nehez
doaj +1 more source
MDETR - Modulated Detection for End-to-End Multi-Modal Understanding [PDF]
Multi-modal reasoning systems rely on a pre-trained object detector to extract regions of interest from the image. However, this crucial module is typically used as a black box, trained independently of the downstream task and on a fixed vocabulary of ...
Aishwarya Kamath +5 more
semanticscholar +1 more source
DriveGPT4: Interpretable End-to-End Autonomous Driving Via Large Language Model [PDF]
Multimodallarge language models (MLLMs) have emerged as a prominent area of interest within the research community, given their proficiency in handling and reasoning with non-textual data, including images and videos.
Zhenhua Xu +7 more
semanticscholar +1 more source
From Logical Form to Form of Life: Daniel Hutto’s Critical Review of Wittgenstein’s Thoughts [PDF]
Since Wittgenstein is known for his two philosophies, one of the concerns of interpreters of his thoughts is to understand the relation between his two philosophies.
Atieh Zandieh
doaj +1 more source

