Results 271 to 280 of about 14,757,385 (333)
Some of the next articles are maybe not open access.

Qwen-Image Technical Report

arXiv.org
We present Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.
Chenfei Wu   +38 more
semanticscholar   +1 more source

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

arXiv.org
We introduce Phi-4-Mini and Phi-4-Multimodal, compact yet highly capable language and multimodal models. Phi-4-Mini is a 3.8-billion-parameter language model trained on high-quality web and synthetic data, significantly outperforming recent open-source ...
Abdelrahman Abouelenin   +72 more
semanticscholar   +1 more source

Writing Technical Reports

IEEE Engineering Management Review, 1977
The author provides rules for producing good technical reports. Experience has convinced him that when an engineer is under pressure to produce a report, he wants specifics, not generalities; a law, not a lecture. And that's what he gets here.
openaire   +1 more source

Qwen3-VL Technical Report

arXiv.org
We introduce Qwen3-VL, the most capable vision-language model in the Qwen series to date, achieving superior performance across a broad range of multimodal benchmarks. It natively supports interleaved contexts of up to 256K tokens, seamlessly integrating
Shuai Bai   +64 more
semanticscholar   +1 more source

Technical report

2020
Der Entwurf und die Realisierung dienstbasierender Architekturen wirft eine Vielzahl von Forschungsfragestellungen aus den Gebieten der Softwaretechnik, der Systemmodellierung und -analyse, sowie der Adaptierbarkeit und Integration von Applikationen auf.
Adriano, Christian   +22 more
openaire   +1 more source

Seed1.5-VL Technical Report

arXiv.org
We present Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning. Seed1.5-VL is composed with a 532M-parameter vision encoder and a Mixture-of-Experts (MoE) LLM of 20B active parameters.
Dong Guo   +196 more
semanticscholar   +1 more source

Technical Report Column1

SIGA
Welcome to the Technical Reports Column. If your institution publishes technical reports that you'd like to have included here, please contact me at the email address above.
D. Kelley
semanticscholar   +1 more source

Kimi-VL Technical Report

arXiv.org
We present Kimi-VL, an efficient open-source Mixture-of-Experts (MoE) vision-language model (VLM) that offers advanced multimodal reasoning, long-context understanding, and strong agent capabilities - all while activating only 2.8B parameters in its ...
Kimi Team Angang Du   +90 more
semanticscholar   +1 more source

MedGemma Technical Report

arXiv.org
Artificial intelligence (AI) has significant potential in healthcare applications, but its training and deployment faces challenges due to healthcare's diverse data, complex tasks, and the need to preserve privacy.
Andrew Sellergren   +80 more
semanticscholar   +1 more source

Kimi-Audio Technical Report

arXiv.org
We present Kimi-Audio, an open-source audio foundation model that excels in audio understanding, generation, and conversation. We detail the practices in building Kimi-Audio, including model architecture, data curation, training recipe, inference ...
KimiTeam   +39 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy