星期三 05 下午 八月 27o 2025
Read the latest posts - ”AI Gateway now gives you access to your favorite AI models, dynamic routing and more — through just one endpoint”
News from the Cloudflare Blog:
Wednesday, August 27, 2025
AI Gateway now gives you access to your favorite AI models, dynamic routing and more — through just one endpoint.
By Michelle Chen
Wednesday, August 27, 2025
How we built the most efficient inference engine for Cloudflare’s network
Infire is an LLM inference engine that employs a range of techniques to maximize resource utilization, allowing us to serve AI models more efficiently with better performance for Cloudflare workloads.
By Vlad Krasnov
Wednesday, August 27, 2025
We’re expanding Workers AI with new partner models from Leonardo.Ai and Deepgram. Start using state-of-the-art image generation models from Leonardo and real-time TTS and STT models from Deepgram.
By Michelle Chen
Wednesday, August 27, 2025
How Cloudflare runs more AI models on fewer GPUs: A technical deep-dive
Cloudflare built an internal platform called Omni. This platform uses lightweight isolation and memory over-commitment to run multiple AI models on a single GPU.
By Sven Sauleau
Copyright © 2025 Cloudflare, Inc. 101 Townsend Street, San Francisco, CA 94107 www.cloudflare.com | Community | Unsubscrib
News from the Cloudflare Blog:
Wednesday, August 27, 2025
AI Gateway now gives you access to your favorite AI models, dynamic routing and more — through just one endpoint.
By Michelle Chen
Wednesday, August 27, 2025
How we built the most efficient inference engine for Cloudflare’s network
Infire is an LLM inference engine that employs a range of techniques to maximize resource utilization, allowing us to serve AI models more efficiently with better performance for Cloudflare workloads.
By Vlad Krasnov
Wednesday, August 27, 2025
We’re expanding Workers AI with new partner models from Leonardo.Ai and Deepgram. Start using state-of-the-art image generation models from Leonardo and real-time TTS and STT models from Deepgram.
By Michelle Chen
Wednesday, August 27, 2025
How Cloudflare runs more AI models on fewer GPUs: A technical deep-dive
Cloudflare built an internal platform called Omni. This platform uses lightweight isolation and memory over-commitment to run multiple AI models on a single GPU.
By Sven Sauleau
Copyright © 2025 Cloudflare, Inc. 101 Townsend Street, San Francisco, CA 94107 www.cloudflare.com | Community | Unsubscrib
发布者