Read the latest posts - ”AI Gateway now gives you access to your favorite AI models, dynamic routing and more

News from the Cloudflare Blog:

Wednesday, August 27, 2025

AI Gateway now gives you access to your favorite AI models, dynamic routing and more — through just one endpoint

AI Gateway now gives you access to your favorite AI models, dynamic routing and more — through just one endpoint.

By Michelle Chen

Wednesday, August 27, 2025

How we built the most efficient inference engine for Cloudflare’s network

Infire is an LLM inference engine that employs a range of techniques to maximize resource utilization, allowing us to serve AI models more efficiently with better performance for Cloudflare workloads.

By Vlad Krasnov

Wednesday, August 27, 2025

State-of-the-art image generation Leonardo models and text-to-speech Deepgram models now available in Workers AI

We’re expanding Workers AI with new partner models from Leonardo.Ai and Deepgram. Start using state-of-the-art image generation models from Leonardo and real-time TTS and STT models from Deepgram.

By Michelle Chen

Wednesday, August 27, 2025

How Cloudflare runs more AI models on fewer GPUs: A technical deep-dive

Cloudflare built an internal platform called Omni. This platform uses lightweight isolation and memory over-commitment to run multiple AI models on a single GPU.

By Sven Sauleau

发布者

cloudflareblog

Read the latest posts - ”AI Gateway now gives you access to your favorite AI models, dynamic routing and more — through just one endpoint”