New in Modular Platform 25.5： Large Scale Batch Inference, standalone Mojo packages, open source MAX Graph API, and seamless MAX <> PyTorch integration

New in Modular Platform 25.5: Large Scale Batch Inference, standalone Mojo packages, open source MAX Graph API, and seamless MAX <> PyTorch integration @media only screen and (max-width:639px){img.stretch-on-mobile,.hs_rss_email_entries_table img,.hs-stretch-cta .hs-cta-img{height:auto !important;width:100% !important} .display_block_on_small_screens{display:block}.hs_padded{padding-left:20px !important;padding-right:20px !important} .hs-hm,table.hs-hm{display:none}.hs-hd{display:block !important}table.hs-hd{display:table !important} }@media only screen and (max-width:639px){.hse-border-m{border-left:1px solid #cbd6e2 !important;border-right:1px solid #cbd6e2 !important;box-sizing:border-box} .hse-border-bottom-m{border-bottom:1px solid #cbd6e2 !important}.hse-border-top-m{border-top:1px solid #cbd6e2 !important} .hse-border-top-hm{border-top:none !important}.hse-border-bottom-hm{border-bottom:none !important} }.moz-text-html .hse-column-container{max-width:600px !important;width:600px !important} .moz-text-html .hse-column{display:table-cell;vertical-align:top}.moz-text-html .hse-section .hse-size-6{max-width:300px !important;width:300px !important} .moz-text-html .hse-section .hse-size-12{max-width:600px !important;width:600px !important} @media only screen and (min-width:640px){.hse-column-container{max-width:600px !important;width:600px !important} .hse-column{display:table-cell;vertical-align:top}.hse-section .hse-size-6{max-width:300px !important;width:300px !important} .hse-section .hse-size-12{max-width:600px !important;width:600px !important} }@media only screen and (max-width:639px){.hse-body-wrapper-td{padding-top:20px !important} #section-0 .hse-column-container{padding-top:10px !important;padding-bottom:10px !important;background-color:transparent !important} #section-0 .hse-column-container{background-color:transparent !important} }@media only screen and (max-width:639px){ #section-1 .hse-column-container{padding-top:0px !important;padding-bottom:0px !important} #section-1 .hse-column-container{background-color:#fff !important} }@media only screen and (max-width:639px){ #section-2 .hse-column-container{padding-top:10px !important;padding-bottom:0px !important} #section-2 .hse-column-container{background-color:#fff !important} }@media only screen and (max-width:639px){ #section-3 .hse-column-container{padding-top:0px !important;padding-bottom:0px !important} #section-3 .hse-column-container{background-color:#fff !important} }@media only screen and (max-width:639px){ #section-4 .hse-column-container{padding-top:0px !important;padding-bottom:0px !important} #section-4 .hse-column-container{background-color:#fff !important} }@media screen and (max-width:639px){.social-network-cell{display:inline-block} }@media only screen and (max-width:639px){ #section-5 .hse-column-container{padding-top:0px !important;padding-bottom:0px !important} #section-5 .hse-column-container{background-color:#fff !important} }@media only screen and (max-width:639px){ #section-6 .hse-column-container{padding-top:0px !important;padding-bottom:0px !important} #section-6 .hse-column-container{background-color:#fff !important} }@media only screen and (max-width:639px){.hse-body-wrapper-td{padding-bottom:20px !important} #section-7 .hse-column-container{padding-top:30px !important;padding-bottom:0px !important;background-color:transparent !important} #section-7 .hse-column-container{background-color:transparent !important} }#hs_body #hs_cos_wrapper_main a[x-apple-data-detectors]{color:inherit !important;text-decoration:none !important;font-size:inherit !important;font-family:inherit !important;font-weight:inherit !important;line-height:inherit !important} a{text-decoration:underline}p{margin:0}body{-ms-text-size-adjust:100%;-webkit-text-size-adjust:100%;-webkit-font-smoothing:antialiased;moz-osx-font-smoothing:grayscale} table{border-spacing:0;mso-table-lspace:0;mso-table-rspace:0}table,td{border-collapse:collapse} img{-ms-interpolation-mode:bicubic}p,a,li,td,blockquote{mso-line-height-rule:exactly}

Introducing Large Scale Batch Inference: a highly asynchronous, at-scale batch API built on open standards and powered by Mammoth. We’re launching this new capability through our partner SF Compute, enabling high-volume AI performance with a fast, accurate, and efficient platform that seamlessly scales workloads across any hardware.

Modular Platform 25.5: Introducing Large Scale Batch Inference

Modular Platform 25.5 is built for developers who need performance at scale. The standout feature is Large Scale Batch Inference, a high-throughput, OpenAI-compatible API built on open standards and powered by Mammoth, our intelligent orchestration layer. It enables seamless scaling across both NVIDIA and AMD hardware and is already in production through our partner SF Compute. You can try it today with over 20 open models.

We’ve also made major improvements across the stack:

Standalone Mojo Conda packages make Mojo development and deployment easier than ever.
The MAX Graph API is now fully open source, including supporting code and unit tests. You can build portable, GPU-accelerated graphs directly in Python.
MAX graphs can now be seamlessly integrated into PyTorch workflows. Use the new @graph_op decorator to turn any MAX graph into a PyTorch operator.

This release also includes performance boosts for MAX, expanded Mojo language features, and dramatically smaller serving packages for NVIDIA GPUs.

Read the full blog post

Livestream: Introducing Modular Platform 25.5

Get an inside look at what’s new in Modular Platform 25.5. The team will walk through the latest updates to MAX and Mojo, our enhanced PyTorch integration, and Large Scale Batch Inference powered by Mammoth, followed by a live Q&A.

RSVP to join virtually

Modular Community Meetup

Join us at the Modular office in Los Altos, CA for our next community meetup! Hear from the Modular team as we share updates and insights on building the future of AI infra. Can’t make it in person? Select the virtual attendance option and we’ll send you a link to join.

Save your spot

发布者

modular