Alluxio Q2: Customer Growth & Sub-Millisecond AI Data Latency
Alluxio, the AI and data-acceleration platform, right this moment introduced sturdy outcomes for the second quarter of its 2026 fiscal 12 months. During the quarter, the corporate launched Alluxio Enterprise AI 3.7, a serious launch that delivers sub-millisecond TTFB (time to first byte) latency for AI workloads accessing information on cloud storage.
Alluxio additionally reported new buyer wins throughout a number of industries and AI use instances, together with mannequin coaching, mannequin deployment, and have retailer question acceleration. In addition, the MLPerf Storage v2.0 benchmark outcomes underscored Alluxio’s management in AI infrastructure efficiency, with the platform attaining distinctive GPU utilization and I/O acceleration throughout numerous coaching and checkpointing workloads.
“This was an exceptional quarter for Alluxio, and I couldn’t be prouder of what the workforce has achieved,” mentioned Haoyuan Li, Founder and CEO, Alluxio. “With Alluxio Enterprise AI 3.7, we’ve eradicated probably the most cussed bottlenecks in AI infrastructure, cloud storage efficiency. By combining sub-millisecond latency with our industry-leading, throughput-maximizing distributed caching expertise, we’re delivering even higher worth to our prospects constructing and serving AI fashions at scale. The sturdy buyer momentum and excellent MLPerf benchmark outcomes additional reinforce Alluxio’s vital function within the AI infrastructure stack.”
Key Features of Alluxio Enterprise AI 3.7
- Ultra-Low Latency Caching for Cloud Storage – Alluxio AI 3.7 introduces a distributed, clear caching layer that reduces latency to sub-millisecond ranges whereas retrieving AI information from cloud storage. It achieves as much as 45× decrease latency than S3 Standard and 5× decrease latency than S3 Express One Zone, plus as much as 11.5 GiB/s (98.7 Gbps) throughput per employee node, with linear scalability as nodes are added.
- Enhanced Cache Preloading – The Alluxio Distributed Cache Preloader now helps parallel loading, delivering as much as 5× quicker cache preloading to make sure scorching information availability for quicker AI coaching and inference chilly begins.
- Role-Based Access Control (RBAC) for S3 Access – New granular RBAC capabilities permit tight integration with id suppliers (OIDC/OAuth 2.0, Apache Ranger), controlling person authentication, authorization, and permitted operations on cached information.
Customer Momentum in H1 2025
The first half of 2025 noticed report market adoption of Alluxio AI, with buyer development exceeding 50% in comparison with the earlier interval. Organizations throughout tech, finance, e-commerce, and media sectors have more and more deployed Alluxio’s AI acceleration platform to boost coaching throughput, streamline characteristic retailer entry, and velocity inference workflows. With rising deployments throughout hybrid and multi-cloud environments, demand for Alluxio AI displays quickly rising expectations for high-performance, low-latency AI information infrastructure. Notable prospects added within the half embrace:
- Salesforce
- Dyna Robotics
- Geely
Substantial I/O Performance Gains Confirmed in MLPerf Storage v2.0 Benchmark
Alluxio’s distributed caching structure underscores its dedication to maximizing GPU effectivity and AI workload efficiency. In the MLPerf Storage v2.0 benchmarks:
- Training Throughput
- ResNet50: 24.14 GiB/s supporting 128 accelerators with 99.57% GPU utilization, scaling linearly from 1 to eight shoppers and a couple of to eight employees.
- 3D-Unet: 23.16 GiB/s with 8 accelerators, 99.02% GPU utilization, equally scaling linearly.
- CosmoFlow: 4.31 GiB/s with 8 accelerators, using 74.97%, almost doubling efficiency when scaling shoppers.
- LLM Checkpointing
- Llama3-8B: 4.29 GiB/s learn and 4.54 GiB/s write (learn/write occasions: 24.44s and 23.14s).
- Llama3-70B: 33.29 GiB/s learn and 36.67 GiB/s write (learn/write occasions: 27.39s and 24.86s).
“The MLPerf Storage v2.0 outcomes validate the core worth of our structure: protecting GPUs fed with information on the velocity they require,” Li added. “High GPU utilization interprets instantly into quicker coaching occasions, higher throughput for giant fashions, and better ROI on infrastructure investments. These benchmarks present that Alluxio can ship efficiency at scale, throughout numerous workloads, with out compromising flexibility in hybrid and multi-cloud environments.”
The put up Alluxio Q2: Customer Growth & Sub-Millisecond AI Data Latency first appeared on AI-Tech Park.