Alluxio

2.6K posts

Alluxio

@Alluxio

Open Source Data Orchestration for Analytics and Machine Learning in the Cloud @TachyonProject is now @Alluxio! [email protected]

Katılım Ekim 2015

197 Takip Edilen1.3K Takipçiler

Alluxio@Alluxio·2d

Object storage is durable, but AI workloads need: • fast reads • efficient metadata ops • better access semantics Keep object storage as the source of truth. Add a data layer near compute. 👉na2.hubs.ly/H04PhqF0 #AIInfrastructure #MLOps

English

Alluxio@Alluxio·3d

Scaling GenAI to 300TB/day requires a fast data path. Security leader @Uptycs moved beyond traditional caching for better scale. Results: ⚡ Sub-second responses 📉 90% CPU reduction See how: na2.hubs.ly/H04N12T0 #GenAI #BigData

English

Alluxio@Alluxio·4d

GPUs often sit idle when data is locked in a different region. This guide shows how to optimize your data path to maximize throughput and cut cross-region egress costs. 👉: na2.hubs.ly/H04LXL20 #AI #GPU #DataPath

English

Alluxio@Alluxio·5d

@Coupang solved the "Data Wait" in ML training. By using a distributed cache across hybrid GPU clusters, they hit: ⚡ Instant job starts (no manual copying) 🚀 40% faster I/O than parallel file systems 🌐 Total code portability Full story: na2.hubs.ly/H04KC3T0 #ML #AI #GPU

English

Alluxio@Alluxio·3 Nis

Parquet on object storage can get expensive for retrieval-heavy workloads. Co-authored with @salesforce engineers, this white paper looks at reducing round trips for RAG, feature retrieval, and similar access patterns. 🔗 na2.hubs.ly/H04GQMH0 #AIInfrastructure #RAG

English

Alluxio@Alluxio·2 Nis

S3 is great for durability and scale. But for low-latency, semantics-heavy workloads, it can become the bottleneck. A tiered architecture with S3 + Alluxio helps close that gap.👇 na2.hubs.ly/H04Ftlg0 #AIInfrastructure #CloudStorage

English

Alluxio@Alluxio·1 Nis

A lot of AI infra friction comes from the data path: slow dataset access, too much data movement, and GPUs sitting idle waiting on data. This white paper breaks down the problem and the architecture behind it：na2.hubs.ly/H04D1ZH0 #AIInfrastructure #MLOps

English

Alluxio@Alluxio·31 Mar

@xiaohongshu cut nightly model training to 5.5 hours with Alluxio, plus 10x faster model downloads and 80% lower distribution cost. At scale, ML is often limited by data movement. Learn more: na2.hubs.ly/H04z1J30 #MLOps #MultiCloud #AIInfrastructure

English

Alluxio@Alluxio·30 Mar

Object storage is durable, but write-heavy pipelines often hit its limits. With Alluxio S3 Write Cache, 10 KB PUT latency dropped from ~30–40 ms to ~4–6 ms, with faster read-after-write and scaling.👇 na2.hubs.ly/H04z1JX0 #AIInfrastructure #ObjectStorage #MachineLearning

English

Alluxio@Alluxio·27 Mar

Storage I/O plays a bigger role in AI training than many teams expect. MLPerf Storage v2.0 showed strong results in data loading and checkpointing, with up to 99.57% GPU utilization. 👇 na2.hubs.ly/H04xL640 #AIInfrastructure #GPUComputing

English

Alluxio@Alluxio·26 Mar

PyTorch performance tuning is bigger than the training loop. Data access, GPU efficiency, and distributed execution all affect throughput. This guide covers practical ways to improve training efficiency. 👇 na2.hubs.ly/H04v70l0 #PyTorch #MLOps #AIInfrastructure

English

Alluxio@Alluxio·25 Mar

AI workloads need more than scalable object storage. Alluxio reduces small-object write latency by 5–8x and speeds up Safetensors model loading by 18x, helping teams keep GPUs moving. 👉na2.hubs.ly/H04v6Pf0 #AIInfrastructure #GPUComputing

English

Alluxio@Alluxio·24 Mar

ML scale is not just a compute problem. It is also a data access problem. Blackout Power Trading used Alluxio to scale from 5K to 100K+ models in the same 15-minute window. 👇 na2.hubs.ly/H04sHwQ0 #AIInfrastructure #MLOps

English

Alluxio@Alluxio·23 Mar

AI workloads do not scale on object storage alone. Add a data layer that removes the I/O bottleneck to get: ☑️ Low latency ☑️ High throughput ☑️ Less data movement ☑️ Better data access for GPUs 👉na2.hubs.ly/H04rcrW0 #AIInfrastructure #GPUComputing

English

Alluxio@Alluxio·20 Mar

#GTC week puts the spotlight on compute. But GPU utilization still depends on: • storage throughput • small files • data movement The data path still matters. 👇 na2.hubs.ly/H04pQs80 #GTC2026 #AIInfrastructure

English

Alluxio@Alluxio·19 Mar

What a week at #GTC26! Huge thanks to @Oracle for hosting us. ⚡ If you're buying fast compute but hitting the I/O wall, @Alluxio solves GPU starvation. Come talk to our team today at the Oracle Booth #1613 before we wrap up! #AIInfrastructure #DataArchitecture #OCI #GPU

English

Alluxio@Alluxio·18 Mar

Stop GPU starvation at #GTC2026! ⚡️ Join us at 2:25 PM to dive into our joint architecture with @OracleCloud. Learn how our compute-side caching continuously feeds OCI's high-performance GPUs, plus how @FireworksAI hit 1TB/s+ throughput! #AIInfrastructure #GPU #DataEngineering

English

Alluxio@Alluxio·17 Mar

The master node is a bottleneck for AI training. We built DORA: a fully decentralized, masterless architecture that keeps data close to compute with <1ms latency. Dive into the tech: na2.hubs.ly/H04kVLT0 #MLOps #AI #GPUs #DataEngineering

English

Alluxio@Alluxio·16 Mar

At #NVIDIAGTC today: Learn how Alluxio + @OracleCloud eliminate data bottlenecks and keep GPUs fully utilized. 🕒4:15 PM | Expo Hall Theater Or visit us at 📍 Oracle Booth #1613 👉 na2.hubs.ly/H04jL1y0 #AIInfrastructure #GPU #MachineLearning

English

Alluxio@Alluxio·13 Mar

🤖 Real-world robotics training creates tens of TBs of data daily. @DynaRobotics used Alluxio to eliminate 30% training slowdowns on H100 clusters and unlock multi-cloud GPU training. 👉na2.hubs.ly/H04gKXM0 #EmbodiedAI #Robotics #AIInfrastructure #GPU

GIF

English

Keşfet

@uptycs @Coupang @salesforce @xiaohongshu @Oracle @OracleCloud @DynaRobotics @elonmusk