Lambda

1.8K posts

Lambda banner
Lambda

Lambda

@LambdaAPI

The Superintelligence Cloud

San Francisco, CA Katılım Temmuz 2012
240 Takip Edilen19.8K Takipçiler
Sabitlenmiş Tweet
Lambda
Lambda@LambdaAPI·
Lambda has raised over $1.5B in equity to build superintelligence cloud infrastructure. TWG Global, USIT, and existing investors led the Series E round to position Lambda to execute on its mission to give everyone the power of superintelligence. One person, One GPU. Press release: bit.ly/3X3lOOb
English
8
16
99
40.3K
Lambda
Lambda@LambdaAPI·
We tested NVIDIA HGX B200 and GB300 NVL72 with TorchTitan and reproducible config changes: - Llama 3.1 8B on 8x B200: 55% MFU - Llama 3.1 70B on 16x B200: 50% - Llama 3.1 405B on 128 GPUs (GB300): 53%. At 16k–32k sequence, MFU climbs toward 60%.
English
1
0
12
1.8K
Lambda
Lambda@LambdaAPI·
The xAI "low utilization" story has people mixing up two different metrics. Fleet utilization tells how many GPUs are running. Model FLOPS Utilization (MFU) how much compute each running GPU is actually capturing. Both matter, but they're not the same.
English
3
7
81
15.8K
kishin
kishin@mulamx·
I need some credits for GPU. Anyone?
GIF
English
2
0
3
229
Lambda
Lambda@LambdaAPI·
Open-sourced on @huggingface. Downloaded 8.2k times last month.
English
1
0
4
389
Lambda
Lambda@LambdaAPI·
Agent harnesses reach the many when the models inside them are efficient. Our goal: compress frontier-grade skills into a footprint small enough to run unquantized on lightweight compute.
Lambda tweet media
English
1
3
15
2.1K
Lambda
Lambda@LambdaAPI·
The shift to AI factories needs strong infrastructure and strong leadership. Tomorrow in Los Gatos, Lambda joins @nvidia and @aligneddc to discuss women’s leadership, talent trends, and the future of AI infrastructure.
Lambda tweet media
English
1
1
9
556
Lambda
Lambda@LambdaAPI·
FlashAttention-4 on NVIDIA Blackwell: 1,613 TFLOPs/s, 71% hardware utilization, and up to 2.7× faster than Triton.
English
1
0
4
560
Lambda
Lambda@LambdaAPI·
Same GPU. More tokens per second. Lower cost per token. The most optimized open-source attention kernel yet.
Lambda tweet media
English
2
1
17
1K
Lambda
Lambda@LambdaAPI·
.@deepseek_ai v4 Pro's checkpoint is both in FP4 and FP8, depending on the layer. This means that the entire model can fit on a single NVIDIA 8xB200 node without trouble. @vllm_project: "Checkpoint is FP4+FP8 mixed: MoE expert weights are stored in FP4 while the remaining (attention / norm / router) params stay in FP8."
Lambda tweet media
English
0
1
26
4.5K
Lambda
Lambda@LambdaAPI·
A few highlights: A 7B agent that outperforms GPT-4o, not by scaling up, but by training smarter inside its own loop. Lossless weight compression that speeds up inference by 177%. And a public arena where 23 teams ran 103,000 adversarial battles to stress-test LLM security.
English
1
0
4
626