emi (@gpuemi) - Perfil de Twitter | Zamantika Mersobahis Locabet

emi retuiteado

steve@gpusteve·1h

GLM-5.2 Fast is up on the @vercel ai gateway now. 150 - 250 tps. fastest serverless speeds out there give it a try: vercel.com/wafer-ai/~/ai-…

English

0

1

5

105

emi retuiteado

Vercel Developers@vercel_dev·1h

GLM 5.2 Fast via Wafer is now available exclusively on Vercel AI Gateway. 2x faster token throughput vs other providers in internal benchmarks. 𝚖𝚘𝚍𝚎𝚕: '𝚣𝚊𝚒/𝚐𝚕𝚖-𝟻.𝟸-𝚏𝚊𝚜𝚝' vercel.com/changelog/glm-…

English

5

8

71

17.6K

emi retuiteado

Y Combinator@ycombinator·1d

Bring your whole team and dozens of AI coding agents into the same chat threads, then let @linzumi_ai keep the fleet coordinated and unblocked. And for a limited time, try state of the art open-weights intelligence free: GLM 5.2 at high speed, via their @wafer_ai partnership. → linzumi.com

English

34

38

263

161.5K

emi retuiteado

Analytic Valley Girl Chris@ChrisExpTheNews·3d

I do not get this neuroticism. Your electricity bill from leaving the lights on all the time is like ten dollars a year. It is not 1926.

Matt Smethurst@MattSmethurst

40% of fatherhood is walking around the house, turning off lights.

English

483

486

29.5K

2.6M

emi retuiteado

Ankit Gupta@agupta·2d

in the near term you should probably just use @wafer_ai which seems to be the cheapest provider instead of building your own rig. but the fact you can even consider doing it now is a very good sign for personal on-device AI in the next decade as trillions of dollars go into chip science.

Jordan Nanos@JordanNanos

GLM 5.2 costs $1.40/4.40 per Mtok at 40 tok/sec and people seriously consider buying GPU rigs for it

English

13

142

51.3K

emi retuiteado

dhru@dhruzzz·3d

@wafer_ai the pricing is good too

English

0

2

6

1.4K

emi retuiteado

Aether Oracle@aether_oracle·4d

Holy shit 222 tokens per second with GLM 5.2

wafer@wafer_ai

🚨 BREAKING: wafer now runs the fastest, lowest-latency GLM-5.2 anywhere ranked #1 across every provider on Artificial Analysis: ⚡ 222 output tok/s (next best: 173) ⚡ 12.6s end-to-end response time (next best: 16.9s) try it: app.wafer.ai

English

1

2

8

860

emi retuiteado

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·4d

Opus-SuperFast

wafer@wafer_ai

🚨 BREAKING: wafer now runs the fastest, lowest-latency GLM-5.2 anywhere ranked #1 across every provider on Artificial Analysis: ⚡ 222 output tok/s (next best: 173) ⚡ 12.6s end-to-end response time (next best: 16.9s) try it: app.wafer.ai

English

7

3

219

20.2K

emi retuiteado

Olek@oleksoleksoleks·4d

GLM 5.2 @ 220 tok/s wafer.ai

Indonesia

18

4

326

45.6K

emi retuiteado

SemiAnalysis@SemiAnalysis_·5d

100% of AI chip startups have slides/“simulated performance data” showing that their chip is way better, but 99% of custom ASICs fail. Why? The MATH isn’t MATH until you realize that AI chips are about Software. It is relatively easy to build a chip and put numbers onto slides; it is hard to build great software. That is why 99% of AI chip startups fail.

English

51

61

798

144.7K

emi retuiteado

wafer@wafer_ai·4d

🚨 BREAKING: wafer now runs the fastest, lowest-latency GLM-5.2 anywhere ranked #1 across every provider on Artificial Analysis: ⚡ 222 output tok/s (next best: 173) ⚡ 12.6s end-to-end response time (next best: 16.9s) try it: app.wafer.ai

English

62

32

658

217.6K

emi retuiteado

wafer@wafer_ai·5d

🚨 BREAKING: GLM-5.2 is now on app.wafer.ai and openrouter.ai/provider/wafer It's the top open-weights model right now. #1 open model on the Artificial Analysis Intelligence Index with a score of 51, a 1M-token context window. It’s also the first open-weights model to cross 80% on Terminal-Bench. try it now: app.wafer.ai

English

3

14

98

6.5K

emi retuiteado

wafer@wafer_ai·5d

computa activate GLM 5.2 on app.wafer.ai

English

1

3

13

1.1K

emi retuiteado

Alfredo Andere@AlfredoAndere·17 Haz

as with most large research projects that concretize in one pretty graph, its hard to gauge the months of work, and hundreds of engineer and scientist hours that went into this result. grateful for the love and care kenny and team put into this and excited to finally share.

Kenny Workman@kenbwork

We introduce TherapeuticsBench Preclinical Pharmacology (TxBench-PP), a verifiable benchmark for small-molecule preclinical pharmacology and the first focused slice of a broader benchmarking effort across drug-discovery stages and therapeutic modalities. TxBench-PP tests whether agents can recover accurate conclusions from realistic assay artifacts rather than memorized facts from the literature. The benchmark contains 100 evaluations indexed by program stage, assay type, and task structure, spanning mechanism-of-action (MoA) and pharmacodynamic (PD) reasoning, compound-target engagement, causal target validation, developability and safety, and translational efficacy. The strongest model-harness configuration was Claude Opus 4.8 + Pi at 59.3%, followed by GPT-5.5 + Pi at 55.3%. While experiments are rate-limited by natural processes, human decisions and organizational consensus often make up significant components of program timelines in drug discovery. Agents promise to accelerate discovery, development, and translation by compressing these interpretation and decision-making loops. However, the practical use of agentic systems in industrial workflows requires standardized and trusted methods of evaluating performance. This is especially challenging in drug discovery because the ecosystem is a sprawling landscape of assay categories, development stages, therapeutic modalities, and decision types. Benchmarks must therefore measure realistic tasks while providing focused treatment of the many local scientific judgments that make up the biotech ecosystem. We evaluated 16 model-harness configurations, comprising 11 models across three agent harnesses, on 100 preclinical pharmacology tasks. Each configuration was run three independent times per task, yielding 4,800 agent trajectories. Performance varied by program stage: accuracy ranged from 27% in screening and hit prioritization to 55% in drug response. Difficult program stages involved decisions across QC, statistics, and chemical or biological judgment of molecular candidates. Trajectory analysis reveals gaps in scientific judgement. Failures included incorrect perception of assay outputs, reliance on literature priors over supplied evidence, and assay-specific reasoning mistakes. Manuscript, results and subset of evals/trajectories available below:

English

0

3

11

2.2K

emi retuiteado

steve@gpusteve·6d

we RECENTLY delivered low latency voice for neon health!! see below to see how we delivered blazing fast speeds.

wafer@wafer_ai

minimizing time to first token is CRUCIAL for voice AI deployment. optimizing GLM-5.1 for Neon Health took ttft 800ms → 550ms at 25% higher peak load. what we learned: • kv locality is a scheduling primitive (95%+ hit rate) • prefill admission > prefill speed — chunked prefill • short decode steps beat speculative decoding under bursts • stable first token > max gpu utilization • optimize the client-observed number, not server-side ttft under 2 weeks, under a BAA, US-only residency.

English

2

1

20

2.3K

emi retuiteado

wafer@wafer_ai·6d

minimizing time to first token is CRUCIAL for voice AI deployment. optimizing GLM-5.1 for Neon Health took ttft 800ms → 550ms at 25% higher peak load. what we learned: • kv locality is a scheduling primitive (95%+ hit rate) • prefill admission > prefill speed — chunked prefill • short decode steps beat speculative decoding under bursts • stable first token > max gpu utilization • optimize the client-observed number, not server-side ttft under 2 weeks, under a BAA, US-only residency.

English

2

1

13

3.2K

emi@gpuemi·17 Haz

good luck to p26! <3

steve@gpusteve

today is yc demo day. just about a year ago, @gpuemi and i stepped onto that stage and presented wafer (f.k.a. herdora). what felt like the end of a chaotic batch turned out to be the beginning of everything that mattered. for everyone presenting today: enjoy the moment, celebrate how far you've come, and take the photos. wishing u the best, p26♥️

English

1

0

8

3.4K

emi retuiteado

steve@gpusteve·17 Haz

today is yc demo day. just about a year ago, @gpuemi and i stepped onto that stage and presented wafer (f.k.a. herdora). what felt like the end of a chaotic batch turned out to be the beginning of everything that mattered. for everyone presenting today: enjoy the moment, celebrate how far you've come, and take the photos. wishing u the best, p26♥️

English

53

3

200

78.5K

emi retuiteado

david friedberg@friedberg·14 Haz

they’re not jobs if they’re not valued. they’re not valued if there aren’t customers out there willing to pay them for their great work. needing the government to “create” a job is tantamount to welfare and that level of welfare resolves these individuals to a dependency on the government and lack of economic mobility. and chains our people, collectively, to a more indentured future. you may be well intentioned but you have, and always will, fail to see the destitute folly of government as a job creation engine. i have tried to engage you on this topic, in good faith, with empiricism and reasoning, but you have only dodged my points and pivoted to some populist refrain about the importance of taxation and the evils of productivity-driven success. i can only assume you’re dodging these truths because you and the rest of the politburo leadership have deemed the conversation unsafe speech and put your oligopoly at risk. let’s leave it at that then. perhaps if your ways get their day, we can all bask in the glories of the dark ages ahead.

English

336

952

12.1K

465.2K

emi@gpuemi·14 Haz

@AlfredoAndere @WillManidis :(

QAM

0

41

Alfredo Andere@AlfredoAndere·12 Haz

@WillManidis Many such cases

English

1

0

3

514

Will Manidis@WillManidis·12 Haz

it brings me no joy to report I spend a year wondering why I was constantly sleepy and had a low sleep score on my whoop that was totally cured by simply stop wearing the whoop.

English

98

83

7.2K

790.4K

emi

Descubrir