Applied Compute (@appliedcompute) - Twitter Profili

Sabitlenmiş Tweet

Applied Compute@appliedcompute·8 Nis

x.com/i/article/2041…

ZXX

5

18

136

196.4K

Applied Compute@appliedcompute·3d

English

1

12

1.2K

Applied Compute@appliedcompute·3d

Using the Context Engine to build a pipeline on APEX-Agents produces up to 16.9% relative improvement at fixed reasoning, with consistent gains on GDPVal. For enterprises, this turns context into a compounding asset: every production rollout makes the next one better.

English

1

0

12

1.6K

Applied Compute@appliedcompute·3d

Introducing the AC Context Engine: enterprise-grade infrastructure to continuously encode nuanced institutional knowledge into a living artifact (Contextbase). We find that our Contextbases can be the unlock to moving the Pareto frontier on cost and intelligence.

English

2

5

97

44.6K

Applied Compute@appliedcompute·22 Nis

Read the full research report: appliedcompute.com/research/infer…

English

1

12

1.2K

Applied Compute@appliedcompute·22 Nis

We release the three workload files along with a lightweight harness for replaying them. See the repository here: github.com/Applied-Comput…

English

1

0

12

1.4K

Applied Compute@appliedcompute·22 Nis

We study three production use cases: agentic coding, code QA, and office work. For each, we capture full traces from production deployments. These workloads are long-context and long-horizon, extending into hundreds of tool call turns for each. Each row in the workload file is a single agent trace, including the input prompt, generation, and tool call lengths needed to synthetically replay against an OpenAI-compatible endpoint.

English

1

0

20

2.2K

Applied Compute@appliedcompute·22 Nis

Inference demand in 2026 has surged, but not for single-turn workloads that most engines are benchmarked on. Agentic workloads have a different structure: traces consist of many tool-calling turns with heavy-tailed distributions over assistant and tool output. These workloads introduce a new set of challenges for efficient serving. We pulled production traces from over 100 post-training runs and are open sourcing these workloads to help define a new target for inference engine optimization.

English

6

12

133

34.1K

Applied Compute retweetledi

Moritz Stephan@moritz_stephan·15 Nis

it was a blast working with @spdling, @rhythmrg, @raymondmfeng and the rest of the @appliedcompute team. Splitting capability maximization (i.e. be good at finding bugs) and product alignment (i.e. short rollouts) into two distinct phases while training made a big difference here and can be useful for other specialized models when real-world product constraints matter

Cognition@cognition

Today we're releasing SWE-check, a specialized bug detection model we RL-trained with @appliedcompute that matches frontier performance on internal in-distribution evals and makes meaningful progress on out-of-distribution evals, all while running 10x faster.

English

0

2

43

4.3K

Applied Compute@appliedcompute·15 Nis

Our work on SWE-check with @cognition is a good window into how we work. We collaborate closely with the team, train a specialized model inside their real environment, and iterate from feedback. Training a specialized model gives teams the flexibility to choose where they want to sit on the cost-latency-performance Pareto frontier. In this case, we specifically optimized for cost and latency given the product requirements. Try it for yourself in Windsurf Next today, and read the technical details in the post below!

Cognition@cognition

Today we're releasing SWE-check, a specialized bug detection model we RL-trained with @appliedcompute that matches frontier performance on internal in-distribution evals and makes meaningful progress on out-of-distribution evals, all while running 10x faster.

English

2

4

68

6.2K

Applied Compute retweetledi

Bryan Lee@_brylee10·10 Nis

at AC i’ve learned forward deployed work is among my favorite. a personal favorite memory was getting a high five from a customer after a day in the office and a successful prod deployment. closely collaborating with companies and diving into the nitty-gritty of their systems to make agents work is challenging but rewarding. it’s “full stack” in the sense it involves a eng, research, and understanding customer needs which makes each day different and gets me excited.

Applied Compute@appliedcompute

There is a large delta between what models can do and what they deliver in company-specific workflows. We bridge that gap through forward deployment. In a given week, our engineers might build eval frameworks from scratch, deploy a large-scale context ingestion engine, and present results to F500 leadership. We fine-tune models on proprietary data no frontier lab has seen and optimize agent performance against real-world outcomes. We're excited by engineers with rigor, high customer empathy, and a bias toward action in ambiguity. appliedcompute.com/blog/unlocking…

English

0

4

22

2.4K

Applied Compute@appliedcompute·10 Nis

There is a large delta between what models can do and what they deliver in company-specific workflows. We bridge that gap through forward deployment. In a given week, our engineers might build eval frameworks from scratch, deploy a large-scale context ingestion engine, and present results to F500 leadership. We fine-tune models on proprietary data no frontier lab has seen and optimize agent performance against real-world outcomes. We're excited by engineers with rigor, high customer empathy, and a bias toward action in ambiguity. appliedcompute.com/blog/unlocking…

English

0

45

37.4K

Applied Compute@appliedcompute·8 Nis

If you believe in this future, you should join us: jobs.ashbyhq.com/Applied%20Comp…

English

0

10

2.5K

Applied Compute@appliedcompute·8 Nis

x.com/i/article/2041…

ZXX

5

18

136

196.4K

Applied Compute@appliedcompute·31 Mar

@Wing_VC @EricNewcomer Read more: #mid-stage" target="_blank" rel="nofollow noopener">wing.vc/et30/list#mid-…

English

0

4

918

Applied Compute@appliedcompute·31 Mar

Thanks to @Wing_VC and @EricNewcomer for recognizing us in the 2026 Enterprise Tech 30 list alongside so many exceptional teams. Lots more to build.

English

4

5

26

4.1K

Applied Compute@appliedcompute·27 Mar

We post-trained AC-Small on ~2,000 expert tasks in law, consulting, and finance. It improved on every held-out professional benchmark we tested, including GDPVal, Toolathalon, and APEX V1, with no regression in general capabilities. The strongest gain was in medicine (+13.3 pp), a domain entirely absent from training. What generalized was procedural discipline. Expert data encodes how professionals structure, verify, and revise their work, and that discipline transfers across domains. Enterprises that capture and structure the work of their best professionals will build models that can outperform general alternatives on the tasks that matter most to their business.

Mercor@mercor_ai

Does training on APEX-Agents dev set generalize beyond the benchmark? @appliedcompute post-trained GLM-4.7 on ~2,000 expert Mercor tasks and achieved state-of-the-art legal performance on APEX Agents. We then evaluated the model on other enterprise benchmarks. On GDPVal, AC-Small’s win+tie rate rose from 55.0% to 62.7% (+7.7pp), ranking 5th overall and ahead of Opus 4.5.

English

0

6

50

7.8K

Applied Compute@appliedcompute·26 Mar

The FDE role of the AI era has fundamentally changed. It's no longer just about building dashboards and connecting data pipes - it's about building evals, deploying agents that improve in production, winning trust across the org chart, and closing feedback loops that compound over time. We wrote about what to expect when deploying AI in the enterprise today.

Michael Chen@michaelzchen5

x.com/i/article/2037…

English

3

4

67

12.2K

Applied Compute

Keşfet