Turing

9.5K posts

Turing banner
Turing

Turing

@turingcom

Accelerating superintelligence to drive economic growth.

Palo Alto, CA Katılım Eylül 2018
2.1K Takip Edilen15.8K Takipçiler
Sabitlenmiş Tweet
Turing
Turing@turingcom·
That COBOL system you retired 3 years ago? It's sitting in a repo — unmaintained, unused, but valuable. Have a legacy codebase? Schedule a call below.
English
2
6
19
19.3K
David Singleton
Excited to announce that @hbarra , @alcor and I are joining Meta Superintelligence Labs with the entire @Dreamer team today. The last few months have been extraordinary: we built Dreamer, put the beta in the world just a month ago, and saw magic come to life for real people. Since then, thousands of people have used Dreamer to build personal, intelligent software with our Sidekick in the world’s newest and most popular programming language: English! They're building and sharing agents to manage email, calendar, and to-do’s, create learning tools for their kids, learn new languages, plan trips with friends, become better cooks, help them with work, achieve their health goals, or simply to creatively express themselves—all sorts of surprising and uniquely personal needs. These are agents as unique as the people building them, because they're built exactly the way each person wants them to be. We’ve captured some of our favorites at dreamer.com/community-lett…. What matters most here isn’t the early momentum; it’s what Dreamer has enabled people to do. People are building things they’ve wanted for years. They’re solving real, important problems no traditional software company would ever prioritize, because they’re too niche, too bespoke, too personal. What company would ever build for an “n of 1”? Our bet from the beginning has been that software should be personal, malleable, and shaped by the person using it. The constraint was never people’s imagination. It was the fact that building software is out of reach for most people. This early chapter gives us conviction that the idea resonates, the need is real, and the moment is now. @alexandr_wang was helpful to us from the very beginning, and when we showed Dreamer to Mark Zuckerberg and @natfriedman earlier this year, it was clear right away that we share the same vision of the future: one where billions of people have the power to create software that makes their lives better. We’re thrilled to accelerate this mission by joining Meta Superintelligence Labs and licensing our technology to Meta. Read more at meta.com/superintellige…. Deeply grateful to our investors @jillchase124 and @ninaachadjian for supporting our vision for a more personal, creative, and intelligent future for software. Thank you for the trust, the thought partnership, and for being in our corner at every step. To everyone in our community who built with us: thank you. You've taught us what's possible, and you're the proof this works. We're so grateful, and we're just getting started!
David Singleton tweet media
English
75
30
518
262.8K
Turing
Turing@turingcom·
Turing operates at the intersection of frontier research and enterprise deployment. Our experience with leading AI labs informs what’s realistic, reliable, and ready for production. That perspective helps enterprises move faster, avoid costly missteps, and deploy AI systems that scale within real regulatory and operational constraints. turing.com/blog/frontier-…
English
1
4
7
105
Turing
Turing@turingcom·
Human-guided AI is how AI works in regulated environments. In compliance, fraud, and audit workflows, speed is not enough. Systems must be explainable, auditable, and defensible. Autonomous-first AI fails where accountability matters: -Hallucinations -Silent drift -Unclear decisions -Weak audit trails “The model said so” does not hold up. The shift is architectural: -> Confidence-based routing -> Deterministic validation -> Human gating before execution -> End-to-end traceability This is partial autonomy: -Routine work scales -Edge cases get expert review -Every decision is reconstructable Governance is not a layer. It is the system.
English
1
4
7
106
Turing
Turing@turingcom·
Turing Research is launching a groundbreaking initiative to capture and utilize the complete, unfiltered operational history of companies, creating the definitive dataset for training the next generation of frontier models. Project Lazarus is an initiative to acquire and permanently preserve the full, unfiltered operational history of defunct or inactive companies at scale. We focus on private codebases, version histories, internal documentation, post-mortems, experimentation logs, infrastructure tooling, and everyday work artifacts that collectively reflect how real organizations actually operate. These materials capture the reality of knowledge work: incomplete specifications, tradeoffs made under time pressure, accumulated technical debt, evolving systems, and decisions made under uncertainty. Unlike polished outputs, operational traces preserve the causal structure of work across weeks, months, and years. We prioritize industries with high complexity and outsized GDP impact, including financial services, healthcare and pharma, advanced manufacturing, and enterprise software. These domains contain long-horizon decision making, regulatory constraints, supply chain dependencies, and high-value intellectual property that are critical for training economically useful AI systems. The data is structured for advanced methodologies such as reinforcement learning, imitation learning & long-horizon task evaluation, enabling models to learn multi-step reasoning, organizational decision processes, and system diagnosis over extended timelines. For founders, Project Lazarus is also preservation. A company’s history is a compressed record of human judgment, experimentation, and problem-solving. Instead of disappearing, that work compounds by becoming part of the foundation shaping the next generation of autonomous AI systems.
Turing tweet media
English
4
14
46
5.5K
Turing
Turing@turingcom·
Request a sample task featuring a curated issue prompt, validated patch, pass/fail test states & metadata on difficulty, solvability, and repository source: turing.com/case-study/cur…
English
0
3
8
119
Turing
Turing@turingcom·
CASE STUDY: Better code models need better benchmarks. We partnered with a client to build a dataset that shows where models actually break, not just where they succeed. 200+ SWE-bench style Java tasks 20+ real GitHub repositories Each task includes a validated patch, reproducible tests, and a trainer-authored issue prompt The goal was simple: reflect how bugs are found and fixed in the real world. The problem: Most benchmarks rely on clean, solvable examples. Real pull requests are not like that. They are messy, uneven, and often hard to resolve. Our client needed to understand: -Where their model succeeds -Where it fails -How well it generalizes across real codebases The approach: We curated tasks from high-quality Java repositories with strict criteria: -Reproducible test failures before the patch -Clean passes after the patch -Meaningful logic changes only -Stable compilation throughout Each repo was containerized in Docker to ensure consistent, isolated test execution. When issues were missing, Turing trainers wrote them. Every prompt was: -Problem-focused -Neutral and solution-agnostic -Aligned with test behavior for clean evaluation We also balanced difficulty: -About 30 percent solvable -About 70 percent designed to expose failure modes The outcome: A benchmark that does more than measure accuracy. It reveals capability. Teams can now: -Test model performance on real bugs -Identify breakpoints across complexity and context -Analyze failure patterns with precision This is how you move from optimistic benchmarks to real insight.
English
1
3
8
2.3K
Turing retweetledi
Turing
Turing@turingcom·
That COBOL system you retired 3 years ago? It's sitting in a repo — unmaintained, unused, but valuable. Have a legacy codebase? Schedule a call below.
English
2
6
19
19.3K
Turing retweetledi
Turing
Turing@turingcom·
Turing is featured in @ServiceNowRSRCH's Enterprise Ops Gym paper. We built the task and evaluation backbone: -1,000 prompts -7 single domain plus 1 hybrid workflow -7 to 30 step planning horizons -Expert reference executions with logged tool calls -Deterministic validation for success and side effect control Enabling structured comparison of enterprise agent performance across domains and complexity tiers. Dataset -> Paper -> Website -> Code. Below.
Turing tweet media
English
1
7
14
27.9K
Turing retweetledi
Turing
Turing@turingcom·
Case Study: Most AI agent evals are flawed. They measure outputs. Real agents operate across 80–200+ actions, tools, and OS environments where failure is gradual, not binary. At Turing, we built a new evaluation framework: -900+ deterministic tasks -450+ parent–child pairs -1800+ evaluable scenarios via prompt–execution swapping • 6 domains, balanced across Windows, macOS, Linux • 40% open-source, 60% closed-source tools Each task includes full telemetry: -screen recordings -event logs (clicks, keystrokes, scrolls) -timestamped screenshots -structured prompts, subtasks, and metadata The key idea: structured failure. Instead of injecting errors, we create them by swapping execution and intent: -Parent prompt + Child execution -Child prompt + Parent execution This produces controlled, classifiable failures: -Critical mistake -Bad side effect -Instruction misunderstanding With calibrated complexity (80–225 actions) and strict QA, this becomes a fully reproducible benchmark. Result: We can measure not just if agents succeed, but: -where they break -how errors propagate -how robust they are across real environments Agents don’t fail at the answer. They fail in the process. Read more case studies below.
English
4
5
18
51.7K
Turing retweetledi
Jonathan Siddharth
Jonathan Siddharth@jonsidd·
Important Project Lazarus update: Back in December, @Turingcom pioneered acquiring real-world startup/enterprise codebases and operational data to train frontier AI models. @steph_palazzolo at @theinformation broke the story on day one. Incredible reporting that helped define a new category. The companies might be dead, but the human intelligence that built them can live on, powering the next generation of frontier models. Lazarus from Turing resurrects the spirits of dead companies. Now we're scaling massively. Buying all data assets from active and inactive companies. Founder or investor or operator with data to monetize? Hit me up. DM or email jonsid@turing.com
Jonathan Siddharth tweet media
Stephanie Palazzolo@steph_palazzolo

Can't go public or sell yourself? Try selling your codebase to an AI lab as training data! In this morning's AI Agenda, we get into this growing trend, as data curation firms like Turing and AfterQuery pick up failed startups' codebases. theinformation.com/articles/turin…

English
6
19
91
20.6K
Turing
Turing@turingcom·
@SnowD3n_india It looks like you may still have an unanswered question.
English
2
0
0
13