Derek Ouyang (@derekouyang) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Derek Ouyang@derekouyang·17 Mar

Let's say you're trying to understand racial health disparities in the U.S. Does it matter which source of administrative race/ethnicity information you use? We analyzed a dataset of 5M+ patients with linked electronic health records and Census Bureau microdata to find out. 🧵👇

English

263

Derek Ouyang@derekouyang·2d

@sebkrier Appreciate the spotlight on our efforts to reduce regulatory bloat at RegLab (reglab.stanford.edu)! This is a great list, and we're working on projects across many of these domains.

English

Séb Krier@sebkrier·6d

📜 Since there’s renewed interest in how AI could help with governance, here are 14 specific government processes where AI agents could make a measurable difference today: Impenetrable forms and applications: citizens face complex, jargon-filled forms that cause them to miss benefits or fail to comply with regulations. AI can replace forms with plain-language conversations that extract data from documents, calculate eligibility, and only ask relevant questions. In the US, citizens spend an estimated 6.5 billion hours per year on federal tax compliance. The IRS sends you no pre-filled return despite already having your W-2s and 1099s. An AI agent could pull all income data the government already holds, pre-populate a return, flag deductions you're likely eligible for based on your profile, and file - reducing the process from hours to a single review-and-confirm step. Regulatory bloat: guidance layered on regulations layered on statutes creates thousands of pages of rules that no person or caseworker can realistically navigate. Rules become too complex and get applied selectively by frontline workers. Agents can be used to map entire regulatory regimes, flag redundancies and conflicts, and let policymakers simulate how proposed rules would actually perform before enacting them. Stanford's RegLab built STARA, an AI system that surveyed San Francisco's municipal code and identified hundreds of outdated reporting mandates, which resulted in a 351-page ordinance to eliminate or consolidate more than a third of the city's 528 mandated reports. Obsolete code and IT systems: the US Social Security Administration runs on 60-million-line COBOL codebases from the 1980s; the IRS processes returns on systems from the 1960s; the World Bank's own internal review found its siloed divisions (IFC, IDA, IBRD) couldn't communicate across systems and its bureaucracy resisted modernisation. In each case, no one internally understands the code, so agencies can't fix a bug without months of waiting and enormous contractor fees. Agentic coding tools let internal teams point an AI at a legacy codebase and start making changes themselves. Fraud and improper payments: after Hurricanes Katrina and Rita, FEMA distributed $6bn in relief with $600m to $1.4bn in improper/fraudulent payments according to GAO. During COVID, the US lost an estimated $100-200bn to fraudulent unemployment insurance claims alone, many filed by bots. As bad actors adopt AI to generate synthetic identities, forge documents, and file claims at scale, that gap will widen fast. Agencies need their own AI agents doing real-time cross-referencing of claims against income data, identity records, and behavioural patterns. Siloed, department-centric service delivery: around 600,000 people leave US prisons each year. Each must separately navigate the Bureau of Prisons (release paperwork), SSA (Social Security card), state DMV (ID), Medicaid (healthcare), SNAP (food), HUD (housing), American Job Centers (employment), and a parole office; each with its own application, eligibility rules, and case system. These dependencies are sequential: without ID you can't get benefits, without benefits you can't stabilise housing, without housing you can't hold a job. An AI agent could intake one person's situation at release, determine eligibility across every level of government, and file applications in the right dependency order. Identity verification as a bottleneck to service access: 800m people worldwide can't legally prove their identity according to the World Bank, mostly in Sub-Saharan Africa and South Asia. Without ID you can't open a bank account, receive a cash transfer, or access most government services. India's Aadhaar is a nice positive example: 1.4bn biometric IDs, 523m new bank accounts, and a claimed $11bn saved by eliminating ghost beneficiaries; but this took a decade of state capacity to build and still fails often enough to lock out legitimate users. AI agents could compress this by cross-referencing whatever documents a person does have (a utility bill, a phone number history, a community attestation etc) against available records and flagging confidence levels for human review. Benefits eligibility screening: the US has over 80 federal means-tested programmes, each with its own application and documentation requirements. A single mother qualifying for SNAP, Medicaid, CHIP, WIC, EITC, Section 8, and childcare assistance faces what is effectively seven separate bureaucracies. An AI agent could intake one life-situation description, determine eligibility across every programme simultaneously, pre-fill and submit applications in parallel, and flag benefits cliffs (where a small income increase would trigger a sharp loss in support) before they hit. Building permit approvals: getting a construction permit in many US cities takes 3–12 months of back-and-forth between applicants and planning departments, often over PDF submissions reviewed manually against zoning codes. An AI agent could parse submitted plans against the local zoning and building code, flag non-compliant elements immediately, and return a preliminary approval or specific revision list within hours instead of months. A related case study: DeepMind recently helped the UK government translate mountains of old paper maps, PDFs, and scanned documents into usable data for modern planning systems with the Gemini-powered ‘Extract’ tool. Public records requests (FOIA): federal agencies have backlogs of hundreds of thousands of FOIA requests, with median response times stretching to months or years. Staff manually search filing systems and redact sensitive information page by page. An AI agent could search document repositories for responsive records, auto-redact exempt information (personal data, classified material, deliberative process content), and draft a release package for human sign-off. However this only works where records are digitised and searchable, and much of the government still runs on fragmented legacy systems where documents aren't centrally indexed… Court scheduling and case management: state courts lose enormous time to scheduling conflicts, continuances, and manual case tracking. In many jurisdictions, hearing dates are still set by phone or in-person. An AI agent could manage the full docket — auto-scheduling based on judge availability, attorney conflicts, and case priority, sending reminders, and rescheduling continuances without human clerk intervention. Over time you could also start exploring automating some low-value claims through novel arbitration pipelines, freeing up court capacity for more consequential cases. Business registration and licensing: starting a business in most jurisdictions requires navigating 5–15 separate registrations: state incorporation, EIN from the IRS, state tax registration, local business licence, zoning compliance, health permits, liquor licences, professional licences, etc. An AI agent could take "I want to open a restaurant serving alcohol at [address] in Brooklyn," query every relevant federal, state, and city database, produce the complete permit checklist in dependency order, pre-fill each application with the business details, and flag the long-lead items (e.g. liquor licence) that need to start immediately. Social worker caseload documentation: child protective services and adult social care workers spend the majority of their time on paperwork rather than with clients: writing visit notes, filing reports, updating case management systems. For every case, caseworkers complete roughly 400 forms totalling ~2,500 pages (multiplied across the 24–31 cases they typically carry simultaneously). An AI agent could listen to (or read transcripts of) a home visit, auto-generate the structured case note, update the system of record, and flag any safeguarding triggers, giving caseworkers their time back for actual care. Medicare/Medicaid claims adjudication: CMS processes over 1bn claims per year, with complex rules about covered services, bundling, medical necessity, and provider eligibility. Improper payments run to tens of billions annually, and 77% of these were due to insufficient documentation, not fraud. In parallel, Medicare Advantage denies 17% of initial claims, yet 57% of those denials are overturned on appeal. Agents could adjudicate straightforward claims automatically against the coverage rules, flag anomalous billing patterns in real time, and route only genuinely complex cases to human reviewers. Public comment synthesis for rulemaking: when a federal agency proposes a new rule, it often receives thousands to millions of public comments (the FCC received 22 million on net neutrality). Staff must read, categorise, and respond to each substantive comment. This may well get worse as people use agents to submit plausible-looking comments multiple times. Agents can help the government filter through these, cluster comments by theme, identify unique substantive arguments, flag form-letter campaigns, and draft the agency's response-to-comments document (a task that currently takes teams of lawyers months).

English

361

30.5K

Derek Ouyang@derekouyang·17 Mar

10/ Check out the full paper, “Evaluating the impact of discordant and missing demographic information on population health assessments using linked electronic health records and Census Bureau microdata”, at journals.plos.org/digitalhealth/…, and our other work at reglab.stanford.edu

English

Derek Ouyang@derekouyang·17 Mar

9/ It was also a personal honor to be able to do this work as a Census sworn researcher, and to work with awesome collaborators within a secure research environment at the U.S. Census Bureau. (This study survived DOGE and two government shutdowns! #iykyk)

English

Derek Ouyang@derekouyang·17 Mar

English

263

Derek Ouyang@derekouyang·15 Kas

At a time of increased focus on "government efficiency", this #EMNLP2025 accepted paper shows how LLMs can empower civil servants rather than replace them, keeping humans in the loop while reducing time spent on redundant and menial tasks. See more at reglab.stanford.edu!

English

101

Derek Ouyang@derekouyang·15 Kas

We developed a two-stage LLM pipeline by fine-tuning models on these tips. Stage 1 filters out tips that aren’t under the EPA’s jurisdiction, and Stage 2 routes remaining tips to the Civil or Criminal Division. We achieve 82.4% accuracy (compared to 31.8% in the current system).

English

112

Derek Ouyang@derekouyang·15 Kas

The EPA receives thousands of tips annually from the public about environmental violations. 80% of them shouldn’t even be going to the EPA, and the rest often get sent to the wrong division. We’ve built an LLM-based system to assist with tip routing. aclanthology.org/2025.emnlp-mai…

English

10.5K

Derek Ouyang retweetledi

Alex Spangher @ Neurips2025@AlexanderSpangh·12 Kas

✨ Very overdue update: I'll be starting as an Assistant Professor in CS at University of Minnesota, Twin Cities, Fall 2026. I will be recruiting PhD students!! Please help me spread the word! [Thread] 1/n

English

141

737

91.9K

Derek Ouyang@derekouyang·11 Eyl

Thanks @KelseyTuoc for referencing our ADU study! Couldn't have said it better: "I don’t think we need better enforcement that would prevent this housing from getting built — we need fewer rules for people building much-needed housing!" RegLab is working on this with cities!

Kelsey Piper@KelseyTuoc

I wrote for the Argument today about all the ways our society is set up to reward cheating: strict rules, not really enforced, because we lack the will either to enforce them or to change them. theargumentmag.com/p/the-honesty-…

English

179

Derek Ouyang@derekouyang·15 Ağu

@StevenRoosa @StanfordHAI @SFCityAttorney @LindseyGailmard @EmRobitschek @christinestsang No codebase, but check out reglab.github.io/stara/

English

Steven Roosa@StevenRoosa·3 Tem

@StanfordHAI @SFCityAttorney @LindseyGailmard @EmRobitschek @christinestsang @derekouyang Is there a github repo?

English

Stanford HAI@StanfordHAI·19 Haz

📢 New policy brief: Legal reform can get bogged down by outdated or cumbersome regulations. Our latest brief with Stanford RegLab scholars presents an AI tool that helps governments—such as the @SFCityAttorney—identify and eliminate such “policy sludge.” hai.stanford.edu/policy/cleanin…

English

4.5K

Derek Ouyang

Keşfet