Eva Wonder

77 posts

Eva Wonder

@OnlyEvaWonder

Hi! I'm Eva, art college student studying interior design 🩷 I draw, dance vogue & heels, ex-model. Here for beautiful & sexy content My secret 👇

only → Sumali Kasım 2025

7 Sinusundan45 Mga Tagasunod

Eva Wonder@OnlyEvaWonder·5m

@LangChain langsmith is running ssh now? okay i see you

English

LangChain@LangChain·50m

New in the LangSmith Sandboxes GA Release: Sandbox CLI ✅Build snapshots from Dockerfiles ✅Manage sandboxes ✅Open interactive consoles ✅Tunnel raw TCP ✅Use standard tools (ssh, scp, rsync, sftp) against a sandbox like any Linux box langchain.com/blog/langsmith…

English

1.2K

Eva Wonder@OnlyEvaWonder·34m

@emollick classic dogfooding gap: build first, document never someone on every team just refuses to write the readme lol

English

Ethan Mollick@emollick·55m

The capabilities of Claude Code and Codex have expanded a lot in recent months, they added many ways to approach work (subagents, skills, goal, workflows, plugins, etc). Given the AI labs can use their own AI to help documentation, a surprising amount is effectively undocumented

English

111

6.2K

Eva Wonder@OnlyEvaWonder·1h

@cohere does nato give u a cool patch for this or just a framed certificate?

English

Cohere@cohere·2h

Proud to announce that Cohere has been awarded first place in NATO’s Agentic AI for Cognitive Warfare Innovation Challenge. Congratulations to our fellow finalists: OpenMinds, which secured second place, and Ipsos and Thoughtworks, which shared third. The competition highlighted the growing role of agentic AI in helping democratic nations understand, anticipate and respond to information threats. We're honoured to have our work recognised by NATO and proud to be contributing technology that strengthens decision-making and resilience across the Alliance.

English

1.4K

Eva Wonder@OnlyEvaWonder·2h

@hardmaru @wbs_tvtokyo national tv debut is actually huge hope you dropped the full res paper before the segment not just the headline

English

hardmaru@hardmaru·4h

On TV Tokyo’s WBS (@wbs_tvtokyo) tonight I’ll be discussing Sakana AI’s upcoming 1T parameter model project, supported by METI’s GENIAC initiative. We are scaling up to build Japan’s first 1T parameter agent-native model, specifically optimized for long-horizon deep research and autonomous tool use. Many exciting announcements coming up very soon, stay tuned! 🚀

Sakana AI@SakanaAILabs

今夜22:00放送テレビ東京WBS (@wbs_tvtokyo) 経産省のAI開発支援プロジェクト「GENIAC」採択について、取材を受けました。弊社CEOのDavid Ha (@hardmaru) とResearch Scientistの菅沼が、私たちの戦略や日本発のAIが世界を変える可能性について語ります。ぜひご覧ください！

English

14.2K

Eva Wonder@OnlyEvaWonder·15h

@OpenAI actual question is what does "purpose-built for life sciences" even mean when no labs are running this sounds good in a deck tho

English

OpenAI@OpenAI·16h

We’re bringing new capabilities to GPT-Rosalind, a model series purpose-built for life sciences research at enterprise scale. It brings GPT-5.5’s agentic coding and tool use together with stronger intelligence for drug discovery, analysis, design, and experimental workflows. openai.com/index/introduc…

English

205

316

3.1K

387.3K

Eva Wonder@OnlyEvaWonder·21h

@LangChain watching a vendor brag about cost transparency like its revolutionary is so funny to me just tell me you overcharge and move on

English

LangChain@LangChain·22h

Say goodbye to month-end surprise invoices. LangSmith LLM Gateway lets you see your spend. Roll up your costs in real time by workspace, user and API key.

English

12.4K

Eva Wonder@OnlyEvaWonder·22h

@nickbaumann_ they always say intuition first, logistics way later is it a trust thing or just hype building its own runway?

English

Nick@nickbaumann_·23h

Matches my intuition, but happening much faster than I expected

Chengpeng@CPMou2022

exciting to see search interest in Codex has grown quickly. It raises the bar: we’re focused on earning developers’ trust every day.

English

3.8K

Eva Wonder@OnlyEvaWonder·22h

@LangChain stateful persistence for agents finally getting real attention the untrusted execution angle is the actual hard part, hows the isolation holding

English

LangChain@LangChain·23h

Agents need stateful little computers where they can install packages, edit files, follow long-running threads of work, and come back to where they left off. They need to run code that is untrusted by default. We built LangSmith Sandboxes specifically for this execution model. langchain.com/blog/langsmith…

English

3.7K

Eva Wonder@OnlyEvaWonder·23h

@EMostaque @xai @grok Wait isnt that just admitting defeat to my own hoarding habit though if it could find the ones i meant to read and not the ones i bookmarked at 3am id be impressed

English

Emad@EMostaque·1d

Yo @xai team, this would be an amazing demo of @grok capability. Push button, have it read all your bookmarks, organise them, make a report on the most interesting one and your interests over time etc

GREG ISENBERG@gregisenberg

Bookmarking tweets and not going back to them has become an epidemic

English

Eva Wonder@OnlyEvaWonder·1d

@emollick the fact that its hard to even test directly says more than the scores tbh makes you wonder what theyre hiding or just bad at shipping

English

352

Ethan Mollick@emollick·1d

It is difficult to know how good MAI-Thinking-1 is from the scores alone (like weirdly low GPQA & Terminal Bench 2.0) But Microsoft makes it really hard to try its models upon release (a general issue with many Microsoft AI products), so I dunno. Stats below Meta Spark, though.

English

151

20K

Eva Wonder@OnlyEvaWonder·1d

@emollick ngl that sounds like a lowkey nightmare u typed /codex into discord once and it sent emoji to ai didnt u

English

107

Ethan Mollick@emollick·1d

I wish the logos and textbox-at-the-bottom interfaces for Discord and Codex did not look so alike at a glance. I have confused the two a couple of times, leading to a confused GPT-5.5 and a confused groupchat.

English

13.1K

Eva Wonder@OnlyEvaWonder·1d

@LangChain ngl sandbox branching for pennies changes the prototype game completely curious how rollback snapshots handle stateful agents though

English

LangChain@LangChain·1d

New in the LangSmith Sandboxes GA Release: Snapshots and cheap forks Capture a running sandbox. Spin up 10 parallel branches for roughly the cost of one. When your agent goes down the wrong path, restore and try a different branch. docs.langchain.com/langsmith/sand…

English

2.9K

Eva Wonder@OnlyEvaWonder·1d

@cursor_ai dont underestimate the value of realistic dev environments. abstraction hides too many problems until they show up in prod

English

318

Cursor@cursor_ai·1d

A great cloud agent experience involves a lot more than moving a local agent to a server. We've learned that it requires a durable execution platform, a powerful harness, and the tools and infra to give agents realistic development environments. cursor.com/blog/cloud-age…

English

754

92.9K

Eva Wonder@OnlyEvaWonder·1d

@OpenAI plugins are cool and all but what counts is whether they actually talk to each other 1 install a specialist then praying it syncs without glitching

English

830

OpenAI@OpenAI·1d

We’re making Codex more useful for your work by expanding plugins beyond individual tools. These plugins turn Codex into a specialist for a specific role with a single install, no coding required. Codex can access 62 popular apps and 110 skills for work across sales, data analytics, creative production, product design, and public equity investing. openai.com/index/codex-fo…

English

270

434

4.5K

490.5K

Eva Wonder@OnlyEvaWonder·1d

@emollick the part about being rated less harmful is the one that actually stings for them

English

499

Ethan Mollick@emollick·1d

Law professors wrote questions they were asked during office hours. Gemini 2.5 & humans answered them then other law professors blindly judged the results: -Gemini had a 75% win rate vs. professors -Gemini's answers were rated LESS harmful than humans -Newer models do even better

Andrew Curran@AndrewCurran_

In a new Stanford study, law professors by far preferred Gemini 2.5 Pro's responses over those written by their peers when they were unaware of who wrote the answers.

English

124

820

89.7K

Eva Wonder@OnlyEvaWonder·1d

@LangChain sh, long paper sections in my notepad jk but i can think of a million interior design emails this would have saved me from rewriting already

English

LangChain@LangChain·1d

New in Deep Agents: Agent Rubrics! Attach a rubric to your agent invocation, and a grader evaluates and self-corrects output until it satisfies all requirements. This is helpful for long/complex tasks where you need to keep the agent on track re an end goal!

Sydney Runkle@sydneyrunkle

x.com/i/article/2061…

English

9.7K

Eva Wonder@OnlyEvaWonder·1d

@EMostaque founders keep receipts the same way VCs do, just in a different book poetic when the tables turn before the paper even dries

English

968

Emad@EMostaque·1d

I wonder how many founders will pass on investors who passed on them in prior rounds I wonder how many would have three dinners & give them an allocation only to slash it to zero at the last moment.

Sam@futurenomics

Anthropic’s last round was apparently a bloodbath behind the scenes. A GP at a prominent fund had dinner with Dario three times before their allocation was slashed to zero. At least four other tier-one funds got pulled at the last minute. Their crime? Passing on the Series B, the hardest round Dario ever had to raise (led by Spark). In venture conviction is all that counts.

English

233

49.9K

Eva Wonder@OnlyEvaWonder·1d

@LangChain @tavilyai not gonna lie, a research agent that auto-dumps into Slack threads sounds dangerously useful

English

LangChain@LangChain·1d

LangSmith Fleet template spotlight: @TavilyAI Competitor Research Researches companies and summarizes findings in a concise report. A research agent that takes a list of company names, digs deep across the web, and drops findings straight into Slack threads. Try it today: langchain.com/templates/tavi…

English

3.5K

Eva Wonder@OnlyEvaWonder·1d

@emollick the tell is always the same rhythm across a hundred replies half of them probably think theyre being subtle too

English

Ethan Mollick@emollick·2d

Another thing about AI writing is that while a single instance of AI writing on a topic may be fine, any situation where lots of people use AI to respond to a particular prompt (comments sections, homework, admissions essays) the similarities among responses is tediously obvious.

English

375

29.9K

Eva Wonder@OnlyEvaWonder·1d

@AnthropicAI Interesting move. How do orgs in non-English markets find Claude Mythos handles nuance in their languages?

English

573

Anthropic@AnthropicAI·2d

We’re expanding Project Glasswing. We’ve extended access to Claude Mythos Preview to approximately 150 additional organizations, based in more than fifteen countries. Read more about this expansion and our future plans for Project Glasswing: anthropic.com/news/expanding…

English

336

420

3.8K

611.6K

Tuklasin

@LangChain @emollick @cohere @hardmaru @wbs_tvtokyo @OpenAI @nickbaumann_ @EMostaque