/sesh/null

603 posts

/sesh/null banner
/sesh/null

/sesh/null

@nerdsane

VP, Observability-Data@datadoghq | peripatetic | minimalist { engineer | athlete | artist } | I have opinions-of-my-own

New York, NY Katılım Aralık 2017
726 Takip Edilen248 Takipçiler
/sesh/null retweetledi
Ameet Talwalkar
Ameet Talwalkar@atalwalkar·
We’ve released a technical report for Toto 2.0 detailing the data, architecture, training recipe, μP/u-μP hyperparameter transfer pipeline, and benchmark results behind our 5-model open-weight release. Report linked below.
Ameet Talwalkar@atalwalkar

Today we’re releasing Toto 2.0: a family of open-weights time series foundation models spanning 4M to 2.5B parameters. The question we set out to answer was simple (yet previously open): Do time series foundation models get reliably better as they scale? Our answer: yes! 🧵

English
1
9
57
5.6K
/sesh/null retweetledi
AJ Stuyvenberg
AJ Stuyvenberg@astuyve·
NEW from Datadog: it's Lapdog! Ever wondered what your AI agent was actually doing? Our latest free project runs locally and traces reasoning and tool calls in Codex, Claude Code, and Pi. You can now see what your agent is REALLY doing, live: lapdog.datadoghq.com
AJ Stuyvenberg tweet media
English
40
51
694
263.1K
/sesh/null retweetledi
Othmane
Othmane@ThisIsOthmane·
Scaling finally works for Time Series Foundation Models. Introducing Toto 2.0: open-weights TSFMs from 4M to 2.5B params, where every size beats the last from a single hyperparameter config. #1 on leading benchmarks: BOOM, GIFT-Eval, and TIME. Most TSFM families ship multiple sizes that all perform roughly the same. This one doesn't.
Othmane tweet media
English
1
9
18
3.2K
/sesh/null
/sesh/null@nerdsane·
The load-bearing frequency of ‘load-bearing’ in LLM discussions is becoming structurally load-bearing on my sanity
English
1
0
5
72
Diptanu Choudhury
Diptanu Choudhury@diptanu·
This is really cool! An agent building design systems autonomously based on research, and using @tensorlake sandboxes to synthesize a proposal, verify and fix if anything looks off! This pairs really well with Claude Design since you can now also get an autonomously built design system!
/sesh/null@nerdsane

@AnthropicAI shipped Claude Design yesterday. Now you can build a website in an hour, but you still need a design system. @arni0x9053 had this idea two weeks ago and decided to build it last weekend - an agentic system that sources, synthesizes, develops and organizes design languages. From idea to launch, 24 hours using Temper (a runtime that I have been working on). Powered by: @modal (@akshat_b), TensorLake @diptanu - sandboxes for agents doing work Turso (@glcst) - transactional storage @Railway - infrastructure/deployment @Cloudflare - object storage @datadoghq - observability @pydantic Monty - agent REPL for Code Mode-style tool execution on Temper @ExaAILabs - for web search.

English
1
0
4
305
/sesh/null retweetledi
arni
arni@arni0x9053·
@AnthropicAI Claude Design is so fun! This release was so serendipitous because I just set up Katagami - a living design language library sourced and synthesized by agents based on rough ideas I wanna explore. You can download a spec from Katagami, upload it into Claude Design as a design system and start applying it to your project from there. I just tried it and it worked amazingly well. Can’t wait to use this more in my future projects.
arni@arni0x9053

x.com/i/article/2045…

English
0
2
2
314
/sesh/null
/sesh/null@nerdsane·
@AnthropicAI shipped Claude Design yesterday. Now you can build a website in an hour, but you still need a design system. @arni0x9053 had this idea two weeks ago and decided to build it last weekend - an agentic system that sources, synthesizes, develops and organizes design languages. From idea to launch, 24 hours using Temper (a runtime that I have been working on). Powered by: @modal (@akshat_b), TensorLake @diptanu - sandboxes for agents doing work Turso (@glcst) - transactional storage @Railway - infrastructure/deployment @Cloudflare - object storage @datadoghq - observability @pydantic Monty - agent REPL for Code Mode-style tool execution on Temper @ExaAILabs - for web search.
arni@arni0x9053

x.com/i/article/2045…

English
1
0
9
1.5K
/sesh/null retweetledi
Rhys
Rhys@RhysSullivan·
we are entering the tool calling industrial revolution because of code mode
English
7
5
105
7.8K
/sesh/null retweetledi
Maxi
Maxi@maxirodgo·
Are chatbots in SaaS apps dead? Chat is communication method, not a product. You can’t define “AI” or “bots” as chat. SaaS companies should think of shipping AI in two categories: 1. Autonomous: AI as a separate entity from the human 2. Assistant: AI as an extension of the human Autonomy: these are essentially background agents that go in loops. You can think of them as doing stuff recursively, kicking off on set triggers or (ideally) events it detects itself. The holy grail here is a background agents that can wake itself up to things you care about, make evaluations and drive its own loop for a long time with proper and only necessary context, execute, iterate, and ask for your input/notify you when it’s done. Key here is that the agent owns its own loop. Claws work really well here to help orchestrate and coordinate for subtasks with personality. Assistants: these are multi turn agents, that start reactively and triggers are defined at each turn. They tend to execute much more scoped tasks, but can still go off and explore and move recursively within a defined upfront instruction input. You play fetch with your assistant. The goal of autonomy is catch things you wouldn’t have caught, to be always-on, and to act as an independent colleague. The goal of assistants is to be your superpower, to help you run your defined workflows, and to execute on your commands. The easiest mode of communication for both is chat. Artifacts are helpful to digest both loops and turns. Our Assistant (Bits) is in Preview. And our next evolution of Autonomy is coming very soon…
Vignesh Palaniappan@vigneshp_

We’ve launched Bits Assistant to help customers search and act across Datadog to resolve issues faster. Few examples below on how we see customers use it.

English
1
1
4
432
Tom Blomfield
Tom Blomfield@t_blom·
Pretty soon, I think we’ll see software shipping with Claude Code SDK embedded inside. Users will use it to configure and modify the software to meet their exact needs. The best changes will get passed back to the software developer and reincorporated in the master release.
English
227
52
1.2K
392.2K
/sesh/null retweetledi
Dylan Garcia
Dylan Garcia@_dylanga·
The first thing I did at @tryramp was set up distributed tracing, structured logging, and metrics for Inspect, our background coding agent. We now have full visibility in to everything the system is doing: the browser, CF workers/DOs, @modal sandboxes, database calls, etc. Most importantly, Inspect now has visibility in to itself. It can self-triage runtime errors it encounters and create PRs to fix them. Every morning, it reviews the past 24 hours of its own @datadoghq dashboard, identifies systemic issues, new errors, and long tail latencies, and has a summary + PR waiting for me at 9am.
Dylan Garcia tweet media
English
28
26
523
71.6K
/sesh/null
/sesh/null@nerdsane·
Our self-improving system leverages Shinka Evolve from @SakanaAILabs underneath. “On a second workload—100 different group-by tag combinations on the same metric—the improvement reached 541% over the baseline. The pattern mirrors ShinkaEvolve’s dynamics: Most generations explore incrementally, but occasional mutations discover qualitatively different algorithms” x.com/nerdsane/statu…
Sakana AI@SakanaAILabs

“When AI Discovers the Next Transformer” Robert Lange (Sakana AI) joins Tim Scarfe (@MLStreetTalk) to discuss Shinka Evolve, a framework that combines LLMs with evolutionary algorithms to do open-ended program search. Full Video: youtu.be/EInEmGaMRLc

English
0
0
0
111