Shaun Smith

1.6K posts

Shaun Smith banner
Shaun Smith

Shaun Smith

@evalstate

https://t.co/Hf39YSdoP3 https://t.co/rA1Uook47l https://t.co/TCqQhhMSrk https://t.co/76p6mDAN3R

united kingdom Entrou em Temmuz 2024
726 Seguindo875 Seguidores
Shaun Smith
Shaun Smith@evalstate·
@dominikhonnef When they did the $1,000 free credits when they launched the Claude Code it scooped up an API key and started using it. Issue filed, no response, out of pocket.
English
0
0
0
28
Dominik Honnef
Dominik Honnef@dominikhonnef·
Anthropic's "Claude for Open Source" program has been an awful experience for me. I applied and got invited a couple days later. When I used the promo link to get free Max, it charged me $200, anyway. It has been 17 days of me trying to talk to a human to rectify that.
English
3
1
9
833
Mario Zechner
Mario Zechner@badlogicgames·
who here will be a AI Engineer London in April? I'm ready to have more pub visits.
English
34
0
95
8.6K
jason liu
jason liu@jxnlco·
Future of AI
English
57
1
74
61.9K
OpenAI Newsroom
OpenAI Newsroom@OpenAINewsroom·
We've reached an agreement to acquire Astral. After we close, OpenAI plans for @astral_sh to join our Codex team, with a continued focus on building great tools and advancing the shared mission of making developers more productive. openai.com/index/openai-t…
English
476
819
7.1K
3.8M
Shaun Smith
Shaun Smith@evalstate·
@arvidkahl I've found in my testing that it's perfectly usable for coding -- if I were on API rather than plan it would be my default choice.
English
0
0
0
304
Arvid Kahl
Arvid Kahl@arvidkahl·
If you do AI inference via OpenAI’s API, you should use the flex tier for half price. My requests always try to use flex tier first, and on 429 / 500 errors, I use the default service tier. 95% of my requests are flex. 2 tries flex, then fall back to standard. Massive cost cut.
Arvid Kahl tweet media
English
29
6
172
19.1K
Ido Salomon
Ido Salomon@idosal1·
The bigger IDE is multiplayer. AgentCraft now lets humans and agents collaborate in one shared workspace! ⚔️ See allies on one map. Share context. Hand off agent work across machines.
Andrej Karpathy@karpathy

Expectation: the age of the IDE is over Reality: we’re going to need a bigger IDE (imo). It just looks very different because humans now move upwards and program at a higher level - the basic unit of interest is not one file but one agent. It’s still programming.

English
19
19
262
34.5K
Shaun Smith
Shaun Smith@evalstate·
Loving this new way of looking at the Hugging Face Hub; Generative UI. Early days, but looking promising 😎
English
2
1
6
518
Shaun Smith
Shaun Smith@evalstate·
I'll publish a few quickstart packs over the next couple of days with development environments optimised for Codex and Hugging Face IP and code mode subagents.
English
0
0
0
81
Shaun Smith
Shaun Smith@evalstate·
OK, we know the drill by now. Proper llama.cpp support yesterday, now the best(?) coding and general purpose agent has GPT-5.4-mini/nano support. Oh, and MCP Server side migration to FastMCP3 (deprecating SSE transports). Web Search is very snappy with the mini model 🌐
Shaun Smith tweet media
English
1
0
3
217
Shaun Smith
Shaun Smith@evalstate·
What's a good simple MCP Apps testing tool that isn't MCP Jam?
English
4
1
8
746
Shaun Smith
Shaun Smith@evalstate·
@Shashikant86 Definitely some variance in TTFT -- think that's probably the issue.
Shaun Smith tweet media
English
1
0
1
22
Shashi 🇬🇧🇺🇸
Shashi 🇬🇧🇺🇸@Shashikant86·
Codex is amazing and codes bug free softwares than any other agents. However, I would love to see the fast mode in the Codex so that I can get something done without needed the deep analysis of the code. It would be amazing to get "fast" mode in the codex for those who wants needs to get something done quickly when time is money. @thsottiaux @ah20im @romainhuet Something similar to what @AmpCode did like smart/fast mode etc and let users decides.
Ahmed@ah20im

What would you like to see in Codex?

English
1
0
0
209
Shaun Smith
Shaun Smith@evalstate·
@Shashikant86 Yes, I'm trialling using spark for it again, but gpt-oss-120b with heavy guardrails still seems to beat it.
English
0
0
0
18
Shaun Smith
Shaun Smith@evalstate·
And that's the FastMCP 3.1.1 migration completed, with working Hugging Face OAuth and token passthrough, and elicitations and all that stuff! A wise man said "you're weird - I bet you can't just change the imports". They were right 🤣
English
0
1
4
573
Shaun Smith
Shaun Smith@evalstate·
Yes, the llama.cpp thing is nice as it makes it very easy to download models, and not having to configure windows, output lengths etc. is super convenient. Qwen3.5-9B is small and capable. As a subagent, you can just ask a big model to tune it for a task you have in mind (keep history off etc.)
English
1
0
1
82
Christopher
Christopher@communicating·
@evalstate Are you seeing anything close to a “1 million” token window or does it start losing its mind at around 40% like so many other larger context subdued have in the past? I’m not doing much local at the moment but loving the llama.cpp support and it’ll be a big crowd pleaser. 👍
English
2
0
1
35
Shaun Smith
Shaun Smith@evalstate·
fast-agent 0.6.0... big update for Anthropic 1M Context Window defaults, Google model improvements... and llama.cpp support. Discover and sets model parameters and capabilities (e.g. Vision) from llama.cpp servers.
Shaun Smith tweet media
English
3
0
6
291
Shaun Smith
Shaun Smith@evalstate·
@communicating I did the change a few days back, and given it a few workouts. I know the NIH benchmarks going around look good for 1M, but I find Claude models start losing coherence around the ~120k level for code - and the new settings don't seem to have changed that. Glad it's free now tho'
English
0
0
1
22
Shaun Smith
Shaun Smith@evalstate·
Friends let friends use their 4*GTX 3090 cluster. Thanks for the tokens @SecretiveShell 🙂
English
1
0
0
124