colloidal scientist

309 posts

colloidal scientist

@_microgel

شامل ہوئے Nisan 2021

1.8K فالونگ112 فالوورز

colloidal scientist@_microgel·1d

send this to your girlfriend and say us

fishious@fishquichee

Me and the bad bitch i pulled by creating an elaborate geometric sand pattern

English

colloidal scientist@_microgel·1d

@jxnlco codex on the go 👀 ?

English

jason liu@jxnlco·1d

ZXX

1.4K

colloidal scientist@_microgel·1d

an apology will not bring back the child's life

Pop Crave@PopCrave

Chappell Roan responds to the controversy involving a security guard confronting a young fan.

English

colloidal scientist@_microgel·1d

@maiamindel lol its been happening for 5+ years now

English

3.3K

Maia@maiamindel·2d

the cdmxification of buenos aires imminent

English

180

477.8K

colloidal scientist@_microgel·2d

@ramit have you ever stayed in a capella? so worth 3k a night

English

Ramit Sethi@ramit·2d

Luxury hotel pricing is utterly bewildering if you aren't the target market I recently spent ~$2,400 for 5 days at a Japanese property. Breakfast included $150 property credit, etc That wouldn't cover a single day at a luxury hotel that I usually stay at (e.g., Aman, Oberoi, Rosewood) What could make a hotel worth that much?

🤝@xPeaceLandBread

Staying at a 5 star hotel for the first time in my life cause the Ritz in Chengdu is $240 a night compared to $975 in Dallas and everything about this experience is totally blowing my mind.

English

1.2K

657.6K

colloidal scientist@_microgel·2d

@RhysSullivan use gpt5.4 xhigh. make it write a lot of tests before starting refactor

English

Rhys@RhysSullivan·2d

are any of the models actually good at doing large refactors? i have to spend so much time fighting with them to not take shortcuts and actually make large changes to code

English

104

157

24.6K

colloidal scientist@_microgel·2d

your crack dealer will be alarmed if you dont do a a lot of crack everyday

sunny madra@sundeep

“If your $500K engineer isn’t burning at least $250K in tokens, something is wrong.”

English

colloidal scientist ری ٹویٹ کیا

Adam Rackis@AdamRackis·4d

When your drug dealer complains you’re not doing enough drugs

sunny madra@sundeep

“If your $500K engineer isn’t burning at least $250K in tokens, something is wrong.”

English

151

24K

896.9K

colloidal scientist@_microgel·2d

@gabriel1 interactive brokers mcp?

English

gabriel@gabriel1·2d

im gonna start investing my money in public market much more when i can hook my ai up to mcp servers that respond to my questions and reallocate money there is no way im clicking around in the chase bank ui it's just not worth the suffering to make frequent bets

English

264

41.1K

colloidal scientist ری ٹویٹ کیا

Alvin Sng@alvinsng·3d

useEffect: x.com/alvinsng/statu… Linting: factory.ai/news/using-lin… Agent readiness: factory.ai/news/agent-rea… Context compression: factory.ai/news/evaluatin… Missions: factory.ai/news/missions

Alvin Sng@alvinsng

x.com/i/article/2028…

English

120

11.8K

colloidal scientist@_microgel·3d

@theo but its from the mistral region in france!!

English

Theo - t3.gg@theo·4d

Since OpenAI dropped gpt-oss-120b, Mistral has released 4 models that are worse than gpt-pss-120b

Artificial Analysis@ArtificialAnlys

Mistral has released Mistral Small 4, an open weights model with hybrid reasoning and image input, scoring 27 on the Artificial Analysis Intelligence Index @MistralAI's Small 4 is a 119B mixture-of-experts model with 6.5B active parameters per token, supporting both reasoning and non-reasoning modes. In reasoning mode, Mistral Small 4 scores 27 on the Artificial Analysis Intelligence Index, a 12-point improvement from Small 3.2 (15) and now among the most intelligent models Mistral has released, surpassing Mistral Large 3 (23) and matching the proprietary Magistral Medium 1.2 (27). However, it lags open weights peers with similar total parameter counts such as gpt-oss-120B (high, 33), NVIDIA Nemotron 3 Super 120B A12B (Reasoning, 36), and Qwen3.5 122B A10B (Reasoning, 42). Key takeaways: ➤ Reasoning and non-reasoning modes in a single model: Mistral Small 4 supports configurable hybrid reasoning with reasoning and non-reasoning modes, rather than the separate reasoning variants Mistral has released previously with their Magistral models. In reasoning mode, the model scores 27 on the Artificial Analysis Intelligence Index. In non-reasoning mode, the model scores 19, a 4-point improvement from its predecessor Mistral Small 3.2 (15) ➤ More token efficient than peers of similar size: At ~52M output tokens, Mistral Small 4 (Reasoning) uses fewer tokens to run the Artificial Analysis Intelligence Index compared to reasoning models such as gpt-oss-120B (high, ~78M), NVIDIA Nemotron 3 Super 120B A12B (Reasoning, ~110M), and Qwen3.5 122B A10B (Reasoning, ~91M). In non-reasoning mode, the model uses ~4M output tokens ➤ Native support for image input: Mistral Small 4 is a multimodal model, accepting image input as well as text. On our multimodal evaluation, MMMU-Pro, Mistral Small 4 (Reasoning) scores 57%, ahead of Mistral Large 3 (56%) but behind Qwen3.5 122B A10B (Reasoning, 75%). Neither gpt-oss-120B nor NVIDIA Nemotron 3 Super 120B A12B support image input. All models support text output only ➤ Improvement in real-world agentic tasks: Mistral Small 4 scores an Elo of 871 on GDPval-AA, our evaluation based on OpenAI's GDPval dataset that tests models on real-world tasks across 44 occupations and 9 major industries, with models producing deliverables such as documents, spreadsheets, and diagrams in an agentic loop. This is more than double the Elo of Small 3.2 (339) and close to Mistral Large 3 (880), but behind gpt-oss-120B (high, 962), NVIDIA Nemotron 3 Super 120B A12B (Reasoning, 1021), and Qwen3.5 122B A10B (Reasoning, 1130) ➤ Lower hallucination rate than peer models of similar size: Mistral Small 4 scores -30 on AA-Omniscience, our evaluation of knowledge reliability and hallucination, where scores range from -100 to 100 (higher is better) and a negative score indicates more incorrect than correct answers. Mistral Small 4 scores ahead of gpt-oss-120B (high, -50), Qwen3.5 122B A10B (Reasoning, -40), and NVIDIA Nemotron 3 Super 120B A12B (Reasoning, -42) Key model details: ➤ Context window: 256K tokens (up from 128K on Small 3.2) ➤ Pricing: $0.15/$0.6 per 1M input/output tokens ➤ Availability: Mistral first-party API only. At native FP8 precision, Mistral Small 4's 119B parameters require ~119GB to self-host the weights (more than the 80GB of HBM3 memory on a single NVIDIA H100) ➤ Modality: Image and text input with text output only ➤ Licensing: Apache 2.0 license

English

1.9K

148.4K

colloidal scientist@_microgel·3d

@jxnlco Chatington DC

English

jason liu@jxnlco·4d

ChatGPT is just sparkling codex from the region of -

English

4.9K

colloidal scientist@_microgel·3d

@gabriel1 marc andreessen fuming somewhere

English

102

gabriel@gabriel1·3d

if i had high agency i would meditate

English

500

19.6K

colloidal scientist@_microgel·3d

.@thsottiaux @ajambrosino put ads in my codex but let me keep 2x rate limits all the time after april 🥺👉👈

English

colloidal scientist@_microgel·3d

@Dimillian lets build AGI 💪 welcome to openai

English

Thomas Ricouard@Dimillian·4d

Last day at work after 15 years with the same team. 15. I was part of the Glose founding team, and we grew from 3 to 20 people. Then we went on an amazing adventure with Medium and grew much bigger instantly! What a ride! And now I can't wait to start the next chapter!

English

276

12.5K

colloidal scientist ری ٹویٹ کیا

Charlie Guo@charlierguo·4d

This is a cute narrative and I get that it's trendy to dunk on OAI but let's be real for a minute about what shipped in the last 7 days: GPT-5.4 mini and nano, Sora Characters API, GPT-5.3 updates, upgrades to Microsoft/Google/Slack connectors, subagents in Codex, longer/higher res/editable Sora API videos, persistent file storage in ChatGPT, and yes, a redesigned model picker.

Aakash Gupta@aakashgupta

Anthropic would have built this in a day and a dev would have tweeted the news. At OpenAI, an exec is telling you about a plan. That gap tells you everything. In the last 7 days, Anthropic shipped Dispatch, channels, voice mode, /loop, 1M context GA, MCP elicitation, persistent Cowork on mobile, Excel and PowerPoint cross-app context, inline charts, and 64k default output tokens. Felix Rieseberg tweeted "we're shipping Dispatch" and you could control your desktop Claude from your phone that afternoon. Every launch came from an engineering account or a GitHub release. In the same 7 days, OpenAI shipped GPT-5.4 mini and nano. Redesigned the model picker. Sunset the "Nerdy" personality preset. Announced three acquisitions. To find a comparable volume of shipped product from OpenAI, you have to rewind to December. This is the most underrated difference in AI right now. Anthropic PMs don't write PRDs. Boris Cherny, head of Claude Code, ships 10 to 30 PRs a day and hasn't written code by hand since November. 60 to 100 internal releases daily. Cowork was built with Claude Code in 10 days. The tools build the next version of the tools. Every cycle compresses the last one. Engineers are empowered to ship and announce. The entire org runs like a product team, not a corporation. OpenAI has the opposite problem. Fidji Simo is CEO of Applications, a title that exists because engineers aren't empowered to ship without executive approval chains. She joined from Instacart. Before that, a decade at Meta running the Facebook app. Since she arrived, OpenAI has acquired 12 companies for $11 billion in 10 months and announced a "superapp" consolidation through the Wall Street Journal. The exec responsible for shipping it is tweeting about "phases of exploration and refocus" on the product she hasn't shipped yet. That's what happens when you layer a Meta-style product org on top of an AI lab. Decisions go up. Shipping slows down. Announcements replace releases. Anthropic's product announcements come from the people who wrote the code. OpenAI's come from the C-suite and the press. One of those loops compounds. The other one meetings.

English

150

22.1K

colloidal scientist@_microgel·4d

@cursor_ai @shaoruu will this make cursor even more laggy on my mac?

English

2.1K

Cursor@cursor_ai·4d

We're also sharing an early alpha of our new interface. cursor.com/glass

English

139

117

740.9K

Cursor@cursor_ai·5d

Composer 2 is now available in Cursor.

English

629

893

9.7K

5.2M

colloidal scientist ری ٹویٹ کیا

MiniMax (official)@MiniMax_AI·5d

During the iteration process, we also realized that the model's ability to recursively evolve its harness is equally critical. Our internal harness autonomously collects feedback, builds evaluation sets for internal tasks, and based on this continuously iterates on its own architecture, skills/MCP implementation, and memory mechanisms to complete tasks better and more efficiently.

English

713

142.4K

colloidal scientist@_microgel·5d

@0xSero have you increased the autocompaction/max token config? different pricing

English

182

0xSero@0xSero·5d

Don’t use gpt-5.4-xhigh I spent 15% of my weekly usage in 12 hours on one auto research session. It does seem much more accurate on research tasks but it’s unsustainable if you don’t want to be spending 1k+ a month on subs

English

269

34.6K

colloidal scientist ری ٹویٹ کیا

david gan@davidgan·5d

tier 0

Karri Saarinen@karrisaarinen

B2B customer proof/logo wall tiers: Tier 8: "Loved by people who use other unrelated products" Tier 7: A few users who happen to work at famous companies Tier 6: One team somewhere in the company uses it Tier 5: Companies use the product but didn't give their permission to use their logo in marketing Tier 4: Actual contracted customers with wall to wall instals and with logo permissions acquired Tier 3: Customer story / video Tier 2: They build their company operations around your product and call it “the [Company] way” Tier 1: Their CEO mentions you unprompted in an all-hands, investor update, or earnings calls

Español

311

37.8K

دریافت کریں

@jxnlco @maiamindel @ramit @RhysSullivan @gabriel1 @theo @elonmusk @BarackObama