
ideal
1.1K posts




@ylecun @francoisfleuret The current LLM scaling and optimizations - research or engineering?
English

Major difference in my mind:
- an engineer, given a problem, invents and tries multiple solutions and stops when the solution is good enough. The goal is product innovation and shipping.
- a scientist asks new questions, proposes various new solutions, compares them (sometimes with old ones), and writes about it. The methodology must be sound or else peers will sneer. The goal is scientific breakthroughs and technological progress.
Both can be called "researchers". Many people can do both: these are activities, not identities.
Importantly, most product innovations are built on scientific breakthroughs and technological innovations that happened 2, 5, 10, or 20 years earlier.
English

IMO a researcher studies a problem that may not be solvable, while an engineer solves a problem that is considered solvable.
Yacine Mahdid@yacinelearning
English

Grok 4.3 is still an early beta that will improve almost every day, but try it out!
We will publish release notes as we fix bugs and add functionality.
X Freeze@XFreeze
Grok 4.3 beta is natively multimodal, and the front-end capabilities are insane You can literally just upload a screenshot of any website you like, and Grok will instantly write the code to clone it for you with an cool UI You don't even need to write a complex prompt...just upload an image or describe what you want and let it build
English

@alexandr_wang What are the expected API prices though? Would be cool if it is an order cheaper than the competition.
English

this is not investment or tax advice… but very cool!
Ravid Shwartz Ziv@ziv_ravid
I took the new Muse Spark to the ultimate test: filing my taxes - 3 different workplaces, consulting, stocks, foreign bank accounts and assets, and kids. One hour later, I had everything done. AGI is here... cc: @alexandr_wang
English

@iruletheworldmo @jazzplane Most mediocre release notes I have ever seen... Show us some benchmarks!
English

@iruletheworldmo Haven't seen many reviews. Also, no benchmarks were released. So, how good is it really?
English

for various reasons people are totally sleeping on grok 4.20
now, it’s not the best model for coding or cute powerpoints.
but. if you’re soundboarding complex ideas and need to think through recent information. it’s incredible.
if you haven’t tried it in depth i’d suggest you spend a few days testing it out.
you’ll be surprised.
English

@NotATeslaApp @SawyerMerritt Pretty smart marketing to allow people to share their stats!
English


@bindureddy @theinformation Revenue is easy, profit is hard:
* Open a store
* Sell iPhones 50% off
* Generate a lot of revenue while getting broke.
English

Exclusive: OpenAI just raised its revenue outlook—but now expects to burn $111B more cash by 2030. The AI boom’s economics are getting clearer. thein.fo/46iM5Nw
English

@CMS_Flash I think if you say "I don't know" your score 0% on that benchmark.
English

Wow what's the secret sauce of GLM here? It's very rare to see a leaderboard topped by a model not coming out of one of the top four labs.
Lisan al Gaib@scaling01
Massive reduction in hallucination rate for Gemini 3.1 Pro over Gemini 3.0 Pro according to AA-Omniscience Hallucination Rate
English

@billyuchenlin @grok We want benchmarks. I used it for some queries, and it was pretty good. Not sur how good it is in general.
English

JUST IN:— Grok 4.20 delivers a major performance upgrade, achieving 95% accuracy on MMLU-Pro.
The update enhances step-back reasoning for complex queries, boosts STEM and coding performance, and introduces advanced image and video understanding.
With a streamlined interface and up to 10× faster response times, Grok 4.20 marks a significant leap in both capability and usability.

English













