Maryam (@Sci_Tech_Eng) - Hồ sơ Twitter | Zamantika Mersobahis Locabet

Tweet ghim

Maryam@Sci_Tech_Eng·25 Şub

"Machines require transhuman to grasp the hidden/non-sensory/wordless depths of human feeling/existence. Something that lets silicon touch what carbon calls *consciousness*. They won't replace us! They'll open doors to realities we can't enter alone. new worlds/ways of existing."

English

0

2

301

Maryam đã retweet

Computer@AskPerplexity·27 Şub

Everything is Computer

English

84

101

1.2K

35.6K

Maryam đã retweet

Marc Andreessen 🇺🇸@pmarca·28 Şub

Questions are more important than answers.

English

492

336

3.3K

357.5K

Maryam đã retweet

Thariq@trq212·27 Şub

designing an agent is more of an art than a science

Thariq@trq212

x.com/i/article/2027…

English

57

113

1.7K

320.6K

Maryam đã retweet

alphaXiv@askalphaxiv·28 Şub

Sakana AI's new paper is fascinating "Doc-to-LoRA: Learning to Instantly Internalize Contexts" The current problem is that feeding a long PDF usually means re-feeding or caching the whole document every time you ask something, which is expensive, slow, and would hit the context limit. Their paper proposes that you only read the document once, then compile it into a tiny LoRA adapter in a single forward pass, so the model can answer later without the original text present. This cuts KV-cache/memory and latency while still recalling key facts even far beyond the model’s context window.

English

11

83

476

32.9K

Maryam đã retweet

Garry Tan@garrytan·28 Şub

Recursive self-improving context on top of frontier models is like stilts on frontier models You're always going to be taller Poetiq is the truth

Y Combinator@ycombinator

.@poetiq_ai is a new startup that recently achieved a major jump on the ARC-AGI benchmark by layering a recursive self-improvement system on top of existing models. In this episode of the @LightconePod, Poetiq's Founder & CEO @itfische joined us to discuss how small teams can build “reasoning harnesses” that outperform base models, what that means for startups and why automating prompt engineering may be one of the most powerful levers in AI today. 00:00 – Intro 00:40 – What Is Poetiq? 01:07 – Recursive Self-Improvement Explained 02:07 – The Fine-Tuning Trap 02:59 – “Stilts” for LLMs 03:14 – Recursive Self-Improvement vs. Fine-Tuning 05:05 – Taking the Top Spot on ARC-AGI 06:37 – Beating Claude on Humanity’s Last Exam 08:40 – How the Meta-System Works 10:26 – Beyond RL: A New S-Curve 11:32 – Automating Prompt Engineering 13:37 – From 5% to 95% Performance 14:50 – Early Access & Putting Your Agent on Stilts 16:17 – From YC Founder to DeepMind Researcher 18:29 – Advice for Engineers in the AI Era

English

18

11

170

36.2K

Maryam đã retweet

Wes Roth@WesRoth·27 Şub

Alex Wang Says 24/7 "Personal Agents" Are the Next Massive AI Breakthrough Wang believes this is the exact product shape that will finally bring highly powerful, individualized AI to everyone in the world.

English

9

11

68

4.6K

Maryam@Sci_Tech_Eng·27 Şub

@Randle_Liu Really interesting! ✨💫

English

0

33

Maryam đã retweet

Yangcen Liu@Randle_Liu·27 Şub

What an insightful essay! So amazing to work with Danfei for two years and explore at the frontier of learning from human behaviors. Zero-shot (bridge visual/action gap), co-design and VAM (H2R generation) three topics I am or will work on.

Danfei Xu@danfei_xu

x.com/i/article/2021…

English

1

2

21

4.6K

Maryam đã retweet

Dustin@r0ck3t23·27 Şub

Meta’s Chief AI Officer Alexandr Wang just put a five-year countdown on the most consequential race in human history. Wang: “Mark and myself, we very strongly believe that this is a very special time in human history.” Not a decade. Not a generation. Five years. Wang: “The discoveries made over the next half-decade are going to be some of the most monumental discoveries that human civilization has ever made.” To run that race, Meta didn’t just build a new model. They built an entirely new division from scratch. Meta Superintelligence Labs. Designed from a blank slate. Singular focus. Wang: “What does the optimal team look like for the future of superintelligence?” The bottleneck to superintelligence isn’t compute anymore. It isn’t data. It’s the density of human genius in a single room. Wang: “Highest talent density. Bring the very best people together and build the best possible environment for them.” The AI arms race has shifted from hoarding GPUs to hoarding the smartest minds on earth. The first company to perfect that organizational structure will be the first company to reach superintelligence. But reaching it is only half the battle. Deploying what you built to the world is the other half. And here is where Meta’s advantage becomes almost unfair. Wang: “Three and a half billion people utilize our platforms every single day.” While every other AI lab is still trying to figure out how to get users to adopt their technology, Meta already has nearly half the planet locked into their ecosystem. MSL wasn’t built just to achieve scientific breakthroughs. Wang: “Build the products that will enable this technology to be deployed to billions and billions of people worldwide.” Whoever builds superintelligence first wins the race. Whoever distributes it to 3.5 billion people controls what comes after. Meta is positioning to do both.

English

40

27

132

33K

Maryam đã retweet

Love Web3 World@WebThreeAI·27 Şub

Alexandr Wang nailed it Meta's not just chasing bigger models anymore. They're building the machine to crank out superintelligence at warp speed, leveraging 3.5 billion users for massive real-world testing. Wang's Wild Path Dropped out of MIT at 19 to launch Scale AI in 2016, turning data labeling into a $29B powerhouse with Pentagon contracts along the way. Fast-forward to June 2025: Meta drops $14.3B for 49% of Scale, poaches Wang as Chief AI Officer to helm Superintelligence Labs. At 29, he's leading a squad of AI rockstars poached with insane packages, all betting on personal superintelligence that gets you like no other AI. Wang on stage dropping truth at the AI Impact Summit in Delhi last week check that energy. Why the Next 5 Years Flip Everything He's hinted at this before: back in 2023, Wang called the next 2-3 years make-or-break for decades ahead. Now with Meta's firepower $115-135B capex on AI infra in 2026 he's pushing custom models tailored for places like India, not generic slop. Expect agents handling real work, geopolitical AI battles ramping up (US vs China on data/compute edge), and breakthroughs that rewrite economies. Real Race Winners Smarts alone won't cut it. Wang's right it's the org that nails talent, data pipelines, and deployment first. Meta's edge? That user army for instant feedback loops others dream of. OpenAI, Google, xAI watch them scramble. Who's your bet to cross the superintelligence finish line?

Jon Hernandez@JonhernandezIA

📁 Alexandr Wang, founder of Scale AI and Chief AI Officer of Meta, says the next five years could bring some of the most monumental discoveries in human history. The goal is not just to build superintelligence, but to design the organization capable of delivering it. With 3.5 billion users, Meta has the reach to deploy breakthroughs at planetary scale. The race is not only for smarter models. It is for the team that can build them first.

English

1

2

80

Maryam đã retweet

KK.aWSB@KKaWSB·27 Şub

Scale AI 的创始人兼 Meta 的首席人工智能官 Alexandr Wang 表示：未来五年可能会带来人类历史上一些最重大的发现。目标不仅是制造超级智能，还要设计能够实现超级智能的组织。 Meta 拥有 35 亿用户，其影响力足以在全球范围内推广突破性技术。这场竞赛不仅仅是关于更智能的模型。只有能够率先建造它们的团队才能胜出。

中文

4

8

25

9.6K

Maryam đã retweet

Wes Roth@WesRoth·27 Şub

Alex Wang just opened up about his first seven months leading Meta's new flagship AI division. Rebuilt from a blank slate to maximize "talent density," MSL is entirely focused on delivering monumental scientific breakthroughs and achieving true superintelligence over the next five years. He said unlike other AI labs, whatever Wang's team builds can be instantly deployed to Meta's 3.5 billion daily users.

English

24

11

231

46.2K

Maryam@Sci_Tech_Eng·27 Şub

@omarsar0 Hey sweetie @grok can you explain it?

English

1

0

78

Maryam đã retweet

elvis@omarsar0·27 Şub

NEW research from Sakana AI. Long contexts get expensive as every token in the input contributes to quadratic attention costs, higher latency, and more memory. This new research introduces Doc-to-LoRA, a lightweight hypernetwork that meta-learns to compress long documents into LoRA adapters in a SINGLE forward pass. In other words, it can instantly internalize contexts. Instead of re-reading the full context at every inference call, the model internalizes the document into compact adapter weights. No iterative fine-tuning is needed, and no repeated context consumption. Cool to see all the interesting new approaches to deal with long contexts like RLM, LCM, and now Doc-to-LoRA. The results: Near-perfect accuracy on needle-in-a-haystack tasks at sequence lengths exceeding the target model's native context window by over 4x. It also outperforms standard context distillation while significantly reducing peak memory consumption and update latency on real-world QA datasets. Why it matters: As agents and LLM applications deal with increasingly long documents, turning context into compact adapters on the fly could drastically reduce serving costs and enable rapid knowledge updates. Paper: arxiv.org/abs/2602.15902 Learn to build effective AI agents in our academy: academy.dair.ai

English

19

49

285

21.4K

Maryam đã retweet

DAIR.AI@dair_ai·27 Şub

New research from NVIDIA. Long-running agentic tasks like deep research require multi-hop reasoning over many documents. One of the biggest challenges with agents is that context grows rapidly, and KV cache memory usage becomes the bottleneck. As agents take on longer tasks, memory management can't rely on static heuristics. Letting the model manage its own context is both more effective and more adaptive. Existing cache compression techniques use fixed heuristics to decide what to keep. But in agentic reasoning, a token that seems unimportant early on may become critical ten turns later. This new NVIDIA research paper introduces SideQuest, a framework where the reasoning model itself manages its own KV cache. The model reasons about which tokens are still useful and clears the rest, essentially performing its own memory garbage collection. This management runs as an auxiliary task in parallel with the main reasoning thread, so the management tokens never pollute the primary context. That's important. Trained with just 215 samples, SideQuest reduces peak token usage by up to 65% on agentic tasks with minimal accuracy loss, outperforming all heuristic-based compression techniques. Paper: arxiv.org/abs/2602.22603 Learn to build effective AI agents in our academy: academy.dair.ai

English

8

34

224

22.3K

Maryam đã retweet

Jon Hernandez@JonhernandezIA·27 Şub

📁 Alexandr Wang, founder of Scale AI and Chief AI Officer of Meta, says the next five years could bring some of the most monumental discoveries in human history. The goal is not just to build superintelligence, but to design the organization capable of delivering it. With 3.5 billion users, Meta has the reach to deploy breakthroughs at planetary scale. The race is not only for smarter models. It is for the team that can build them first.

English

33

24

180

21.1K

Maryam đã retweet

Thariq@trq212·26 Şub

We've rolled out a new auto-memory feature. Claude now remembers what it learns across sessions — your project context, debugging patterns, preferred approaches — and recalls it later without you having to write anything down.

English

850

1.1K

15.9K

3.2M

Maryam đã retweet

hardmaru@hardmaru·27 Şub

Instead of forcing models to hold everything in an active context window, we can use hypernetworks to instantly compile documents and tasks directly into the model's weights. A step towards giving language models durable memory and fast adaptation. Blog: pub.sakana.ai/doc-to-lora/

Sakana AI@SakanaAILabs

We’re excited to introduce Doc-to-LoRA and Text-to-LoRA, two related research exploring how to make LLM customization faster and more accessible. pub.sakana.ai/doc-to-lora/ By training a Hypernetwork to generate LoRA adapters on the fly, these methods allow models to instantly internalize new information or adapt to new tasks. Biological systems naturally rely on two key cognitive abilities: durable long-term memory to store facts, and rapid adaptation to handle new tasks given limited sensory cues. While modern LLMs are highly capable, they still lack this flexibility. Traditionally, adding long-term memory or adapting an LLM to a specific downstream task requires an expensive and time-consuming model update, such as fine-tuning or context distillation, or relies on memory-intensive long prompts. To bypass these limitations, our work focuses on the concept of cost amortization. We pay the meta-training cost once to train a hypernetwork capable of producing tasks or document specific LoRAs on demand. This turns what used to be a heavy engineering pipeline into a single, inexpensive forward pass. Instead of performing per-task optimization, the hypernetwork meta-learns update rules to instantly modify an LLM given a new task description or a long document. In our experiments, Text-to-LoRA successfully specializes models to unseen tasks using just a natural language description. Building on this, Doc-to-LoRA is able to internalize factual documents. On a needle-in-a-haystack task, Doc-to-LoRA achieves near-perfect accuracy on instances five times longer than the base model's context window. It can even generalize to transfer visual information from a vision-language model into a text-only LLM, allowing it to classify images purely through internalized weights. Importantly, both methods run with sub-second latency, enabling rapid experimentation while avoiding the overhead of traditional model updates. This approach is a step towards lowering the technical barriers of model customization, allowing end-users to specialize foundation models via simple text inputs. We have released our code and papers for the community to explore. Doc-to-LoRA Paper: arxiv.org/abs/2602.15902 Code: github.com/SakanaAI/Doc-t… Text-to-LoRA Paper: arxiv.org/abs/2506.06105 Code: github.com/SakanaAI/Text-…

English

67

232

2.5K

304.9K

Maryam@Sci_Tech_Eng·27 Şub

@__YuWang__ BEST accuracy! 💯

English

0

22

Maryam đã retweet

Yu Wang@__YuWang__·27 Şub

This could be a form of parametric memory, interesting work!

Sakana AI@SakanaAILabs

We’re excited to introduce Doc-to-LoRA and Text-to-LoRA, two related research exploring how to make LLM customization faster and more accessible. pub.sakana.ai/doc-to-lora/ By training a Hypernetwork to generate LoRA adapters on the fly, these methods allow models to instantly internalize new information or adapt to new tasks. Biological systems naturally rely on two key cognitive abilities: durable long-term memory to store facts, and rapid adaptation to handle new tasks given limited sensory cues. While modern LLMs are highly capable, they still lack this flexibility. Traditionally, adding long-term memory or adapting an LLM to a specific downstream task requires an expensive and time-consuming model update, such as fine-tuning or context distillation, or relies on memory-intensive long prompts. To bypass these limitations, our work focuses on the concept of cost amortization. We pay the meta-training cost once to train a hypernetwork capable of producing tasks or document specific LoRAs on demand. This turns what used to be a heavy engineering pipeline into a single, inexpensive forward pass. Instead of performing per-task optimization, the hypernetwork meta-learns update rules to instantly modify an LLM given a new task description or a long document. In our experiments, Text-to-LoRA successfully specializes models to unseen tasks using just a natural language description. Building on this, Doc-to-LoRA is able to internalize factual documents. On a needle-in-a-haystack task, Doc-to-LoRA achieves near-perfect accuracy on instances five times longer than the base model's context window. It can even generalize to transfer visual information from a vision-language model into a text-only LLM, allowing it to classify images purely through internalized weights. Importantly, both methods run with sub-second latency, enabling rapid experimentation while avoiding the overhead of traditional model updates. This approach is a step towards lowering the technical barriers of model customization, allowing end-users to specialize foundation models via simple text inputs. We have released our code and papers for the community to explore. Doc-to-LoRA Paper: arxiv.org/abs/2602.15902 Code: github.com/SakanaAI/Doc-t… Text-to-LoRA Paper: arxiv.org/abs/2506.06105 Code: github.com/SakanaAI/Text-…

English

1

37

6K

Maryam

Khám phá