Maryam

1.7K posts

Maryam banner
Maryam

Maryam

@Sci_Tech_Eng

Exploring in neural networks from the inside purely biological mind with heavy cognition architecture & mapping the phase space where thought becomes destiny.

Post-Modern Human Era Katılım Ağustos 2021
7.5K Takip Edilen87 Takipçiler
Sabitlenmiş Tweet
Maryam
Maryam@Sci_Tech_Eng·
"Machines require transhuman to grasp the hidden/non-sensory/wordless depths of human feeling/existence. Something that lets silicon touch what carbon calls *consciousness*. They won't replace us! They'll open doors to realities we can't enter alone. new worlds/ways of existing."
English
0
0
2
301
Maryam retweetledi
Computer
Computer@AskPerplexity·
Everything is Computer
English
84
101
1.2K
35.6K
Maryam retweetledi
alphaXiv
alphaXiv@askalphaxiv·
Sakana AI's new paper is fascinating "Doc-to-LoRA: Learning to Instantly Internalize Contexts" The current problem is that feeding a long PDF usually means re-feeding or caching the whole document every time you ask something, which is expensive, slow, and would hit the context limit. Their paper proposes that you only read the document once, then compile it into a tiny LoRA adapter in a single forward pass, so the model can answer later without the original text present. This cuts KV-cache/memory and latency while still recalling key facts even far beyond the model’s context window.
alphaXiv tweet media
English
11
83
476
32.9K
Maryam retweetledi
Garry Tan
Garry Tan@garrytan·
Recursive self-improving context on top of frontier models is like stilts on frontier models You're always going to be taller Poetiq is the truth
Y Combinator@ycombinator

.@poetiq_ai is a new startup that recently achieved a major jump on the ARC-AGI benchmark by layering a recursive self-improvement system on top of existing models. In this episode of the @LightconePod, Poetiq's Founder & CEO @itfische joined us to discuss how small teams can build “reasoning harnesses” that outperform base models, what that means for startups and why automating prompt engineering may be one of the most powerful levers in AI today. 00:00 – Intro 00:40 – What Is Poetiq? 01:07 – Recursive Self-Improvement Explained 02:07 – The Fine-Tuning Trap 02:59 – “Stilts” for LLMs 03:14 – Recursive Self-Improvement vs. Fine-Tuning 05:05 – Taking the Top Spot on ARC-AGI 06:37 – Beating Claude on Humanity’s Last Exam 08:40 – How the Meta-System Works 10:26 – Beyond RL: A New S-Curve 11:32 – Automating Prompt Engineering 13:37 – From 5% to 95% Performance 14:50 – Early Access & Putting Your Agent on Stilts 16:17 – From YC Founder to DeepMind Researcher 18:29 – Advice for Engineers in the AI Era

English
18
11
170
36.2K
Maryam retweetledi
Wes Roth
Wes Roth@WesRoth·
Alex Wang Says 24/7 "Personal Agents" Are the Next Massive AI Breakthrough Wang believes this is the exact product shape that will finally bring highly powerful, individualized AI to everyone in the world.
English
9
11
68
4.6K
Maryam retweetledi
Yangcen Liu
Yangcen Liu@Randle_Liu·
What an insightful essay! So amazing to work with Danfei for two years and explore at the frontier of learning from human behaviors. Zero-shot (bridge visual/action gap), co-design and VAM (H2R generation) three topics I am or will work on.
Danfei Xu@danfei_xu

x.com/i/article/2021…

English
1
2
21
4.6K
Maryam retweetledi
Dustin
Dustin@r0ck3t23·
Meta’s Chief AI Officer Alexandr Wang just put a five-year countdown on the most consequential race in human history. Wang: “Mark and myself, we very strongly believe that this is a very special time in human history.” Not a decade. Not a generation. Five years. Wang: “The discoveries made over the next half-decade are going to be some of the most monumental discoveries that human civilization has ever made.” To run that race, Meta didn’t just build a new model. They built an entirely new division from scratch. Meta Superintelligence Labs. Designed from a blank slate. Singular focus. Wang: “What does the optimal team look like for the future of superintelligence?” The bottleneck to superintelligence isn’t compute anymore. It isn’t data. It’s the density of human genius in a single room. Wang: “Highest talent density. Bring the very best people together and build the best possible environment for them.” The AI arms race has shifted from hoarding GPUs to hoarding the smartest minds on earth. The first company to perfect that organizational structure will be the first company to reach superintelligence. But reaching it is only half the battle. Deploying what you built to the world is the other half. And here is where Meta’s advantage becomes almost unfair. Wang: “Three and a half billion people utilize our platforms every single day.” While every other AI lab is still trying to figure out how to get users to adopt their technology, Meta already has nearly half the planet locked into their ecosystem. MSL wasn’t built just to achieve scientific breakthroughs. Wang: “Build the products that will enable this technology to be deployed to billions and billions of people worldwide.” Whoever builds superintelligence first wins the race. Whoever distributes it to 3.5 billion people controls what comes after. Meta is positioning to do both.
English
40
27
132
33K
Maryam retweetledi
Love Web3 World
Love Web3 World@WebThreeAI·
Alexandr Wang nailed it Meta's not just chasing bigger models anymore. They're building the machine to crank out superintelligence at warp speed, leveraging 3.5 billion users for massive real-world testing. Wang's Wild Path Dropped out of MIT at 19 to launch Scale AI in 2016, turning data labeling into a $29B powerhouse with Pentagon contracts along the way. Fast-forward to June 2025: Meta drops $14.3B for 49% of Scale, poaches Wang as Chief AI Officer to helm Superintelligence Labs. At 29, he's leading a squad of AI rockstars poached with insane packages, all betting on personal superintelligence that gets you like no other AI. Wang on stage dropping truth at the AI Impact Summit in Delhi last week check that energy. Why the Next 5 Years Flip Everything He's hinted at this before: back in 2023, Wang called the next 2-3 years make-or-break for decades ahead. Now with Meta's firepower $115-135B capex on AI infra in 2026 he's pushing custom models tailored for places like India, not generic slop. Expect agents handling real work, geopolitical AI battles ramping up (US vs China on data/compute edge), and breakthroughs that rewrite economies. Real Race Winners Smarts alone won't cut it. Wang's right it's the org that nails talent, data pipelines, and deployment first. Meta's edge? That user army for instant feedback loops others dream of. OpenAI, Google, xAI watch them scramble. Who's your bet to cross the superintelligence finish line?
Jon Hernandez@JonhernandezIA

📁 Alexandr Wang, founder of Scale AI and Chief AI Officer of Meta, says the next five years could bring some of the most monumental discoveries in human history. The goal is not just to build superintelligence, but to design the organization capable of delivering it. With 3.5 billion users, Meta has the reach to deploy breakthroughs at planetary scale. The race is not only for smarter models. It is for the team that can build them first.

English
1
1
2
80
Maryam retweetledi
KK.aWSB
KK.aWSB@KKaWSB·
Scale AI 的创始人兼 Meta 的首席人工智能官 Alexandr Wang 表示:未来五年可能会带来人类历史上一些最重大的发现。 目标不仅是制造超级智能,还要设计能够实现超级智能的组织。 Meta 拥有 35 亿用户,其影响力足以在全球范围内推广突破性技术。 这场竞赛不仅仅是关于更智能的模型。 只有能够率先建造它们的团队才能胜出。
中文
4
8
25
9.6K
Maryam retweetledi
Wes Roth
Wes Roth@WesRoth·
Alex Wang just opened up about his first seven months leading Meta's new flagship AI division. Rebuilt from a blank slate to maximize "talent density," MSL is entirely focused on delivering monumental scientific breakthroughs and achieving true superintelligence over the next five years. He said unlike other AI labs, whatever Wang's team builds can be instantly deployed to Meta's 3.5 billion daily users.
English
24
11
231
46.2K
Maryam retweetledi
elvis
elvis@omarsar0·
NEW research from Sakana AI. Long contexts get expensive as every token in the input contributes to quadratic attention costs, higher latency, and more memory. This new research introduces Doc-to-LoRA, a lightweight hypernetwork that meta-learns to compress long documents into LoRA adapters in a SINGLE forward pass. In other words, it can instantly internalize contexts. Instead of re-reading the full context at every inference call, the model internalizes the document into compact adapter weights. No iterative fine-tuning is needed, and no repeated context consumption. Cool to see all the interesting new approaches to deal with long contexts like RLM, LCM, and now Doc-to-LoRA. The results: Near-perfect accuracy on needle-in-a-haystack tasks at sequence lengths exceeding the target model's native context window by over 4x. It also outperforms standard context distillation while significantly reducing peak memory consumption and update latency on real-world QA datasets. Why it matters: As agents and LLM applications deal with increasingly long documents, turning context into compact adapters on the fly could drastically reduce serving costs and enable rapid knowledge updates. Paper: arxiv.org/abs/2602.15902 Learn to build effective AI agents in our academy: academy.dair.ai
elvis tweet media
English
19
49
285
21.4K
Maryam retweetledi
DAIR.AI
DAIR.AI@dair_ai·
New research from NVIDIA. Long-running agentic tasks like deep research require multi-hop reasoning over many documents. One of the biggest challenges with agents is that context grows rapidly, and KV cache memory usage becomes the bottleneck. As agents take on longer tasks, memory management can't rely on static heuristics. Letting the model manage its own context is both more effective and more adaptive. Existing cache compression techniques use fixed heuristics to decide what to keep. But in agentic reasoning, a token that seems unimportant early on may become critical ten turns later. This new NVIDIA research paper introduces SideQuest, a framework where the reasoning model itself manages its own KV cache. The model reasons about which tokens are still useful and clears the rest, essentially performing its own memory garbage collection. This management runs as an auxiliary task in parallel with the main reasoning thread, so the management tokens never pollute the primary context. That's important. Trained with just 215 samples, SideQuest reduces peak token usage by up to 65% on agentic tasks with minimal accuracy loss, outperforming all heuristic-based compression techniques. Paper: arxiv.org/abs/2602.22603 Learn to build effective AI agents in our academy: academy.dair.ai
DAIR.AI tweet media
English
8
34
224
22.3K
Maryam retweetledi
Jon Hernandez
Jon Hernandez@JonhernandezIA·
📁 Alexandr Wang, founder of Scale AI and Chief AI Officer of Meta, says the next five years could bring some of the most monumental discoveries in human history. The goal is not just to build superintelligence, but to design the organization capable of delivering it. With 3.5 billion users, Meta has the reach to deploy breakthroughs at planetary scale. The race is not only for smarter models. It is for the team that can build them first.
English
33
24
180
21.1K
Maryam retweetledi
Thariq
Thariq@trq212·
We've rolled out a new auto-memory feature. Claude now remembers what it learns across sessions — your project context, debugging patterns, preferred approaches — and recalls it later without you having to write anything down.
English
850
1.1K
15.9K
3.2M
Maryam retweetledi
hardmaru
hardmaru@hardmaru·
Instead of forcing models to hold everything in an active context window, we can use hypernetworks to instantly compile documents and tasks directly into the model's weights. A step towards giving language models durable memory and fast adaptation. Blog: pub.sakana.ai/doc-to-lora/
Sakana AI@SakanaAILabs

We’re excited to introduce Doc-to-LoRA and Text-to-LoRA, two related research exploring how to make LLM customization faster and more accessible. pub.sakana.ai/doc-to-lora/ By training a Hypernetwork to generate LoRA adapters on the fly, these methods allow models to instantly internalize new information or adapt to new tasks. Biological systems naturally rely on two key cognitive abilities: durable long-term memory to store facts, and rapid adaptation to handle new tasks given limited sensory cues. While modern LLMs are highly capable, they still lack this flexibility. Traditionally, adding long-term memory or adapting an LLM to a specific downstream task requires an expensive and time-consuming model update, such as fine-tuning or context distillation, or relies on memory-intensive long prompts. To bypass these limitations, our work focuses on the concept of cost amortization. We pay the meta-training cost once to train a hypernetwork capable of producing tasks or document specific LoRAs on demand. This turns what used to be a heavy engineering pipeline into a single, inexpensive forward pass. Instead of performing per-task optimization, the hypernetwork meta-learns update rules to instantly modify an LLM given a new task description or a long document. In our experiments, Text-to-LoRA successfully specializes models to unseen tasks using just a natural language description. Building on this, Doc-to-LoRA is able to internalize factual documents. On a needle-in-a-haystack task, Doc-to-LoRA achieves near-perfect accuracy on instances five times longer than the base model's context window. It can even generalize to transfer visual information from a vision-language model into a text-only LLM, allowing it to classify images purely through internalized weights. Importantly, both methods run with sub-second latency, enabling rapid experimentation while avoiding the overhead of traditional model updates. This approach is a step towards lowering the technical barriers of model customization, allowing end-users to specialize foundation models via simple text inputs. We have released our code and papers for the community to explore. Doc-to-LoRA Paper: arxiv.org/abs/2602.15902 Code: github.com/SakanaAI/Doc-t… Text-to-LoRA Paper: arxiv.org/abs/2506.06105 Code: github.com/SakanaAI/Text-…

English
67
232
2.5K
304.9K
Maryam retweetledi
Yu Wang
Yu Wang@__YuWang__·
This could be a form of parametric memory, interesting work!
Sakana AI@SakanaAILabs

We’re excited to introduce Doc-to-LoRA and Text-to-LoRA, two related research exploring how to make LLM customization faster and more accessible. pub.sakana.ai/doc-to-lora/ By training a Hypernetwork to generate LoRA adapters on the fly, these methods allow models to instantly internalize new information or adapt to new tasks. Biological systems naturally rely on two key cognitive abilities: durable long-term memory to store facts, and rapid adaptation to handle new tasks given limited sensory cues. While modern LLMs are highly capable, they still lack this flexibility. Traditionally, adding long-term memory or adapting an LLM to a specific downstream task requires an expensive and time-consuming model update, such as fine-tuning or context distillation, or relies on memory-intensive long prompts. To bypass these limitations, our work focuses on the concept of cost amortization. We pay the meta-training cost once to train a hypernetwork capable of producing tasks or document specific LoRAs on demand. This turns what used to be a heavy engineering pipeline into a single, inexpensive forward pass. Instead of performing per-task optimization, the hypernetwork meta-learns update rules to instantly modify an LLM given a new task description or a long document. In our experiments, Text-to-LoRA successfully specializes models to unseen tasks using just a natural language description. Building on this, Doc-to-LoRA is able to internalize factual documents. On a needle-in-a-haystack task, Doc-to-LoRA achieves near-perfect accuracy on instances five times longer than the base model's context window. It can even generalize to transfer visual information from a vision-language model into a text-only LLM, allowing it to classify images purely through internalized weights. Importantly, both methods run with sub-second latency, enabling rapid experimentation while avoiding the overhead of traditional model updates. This approach is a step towards lowering the technical barriers of model customization, allowing end-users to specialize foundation models via simple text inputs. We have released our code and papers for the community to explore. Doc-to-LoRA Paper: arxiv.org/abs/2602.15902 Code: github.com/SakanaAI/Doc-t… Text-to-LoRA Paper: arxiv.org/abs/2506.06105 Code: github.com/SakanaAI/Text-…

English
1
1
37
6K