Pegah Maham

222 posts

Pegah Maham banner
Pegah Maham

Pegah Maham

@pegahbyte

Is it true or is it just confirmation bias? | Trying to understand second order effects | Policy development & strategy @GoogleDeepMind

San Francisco Beigetreten Ekim 2021
283 Folgt587 Follower
Pegah Maham retweetet
Toby Ord
Toby Ord@tobyordoxford·
Here is the range of credible dates for AGI, across all forecasters at Metaculus. This is a huge range of uncertainty. The median date is 2033, but their 80% confidence interval is from 2026 to 2067 — between 0.25 years and 41 years. 2/
Toby Ord tweet media
English
3
6
47
3.7K
Pegah Maham retweetet
Ruijiang Gao
Ruijiang Gao@ruijianggao·
The shocker? For the same question, agents with same instructions reached wildly different conclusions. 🤯
Ruijiang Gao tweet media
English
3
12
181
23.3K
Pegah Maham retweetet
mahmoud ghanem
mahmoud ghanem@mynamelowercase·
It was surreal, years ago, giving early feedback to the cyber security experts who were helping us build these and telling them with a straight face "you need to make this task 10x more challenging" when models at the time could barely solve high school level PicoCTF challenges.
English
1
2
9
532
Pegah Maham
Pegah Maham@pegahbyte·
AI to cyber security people: 2023: 😂 2024: 🫠 2025: 👀 2026: 🫥
English
1
0
4
109
Pegah Maham retweetet
Google Public Policy
Google Public Policy@googlepubpolicy·
Our new Google Threat Intelligence Group (GTIG) report breaks down how threat actors are using AI for everything from advanced reconnaissance to phishing to automated malware development. More on that and how we’re countering the threats ↓ cloud.google.com/blog/topics/th…
English
2
15
60
10K
Pegah Maham retweetet
andy jones
andy jones@andy_l_jones·
i am glad this chart is public now because it is bananas. it is ridiculous. it should not exist. it should be taken less as evidence about anthropic's execution or potential and more as evidence about how weird the world we've found ourselves in is.
andy jones tweet media
English
27
65
740
239.3K
Pegah Maham retweetet
Pegah Maham retweetet
Bonnie Li
Bonnie Li@bonniesjli·
The model act as both the 1) task setter - instructing the agent to perform tasks in the environment 2) agent, executing trajectories 3) and reward model - scoring its own trajectories. This builds a self-improvement flywheel purely through self-generated data.
Bonnie Li tweet media
English
1
10
144
11.1K
Pegah Maham
Pegah Maham@pegahbyte·
context switching is the bottleneck 🥲
English
0
0
0
70
Forecasting Research Institute
Forecasting Research Institute@Research_FRI·
Is AI on track to match top human forecasters at predicting the future? Today, FRI is releasing an update to ForecastBench—our benchmark that tracks how accurate LLMs are at forecasting real-world events. A trend extrapolation of our results suggests LLMs will reach superforecaster-level forecasting performance around a year from now. Here’s what you need to know: 🧵
Forecasting Research Institute tweet media
English
7
28
121
43.6K
Pegah Maham retweetet
roon
roon@tszzl·
not enough people are emotionally prepared for if it’s not a bubble
English
436
451
8.8K
1.6M
Pegah Maham retweetet
Peyman Milanfar
Peyman Milanfar@docmilanfar·
it's called encoder-decoder for a reason
Peyman Milanfar tweet media
English
12
82
974
45.7K
Pegah Maham retweetet
Epoch AI
Epoch AI@EpochAIResearch·
Should AI regulations be based on training compute? As training pipelines become more complex, they could undermine compute-based AI policies. In a new piece with Google DeepMind’s AI Policy Perspectives team, we explain why. 🧵
Epoch AI tweet media
English
8
11
63
8.4K
Pegah Maham retweetet
Gustavs Zilgalvis
Gustavs Zilgalvis@GZilgalvis·
The "you can just do things" ethos creates an adverse selection dynamic: while ostensibly democratizing agency, it primarily empowers those with minimal internal friction around action - precisely those least equipped to consider externalities or second-order effects. The thoughtful remain constrained by their own cognitive apparatus for evaluating downstream consequences, while the unscrupulous find their natural tendency toward unreflective action suddenly legitimized as entrepreneurial bias.
English
3
4
16
1.6K
Pegah Maham retweetet
Rohan Paul
Rohan Paul@rohanpaul_ai·
BRILLIANT @GoogleDeepMind research. Even the best embeddings cannot represent all possible query-document combinations, which means some answers are mathematically impossible to recover. Reveals a sharp truth, embedding models can only capture so many pairings, and beyond that, recall collapses no matter the data or tuning. 🧠 Key takeaway Embeddings have a hard ceiling, set by dimension, on how many top‑k document combinations they can represent exactly. They prove this with sign‑rank bounds, then show it empirically and with a simple natural‑language dataset where even strong models stay under 20% recall@100. When queries force many combinations, single‑vector retrievers hit that ceiling, so other architectures are needed. 4096‑dim embeddings already break near 250M docs for top‑2 combinations, even in the best case. 🛠️ Practical Implications For applications like search, recommendation, or retrieval-augmented generation, this means scaling up models or datasets alone will not fix recall gaps. At large index sizes, even very high-dimensional embeddings fail to capture all combinations of relevant results. So embeddings cannot work as the sole retrieval backbone. We will need hybrid setups, combining dense vectors with sparse methods, multi-vector models, or rerankers to patch the blind spots. This shifts how we should design retrieval pipelines, treating embeddings as one useful tool but not a universal solution. 🧵 Read on 👇
Rohan Paul tweet media
English
51
371
2.4K
241K
Pegah Maham retweetet
Derya Unutmaz, MD
Derya Unutmaz, MD@DeryaTR_·
I’m excited to share the first part of an absolutely stunning analysis from the GPT-5 thinking model! I uploaded a huge spreadsheet, nearly 1,300 metabolites (lipids, carbohydrates, microbiome-derived compounds, and much more) measured in 150 ME/CFS patients and 100 healthy controls. In the first run, I didn’t even tell GPT-5 these samples were from ME/CFS patients, I wanted to see what it could find blind, purely from the metabolomics data. Next, I’ll share the version where I revealed these were from our patient cohort, tied to our recently published paper and what GPT-5 uncovered there is yet on another level! We had analyzed this same dataset over two years ago, and it took us more than a month to fully work through it. ✅GPT-5 did a better job in under five minutes. ✅It not only replicated almost everything we had concluded back then, including finding all the significant differences, creating multiple spreadsheets on different pathways and so on, but also uncovered several discoveries we completely missed. ✅GPT-5 even highlighted actionable targets and potential treatments for patients (which I’ll share soon). This isn’t an “incremental improvement.” This is a revolution! What once took months now takes hours. As I mentioned before the rules of scientific research aren’t just shifting, they’re being rewritten! Sharing a portion of output from GPT-5 as an example, and executive summary is also included as a screenshot. Unified mechanistic theory with causal diagram Observed pattern •Lipid remodeling with increased DAG, PC, SM, and specific ceramides in patients. •Cofactor pattern with decreased carotenoids and increased alpha-tocopherol. Mechanistic links •De novo ceramide synthesis via serine palmitoyltransferase and ceramide synthases increases ceramide pools that influence stress and signaling. •The Kennedy (CDP-choline) pathway couples DAG and PC metabolism; CHKA → PCYT1A → CHPT1 convert choline to PC using DAG as the acceptor. •DAG activates PKCε and related isoforms, which can shift receptor signaling fidelity. •Alpha-tocopherol is a lipid-phase peroxyl radical scavenger and is regenerated by ascorbate; reduced carotenoids are consistent with antioxidant consumption. Ranked, actionable targets 1.SPTLC1/2 or CERS (enzymes) - decrease de novo ceramide synthesis. Low feasibility at present but highly causal if lipid drivers are primary. Risks include effects on myelin. 2.DGAT1/2 modulation - reduce toxic DAG signaling by shunting to neutral storage or titrating flux. Medium feasibility, GI tolerability is the key risk. 3.PKCε inhibition - block DAG-to-signaling step. Currently low feasibility, but mechanistically precise. 4.Dietary carotenoids and vitamin C support - replete antioxidant capacity and aid tocopherol recycling. High feasibility, monitor F2-isoprostanes and carotenoid panel. 5.Trial L-carnitine only if deficiency is confirmed - small signal in carnitine pathway; low-confidence, pilot dosing with monitoring. Proposed validation experiments and minimal clinical biomarker panel Validation experiments •Targeted lipidomics focusing on DAG species, ceramides (chain-length resolved), sphingomyelins, PCs. •PKCε activity proxies in accessible cells if feasible. •Antioxidant panel: alpha-tocopherol, carotenoids, vitamin C, plus F2-isoprostanes for lipid peroxidation readout. •If pilot L-carnitine is considered, measure free and acyl-carnitines and the acyl/free ratio pre-post. Minimal monitoring panel •Ceramides: d18:1/16:0, d18:1/18:0, and dihydroceramides. •DAG class panel with positional isomers if available; report as molar % of total lipids. •PC class and LPC/PC ratio; choline and phosphocholine to infer Kennedy pathway flux. •Alpha-tocopherol, beta-cryptoxanthin, carotene diols, vitamin C, and F2-isoprostanes.
Derya Unutmaz, MD tweet media
English
90
244
1.7K
498.8K
🙃 ɐʇǝɯ - Untrulie 🌞
Anyone in/near London who is willing to come this weekend for a free Alexander Technique lesson taught by me? (Getting assessed for my qualification) AT helps with: - freedom & ease (physical and mental) - perception - reactivity - armouring / tension - pains (back, etc)
English
19
7
102
7.1K