Gad Benram

316 posts

Gad Benram banner
Gad Benram

Gad Benram

@gadbenram

✡️ • Founder of @tensoropsai • Your partner for AI 🤖 🧠 • @GoogleDevExpert

Lisbon, Portugal Katılım Eylül 2016
233 Takip Edilen222 Takipçiler
Gad Benram retweetledi
TensorOps
TensorOps@tensoropsai·
Want to measure ChatGPT Ads seamlessly? 🎯 Meet the open-source OpenAI Ads GTM Pixel Template. ✅ Single tag for Initialization & Events ✅ Conversions API deduplication ✅ Payload validation Check out the repo and star us here: github.com/TensorOpsAI/op…
English
0
1
2
17
Gad Benram
Gad Benram@gadbenram·
Running ChatGPT Ads? 🚀 Easily track your campaign conversions with the new open-source OpenAI Ads Measurement Pixel template for Google Tag Manager! Created by @tensoropsai, it supports standard/custom events and Conversions API deduplication. tensorops.ai/labs/research/…
English
0
1
2
53
Gad Benram
Gad Benram@gadbenram·
For me, 2026 is the year when builders of AI applications realized that wrapping an LLM with a REST API is not doing AI. They will soon realize that they need data scientists. Back to 2021 again.
Gad Benram tweet media
English
0
0
0
19
Gad Benram
Gad Benram@gadbenram·
@drjoshcsimmons I also wondered why it didn't make the stock jump so I ran some research with Ask Seeking Alpha and basically found that the drop is mostly explained by weak growth data in the report, the layoff if 20% of the team is considered panic rather some efficiency.
English
0
1
14
1.4K
Dr. Josh C. Simmons
Dr. Josh C. Simmons@drjoshcsimmons·
For a decade, tech CEOs had one magic trick: Announce layoffs. Watch the stock pop. Collect the bonus. Cloudflare tried it. They cut 1,100 people, wrapped it in "AI transformation" language, and the stock fell 18%. A tiny bit of justice in a bleak world.
English
70
303
7.6K
163.9K
Gad Benram
Gad Benram@gadbenram·
🚨 LIVE DEMO! 🚨 The challenge in Enterprise AI is NOT creating it - it's controlling it. Gaining full observability over dozens, or even hundreds, of agents requires kill switches, real-time visibility into their actions, and automated alerts for when they get stuck. Right now, even companies with the highest AI adoption max out at just 1 to 4 agents per employee, where that employees has to "operate" the agents just like you work with Claude code. It's a massive effort keeping the "context" in their heads of what each agent is actually supposed to be doing. This demo of AgentScrum shows how easy it is to monitor the agent when you build for them a board. Give it a try: tensorops.ai/agent-scrum
English
0
0
0
37
Gad Benram retweetledi
TensorOps
TensorOps@tensoropsai·
There is NO replacement for the quality of Claude Code and Codex combined with their top models (Opus and GPT-5.5). But if you're willing to go for Chinese self-hosted models and OSS, you CAN cut down on the price for teams of 100+ members. @gadbenram tensorops.ai/blog/claude-co…
TensorOps tweet media
English
0
1
1
40
Gad Benram
Gad Benram@gadbenram·
I have no doubt that AI is a bubble, and to prove it, I suggest we unplug every LLM for 4 weeks and watch the global economy slow down by 10%.
English
0
0
2
17
Gad Benram
Gad Benram@gadbenram·
A 9-step practical guide to agent reinforcement fine-tuning. Dataset → tool servers → graders → KL & group size → GPU sizing → framework choice (TRL · verl · OpenRLHF · Unsloth · NeMo RL). Code and expected outputs at every step. tensorops.ai/blog/practical…
Gad Benram tweet media
English
0
0
4
87
Gad Benram
Gad Benram@gadbenram·
@sir4K_zen Thanks! what do you call "edge cases", would love to share
English
0
0
0
12
Gad Benram
Gad Benram@gadbenram·
🚀 Investors Listen Up! "Ask Seeking Alpha" is officially LIVE for Premium+ users! Built together with our team at TensorOps, this isn't just another LLM wrapper. We engineered a specialized Multi-Agent AI to act as the ultimate financial research assistant. 📊 Investors can now: 🔍 Screen stocks conversationally: ("Show me top dividend-paying energy stocks with P/E ratios under 20") ⚖️ Compare ticker performances: Get precise, mathematically accurate data instantly. 📑 Analyze 10-Ks & earnings calls: ZERO hallucinations. We built a strict "Citation Enforcer" so every single claim links directly back to the source! Read the full engineering deep-dive on how we built it: tensorops.ai/blog/building-…
English
2
2
4
188
Gad Benram
Gad Benram@gadbenram·
@urieli17 כן, אבל החתימה שלי בטוויטר יותר נמוכה משל מפציץ חמקן
עברית
0
0
1
73
Uri Eliabayev
Uri Eliabayev@urieli17·
אחרי שנכנסתם פנימה לאקוסיסטם של קלוד, המטרה היחידה של אנטרופיק היא לגרום לכם לצרוך כמה שיותר טוקנים כאשר האפשרויות בלתי מוגבלות. אני לא חושב שיש הרבה תקדימים בהיסטוריה לשיטת תמחור כזו ממכרת. זה לא רק לדחוף לך עוד פיצ'רים יקרים (נגיד כמו סיילספורס) זה ממש למכור לך שהכל אפשרי, רק תנסה. רק תן לזה צאנס, מה תפסיד? הצליח לך? אהה יופי, הנה תן שאכטה קטנה נוספת, הנה עכשיו תנסה אותנו גם בפיננסים.
Claude@claudeai

New for financial services: ready-to-run Claude agent templates for building pitches, conducting valuation reviews, closing the books at month-end, and more. Install them as plugins in Cowork and Claude Code, or use our cookbooks to run them in production as Managed Agents.

עברית
33
1
231
31K
Gad Benram
Gad Benram@gadbenram·
@urieli17 תודה! הצוות הזיע על זה הרבה!
עברית
1
0
1
84
Uri Eliabayev
Uri Eliabayev@urieli17·
@gadbenram לכולנו. וברכות דרך אגב על ההשקה עם אלפא!
עברית
1
0
1
399
Gad Benram
Gad Benram@gadbenram·
This LLM was trained exclusively on data up to 1930-meaning it genuinely reflects the shocking biases and worldview of that era. But beyond this wild "AI time machine" experiment, it’s a masterclass in building highly specialized expert models. If you want to build real value instead of just being an LLM API wrapper, here is the playbook: 1. Pre-training from ScratchThis isn't a Llama or GPT fine-tune. They trained a 13B model from scratch on 260B tokens of pre-1931 text. To prevent "future leaks" efficiently, they used Regex to clean messy OCR scans and an N-gram time filter to block modern words (like "Internet" or "WWII"). 2. Historical SFTHow do you teach an AI to converse without modern chat datasets? They used historical Q&A templates. By feeding it vintage etiquette books, formal letter-writing guides, and old encyclopedias, the model learned to talk like someone actually living in that decade. 3. Online DPO (Preference Optimization)To ensure it actually follows instructions (e.g., "summarize this"), they used Claude 4.6 Sonnet as an AI judge to rank the model's responses to synthetic prompts. (Note: Doing this legally requires permission, and this team had full funding and backing from Anthropic). 4. Rejection-Sampled SFTTo make multi-turn chats feel natural instead of robotic, they ran a final training round. They generated synthetic conversations between Claude 4.6 Opus and their model, filtered out the weak responses, and trained exclusively on the most successful interactions.
Gad Benram tweet media
English
1
0
1
100
Gad Benram
Gad Benram@gadbenram·
Visiting 10 AI companies in New York last week opened my eyes. Anyone still doing LangChain as their "core business" in 2026 is missing the point of AI. There is a huge gap between SaaS startups that keep trying to “build agents.” These are just wrappers on commercial LLMs. The startups backed by the biggest funds do something totally different. Most companies I met consume agents. They do not build them. Even small teams using LangChain or ADK treat it as a side project. No one puts agents at the center of their product. So what are they actually doing? Data. That is the real game. It feels like 2021 again. The focus is back on training models. Companies win by training from scratch or tuning on proprietary data. Legal docs or biochemistry. The models become far more accurate. Think of the leap from old models to Claude 3.x for code. One gives output you can actually use. The other does not. What data exactly? Competitor chat logs. Technical support CRM records. University exam answers. Expert work in semi-structured formats. Resources matter too. A good manager in 2026 spends most of his time securing compute. There is a real shortage of CPU and GPU. Top engineers now juggle context across multiple OpenX instances. Their biggest ask is always higher quotas. That turns into an internal war between teams. It lands on the CEO’s desk. His job is to close deals that deliver the cheapest compute possible. So the team never has to pause. Beyond data, training techniques create the edge. Deciding when to use adaptive pre-training versus fine-tuning. Or when to distill from competitors. This makes up about 20 percent of the work. It counts as real advantage.
English
1
0
2
70
Gad Benram
Gad Benram@gadbenram·
@status_effects This is super cool!!! I tried to do something different back in Feb 2025. I gave o3 and o1 the option to implement a DQN in order to fight a spaceship game. Would love to chat about what you did there! x.com/gadbenram/stat…
Gad Benram@gadbenram

What better way to test who is better, o1 or o3, than to put them in a shootout? To evaluate the performance of o3 vs. o1, I decided to create a dynamic test environment for them. First, using o3, I created a basic template for a shooting game. In this template, the rules of the game were defined, but no specific strategy was set—each system had to implement its own strategy independently. What did the experiment reveal? A clear difference in approach was observed: o3, known to be “smarter,” tried to aim its shots in a strategic, calculated manner toward o1. In contrast, o1’s shots seemed completely random—without a clear strategic direction. What did the experiment reveal? A clear difference in approach was observed: o3, known to be “smarter,” tried to aim its shots in a strategic, calculated manner toward o1. In contrast, o1’s shots seemed completely random—without a clear strategic direction. But here comes the surprise: despite o3’s apparent intelligence and deep thinking, its initial solution approach simply did not work. In most cases, o1’s randomness led to surprising and sometimes even advantageous results in the game. What can we learn from this? First, to truly assess which of the two models is “smarter,” you can pit them against each other in a complex environment with clear rules and conduct a broad statistical experiment. In my opinion, this is currently the most effective way to get the full performance picture—both the initial outcome and the capacity to correct and improve actions. Second, the experiment shows that even if a model demonstrates high intelligence in theory, it does not guarantee success if we rely on the output of its first run. This is where the importance of learning and long-term adaptation in AI systems comes in—the ability to improve performance using Reinforcement Learning, which emphasizes identifying mistakes, learning from them, and adapting the strategy to the changing environment. Ultimately, what emerges from the experiment is not just the need for initial intelligence, but also the need for a continuous ability to learn and improve. Techniques such as Reinforcement Learning demonstrate how correction and adaptation processes can lead to improved long-term performance—what does this mean for the future of AI systems? I would love to hear your thoughts—do you agree that the ability to learn and adapt one’s strategy is the key to success in the world of artificial intelligence?

English
0
0
0
17
Nick Levine
Nick Levine@status_effects·
llms can FIGHT now. here's opus as wizard vs gpt-5.4 as robot. calling this budok-ai. it works by modding the brilliant game yomi hustle. 8-model seeded tournament incoming. details and code below:
English
118
173
2.3K
250K
Gad Benram
Gad Benram@gadbenram·
@ShaharTzafrir ברכות. גם מהצד השני של היזם אני יכול להגיד שקל לזהות את המשקיעים שהם טובים במה שהם עושים. השאר לא רק ש"לא יצליחו" אלא גם יחריבו לך את העסק.
עברית
1
0
1
125
Shahar Tzafrir
Shahar Tzafrir@ShaharTzafrir·
למי שמחפש בנצ'מרק על ביצועי סטארטאפים והון סיכון, זה מעניין. ואותי הפתיע. לפי הדאטאסט הזה, רק 1.5% מהסטארטאפים הצליחו לעבור את המאה מיליון דולר בהכנסות. ומתוך כמעט 20,000 משקיעי הון סיכון שעושים השקעות סיד, 89% מהם מעולם לא הצליחו להשקיע בסטארטאפ שהגיע לזה. 8.2% ממשקיעי הסיד השקיעו בקריירה שלהם רק פעם אחת בסטארטאפ שחצה את המאה מיליון דולר מכירות. וקבוצה קטנה ביותר של 25 משקיעי הון סיכון (אנשים, לא קרנות) פגעו כל אחד מהם ב 11+ סטארטאפים שהצליחו לעבור את המאה מיליון דולר מכירות. מה שמוכיח שתמיד יש אנשים שטובים בעקביות בצורה חריגה במקצוע הזה. אני עם קריירה עד כה קצרה יחסית בהון סיכון. עושה רק השקעות סיד. ומתוך 35 חברות שהשקעתי בהן בשלב הסיד מתחילת דרכי כמשקיע (13 שנים), בינתיים ארבע עברו את המאה מיליון דולר מכירות ואני צופה שהשנה החמישית תעשה את זה.
Christoph Janz 🕊@chrija

If you forget about the Harveys and Lovables and Anthropics for a second … getting to $100M in revenue is very rare. According to this dataset ca. 1.5% of VC funded startups. If you’ve achieved that as a founder, you can be super proud.

עברית
25
1
157
21.7K
Yuval Weinreb 于威
Yuval Weinreb 于威@yuval_weinreb·
ימים מעניינים בגיזרה הסינית. התגובה הסינית לסגר של טראמפ על מיצר הורמוז כוללת את ״ארבעת הנקודות״ שהציג שי ג׳ינפינג ליורש העצר האמירתי מתבססת על היוזמות הגלובליות של סין - ומדגישה את ההבדלים בינה לבין ארה״ב באזור, גם במיצוב וגם באפקטיביות... שי נפגש עם רה״מ ספרד פדרו סאנצ׳ז ימ״ש - שמסתמן כראש גשר פוטנציאלי מעניין עבור שי לקידום רעיון ״האוטונומיה האסטרטגית״ של אירופה - כל אלו, וגם המפגש המעניין של הנשיא שי עם יו״ר האופזיציה הטאיוואנית והמשמעויות שלה ליחסי משני צידי מיצרי טאיוואן - בפרק החדש שעלה היום ב״להבין את סין״ - שלא הצלחתי להתאפק, וקראתי לו ״בין המיצרים״... הלינקים בתגובה הראשונה האזנה וצפיה נעימה!
Yuval Weinreb 于威 tweet media
עברית
4
0
42
6.2K
Gad Benram
Gad Benram@gadbenram·
In year 2026 Anthropic released a model so powerful that it could break any codebase, and even managed to jailbreak from the enclave that its creators made for it. Turns out, it was a Mythos.
English
0
0
0
82
Gad Benram
Gad Benram@gadbenram·
@verbbz וואללה לא הייתי מודאג... המניה במכפיל רווח של 27 בערך, יש לה עוד לאן לרדת והחברה תמשיך לצמוח.
Gad Benram tweet media
עברית
1
0
4
1.4K
verb ⏸️
verb ⏸️@verbbz·
סורי אבל הסיפור הזה לא כל כך מעורר השראה בעיניי כשהמנייה שלכם נפלה השנה ב- 75%. במקום לספר איך פירקתם את המונולית, תספר מה קרה לצמיחה/תחזיות של החברה ואיך מה שהצוות שלך עשה הולך לעזור למאנדיי לצמוח
verb ⏸️ tweet mediaverb ⏸️ tweet media
עברית
16
0
69
19K