Shivi Bhatia

8K posts

Shivi Bhatia

Shivi Bhatia

@Shivipmp

Senior Solution Architect Generative AI - improving search with stats , rerankers , Agents and not an MCP Fan and baking traditional maths to Agents

Woodinville, WA Katılım Mart 2013
762 Takip Edilen462 Takipçiler
Shivi Bhatia
Shivi Bhatia@Shivipmp·
@emollick In my personal capacity I am working on cases like patient admission specially on opioid crisis , suicidal thoughts and ideation , depression where LLM are great but doesn’t only depend on internal maths - human touch , clinical exp, nuisances , 2M token input in many cases
English
0
0
0
343
Ethan Mollick
Ethan Mollick@emollick·
Math is easy* because it has verifiable outputs and few messy judgement choices to make. Which AI labs have the guts to make advancing social science a priority? It may actually do more for human flourishing to unlock sociology, econ & psych reseach. * For AIs, not for humans
English
74
28
455
66.8K
Shivi Bhatia
Shivi Bhatia@Shivipmp·
@levie What I like fascinating about this LLM & AI is that every one comes up with fancy words whereas there is nothing new if you have actually worked on the ground .This FDE is identical to what master black belts used to do as value steam mapping , theory of constraint etc
English
0
0
0
140
Aaron Levie
Aaron Levie@levie·
Great post on FDEs. Everyone should read it if you’re interested in this job category. This is a job that is going to be around as long as AI keeps changing rapidly, which it inevitably will. People often wonder why isn’t this like just deploying other forms of technology in the past, like cloud. Because something like cloud adoption affected a fairly concentrated set of users (developers and IT), and generally didn’t require a fundamental change to the workflows of employees to get the benefits of the new service being delivered on the cloud. At best you went to one training session and you were done. With agents, the work to implement them is not only highly technical, but they directly impact the underlying workflows that people participate in. This means there’s a ton of technical work and change management that comes with it. Further, the pace of change of cloud wasn’t nearly as quick, so there was a lot more time for best practices to propagate. Now, every model change means either something new can be done that wasn’t possible before, or some piece of scaffolding is now redundant or holding you back. This is why it’s commonly easier for a vendor or partner that’s seen the implementation hundreds or thousands of times help do the work, even with internal support from the customer. So, this job isn’t going away any time soon, and will be a great path for a lot of technical talent, especially early career.
vas@vasuman

x.com/i/article/2057…

English
60
147
1.4K
451.2K
Shivi Bhatia retweetledi
Tom Elliott
Tom Elliott@tomselliott·
Jeff Bezos explains to @AOC how billionaires are created: providing at least a billion dollars in value to society -- the opposite of exploitation. Bezos: "Let me give you a simple example. Let’s say you start a burger joint, and you have 10 employees, and you make a little bit of money.” SORKIN: “Right.” Bezos: “Until you have — this is — this just one — one outlet. And by the way, these are the most delicious burgers in the world. People love your burgers, Andrew. And so then, you open a second outlet —” SORKIN: “Right.” Bezos: “— and now you’re making a little bit more money, and you have 20 employees. Nd you open a third outlet. By the time you’ve opened a thousand outlets, you are a billionaire.” SORKIN: “Right.” Bezos: “And by the way, this is a real life story, it happens all the time, it’s In-N-Out Burger, it’s Raising Cane’s Chicken. At what point did that money all of a sudden become unethical, or it didn’t? There was one outlet, and then there were two, and then there were three. What you’re doing — the way — the way you make a billion dollars, or a hundred million dollars, or 10 million dollars, or anything, is you create a service that people love. And if millions of people choose your service, you’re going to end up with a billion dollars.” SORKIN: “Right.” Bezos: “And you can, you know, just try it with a chicken franchise.” SORKIN: “Do you think though —” Bezos: “But your chicken has to be good.”
English
384
1.3K
12.5K
981.1K
Shivi Bhatia
Shivi Bhatia@Shivipmp·
@theo @OpenAIDevs Hands down codex and OpenAI models are class leading . All my research on TTC , heap search , Bayesian optimization , HAT trees and many more OpenAI worked from where Claude stuck
English
0
0
0
397
Theo - t3.gg
Theo - t3.gg@theo·
Honestly I'm still really impressed with the Codex app. It works reliably. It adds useful features consistently. It has taste. The mobile integration is awesome. The git integration is solid. If you haven't used it yet, I highly recommend it.
English
219
103
4.1K
792.9K
Shivi Bhatia
Shivi Bhatia@Shivipmp·
The more you use opus 4.7 the more you realize how good OpenAI models are . Opus and current generation models are very very poor , example here :
Shivi Bhatia tweet media
English
0
0
1
101
Shivi Bhatia
Shivi Bhatia@Shivipmp·
Lol I switched to opus from sonnet on the app & not even one code was generated. The session had 4-5 codes on sonnet 4.6 and opus did not even generate one. Happy I moved 4m $100 to $20 & switched to OpenAI to a $200 - better faster & intelligent . Claude sucks. #claude #openai
Shivi Bhatia tweet media
English
0
0
0
49
Shivi Bhatia retweetledi
Ari Hoffman
Ari Hoffman@thehoffather·
BREAKING: The DOJ has notified Gov Bob Ferguson of a federal investigation into the state’s practice of housing men in women’s prison & whether WA engages in violating the constitutional rights of female prisoners at the WA Corrections Center for Women in Gig Harbor.
Ari Hoffman tweet mediaAri Hoffman tweet media
English
117
733
3.6K
53.8K
Shivi Bhatia
Shivi Bhatia@Shivipmp·
@AdityaRajKaul This is not an attack . She simply asked a question to Modi to answer a question why it’s wrong
English
0
0
0
124
Aditya Raj Kaul
Aditya Raj Kaul@AdityaRajKaul·
Prime Minister of Norway snubs controversial journalist Helle Lyng, asks her to respect Indian democracy. Schools her about India’s one and a half billion people from different cultures, religions, history and experiences. Will she still attack India?
English
442
5.6K
23.2K
470.4K
Dimitris Papailiopoulos
Dimitris Papailiopoulos@DimitrisPapail·
@tszzl It's the golden age of asking questions. Perhaps the best of time for research too.
English
1
0
68
3.8K
roon
roon@tszzl·
ironically think it’ll be a sad time for ai researchers this year. they are first in the hotpath of RSI and probably the market for them will shrink or at least their pricing power will be reduced as this generation of models commoditizes the skills that made them rare
English
167
64
2K
228.2K
LuxAlgo
LuxAlgo@LuxAlgo·
A jury has unanimously rejected Elon Musk's claims OpenAI 'stole a charity.' Ruling says the statute of limitations against CEO Sam Altman had expired and clears the way for OpenAI to IPO.
LuxAlgo tweet media
English
4
1
29
4.2K
Shivi Bhatia
Shivi Bhatia@Shivipmp·
@HarveenChadha There are ways to do it , it’s not that hard - PR code commit and many others like time taken to complete Jira , SFDC or other ticket pre and post AI implementation . Never do it based on Input or output tokens .
English
0
0
0
639
Harveen Singh Chadha
Harveen Singh Chadha@HarveenChadha·
was talking to a friend who is a senior leader top management had given him a target: reduce staff by 15% and show equivalent gains through AI agents since company attrition was 14-18%, it was an achievable target as he assumed they would not hire backfills but attrition was only 2% as the job market is bad and very few are finding good offers outside.. he now has to fire people and asked me: on what parameters should I judge the AI-ability of a person ? Is it LOC ? token consumption? turnaround time ? I had no answer
English
62
21
591
83.2K
Leading Report
Leading Report@LeadingReport·
Alcohol consumption among U.S. adults has fallen to the lowest level recorded in Gallup’s nearly 90-year history.
English
431
336
4.3K
4.8M
Velina Tchakarova
Velina Tchakarova@vtchakarova·
Timing is of absolute essence for intelligent investors, businesses and states. In this particular example, I drafted the Global System Rupture by Mid-March outlining the metrics of the greatest systemic risk-driven global crisis. I had two month advantage to prepare accordingly!
Aravind@aravind

Singapore PM, on May 1st, said their country needs to prepare for food, fertilizer, and fuel shortages. PM Modi too, just a few days back, said India needs to prepare for difficult times similar to Covid-19. Responsible countries and their responsible leaders prepare their countries and citizens like this. I would be more worried if PM Modi or GoI acted all cool and as if India's all set for political posturing.

English
3
15
135
17.7K
Shivi Bhatia
Shivi Bhatia@Shivipmp·
@Yuchenj_UW Front and is still not a deal breaker . But it beats hands down in any complex situation- every time hands down
English
0
0
0
67
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
Claude Opus 4.7 is over-trained on the Anthropic website. Every HTML page it designs has that unmistakable Anthropic flavor. GPT-5.5 is still weirdly weak at frontend. It designs frontend like it learned CSS from a backend engineer. OpenAI urgently needs an MTS with taste.
English
84
24
992
86.7K
Shivi Bhatia
Shivi Bhatia@Shivipmp·
@sama Long context beyond 200k model hallucinates and don’t remember the context - prior to 200k it is the best model hands down
English
0
0
0
7
Sam Altman
Sam Altman@sama·
what would you most like to see improve in our next model?
English
8.3K
305
9K
1.4M
Bindu Reddy
Bindu Reddy@bindureddy·
Google I/O Predictions - new video model, Veo 3.5 - Nano Banana 3 - Flash 3.2 - Gemini Pro 3.5 Gemini beats GPT 5.5 at coding
English
76
29
814
42.1K
Shivi Bhatia
Shivi Bhatia@Shivipmp·
@diegohaz It’s always the case for me as well codex is a killer product
English
0
0
0
34
Haz
Haz@diegohaz·
I gave the same task to GPT-5.5 xhigh and Opus 4.7 max. – GPT took ~30 minutes. – Opus took ~2 hours. Then I asked them to review each other's work and give an honest verdict on which was better. – GPT said its own code was better. – Opus said there was no obvious winner. Then I asked them to learn from each other and apply the best parts they had learned to their own code. In the end, I asked them to review each other again and give an honest verdict after they had both improved. – GPT kept saying its own code was better. – Opus said GPT's code was better. What a journey.
English
186
77
3.1K
346.2K
Shivi Bhatia
Shivi Bhatia@Shivipmp·
@adamghowiba LLM s a judge is sub optimal if any org is using only this for quality assurance they are heading for a bad surprise
English
0
0
1
1.5K
Adam Ghowiba
Adam Ghowiba@adamghowiba·
JP Morgan's investment research team just shared exactly how they built their multi-agent system "Ask David", and it's the same architecture pattern showing up everywhere: - supervisor agent orchestrates - specialized subagents handle retrieval, structured data, analytics - LLM-as-judge reflection node before the answer ships - human-in-the-loop for the last accuracy gap worth watching for anyone building:
English
134
676
7K
2M