Pratik Patel

1.9K posts

Pratik Patel

@pratikpatel

Data Scientist in Life Sciences and Healthcare domains. Developing intelligent systems and processes using deep learning.

Katılım Mart 2008

1.1K Takip Edilen271 Takipçiler

Sabitlenmiş Tweet

Pratik Patel@pratikpatel·7 Ara

People are underestimating the importance of this news. Whatever we have seen till now, is the same old Robots-are-going-to-kill-us reporting. It's different. It's a step closer to general purpose AI. It is not a question of sentience, but about solving humanities all problems. twitter.com/TheStalwart/st…

Joe Weisenthal@TheStalwart

1997: World's best human chess player gets destroyed by computer. 2017: World's best chess computer gets destroyed by Artificial Intelligence program that had only learned about chess a few hours earlier. twitter.com/bramcohen/stat…

English

Pratik Patel retweetledi

Andrej Karpathy@karpathy·28 Mar

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

English

1.7K

2.4K

31.3K

3.5M

Pratik Patel retweetledi

Andrew Rousso@AndrewRousso·9 Şub

the forbidden way to order food

English

181

1.8K

18.5K

580.8K

Pratik Patel@pratikpatel·30 Kas

@IndiGo6E what’s up with your baggage services at Ahmedabad airport? *Overworked Staff* They don’t have time to answer phone calls. Delayed bag receipt that should take 2 mins to generate took 15 mins per customer because staff was getting calls continuously. Staff really wants to help but they are burdened by a thoughtless process (or lack of). *Convoluted Process* Delayed baggage seems to be a very common occurrence by the amount of passengers I saw in line at your baggage service desk. You should have a streamlined process that reduces burden on your staff. *Delivery Vendor* They want to finish the delivery asap which means they call the customer at 2AM to see if they can deliver. I understand they’d like to get this done but it should be customer’s preference if they want the delivery at all cost or with reasonable means. This post is only to highlight a company’s failure and not individuals’. Everyone I have encountered were very courteous and always wanted to do the right thing. @MoCA_GoI @DGCAIndia

English

Pratik Patel retweetledi

Andrej Karpathy@karpathy·13 Eki

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.

English

687

3.4K

24.2K

5.8M

Pratik Patel retweetledi

Simon Willison@simonw·9 Eyl

I got GPT-5 with ChatGPT code interpreter to hunt down the US Census numbers used by the recent Apollo AI adoption rate chart and then recreate that chart from a screenshot and the raw data using matplotlib in Python simonwillison.net/2025/Sep/9/apo…

English

292

102.3K

Pratik Patel retweetledi

lisatomic@lisatomic5·29 Nis

the problem with chatGPT being a sycophant is downstream of ppls' default "thinking" mode being geared toward positive evaluations from other people

English

936

Pratik Patel retweetledi

Tim Urban@waitbutwhy·29 Nis

This is why it's important to surround yourself with people who can see the real you. If not, you might find yourself morphing your personality to match people's (incorrect) model of you, subconsciously hiding your real self away.

taoki@justalexoki

this is fucking me up

English

109

368

3.6K

296.3K

Pratik Patel retweetledi

Pearl@ppearlman·6 Kas

The election is over. If your side won, congrats. If your side lost, you’ll get ‘em next go round. Now it’s time to get back to making life better for you & your people. Back to basics… Move your body daily. Feed it real food. Prioritize rest. Love your people. Let’s go!!

English

362

58.3K

Pratik Patel retweetledi

FreeBSD Frau@freebsdfrau·3 Mar

In 2002, I was working for a nation-wide retailer. About 6 months after building a new store, my car broke down. I lived so far away from the store that I was only able to get two rides home from different colleagues. I ultimately took to walking 23 miles to work (each way) 🧵

English

769

135.2K

Pratik Patel retweetledi

Sebastian Raschka@rasbt·13 Eki

I ran hundreds if not thousands of LoRA & QLoRA experiments to finetune open-source LLMs, and here’s what I learned: 1. Despite the inherent randomness of LLM training (or when training models on GPUs in general), the outcomes remain remarkably consistent across multiple runs. 2. QLoRA presents a trade-off that might be worthwhile if you're constrained by GPU memory. It offers 33% memory savings at the cost of a 33% increase in runtime. 3. When finetuning LLMs, the choice of optimizer shouldn't be a major concern. While SGD on its own is suboptimal, there's minimal variation in outcomes whether you employ AdamW, SGD with a scheduler, or AdamW with a scheduler. 4. While Adam is often labeled a memory-intensive optimizer due to its introduction of two new parameters for every model parameter, this doesn't significantly affect the peak memory demands of the LLM. This is because the majority of the memory is allocated for large matrix multiplications rather than retaining extra parameters. 5. For static datasets, iterating multiple times as done in multi-epoch training might not be beneficial. It often deteriorates the results, probably due to overfitting. 6. If you're incorporating LoRA, ensure it's applied across all layers, not just to the Key and Value matrices, to maximize model performance. 7. Adjusting the LoRA rank is essential, and so is selecting an apt alpha value. A good heuristic is setting alpha at twice the rank's value. 8. 7B models can be finetuned efficiently within a few hours on a single GPU possessing 14 Gb of RAM. With a static dataset, optimizing an LLM to excel across all benchmark tasks is unattainable. Addressing this requires diverse data sources, or perhaps LoRA might not be the ideal tool.

Lightning AI ⚡️@LightningAI

After hundreds of experiments, @rasbt has figured out how to get the most out of LoRA finetuning 👉 lightning.ai/pages/communit… #LLMs #GenAI #DeepLearning

English

218

1.2K

367.4K

Pratik Patel retweetledi

Nirant@NirantK·4 Ağu

Alibaba releases QwenLM-7B and Chat variant Self-reported results: Beats Llama2 by a mile on math(GSM8K), code (HumanEval), Question Answering and QA (MMLU) 🤯 ~2x better than GPT4 on a variant of tool selection (e.g. AutoGPT) — 8.5% False Positive Rate, compared to GPT4's 15% Commercially licensed if below 100M users huggingface.co/Qwen

English

12K

Pratik Patel retweetledi

Austen Allred@Austen·25 Eki

It’s remarkable how people like @joerogan and @lexfridman are building absolute media empires by simply listening to people for once instead of cutting them off and consistently trying to shape a cohesive narrative

English

259

4.3K

Pratik Patel@pratikpatel·4 Eki

@caseyliss Took me some time to find this tweet but i think i nailed how it’ll be implemented 5 years ago.

English

Casey Liss@caseyliss·23 May

@pratikpatel 🍻

QME

Pratik Patel retweetledi

Ana Navarro-Cárdenas@ananavarro·24 Haz

If you’re against abortion, don’t get one. If you’re against contraception, don’t take any. If you’re against same-sex relationships, don’t have one. If you’re against same-sex marriage, don’t marry someone of same gender. Do not impose your beliefs & religion on all Americans.

English

15.8K

133.9K

576.1K

Pratik Patel retweetledi

Alec Stapp@AlecStapp·14 Kas

Fix your labor shortage with this one weird trick

English

691

28.2K

178.3K

Pratik Patel retweetledi

Inspired by Iceland@iceland·11 Kas

Some said an open-world experience this immersive wasn’t possible. But it’s already here. And you don’t even need silly VR headsets. Introducing, ✨Icelandverse✨ #icelandverse

English

738

8.5K

31.4K

Pratik Patel retweetledi

Marques Brownlee@MKBHD·21 Nis

Great now do the whole system

English

244

7.9K

56.1K

Pratik Patel@pratikpatel·3 Nis

@eightsleep what is going on with your order delays. Ordered almost 6 weeks ago. Originally told, it’d be delivered in 3-4 weeks. Now I’m being told it’ll ship on April 15th. This is unacceptable. Every time different excuse.

Westford, MA 🇺🇸 English

Keşfet

@IndiGo6E @MoCA_GoI @DGCAIndia @joerogan @lexfridman @caseyliss @eightsleep @elonmusk