Josh Patterson

9.5K posts

Josh Patterson banner
Josh Patterson

Josh Patterson

@datametrician

VP Solution Architecture and Engineering @NVIDIA; @RAPIDSai; former @PIFgov (#44). Building bridges not walls. Accelerating Data Science.

Charleston, SC เข้าร่วม Eylül 2008
978 กำลังติดตาม4.4K ผู้ติดตาม
ทวีตที่ปักหมุด
Josh Patterson
Josh Patterson@datametrician·
10 years of "it can't be done..." 7 @nvidia GPU architectures... 5 years of @RAPIDSai... 3 years of @VoltronData... finally a petabyte-scale GPU-native engine that DOESN'T require you to change your data pipelines. Same code, same data formats, just modular, interoperable, composable, extensible... and of course ACCELERATED! Theseus is the Scalable Performant And Compute Efficient engine🔥🔥🔥 Check out our benchmarks and new webpage... and reach out if you're struggling with queries above 30TBs. voltrondata.com/benchmarks
English
3
30
111
16.9K
Josh Patterson รีทวีตแล้ว
PVLDB
PVLDB@pvldb·
Vol:19 No:2 → Terabyte-Scale Analytics in the Blink of an Eye vldb.org/pvldb/vol19/p1…
PVLDB tweet media
English
1
23
195
22.7K
Marlene Mhangami
Marlene Mhangami@marlene_zw·
Some days I’m tempted to move just to avoid timezone issues 😂 When is AI fixing timezone issues? Where’s the MCP server for that?
English
3
0
14
1.2K
Josh Patterson
Josh Patterson@datametrician·
@marlene_zw If you ever did a startup, I’d give you all the free advice you wanted… or just root you on from the sidelines… whatever to help you succeed.
English
1
0
2
90
Marlene Mhangami
Marlene Mhangami@marlene_zw·
I used to think I wanted to run a startup, but tbh I don’t think I have the level of grit it takes😂 Truly respect founders building stuff 🙏🏾
English
13
6
81
3.8K
Josh Patterson รีทวีตแล้ว
Sumanth
Sumanth@Sumanth_077·
NVIDIA just open-sourced a high-throughput, low-latency inference framework for serving reasoning models like DeepSeek-R1! Introducing Dynamo, a framework designed for serving generative AI and reasoning models in multi-node distributed environments. 100% Open Source
Sumanth tweet media
English
12
98
660
45.7K
Josh Patterson รีทวีตแล้ว
Colaboratory
Colaboratory@GoogleColab·
🚀 The Colab team collaborated closely with @nvidia to deliver day 1 compatibility for NVIDIA cuML's Zero Code Change ML Acceleration. Now, you can experience significant speedups in your machine learning workflows in Colab with no code modifications! Example notebook below 👇 youtu.be/cIJsVq8CPys?fe…
YouTube video
YouTube
English
9
76
577
45.9K
Josh Patterson รีทวีตแล้ว
Bryan Catanzaro
Bryan Catanzaro@ctnzr·
DLSS 4 is the biggest DLSS yet: 8X more efficient graphics for 4K 240Hz rendering 15/16 pixels generated by AI A new transformer based neural network dramatically upgrades image quality for Ray Reconstruction and Super Resolution. youtu.be/qQn3bsPNTyI
YouTube video
YouTube
English
32
37
317
36.2K
Josh Patterson รีทวีตแล้ว
Chip Huyen
Chip Huyen@chipro·
My 8000-word note on agents: huyenchip.com//2025/01/07/ag… Covering: 1. An overview of agents 2. How the capability of an AI-powered agent is determined by the set of tools it has access to and its capability for planning 3. How to select the best set of tools for your agent 4. Whether LLMs can plan and how to augment a model’s capability for planning 5. Agent’s failure modes AI-powered agents are an emerging field with no established theoretical frameworks for defining, developing, and evaluating them. This post is a best-effort attempt to build a framework from the existing literature, but it will evolve as the field does. As always, feedback is much appreciated!
English
52
449
2.8K
365.2K
Josh Patterson รีทวีตแล้ว
Chip Huyen
Chip Huyen@chipro·
During the process of writing AI Engineering, I went through so many papers, case studies, blog posts, repos, tools, etc. This repo contains ~100 resources that really helped me understand various aspects of building with foundation models. github.com/chiphuyen/aie-… Here are the highlights: 1. Anthropic’s Prompt Engineering Interactive Tutorial The Google Sheets-based interactive exercises make it easy to experiment with different prompts and see immediately what works and what doesn’t. I’m surprised other model providers don’t have similar interactive guides: docs.google.com/spreadsheets/d… 2. OpenAI’s best practices for finetuning While this guide focuses on GPT-3, many techniques are applicable to full finetuning in general. It explains how finetuning works, how to prepare training data, how to pick training hyperparameters, and common finetuning mistakes: docs.google.com/document/d/1rq… 3. Llama 3 paper The section on post-training data is a gold mine as it details different techniques they used to generate 2.7 million examples for supervised finetuning. It also covers a crucial but less talked about topic: data verification, how to evaluate the quality of synthetic data: arxiv.org/abs/2407.21783 4. Efficiently Scaling Transformer Inference (Pope et al., 2022) An amazing paper co-authored by Jeff Dean about inference optimization for transformers models. It covers not only different optimization techniques and their tradeoffs, but also provides a guideline for what to do if you want to optimize for different aspects, e.g. lowest possible latency, highest possible throughput, or longest context length: arxiv.org/abs/2211.05102 5. Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models (Lu et al., 2023) My favorite study on LLM planners, how they use tools, and their failure modes. An interesting finding is that different LLMs have different tool preferences: arxiv.org/abs/2304.09842 6. AI Incident Database For those interested in seeing how AI can go wrong, this contains over 3000 reports of AI harms: incidentdatabase.ai 7. I find case studies from teams that have successfully deployed AI applications extremely educational. Here are some of my favorite enterprise case studies. I'll add more case studies soon! - LinkedIn: linkedin.com/blog/engineeri… - Pinterest's Text-to-SQL: medium.com/pinterest-engi… - Gmail’s Smart Compose (2019): arxiv.org/abs/1906.00080 - Grab: engineering.grab.com/llm-powered-da…
English
31
232
1.5K
102.7K
Josh Patterson รีทวีตแล้ว
Dewey Dunnington
Dewey Dunnington@paleolimbot·
First blog post in a long time! I started writing a post ~2 years ago on adventures counting 130M U.S. buildings by zipcode and finally circled back to write it up. Everybody is a winner really, but @duckdb @IbisData , @ApacheArrow, and @GeoParquet were essential throughout!
Dewey Dunnington tweet mediaDewey Dunnington tweet media
English
2
11
53
3.7K
Josh Patterson รีทวีตแล้ว
Naty Clementi
Naty Clementi@ncclementi·
Hi y'all, I'll be talking at #DuckCon on January 2025. I'll be sharing how to leverage the power of @duckdb's geospatial capabilities while staying within the Python ecosystem using @IbisData . I’ll show you how to work with GeoParquet data and create nice maps in your laptop.
Naty Clementi tweet media
English
1
5
28
2.2K
Josh Patterson รีทวีตแล้ว
Chip Huyen
Chip Huyen@chipro·
It’s done! 150,000 words, 200+ illustrations, 250 footnotes, and over 1200 reference links. My editor just told me the manuscript has been sent to the printers. - The ebook will be coming out later this week. - Paperback copies should be available in a few weeks (hopefully before the end of the year). Preorder: amzn.to/49j1cGS - The full manuscript is also accessible on O'Reilly platform: oreillymedia.pxf.io/c/5719111/2146… This wouldn’t have been possible without the help of so many people who reviewed the early drafts, answered my thousands of questions, introduced me to fascinating use cases, or helped me see the beauty of overlooked techniques. Thank you everyone for making this happen!
Chip Huyen tweet media
English
173
585
5.8K
355.3K
Josh Patterson รีทวีตแล้ว
Alex Miller
Alex Miller@AlexMillerDB·
New blog post on the fun new hardware advancements which databases can leverage for great gains, and why the cloud means it doesn't matter that they exist. 🫠 transactional.blog/blog/2024-mode…
English
7
48
264
22.3K
Josh Patterson รีทวีตแล้ว
Colaboratory
Colaboratory@GoogleColab·
We've increased the size of our NVIDIA A100 fleet for paid users by around 2x, and for the last several days we've seen 100% success rate for users requesting A100s.
English
7
25
235
21.4K
👩‍💻 Paige Bailey
👩‍💻 Paige Bailey@DynamicWebPaige·
my church trusts my baking skills enough to ask me to make bread for sunday morning mass! 😂🍞
👩‍💻 Paige Bailey tweet media
English
4
0
35
4.2K
Josh Patterson รีทวีตแล้ว
marimo
marimo@marimo_io·
no code transformations using marimo's mo.ui.dataframe and @IbisData! Pass in any Ibis dataframe to mo.ui.dataframe to display a UI for different filters/transformations and get the filtered result back in Python. Plus, you can see the SQL statement generated by Ibis!
English
0
3
16
1.4K
Josh Patterson รีทวีตแล้ว
Voltron Data
Voltron Data@VoltronData·
🔊 Just Aired! Our CEO @datametrician's interview on @Bloomberg 🚀 In <10 min, Josh unpacks how @VoltronData harness @NVIDIA GPUs to accelerate compute, helping businesses “do more with less.” 🎧 Listen now, starting at 29:39: bloomberg.com/news/audio/202… 💡 Highlights include... - How to explain accelerated compute to Josh’s grandma - Scale 200 servers down to just 2 - Slash query times from 30 min to 30 sec - Lower energy consumption & operational costs - Keep your code with BYO-API & open-source best practices #AcceleratedCompute #ComposableDataSystems #SustainableTech #OpenStandards #Composability #OpenSource #GPUs #DataAnalytics
English
0
1
4
1.4K
Josh Patterson รีทวีตแล้ว
Mim
Mim@mim_djo·
ibis solved the sql dialect difference problem,as i can't write the same sql that both engines can understand
English
1
1
6
764