Josh Patterson

9.5K posts

Josh Patterson

@datametrician

VP Solution Architecture and Engineering @NVIDIA; @RAPIDSai; former @PIFgov (#44). Building bridges not walls. Accelerating Data Science.

Charleston, SC เข้าร่วม Eylül 2008

978 กำลังติดตาม4.4K ผู้ติดตาม

ทวีตที่ปักหมุด

Josh Patterson@datametrician·19 Mar

10 years of "it can't be done..." 7 @nvidia GPU architectures... 5 years of @RAPIDSai... 3 years of @VoltronData... finally a petabyte-scale GPU-native engine that DOESN'T require you to change your data pipelines. Same code, same data formats, just modular, interoperable, composable, extensible... and of course ACCELERATED! Theseus is the Scalable Performant And Compute Efficient engine🔥🔥🔥 Check out our benchmarks and new webpage... and reach out if you're struggling with queries above 30TBs. voltrondata.com/benchmarks

English

111

16.9K

Josh Patterson รีทวีตแล้ว

PVLDB@pvldb·18 Oca

Vol:19 No:2 → Terabyte-Scale Analytics in the Blink of an Eye vldb.org/pvldb/vol19/p1…

English

195

22.7K

Josh Patterson@datametrician·5 Haz

@marlene_zw @pawjast I hear the US east coast is cool.

English

Marlene Mhangami@marlene_zw·5 Haz

@pawjast Good question. At this very moment idk 😂

English

Marlene Mhangami@marlene_zw·5 Haz

Some days I’m tempted to move just to avoid timezone issues 😂 When is AI fixing timezone issues? Where’s the MCP server for that?

English

1.2K

Josh Patterson@datametrician·23 Nis

@marlene_zw If you ever did a startup, I’d give you all the free advice you wanted… or just root you on from the sidelines… whatever to help you succeed.

English

Marlene Mhangami@marlene_zw·22 Nis

I used to think I wanted to run a startup, but tbh I don’t think I have the level of grit it takes😂 Truly respect founders building stuff 🙏🏾

English

3.8K

Josh Patterson รีทวีตแล้ว

Sumanth@Sumanth_077·19 Mar

NVIDIA just open-sourced a high-throughput, low-latency inference framework for serving reasoning models like DeepSeek-R1! Introducing Dynamo, a framework designed for serving generative AI and reasoning models in multi-node distributed environments. 100% Open Source

English

660

45.7K

Josh Patterson รีทวีตแล้ว

Colaboratory@GoogleColab·19 Mar

🚀 The Colab team collaborated closely with @nvidia to deliver day 1 compatibility for NVIDIA cuML's Zero Code Change ML Acceleration. Now, you can experience significant speedups in your machine learning workflows in Colab with no code modifications! Example notebook below 👇 youtu.be/cIJsVq8CPys?fe…

YouTube

English

577

45.9K

Josh Patterson รีทวีตแล้ว

Bryan Catanzaro@ctnzr·7 Oca

DLSS 4 is the biggest DLSS yet: 8X more efficient graphics for 4K 240Hz rendering 15/16 pixels generated by AI A new transformer based neural network dramatically upgrades image quality for Ray Reconstruction and Super Resolution. youtu.be/qQn3bsPNTyI

YouTube

English

317

36.2K

Josh Patterson รีทวีตแล้ว

Chip Huyen@chipro·7 Oca

My 8000-word note on agents: huyenchip.com//2025/01/07/ag… Covering: 1. An overview of agents 2. How the capability of an AI-powered agent is determined by the set of tools it has access to and its capability for planning 3. How to select the best set of tools for your agent 4. Whether LLMs can plan and how to augment a model’s capability for planning 5. Agent’s failure modes AI-powered agents are an emerging field with no established theoretical frameworks for defining, developing, and evaluating them. This post is a best-effort attempt to build a framework from the existing literature, but it will evolve as the field does. As always, feedback is much appreciated!

English

449

2.8K

365.2K

Josh Patterson รีทวีตแล้ว

Chip Huyen@chipro·13 Ara

During the process of writing AI Engineering, I went through so many papers, case studies, blog posts, repos, tools, etc. This repo contains ~100 resources that really helped me understand various aspects of building with foundation models. github.com/chiphuyen/aie-… Here are the highlights: 1. Anthropic’s Prompt Engineering Interactive Tutorial The Google Sheets-based interactive exercises make it easy to experiment with different prompts and see immediately what works and what doesn’t. I’m surprised other model providers don’t have similar interactive guides: docs.google.com/spreadsheets/d… 2. OpenAI’s best practices for finetuning While this guide focuses on GPT-3, many techniques are applicable to full finetuning in general. It explains how finetuning works, how to prepare training data, how to pick training hyperparameters, and common finetuning mistakes: docs.google.com/document/d/1rq… 3. Llama 3 paper The section on post-training data is a gold mine as it details different techniques they used to generate 2.7 million examples for supervised finetuning. It also covers a crucial but less talked about topic: data verification, how to evaluate the quality of synthetic data: arxiv.org/abs/2407.21783 4. Efficiently Scaling Transformer Inference (Pope et al., 2022) An amazing paper co-authored by Jeff Dean about inference optimization for transformers models. It covers not only different optimization techniques and their tradeoffs, but also provides a guideline for what to do if you want to optimize for different aspects, e.g. lowest possible latency, highest possible throughput, or longest context length: arxiv.org/abs/2211.05102 5. Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models (Lu et al., 2023) My favorite study on LLM planners, how they use tools, and their failure modes. An interesting finding is that different LLMs have different tool preferences: arxiv.org/abs/2304.09842 6. AI Incident Database For those interested in seeing how AI can go wrong, this contains over 3000 reports of AI harms: incidentdatabase.ai 7. I find case studies from teams that have successfully deployed AI applications extremely educational. Here are some of my favorite enterprise case studies. I'll add more case studies soon! - LinkedIn: linkedin.com/blog/engineeri… - Pinterest's Text-to-SQL: medium.com/pinterest-engi… - Gmail’s Smart Compose (2019): arxiv.org/abs/1906.00080 - Grab: engineering.grab.com/llm-powered-da…

English

232

1.5K

102.7K

Josh Patterson รีทวีตแล้ว

Dewey Dunnington@paleolimbot·2 Ara

First blog post in a long time! I started writing a post ~2 years ago on adventures counting 130M U.S. buildings by zipcode and finally circled back to write it up. Everybody is a winner really, but @duckdb @IbisData , @ApacheArrow, and @GeoParquet were essential throughout!

English

3.7K

Josh Patterson รีทวีตแล้ว

Naty Clementi@ncclementi·4 Ara

Hi y'all, I'll be talking at #DuckCon on January 2025. I'll be sharing how to leverage the power of @duckdb's geospatial capabilities while staying within the Python ecosystem using @IbisData . I’ll show you how to work with GeoParquet data and create nice maps in your laptop.

English

2.2K

Josh Patterson รีทวีตแล้ว

Chip Huyen@chipro·4 Ara

It’s done! 150,000 words, 200+ illustrations, 250 footnotes, and over 1200 reference links. My editor just told me the manuscript has been sent to the printers. - The ebook will be coming out later this week. - Paperback copies should be available in a few weeks (hopefully before the end of the year). Preorder: amzn.to/49j1cGS - The full manuscript is also accessible on O'Reilly platform: oreillymedia.pxf.io/c/5719111/2146… This wouldn’t have been possible without the help of so many people who reviewed the early drafts, answered my thousands of questions, introduced me to fascinating use cases, or helped me see the beauty of overlooked techniques. Thank you everyone for making this happen!

English

173

585

5.8K

355.3K

Josh Patterson รีทวีตแล้ว

Alex Miller@AlexMillerDB·20 Kas

New blog post on the fun new hardware advancements which databases can leverage for great gains, and why the cloud means it doesn't matter that they exist. 🫠 transactional.blog/blog/2024-mode…

English

264

22.3K

Josh Patterson รีทวีตแล้ว

Colaboratory@GoogleColab·24 Eyl

We've increased the size of our NVIDIA A100 fleet for paid users by around 2x, and for the last several days we've seen 100% success rate for users requesting A100s.

English

235

21.4K

Josh Patterson@datametrician·15 Eyl

@DynamicWebPaige Your baking skills are the yeast of their worries…

English

👩‍💻 Paige Bailey@DynamicWebPaige·15 Eyl

my church trusts my baking skills enough to ask me to make bread for sunday morning mass! 😂🍞

English

4.2K

Josh Patterson รีทวีตแล้ว

Voltron Data@VoltronData·9 Eyl

🚀 Heading to @Oracle #CloudWorld 2024? Book a meeting with our execs @datametrician, @rodaramburu & @darrenhaas on Sept 10-11. Spots are limited! airtable.com/appWDPBDUhIP0v… 📍 Visit us at Booth #115 for a deep dive into accelerated compute with @NVIDIA #GPUs. #OCW2024

English

2.4K

Josh Patterson รีทวีตแล้ว

marimo@marimo_io·9 Eyl

no code transformations using marimo's mo.ui.dataframe and @IbisData! Pass in any Ibis dataframe to mo.ui.dataframe to display a UI for different filters/transformations and get the filtered result back in Python. Plus, you can see the SQL statement generated by Ibis!

English

1.4K

Josh Patterson รีทวีตแล้ว

Voltron Data@VoltronData·6 Eyl

🔊 Just Aired! Our CEO @datametrician's interview on @Bloomberg 🚀 In <10 min, Josh unpacks how @VoltronData harness @NVIDIA GPUs to accelerate compute, helping businesses “do more with less.” 🎧 Listen now, starting at 29:39: bloomberg.com/news/audio/202… 💡 Highlights include... - How to explain accelerated compute to Josh’s grandma - Scale 200 servers down to just 2 - Slash query times from 30 min to 30 sec - Lower energy consumption & operational costs - Keep your code with BYO-API & open-source best practices #AcceleratedCompute #ComposableDataSystems #SustainableTech #OpenStandards #Composability #OpenSource #GPUs #DataAnalytics

English

1.4K

Josh Patterson รีทวีตแล้ว

Andrea B. Kalmans@akalmans·5 Eyl

Incredible to hear ⁦@datametrician⁩ of ⁦@VoltronData⁩ at minute 30sh on ⁦@Bloomberg⁩ discussing how we make #ai data centers massively more energy efficient and scalable with ⁦@nvidia⁩ GPUs. omny.fm/shows/bloomber…

English

1.1K

Josh Patterson รีทวีตแล้ว

Ibis@IbisData·4 Eyl

It's never been a better time to get involved with Ibis! If you need Python dataframes that work on the best local engines (@duckdb, @ApacheDataFusio, @DataPolars) and distributed/cloud platforms (@ApacheSpark, @SnowflakeDB, @ClickHouseDB, @trinodb), give us a try!

English

5.3K

Josh Patterson รีทวีตแล้ว

Mim@mim_djo·31 Ağu

ibis solved the sql dialect difference problem,as i can't write the same sql that both engines can understand

English

764

ค้นพบ

@marlene_zw @pawjast @nvidia @duckdb @IbisData @ApacheArrow @GeoParquet @DynamicWebPaige