Refuel

4

21

1.8K

Refuel@RefuelAI·15 May

4/ To our customers: thank you for trusting us to solve your critical data problems and helping shape the journey. And to the Refuel team: you made this possible — every late night, every launch, every hard technical choice. We're proud of what we’ve built. Onwards!

English

2

264

Refuel@RefuelAI·15 May

3/ By joining @togethercompute, we will bring Refuel's team, technology and mission to Together’s AI platform, and help accelerate the AI adoption journey of the next generation of developers and enterprises

English

0

2

333

Refuel@RefuelAI·15 May

We have some big news to share today - @RefuelAI is joining @togethercompute to help accelerate the future of open source and enterprise AI! together.ai/blog/together-…

English

4

21

1.8K

Refuel me-retweet

Together AI@togethercompute·15 May

🚀 Big news: Together AI has acquired @RefuelAI! Refuel specializes in models and tools that turn messy, unstructured data into clean, structured input—exactly what teams need to build high-quality, production-grade AI applications. Details below 👇

English

2

4

29

2.8K

Refuel@RefuelAI·14 Mar

@modal_labs @andonlabs @phonic_co @CleanlabAI @sethkimmel3 thank you for hosting @modal_labs !

English

1

47

Modal@modal·14 Mar

From blazing fast speech-to-speech to domain-specific agent evals, we learned a ton from the community! @RefuelAI @andonlabs @phonic_co @CleanlabAI @sethkimmel3 Who should we team up with for our next SF event? 💞🌉

English

3

10

2.1K

Modal@modal·14 Mar

Full house for our open-source LLM demo night with @MistralAI last week!

English

0

22

2.2K

Refuel me-retweet

Nihit Desai@nihit_desai·8 Oca

Data intelligence too cheap to meter RefuelLLM-2-mini (75.02%), our latest 1.5B param SLM, outperforms all comparable models including Phi-3.5 (65.3%), Qwen2.5 (67.62%), Gemma2 (64.52%), Llama3-3B (55.8%) and Llama3-1B (39.92%) across our benchmark of data processing tasks such as labeling, enrichment and structure extraction RefuelLLM-2-mini is a Qwen2-1.5B base model, trained on a corpus of 2750+ datasets spanning tasks such as classification, reading comprehension, structured attribute extraction and entity resolution, using the same recipe as other models in the Refuel-LLM family. It's fast! We’re open sourcing the model weights, available on @huggingface - huggingface.co/refuelai/Qwen-… If you'd like to access models, along with fine tuning support, DM me or reach out to us: refuel.ai/get-started Grateful to our early customers for their partnership, and the entire @RefuelAI team for their hard work 🚀

Thrilled to introduce RefuelLLM-2, our latest family of LLMs built for data labeling and enrichment tasks. RefuelLLM-2 (83.82%) outperforms GPT-4-Turbo (80.88%), Claude-3-Opus (79.19%), Llama3-70B (78.2%) and Gemini-1.5-Pro (74.59%) on a benchmark of ~30 data labeling tasks: RefuelLLM-2-small (79.67%), aka Llama-3-Refueled, outperforms all comparable LLMs including Claude3-Sonnet (70.99%), Haiku (69.23%) and GPT-3.5-Turbo (68.13%). We’re open sourcing the model: huggingface.co/refuelai/Llama… You can try out the models here and give us some feedback! labs.refuel.ai/playground. The code and data used for benchmarking the LLMs is available in our Autolabel library: github.com/refuel-ai/auto… One more thing: RefuelLLM-2 family of models output much better calibrated confidence scores - a useful lever to reject, retry or ensemble low confidence outputs.

English

3

15

2.2K

Refuel@RefuelAI·27 Ağu

(6/6) - While not every marketplace looks like Netflix, recommendations drive revenue and high-quality data drives good recommendations. If you're building a recommendations system and thinking about data quality and the role LLMs can play, we should chat!

English

438

Refuel@RefuelAI·27 Ağu

(5/6) - These observations led to Netflix eventually switching to a thumbs up and thumbs down system. The byproduct? An almost 200% increase in ratings!

English

2

0

493

Refuel@RefuelAI·27 Ağu

In 2017, Netflix got rid of its “5 star” rating system in favor of a simple thumbs up and thumbs down approach. This decision fundamentally transformed their business. A 🧵- (1/6)

English

2

0

2

808

Refuel@RefuelAI·21 Haz

@varunjain @Shpigford Thanks @varunjain 👋 @Shpigford would love to chat!

English

1

18

Varun Jain@varunjain·21 Haz

We are running classifications of products into categories too - not the number of categories you have, but here's what I think would work well for you at scale. Use @RefuelAI's cloud offering, or their open-source Autolabel repo. I know you mentioned no additional systems, but there's a simple API call at the end of this The main idea behind it is passing examples dynamically that are relevant to the current query. This significantly improves accuracy and confidence level. Here's how it would work: - You upload a test dataset with your prompt (your provide it the correct categories for these, or you run it through the best model, GPT4, and correct any mistakes) - This gets added to your set of examples that is available for each subsequent call - For the next set of examples, you can now switch out to a lower cost LLM (like Haiku, 3.5, etc) - should drastically bring down costs - It will now draw upon all the examples from the first dataset and significantly improve accuracy - Then you sort by Confidence, and clear out any low confidence scores - Then you just click a button and get an API call to use for your categorization whenever you want - you don't need to maintain that platform going forward, unless you want to come and tweak categories, etc (Disclaimer - not affiliated with Refuel - we're just paying customers, and I feel it should work for your use case) If this doesn't work for you, I would still experiment with: dynamic few shot prompting (based on embedding match to the current query) + moving to a cheaper LLM

English

0

4

163

Josh Pigford@Shpigford·20 Haz

I have a nerdy AI challenge: I need to find the cheapest way to reliably categorize tools from an existing list of ~1400 categories. I'll send $500 to whoever comes up with the subjectively best option. I'm currently using gpt-4o and it's costing me roughly $0.09 *per categorization*, but it works *very* reliably. I'd like to find a way to reduce that cost while keeping in mind that maintaining additional systems is NOT something I want to do. Perfect world is I just fire off what I need to an API endpoint and call it a day. At any rate, here's a gist with all the categories as well as a bunch of examples of tools and how they should be categorized. gist.github.com/Shpigford/711a…

English

85

5

105

85.3K

Refuel me-retweet

Nihit Desai@nihit_desai·13 Haz

Thank you @databricks @DbrxMosaicAI for the keynote shoutout! Always great connecting with new and old friends at the @Data_AI_Summit

English

2

20

1.3K

Refuel me-retweet

dennylee@dennylee·7 Haz

We're kicking off the Data+AI Summit with the #MosaicX #Meetup: San Francisco Edition on Monday, June 10th. We're at the #Moscone Center South, 2nd floor, with over 1500 registrants and 39 speakers across four tracks. It's a "slightly" packed agenda with: ✅ Discussion panels on #Hardware, Build & Risks, Data Panel, and #VC Panel with a special session on #OLMo ✅ #Research track on importance of high quality #data, common challenges in #RAG development, #diffusion models, and more ✅ A use cases track on composable #CDP, building models, #multimodal, #agents, and more ✅ In the building LLMs track, we discuss the challenges, tools/techniques to build them, fast #LLM inference, and building #GenAI apps. While we are fully packed, if you are already registered for #DataAISummit, we will have a waitlist at the door. We have speakers from @databricks @DbrxMosaicAI @LaminiAI @Oracle @VoltronData @Replit @AiSquared_ @robusthq @gretel_ai @superannotate @EssenceVenture @AmplifyPartners @llama_index @QuotientAI @ActionIQinc @RefuelAI @OrbyAI @yousearchengine @NumbersStnAI @lancedb @huggingface mosaicx.events/events/june-10…

English

7

14

53

37.2K

Refuel@RefuelAI·14 May

@DbrxMosaicAI @databricks @Dynamoai @allen_ai We're big fans of @DbrxMosaicAI's training platform! Great speed, reliability and great user experience 🚀 Thanks for the shoutout

English

3

176

Databricks AI Research@DbrxMosaicAI·14 May

Our #DBRX open source #LLM was built using @databricks Mosaic AI Training - now our engineering team is sharing the core capabilities that made it possible for us (and others like @Dynamoai, @allen_ai, @RefuelAI) to seamlessly train large-scale #GenAI models. Learn more: databricks.com/blog/mosaic-ai…

English

5

6

30

9.4K

Refuel@RefuelAI·13 May

@ThinkInSysDev @Suhail 👋 hey @Suhail, you might like what we're building :) twitter.com/nihit_desai/st… - label, clean, generate data at scale

Thrilled to introduce RefuelLLM-2, our latest family of LLMs built for data labeling and enrichment tasks. RefuelLLM-2 (83.82%) outperforms GPT-4-Turbo (80.88%), Claude-3-Opus (79.19%), Llama3-70B (78.2%) and Gemini-1.5-Pro (74.59%) on a benchmark of ~30 data labeling tasks: RefuelLLM-2-small (79.67%), aka Llama-3-Refueled, outperforms all comparable LLMs including Claude3-Sonnet (70.99%), Haiku (69.23%) and GPT-3.5-Turbo (68.13%). We’re open sourcing the model: huggingface.co/refuelai/Llama… You can try out the models here and give us some feedback! labs.refuel.ai/playground. The code and data used for benchmarking the LLMs is available in our Autolabel library: github.com/refuel-ai/auto… One more thing: RefuelLLM-2 family of models output much better calibrated confidence scores - a useful lever to reject, retry or ensemble low confidence outputs.

English

110

thinkinsysdev@ThinkInSysDev·13 May

@Suhail @RefuelAI is probably going to be the shovel for this.

English

0

6

204

Suhail@Suhail·13 May

It seems like software's hardest problems will likely be solved via brute force: just massively scaling huge models with the most diverse high quality data you can find. This begs the question: will the market for data be larger than (training) compute? Internet data seems like it will tap out soon.

English

24

9

151

85.7K

Refuel me-retweet

Nihit Desai@nihit_desai·9 May

We're trending on @huggingface! Check out: huggingface.co/refuelai/Llama… Try out the model here: labs.refuel.ai/playground

Thrilled to introduce RefuelLLM-2, our latest family of LLMs built for data labeling and enrichment tasks. RefuelLLM-2 (83.82%) outperforms GPT-4-Turbo (80.88%), Claude-3-Opus (79.19%), Llama3-70B (78.2%) and Gemini-1.5-Pro (74.59%) on a benchmark of ~30 data labeling tasks: RefuelLLM-2-small (79.67%), aka Llama-3-Refueled, outperforms all comparable LLMs including Claude3-Sonnet (70.99%), Haiku (69.23%) and GPT-3.5-Turbo (68.13%). We’re open sourcing the model: huggingface.co/refuelai/Llama… You can try out the models here and give us some feedback! labs.refuel.ai/playground. The code and data used for benchmarking the LLMs is available in our Autolabel library: github.com/refuel-ai/auto… One more thing: RefuelLLM-2 family of models output much better calibrated confidence scores - a useful lever to reject, retry or ensemble low confidence outputs.

English

5

21

14.3K

Refuel@RefuelAI·8 May

We're thrilled to introduce RefuelLLM-2. Outperforms every single LLM available (GPT-4-Turbo, Claude Opus, Llama 3-70B, Gemini 1.5 Pro) on our benchmark of data labeling tasks. * Launch: refuel.ai/blog-posts/ann… * Playground: labs.refuel.ai/playground