Refuel

50 posts

Refuel

Refuel

@RefuelAI

Solve enterprise data tasks at superhuman accuracy. Acquired by @togethercompute

San Francisco, CA Tham gia Mayıs 2021
42 Đang theo dõi567 Người theo dõi
Refuel
Refuel@RefuelAI·
4/ To our customers: thank you for trusting us to solve your critical data problems and helping shape the journey. And to the Refuel team: you made this possible — every late night, every launch, every hard technical choice. We're proud of what we’ve built. Onwards!
English
0
0
2
264
Refuel
Refuel@RefuelAI·
3/ By joining @togethercompute, we will bring Refuel's team, technology and mission to Together’s AI platform, and help accelerate the AI adoption journey of the next generation of developers and enterprises
English
1
0
2
333
Refuel đã retweet
Together AI
Together AI@togethercompute·
🚀 Big news: Together AI has acquired @RefuelAI! Refuel specializes in models and tools that turn messy, unstructured data into clean, structured input—exactly what teams need to build high-quality, production-grade AI applications. Details below 👇
Together AI tweet media
English
2
4
29
2.8K
Modal
Modal@modal·
Full house for our open-source LLM demo night with @MistralAI last week!
Modal tweet mediaModal tweet mediaModal tweet mediaModal tweet media
English
1
0
22
2.2K
Refuel đã retweet
Nihit Desai
Nihit Desai@nihit_desai·
Data intelligence too cheap to meter RefuelLLM-2-mini (75.02%), our latest 1.5B param SLM, outperforms all comparable models including Phi-3.5 (65.3%), Qwen2.5 (67.62%), Gemma2 (64.52%), Llama3-3B (55.8%) and Llama3-1B (39.92%) across our benchmark of data processing tasks such as labeling, enrichment and structure extraction RefuelLLM-2-mini is a Qwen2-1.5B base model, trained on a corpus of 2750+ datasets spanning tasks such as classification, reading comprehension, structured attribute extraction and entity resolution, using the same recipe as other models in the Refuel-LLM family. It's fast! We’re open sourcing the model weights, available on @huggingface - huggingface.co/refuelai/Qwen-… If you'd like to access models, along with fine tuning support, DM me or reach out to us: refuel.ai/get-started Grateful to our early customers for their partnership, and the entire @RefuelAI team for their hard work 🚀
Nihit Desai tweet mediaNihit Desai tweet media
Nihit Desai@nihit_desai

Thrilled to introduce RefuelLLM-2, our latest family of LLMs built for data labeling and enrichment tasks. RefuelLLM-2 (83.82%) outperforms GPT-4-Turbo (80.88%), Claude-3-Opus (79.19%), Llama3-70B (78.2%) and Gemini-1.5-Pro (74.59%) on a benchmark of ~30 data labeling tasks: RefuelLLM-2-small (79.67%), aka Llama-3-Refueled, outperforms all comparable LLMs including Claude3-Sonnet (70.99%), Haiku (69.23%) and GPT-3.5-Turbo (68.13%). We’re open sourcing the model: huggingface.co/refuelai/Llama… You can try out the models here and give us some feedback! labs.refuel.ai/playground. The code and data used for benchmarking the LLMs is available in our Autolabel library: github.com/refuel-ai/auto… One more thing: RefuelLLM-2 family of models output much better calibrated confidence scores - a useful lever to reject, retry or ensemble low confidence outputs.

English
3
3
15
2.2K
Refuel
Refuel@RefuelAI·
(6/6) - While not every marketplace looks like Netflix, recommendations drive revenue and high-quality data drives good recommendations. If you're building a recommendations system and thinking about data quality and the role LLMs can play, we should chat!
English
0
0
0
438
Refuel
Refuel@RefuelAI·
(5/6) - These observations led to Netflix eventually switching to a thumbs up and thumbs down system. The byproduct? An almost 200% increase in ratings!
English
2
0
0
493
Refuel
Refuel@RefuelAI·
In 2017, Netflix got rid of its “5 star” rating system in favor of a simple thumbs up and thumbs down approach. This decision fundamentally transformed their business. A 🧵- (1/6)
Refuel tweet media
English
2
0
2
808
Varun Jain
Varun Jain@varunjain·
We are running classifications of products into categories too - not the number of categories you have, but here's what I think would work well for you at scale. Use @RefuelAI's cloud offering, or their open-source Autolabel repo. I know you mentioned no additional systems, but there's a simple API call at the end of this The main idea behind it is passing examples dynamically that are relevant to the current query. This significantly improves accuracy and confidence level. Here's how it would work: - You upload a test dataset with your prompt (your provide it the correct categories for these, or you run it through the best model, GPT4, and correct any mistakes) - This gets added to your set of examples that is available for each subsequent call - For the next set of examples, you can now switch out to a lower cost LLM (like Haiku, 3.5, etc) - should drastically bring down costs - It will now draw upon all the examples from the first dataset and significantly improve accuracy - Then you sort by Confidence, and clear out any low confidence scores - Then you just click a button and get an API call to use for your categorization whenever you want - you don't need to maintain that platform going forward, unless you want to come and tweak categories, etc (Disclaimer - not affiliated with Refuel - we're just paying customers, and I feel it should work for your use case) If this doesn't work for you, I would still experiment with: dynamic few shot prompting (based on embedding match to the current query) + moving to a cheaper LLM
English
1
0
4
163
Josh Pigford
Josh Pigford@Shpigford·
I have a nerdy AI challenge: I need to find the cheapest way to reliably categorize tools from an existing list of ~1400 categories. I'll send $500 to whoever comes up with the subjectively best option. I'm currently using gpt-4o and it's costing me roughly $0.09 *per categorization*, but it works *very* reliably. I'd like to find a way to reduce that cost while keeping in mind that maintaining additional systems is NOT something I want to do. Perfect world is I just fire off what I need to an API endpoint and call it a day. At any rate, here's a gist with all the categories as well as a bunch of examples of tools and how they should be categorized. gist.github.com/Shpigford/711a…
English
85
5
105
85.3K
Refuel đã retweet
dennylee
dennylee@dennylee·
We're kicking off the Data+AI Summit with the #MosaicX #Meetup: San Francisco Edition on Monday, June 10th. We're at the #Moscone Center South, 2nd floor, with over 1500 registrants and 39 speakers across four tracks. It's a "slightly" packed agenda with: ✅ Discussion panels on #Hardware, Build & Risks, Data Panel, and #VC Panel with a special session on #OLMo#Research track on importance of high quality #data, common challenges in #RAG development, #diffusion models, and more ✅ A use cases track on composable #CDP, building models, #multimodal, #agents, and more ✅ In the building LLMs track, we discuss the challenges, tools/techniques to build them, fast #LLM inference, and building #GenAI apps. While we are fully packed, if you are already registered for #DataAISummit, we will have a waitlist at the door. We have speakers from @databricks @DbrxMosaicAI @LaminiAI @Oracle @VoltronData @Replit @AiSquared_ @robusthq @gretel_ai @superannotate @EssenceVenture @AmplifyPartners @llama_index @QuotientAI @ActionIQinc @RefuelAI @OrbyAI @yousearchengine @NumbersStnAI @lancedb @huggingface mosaicx.events/events/june-10…
dennylee tweet media
English
7
14
53
37.2K
Refuel
Refuel@RefuelAI·
@ThinkInSysDev @Suhail 👋 hey @Suhail, you might like what we're building :) twitter.com/nihit_desai/st… - label, clean, generate data at scale
Nihit Desai@nihit_desai

Thrilled to introduce RefuelLLM-2, our latest family of LLMs built for data labeling and enrichment tasks. RefuelLLM-2 (83.82%) outperforms GPT-4-Turbo (80.88%), Claude-3-Opus (79.19%), Llama3-70B (78.2%) and Gemini-1.5-Pro (74.59%) on a benchmark of ~30 data labeling tasks: RefuelLLM-2-small (79.67%), aka Llama-3-Refueled, outperforms all comparable LLMs including Claude3-Sonnet (70.99%), Haiku (69.23%) and GPT-3.5-Turbo (68.13%). We’re open sourcing the model: huggingface.co/refuelai/Llama… You can try out the models here and give us some feedback! labs.refuel.ai/playground. The code and data used for benchmarking the LLMs is available in our Autolabel library: github.com/refuel-ai/auto… One more thing: RefuelLLM-2 family of models output much better calibrated confidence scores - a useful lever to reject, retry or ensemble low confidence outputs.

English
0
0
0
110
Suhail
Suhail@Suhail·
It seems like software's hardest problems will likely be solved via brute force: just massively scaling huge models with the most diverse high quality data you can find. This begs the question: will the market for data be larger than (training) compute? Internet data seems like it will tap out soon.
English
24
9
151
85.7K
Refuel đã retweet
Refuel
Refuel@RefuelAI·
We're thrilled to introduce RefuelLLM-2. Outperforms every single LLM available (GPT-4-Turbo, Claude Opus, Llama 3-70B, Gemini 1.5 Pro) on our benchmark of data labeling tasks. * Launch: refuel.ai/blog-posts/ann… * Playground: labs.refuel.ai/playground
Nihit Desai@nihit_desai

Thrilled to introduce RefuelLLM-2, our latest family of LLMs built for data labeling and enrichment tasks. RefuelLLM-2 (83.82%) outperforms GPT-4-Turbo (80.88%), Claude-3-Opus (79.19%), Llama3-70B (78.2%) and Gemini-1.5-Pro (74.59%) on a benchmark of ~30 data labeling tasks: RefuelLLM-2-small (79.67%), aka Llama-3-Refueled, outperforms all comparable LLMs including Claude3-Sonnet (70.99%), Haiku (69.23%) and GPT-3.5-Turbo (68.13%). We’re open sourcing the model: huggingface.co/refuelai/Llama… You can try out the models here and give us some feedback! labs.refuel.ai/playground. The code and data used for benchmarking the LLMs is available in our Autolabel library: github.com/refuel-ai/auto… One more thing: RefuelLLM-2 family of models output much better calibrated confidence scores - a useful lever to reject, retry or ensemble low confidence outputs.

English
3
9
35
19.8K