Toran Billups

8.4K posts

Toran Billups

Toran Billups

@toranb

Decision Making With Feedforward Multilayer Perceptrons

Des Moines, IA Katılım Eylül 2008
101 Takip Edilen1.9K Takipçiler
Toran Billups retweetledi
Upen
Upen@upen946·
Don’t build a product, build value. Don’t market your SaaS, market how it solves a problem. People care about results, not your product. If it’s valuable and you show them the benefits, they’ll happily buy it.
English
14
6
37
1.7K
Toran Billups retweetledi
Jo Kristian Bergum
Jo Kristian Bergum@jobergum·
On AI in enterprises: models come and go; the true competitive advantage lies not in which frontier models you use but in how effectively you can connect those models to your organization's knowledge.
English
3
9
54
5.1K
Toran Billups retweetledi
Jeremy Howard
Jeremy Howard@jeremyphoward·
ModernBERT is available as a slot-in replacement for any BERT-like model, with both 139M param and 395M param sizes. It has a 8192 sequence length, is extremely efficient, is uniquely great at analyzing code, and much more. Read this for details: huggingface.co/blog/modernbert
English
6
34
492
30.4K
Toran Billups
Toran Billups@toranb·
@jobergum what language is the talk in? I'm planning to translate it from mp3 so I can listen this weekend
English
1
0
0
178
Jo Kristian Bergum
Jo Kristian Bergum@jobergum·
Great presentation on how Taboola uses Vespa at scale to power their real-time ad recommendation system. Very interesting use case as there are many filter constraints + complex ranking phases. m.youtube.com/watch?v=iJfVWo…
Jo Kristian Bergum tweet media
English
1
4
48
3.2K
Toran Billups
Toran Billups@toranb·
This blog post from the team @bitcrowd is an outstanding resource for those who want to leverage SOTA embeddings with bumblebee. Easily the highest value resource I've seen on the subject yet. This post in particular covers the path from zero to Jina v2 bitcrowd.dev/how-to-run-jin…
English
0
0
5
573
Toran Billups retweetledi
Gary Bernhardt
Gary Bernhardt@garybernhardt·
Me at 25: Tests should be 5ish lines! One assert per test! Me at 40: This test is 56 lines long with 11 asserts. If I broke it up, it would be 11 separate tests, ~5x as much code, multiple helper functions and `beforeEach`s to avoid duplication, and more difficult to read.
English
16
71
1K
87.1K
Toran Billups
Toran Billups@toranb·
@jobergum haha, just when I hoped you /would/ get started on e-comm search 😆
English
0
0
0
192
Philipp Schmid
Philipp Schmid@_philschmid·
How Do Large Language Models Acquire Factual Knowledge During Pretraining? - LLMs learn facts by encountering them multiple times during training (different sources). - LLMs forget faster with exact data repetitions, using deduplicated data helps retain knowledge. - Adding more data doesn't significantly improve how well LLMs learn facts. - Using larger batches of data during training helps LLMs remember facts better. - Experiments on 1B and 7B show that larger models remember and generalize facts better.
Philipp Schmid tweet media
English
4
24
154
14.2K
Toran Billups
Toran Billups@toranb·
@_philschmid This list is awesome! I recently did a talk on my adventures with synthetic data and I would add that for generating DPO datasets you can derive a synthetic prompt from a good response and then use that synthetic prompt to generate the rejected response youtube.com/watch?v=R0VJIW…
YouTube video
YouTube
English
0
0
8
965
Philipp Schmid
Philipp Schmid@_philschmid·
Creating a Pipeline for Generating Synthetic Data for Fine-Tuning Custom Embedding Models. 👀 Step 1 Create a Knowledge Base: Start with preparing your domain specific knowledge base, such as PDFs or other documents containing information. Convert the content of these documents into a plain text format. Step 2 Chunk the Data: Divide your text data into manageable chunks of approximately 256 tokens each (chunk size used in RAG later). Step 3 Generate Questions Using LLM: Use a Language Model (LLM) to generate K questions for each chunk of text. The questions should be answerable based on the content within the chunk. Example prompt: "Generate five questions that can be answered using the following text: [insert chunk here]." Step 4 Optionally Generate Hard Negative Examples: Create hard negative examples by generating questions that are similar to the correct questions but have answers that are incorrect or misleading. Alternatively, use random other samples from the batch as negative examples during training (in-batch negatives). Step 5 Deduplicate and Filter Pairs: Remove “duplicate” question-context pairs to ensure uniqueness. Use the LLM to judge and filter out lower-quality pairs by defining custom rubrics for quality assessment. Step 6 Fine-Tune Embedding Models: Use the prepared data to fine-tune your embedding models with Sentence Transformers 3.0Use the prepared data to fine-tune your embedding models with Sentence Transformers 3.0
Philipp Schmid tweet media
English
6
95
411
36K
Toran Billups retweetledi
Philipp Schmid
Philipp Schmid@_philschmid·
Data is all we need! 💎 @Alignment Labs AI just released Buzz, an instruction dataset with 3.13 million rows and a total of 85 million conversations in single- and multiturns. 🤯 It comes in 3 configurations: Buzz (SFT), RLSTACK (RLHF), Select Stack (filtered SFT) TL;DR: 💥 Curated, deduplicated, extended, and regenerated from 435 datasets 🧠 Training Llama 3 on it with Buzz-8b-Large 🌍 85 million conversational turns, including new and augmented data ⚖️ RLSTACK contains 1 million samples of DPO preference pairs 🥇 Select stack contains 1.5 million samples of the top-scoring response 🔄 intend to update and improve the dataset 🔓 Released under cc-by-4.0 🤗 Available on @huggingface Kudos to the team at @alignment_lab and @HIVEDigitalTech for this release! I am looking forward to read and learn more about the creation process! 🤗
Philipp Schmid tweet media
English
6
31
132
18.2K
Toran Billups
Toran Billups@toranb·
@yevkurtov I showed at the end of the video that you can use the f16 or quantized model from the command line. Are you asking about a specific inference platform perhaps?
English
1
0
0
30
Toran Billups
Toran Billups@toranb·
I had trouble converting Mistral 8B Pro to GGUF format recently so I recorded a short how-to for llama-cpp n00bs like myself. Check it out! 👇
English
1
0
2
502
Toran Billups
Toran Billups@toranb·
The next version of bumblebee is out and it's working great with Mistral 7B from HF using bf16 OOTB. It's great to see the platform moving forward with loads of improvements!
Toran Billups tweet media
English
0
1
5
445
Toran Billups retweetledi
Charlie Holtz
Charlie Holtz@charlieholtz·
Introducing YouTune — fine tune image models on YouTube videos. > python tune.⁠py <youtube-url> • downloads video • screenshots every 50 frames • removes near duplicates • fine tunes SDXL for you github.com/cbh123/youtune
English
44
85
657
104.3K
Toran Billups retweetledi
José Valim
José Valim@josevalim·
Tomorrow marks 13 years since the first commit to the Elixir repo. And today we celebrate by announcing that Elixir is, officially, a gradually typed language:
José Valim tweet media
English
33
440
2.2K
192.7K