Travis Biehn

130 posts

Travis Biehn banner
Travis Biehn

Travis Biehn

@tbiehn

Travis Biehn is lost in a single pane of glass fun-house.

Mobile Katılım Nisan 2011
0 Takip Edilen127 Takipçiler
Travis Biehn
Travis Biehn@tbiehn·
@gavin_gee @carrigmat Nebius looks like <24$ / hr for 8xh100 - 640gb of graphics ram 🤔 modestly or selective quant on that and you’re cooking with gas.
English
0
0
5
435
GG
GG@gavin_gee·
@carrigmat Would a x2idn.32xlarge on AWS work too? 128vcpu and 2048 gb RAM. It’s $13 an hour.
English
3
2
68
20.6K
Matthew Carrigan
Matthew Carrigan@carrigmat·
Complete hardware + software setup for running Deepseek-R1 locally. The actual model, no distillations, and Q8 quantization for full quality. Total cost, $6,000. All download and part links below:
English
712
3.5K
27.6K
5.5M
Travis Biehn
Travis Biehn@tbiehn·
@positiveblue2 @simonw R1 has tool calling tokens in its vocab & template. Ollama doesn’t have a generic template replacer for DeepSeek R1 yet, but the models likely support func calling. FIM tokens are in there too.
English
1
0
2
98
positiveblue ⚡️🍠
positiveblue ⚡️🍠@positiveblue2·
@simonw Yes, that's what I did with Phi 4. Llama 3.2B interacts with the user but uses Phi 4 (with ctx about tools) for reasoning and from the Phi4 prompt response it infers what tools to call However, it would be nice to have it built in so you can run a 3B/8B models in the edge.
English
1
0
3
133
Simon Willison
Simon Willison@simonw·
DeepSeek released a whole family of inference-scaling / "reasoning" models today, including distilled variants based on Llama and Qwen Here are my notes on the new models, plus how I ran DeepSeek-R1-Distill-Llama-8B on my Mac using Ollama and LLM simonwillison.net/2025/Jan/20/de…
English
12
52
468
48.7K
Travis Biehn
Travis Biehn@tbiehn·
So much ‘ad absurdum’ is now only compute bound. So, do we?
English
0
0
0
98
Travis Biehn
Travis Biehn@tbiehn·
@SwannMarcus89 He said, chuckling, downing his sixth whiskey at the company Christmas party…
English
0
0
0
8
Travis Biehn
Travis Biehn@tbiehn·
@Saboo_Shubham_ PassGAN is over 4 years old at this point - the original paper compared the technique with other SOTA approaches - PassGAN is one of the worst generative strategies available when compared to JTR & HashCat dict+rules. The password strength reality is worse with those tools.
English
0
0
0
67
Shubham Saboo
Shubham Saboo@Saboo_Shubham_·
AI can crack 51% of passwords in less than 1 min 🤯 Meet PassGAN, a Generative Adversarial Network (GAN) that can autonomously learn the distribution of real passwords from actual password leaks. It can crack any kind of 7 chars password in <6 mins even if it contain symbols.
Shubham Saboo tweet media
English
83
217
937
459.5K
Travis Biehn
Travis Biehn@tbiehn·
@jerryjliu0 I do this - for specific tasks I see that, for example, 1 of 4 strategies usually works best, however there's always a few cases where other ones produce the best answers.
English
0
0
0
5
Jerry Liu
Jerry Liu@jerryjliu0·
There are too many options for building information retrieval: - Chunk size - Query strategy (top-k, hybrid, MMR) Idea: What if we ensembled *all of the options* + let an LLM prune the pooled results? 👇 ✅ More general retriever (though more 💰) ✅ Benchmark diff strategies
Jerry Liu tweet media
English
8
25
171
31.6K
elvis
elvis@omarsar0·
Skeleton-of-Thought: LLMs can do parallel decoding Interesting prompting strategy which firsts generate an answer skeleton and then performs parallel API calls to generate the content of each skeleton point. Reports quality improvements in addition to speed-up of up to 2.39x. Big deal given how costly in terms of latency some tasks are. This a great paper to rethink the necessity of sequential decoding of current LLMs. arxiv.org/abs/2307.15337
elvis tweet media
English
10
130
583
131.8K
Travis Biehn
Travis Biehn@tbiehn·
@simonw I'm generating and retrieving embeddings using a few different strategies, then use GPT-4 to compare and rank those responses. Doing a 'vibe check' on those strategies is important, getting ranked evals gives you prelim data to work from.
English
0
0
1
81
Simon Willison
Simon Willison@simonw·
Search engineers have spent decades figuring out good ways to evaluate if their search relevance algorithms are returning good results or not, we need to be adapting similar strategies
English
3
2
7
5K
Simon Willison
Simon Willison@simonw·
With respect to retrieval augmented generation for answering user questions, what's the current accepted best practice on how best to chunk up text for indexing in a vector database? (If this question makes no sense to you my post here might help simonwillison.net/2023/Jan/13/se… )
English
31
47
313
110.6K
Travis Biehn
Travis Biehn@tbiehn·
I've released a complimentary tool to ThoughtLoom that lets you do k-nearest-neighbors embeddings search from your CLI. I've used it for all sorts of nonsense. github.com/tbiehn/embedme…
English
0
0
0
117
Travis Biehn
Travis Biehn@tbiehn·
Getting formatted data out of your LLM is a PITA. Specify JSONSpec functions for the new OpenAI API function support in the latest version of ThoughtLoom. Tell the model to use one, and voila - structured, escaped, typed emissions. github.com/tbiehn/thought…
English
0
0
0
93
Travis Biehn
Travis Biehn@tbiehn·
LLM powered program exploration will be another leap for dynamic application security testing - at least as big as concolic fuzzing. Here's some great work showing how LLMs can make their way through arbitrary workflows and arbitrary user-interfaces: osu-nlp-group.github.io/Mind2Web/
English
0
0
0
106
Travis Biehn
Travis Biehn@tbiehn·
@_atilla1 Good UX. I wonder how they'll make it not suck to use physically? Try mashing your fingers into a table, or wiggling them in the air for an extended period of time. Could definitely see us spending more time and doing less trivial work using a paired physical keyboard.
English
0
0
1
189
Atilla
Atilla@_atilla1·
Attention to details is crucial, especially when it comes to interactions. 👇🏼Here's a little breakdown of the keyboard interaction and visual feedback in visionOS. 1. Look at how the keys get highlighted when hovering with the fingers over them. ❤️ 2. Pressing a key pushes it downwards on the Z axsis. 3. Additionally a little circular pulse expands outwards for visual confirmation. Just looking at this is so satisfying, but typing in Vision Pro must be another level of satisfaction.
GIF
English
81
191
2.2K
893.5K
Travis Biehn
Travis Biehn@tbiehn·
Experimenting with LLMs? Love CLI? Always thought ideal CLI IPC was JSON? You’re one of the 5 people I wrote github.com/tbiehn/thought… for! Comes with a bunch of examples - including (cyber)security; writing reports from scan results, & generating fix patches from semgrep results.
English
0
0
0
79
Travis Biehn retweetledi
Clint Gibler
Clint Gibler@clintgibler·
🔮 Harnessing the Hive Mind: How Semgrep and @pdiscoveryio's Nuclei Are Shaping the Future of Security Engineering 🔥 overview of the benefits of open source security tooling, modern security engineering, and where things are headed dualuse.io/blog/harnessin…
Clint Gibler tweet mediaClint Gibler tweet media
English
0
10
23
5.7K