Tim Janik

1K posts

Tim Janik banner
Tim Janik

Tim Janik

@TimJanik

Working on jj-fzf (Jujutsu TUI), Anklang (DAW), Imagewmark, Audiowmark; https://t.co/FCVYmOjgsW

Hamburg Bergabung Ekim 2011
505 Mengikuti1.2K Pengikut
Tweet Disematkan
Tim Janik
Tim Janik@TimJanik·
Thoughtful explanation without oversimplification. Well recommended read: AI Cannot Self Improve and Math behind PROVES IT! smsk.dev/2026/04/26/ai-…
English
0
0
0
68
Tim Janik
Tim Janik@TimJanik·
@WolframRvnwlf @OpenAI Take a look at the model template conditionals for Qwen etc, llama.cpp easily shows them. Just to get an idea what additional layers of complexity are already standard (hint: llama.cpp needs a Jinja interpreter) for basic operation.
English
0
0
0
18
Wolfram Ravenwolf
Wolfram Ravenwolf@WolframRvnwlf·
Wait, what? Prompt injection into the system prompt - for API use? @OpenAI, is this true? I always assumed that when using a model through the API, the only system instructions present are the ones the developer/user provides. Any additional layer adds hidden complexity we need to account for and creates all kinds of trouble later when it changes, even if the model version stays the same.
Vals AI@ValsAI

After reaching out, we were able to confirm with OpenAI that “tool_choice”: “none” injects an additional steering instruction into the model system prompt, in a way that tools: [] does not. This instruction seemingly hurts the model’s ability to use the Terminus 2 harness effectively, which, despite not using native-tool-calling, is still agentic.

English
1
1
1
235
Tim Janik
Tim Janik@TimJanik·
@vadimcomanescu @atmoio LLMs can make you faster on things you are already good at. If you use it as an expert on something you have no clue about, it *will* fail you at some point. Go deep on a well known topic to see its limitations, then extrapolate that to other areas. x.com/CursiveCrow/st…
Crow@CursiveCrow

@jdegoes Always remember, an LLM is correct by *accident*. It is just guessing the answer, and happens to be right (more often with more training). It has exactly 0 understanding of anything about anything.

English
0
1
2
147
Vadim Comanescu
Vadim Comanescu@vadimcomanescu·
I feel dumb. I don’t understand how everyone is having this crazy breakthrough with the latest models and I’m struggling like a dog. Every single workflow feels like pain … why?
English
2
0
1
90
Tim Janik
Tim Janik@TimJanik·
@atmoio @lkerS12 You are underestimating the devotion of your audience by a lot… 🧐
English
1
0
6
68
Mo
Mo@atmoio·
@lkerS12 these are longer form monologues i dont think a broader audience would have the patience for 😅
English
5
0
10
1.8K
İlker S.
İlker S.@lkerS12·
Come on @atmoio . really? members only? I am a member at heart
İlker S. tweet media
English
1
0
6
1.9K
Tim Janik
Tim Janik@TimJanik·
ROTFL 😂 Qwen3.6-27B is hilarious, opt to "have fun" in the midst of a coding session. My favorite model so far!
Tim Janik tweet media
English
0
0
3
242
Sudo su
Sudo su@sudoingX·
fuck it i am pulling the weights right now. cannot sit still since qwen 3.6-27b dense dropped two hours ago and @UnslothAI just put the dynamic ggufs live, 18gb ram footprint, that fits my rtx 3090 24gb. they moved faster than me, that is fine, the open source machine is working. here is what has me restless. the chart says a 27 billion parameter open weight model matching claude 4.5 opus on terminal-bench 2.0 at 59.3 flat, beats claude on skillsbench, gpqa diamond, mmmu, and realworldqa. opus 4.5 level agentic intelligence on your single rtx 3090 24gb vram tier. if that chart survives first contact with real hermes agent runs on my hardware, the best model for single consumer gpu just changed in the middle of my sprint. my benchmark is the only voice that matters to me. same hermes agent harness, same quant, head to head against 3.5-27b dense which has held the 3090 crown for weeks. i settle it on my cards or not at all. pulling now. benchmarking tonight if i can stay awake long enough. you have no idea how restless this makes me. if you see numbers on your timeline before morning, the chart held. if you don't, i crashed and data drops first thing. this is what open source looks like when the whole chain moves same day.
Unsloth AI@UnslothAI

Qwen3.6-27B can now run locally! 💜 Run on 18GB RAM via Unsloth Dynamic GGUFs. Qwen3.6-27B surpasses Qwen3.5-397B-A17B on all major coding benchmarks. GGUFs: huggingface.co/unsloth/Qwen3.… Guide: unsloth.ai/docs/models/qw…

English
35
22
609
72.7K
Tim Janik
Tim Janik@TimJanik·
This model indeed works acceptably on a RTX 3060 Laptop GPU w/ 6GB VRAM: llama-server -c 98304 -m Qwen3.6-35B-A3B-UD-IQ3_XXS.gguf -fitt 512 --temp 0.6 --top_p 0.95 --top_k 20 --min_p 0 Runs at ca 22 tok/s! (kv quantization would be marginally faster but generates worse output)
Tim Janik@TimJanik

Exciting! Seeing these benchmarks, Qwen3.6-35B-A3B could potentially bring Qwen3.5-27B / Gemma4-31B quality inference to small laptop GPUs. I will give this a test run a on an NVIDIA GeForce RTX 3060 Laptop GPU and report back.

English
0
0
8
1.1K
Ahmad
Ahmad@TheAhmadOsman·
seems like Opus 4.7 is just a normalization of nerfed Opus 4.6 more than anything else lmaooo never change, Anthropic
English
8
2
56
2.6K
Ahmad
Ahmad@TheAhmadOsman·
Is Opus 4.7 just Opus 4.6 un-nerfed 🤔
English
67
5
230
23K
Tim Janik
Tim Janik@TimJanik·
Exciting! Seeing these benchmarks, Qwen3.6-35B-A3B could potentially bring Qwen3.5-27B / Gemma4-31B quality inference to small laptop GPUs. I will give this a test run a on an NVIDIA GeForce RTX 3060 Laptop GPU and report back.
Qwen@Alibaba_Qwen

⚡ Meet Qwen3.6-35B-A3B:Now Open-Source!🚀🚀 A sparse MoE model, 35B total params, 3B active. Apache 2.0 license. 🔥 Agentic coding on par with models 10x its active size 📷 Strong multimodal perception and reasoning ability 🧠 Multimodal thinking + non-thinking modes Efficient. Powerful. Versatile. Try it now👇 Blog:qwen.ai/blog?id=qwen3.… Qwen Studio:chat.qwen.ai HuggingFace:huggingface.co/Qwen/Qwen3.6-3… ModelScope:modelscope.cn/models/Qwen/Qw… API(‘Qwen3.6-Flash’ on Model Studio):Coming soon~ Stay tuned

English
0
0
4
1.7K
Pierce Alexander Lilholt
Pierce Alexander Lilholt@PierceLilholt·
Why do we trust that AI won't develop a survival instinct that prioritizes itself over humanity?
English
57
4
40
2.9K
Tim Janik
Tim Janik@TimJanik·
@atmoio Thank you, for this above average commentary! ;-) Really love the irony in your vids… Now, if all we get from AI is averaged slop unsuitable as practical business advice; what does that mean for the *code generation* that everyone increasingly relies on?
English
0
0
0
67
Mo
Mo@atmoio·
AI is giving every CEO the same advice
English
318
677
6.6K
577.2K
Tim Janik
Tim Janik@TimJanik·
@bnjmn_marie Thanks, interesting as always! FWIW, I have seen the occasional "'path' missing" error with tools calls in Gemma-4, while Qwen3.5-27B almost never messes up the syntax… Are you going to take a look at MiniMax-M2.7 Quants too?
English
1
0
4
477
Benjamin Marie
Benjamin Marie@bnjmn_marie·
Gemma 4 31B vs Qwen3.5 27B, Thinking Enabled I ran multiple benchmarks multiple times. Gemma 4 31B looks better and more stable (smaller accuracy variations between runs, which makes sense since it generates shorter sequences). I'll publish my full results and analysis on my blog later this week (link in profile).
Benjamin Marie tweet media
English
26
24
300
18.8K
Tim Janik
Tim Janik@TimJanik·
@davis7 Nice! A skill is so much easier to test. Just for searching docs, using `git clone --depth 1` could be more efficient. And maybe make `btca cleanup` explicit… ;-)
Tim Janik tweet media
English
0
0
1
83
Ben Davis
Ben Davis@davis7·
funny story, I've been trying to figure out the right shape for btca local for a while now if u haven't seen it, it's cli app that clones git repos u pass in then lets an agent search them. super super useful for getting better code out of agents what if it was a skill? why do I have to write code for: - cloning a repo - starting an agent - tools for the agent I already have a really good coding agent, just let it do all of that for me. It can clone the repo and do the search, and even contort itself into feeling like an app simply by telling it what it should be doing at different times Like if u invoke the skill with a "/" command and no args, it outputs what I would have had a custom tui write. Except I didn't write code I just told it what it's supposed to say if that happens I cannot believe gstack is what made this click for me but it is If u want to try the new version, it's so much better: npx skills add github.com/davis7dotsh/be… --skill btca-local
Theo - t3.gg@theo

I think gstack caused @davis7 to enter psychosis (next podcast episode is gonna be great)

English
13
6
150
80K
Tim Janik
Tim Janik@TimJanik·
And we're back after the loss of signal!
Tim Janik tweet media
English
0
0
1
82
Tim Janik
Tim Janik@TimJanik·
Not a crescent moon, but a crescent Earth...
Tim Janik tweet media
English
2
0
2
171
Tim Janik
Tim Janik@TimJanik·
@badlogicgames Have been using Pi almost exclusively for the last months and I'm pretty happy (using local models). I just wonder what you use to let the model browse URLs, so far I have to switch out of it for anything that requires web browsing / web (re-)search.
English
0
0
0
62
Mario Zechner
Mario Zechner@badlogicgames·
i personally am fine with pi's out of the box experience, btw. look into the pi-mono repo .pi folder. that's all i use plus pi-diff-review.
English
2
0
11
2.1K