FlyingIkki

7.7K posts

FlyingIkki

FlyingIkki

@FlyingIkki

เข้าร่วม Nisan 2018
2.5K กำลังติดตาม336 ผู้ติดตาม
FlyingIkki รีทวีตแล้ว
Alex Prompter
Alex Prompter@alex_prompter·
Anthropic just dropped a paper on 400,000 Claude Code sessions, and the headline finding flips a year of assumptions: domain expertise, not coding skill, is what makes an AI agent succeed. 235,000 people. Seven months. October 2025 to April 2026. The division of labor in a typical session: → You make about 70% of the planning decisions (what to build) → The agent makes about 80% of the execution decisions (how to build it) The agent runs the keyboard and you, the judgment. The gap widens from there. The more you understand the domain, the more the model does per instruction. A novice prompt sets off around 5 actions and 600 words of output. An expert prompt sets off 12 actions and 3,200 words. Same model. Same subscription. Five times the output. The only variable is the person typing. The success numbers track the same line: → Expert-rated sessions reach verified success more than twice as often as novice ones → When a session goes wrong, novices abandon it 19% of the time. Everyone else, 5 to 7%. And a coding background barely matters. Lawyers, analysts, marketers, and managers all succeeded within a few points of software engineers. Managers landed at the top. The gains come from competence, not mastery. A working grasp of the problem captures most of the benefit. Deep specialization adds a little on top. The bottleneck was never the syntax. It was always how well you understand the thing you're trying to build. LLMs don't think, you do.
Alex Prompter tweet media
English
7
11
59
4.4K
FlyingIkki รีทวีตแล้ว
Ulf Poschardt
Ulf Poschardt@ulfposh·
wie scheissegal den linken feminist:innen die misshandelten mädchen in nürnberg, UK und afghanistan sind.
Deutsch
313
1.9K
12.5K
143.9K
FlyingIkki รีทวีตแล้ว
Matthew Berman
Matthew Berman@MatthewBerman·
One of my new favorite loops from Peter Steinberger (@steipete): “Refactor until you are happy with the architecture. After each significant step, live-test the system, run autoreview, and commit. Track progress in /tmp/refactor-{projectname}.md.” signals.forwardfuture.ai/loop-library/l…
Matthew Berman@MatthewBerman

Just launched Loop Library - a curated list of agent loops you can use right now. Find loops, submit your own, tokenmaxx!! signals.forwardfuture.ai/loop-library/

English
32
31
951
99.5K
FlyingIkki รีทวีตแล้ว
Ning
Ning@totheagi·
We're the first to make the full GLM-5.2 (FP8) run on RTX 4090s. GLM-5.2 is the new 753B SOTA open-weights model, and it officially ships for datacenter GPUs only: H100, H200, B200. We ported its sparse-attention kernel stack to consumer hardware. A frontier open model, off the scarce GPUs and onto the abundant kind. github.com/renning22/glm-… @Zai_org
Ning tweet media
English
36
55
535
66K
FlyingIkki รีทวีตแล้ว
Australischer Austauschstudent
Die ziehen das jetzt ernsthaft eiskalt durch. Der ÖRR wird nicht über Rupert Lowes Rape Gang Inquiry Report berichten. Ein 219 Seiten langer Bericht, der die Vergewaltigung von mehr als 250.000 britischen Mädchen über mehrere Jahrzehnte offenlegt, hat in den Augen unserer objektiven Qualitätsjournalisten keine „überregionale Relevanz“. Hier auf X haben innerhalb eines Tages 43 Millionen Menschen davon erfahren. Diese Plattform ist buchstäblich das letzte Bollwerk der Meinungsfreiheit.
Deutsch
70
758
4.6K
32.2K
FlyingIkki รีทวีตแล้ว
Oliver Gorus
Oliver Gorus@olivergorus·
Der Aufsichtsratsvorsitzende der Firma, die hinter „W Social“, dem Social-Media-Haustierchen der EU steht, ist genau der Ingmar Rentzhog, der mit seiner Agentur „We Don‘t Have Time“ 2018 Greta Thunberg inszeniert hat. Die Agentur ist an W Social auch beteiligt. Nur mal so.
Oliver Gorus tweet media
Deutsch
37
338
999
9.8K
FlyingIkki รีทวีตแล้ว
elvis
elvis@omarsar0·
Cool paper on Skill routing for LLM agents. Real tasks rarely map to a single skill. They need several composed together, but most skill routing still treats the problem as picking one tool from a library. This work formalizes Compositional Skill Routing, decomposes a complex query into atomic sub-tasks, retrieves the right skill for each, and then composes an executable plan. The system, SkillWeaver, pairs an LLM decomposer with a bi-encoder FAISS retriever and a dependency-aware DAG planner. It comes with CompSkillBench, 300 compositional queries over 2,209 real skills, so the multi-skill case gets measured directly. Why does it matter? As skill libraries grow, single-skill retrieval quietly caps what an agent can do. The DAG planner turns retrieved skills into an ordered, dependency-respecting plan. Paper: arxiv.org/abs/2606.18051 Learn to build effective AI agents in our academy: academy.dair.ai
elvis tweet media
English
12
44
194
13.9K
FlyingIkki รีทวีตแล้ว
Sebastian Caliri
Sebastian Caliri@SebastianCaliri·
I just tested my hand in a mini version of this scanner. Images that are higher quality than MRI, whole body captured in <1 minute, virtually free to run. This is going to change medicine. Things get even crazier when you consider the possibility of using the same tank to focus ultrasound to ablate tissue, stimulate nerves, etc. The FDA is not in the slightest ready for this. People will also complain about incidental findings but they are wrong and don’t understand how quickly software can improve and how inexpensive a time series of scans will be to generate.
Midjourney@midjourney

A technical dive inside our new "Midjourney Scanner"

English
224
613
9.6K
849K
FlyingIkki รีทวีตแล้ว
Chainlink
Chainlink@chainlink·
🔒 SOC 2 Type 2 🌐 ISO 27001 🔗 LINK 2026
English
27
204
727
16.2K
FlyingIkki รีทวีตแล้ว
h100envy
h100envy@h100envy·
Daniel Han wrote Unsloth, the reason half of open-source can fine-tune a model on one GPU instead of a cluster. He didn't optimize the math. He rewrote the kernels by hand, found bugs in everyone else's code, and made training 2 to 3 times faster with zero accuracy loss. Millions of fine-tunes run through his code every month. Most people training a model locally are standing on it without knowing. Everyone talks about who has the most GPUs. He made yours enough.
h100envy@h100envy

x.com/i/article/2065…

English
13
30
300
28.2K
FlyingIkki รีทวีตแล้ว
Unreal Engine
Unreal Engine@UnrealEngine·
Unreal Engine 5.8 ships today with experimental MCP server support: Your sources, your pipeline and your workflow—simply configure the MCP plugin and connect to any agent. Get familiar with the MCP server and the PCG Primitive Plugin today and see what teams can build together: epic.gm/ue-5-8-blog
English
228
874
7.2K
2.8M
FlyingIkki รีทวีตแล้ว
Chainlink
Chainlink@chainlink·
NEW: Top-10 crypto exchange with 120M+ users, @okx, adopts Chainlink to unlock the $80 trillion tokenized RWA opportunity on X Layer. Chainlink enables devs to create advanced apps, bringing the agentic economy & high-speed DeFi to Chainlink Scale member @XLayerOfficial.
Chainlink tweet media
English
63
434
1.2K
119.6K
FlyingIkki รีทวีตแล้ว
Michaël van de Poppe
Michaël van de Poppe@CryptoMichNL·
Slowly, but surely, $LINK is growing and growing. Another massive partner added to the ecosystem, it's @okx. Very happy to see this news between the two parties, and very bullish for the ecosystem as a whole.
Chainlink@chainlink

NEW: Top-10 crypto exchange with 120M+ users, @okx, adopts Chainlink to unlock the $80 trillion tokenized RWA opportunity on X Layer. Chainlink enables devs to create advanced apps, bringing the agentic economy & high-speed DeFi to Chainlink Scale member @XLayerOfficial.

English
39
58
505
62.1K
FlyingIkki รีทวีตแล้ว
ÖRR Blog.
ÖRR Blog.@OERRBlog·
Elon Musk ist ein Trottel und Faschist, Tesla Fahrer sind Arschlöcher, Vergleich von Tesla Aktionären mit dem Inzest-Verbrecher Josef Fritzl. Jan Böhmermann arbeitet für das ZDF. #OerrBlog
Deutsch
740
1.2K
6.8K
164.8K
FlyingIkki รีทวีตแล้ว
vittorio
vittorio@IterIntellectus·
this is actually incredible a full body ultrasound scanner that takes 60 seconds instead of spending an hour in an MRI tube, without radiation, hospitals or a $2000 bill soon you’ll just walk into a health spa, order a coffee, step into the pod, and walk out with a 3D map of your body the future is finally starting to look like the future
vittorio tweet mediavittorio tweet media
Midjourney@midjourney

A technical dive inside our new "Midjourney Scanner"

English
243
1.2K
15.3K
1.3M
FlyingIkki รีทวีตแล้ว
Hassan
Hassan@nutlope·
This model is insane at design. I asked GLM 5.2 (left) and Opus 4.8 (right) to build me a landing page and you can't even tell the difference. GLM cost $0.06 while opus cost $0.49. More than 6x cheaper while being faster + more token efficient. Another win for open source AI.
Z.ai@Zai_org

Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency - MIT-licensed open weights - Same API pricing as GLM-5.1 Tech Blog: z.ai/blog/glm-5.2 Weights: huggingface.co/zai-org/GLM-5.2 API: docs.z.ai/guides/llm/glm… Coding Plan: z.ai/subscribe Chat: chat.z.ai

English
298
483
7.4K
1.2M
FlyingIkki รีทวีตแล้ว
Chubby♨️
Chubby♨️@kimmonismus·
The Midjourney medical thing is genuinely strange and I kind of love it. The plan is a spa. Hot tubs, saunas, cold plunges, open 24/7, first location in San Francisco in 2027. You step into a shallow pool of water, sink slowly through a ring of half a million tiny ultrasonic sensors, and in about 60 seconds you walk out with a 3D map of your insides down to a fraction of a millimeter. No magnets, no radiation, no contrast, just sound waves and warm water. Compare that to how we do this now: They say it's close to 100x faster than an MRI ("60 seconds"). For context, a normal MRI in the US averages around $1,300 and the scan alone can take over an hour inside a loud metal tube. A full-body scan from Prenuvo runs about $2,500 for roughly the same hour. Midjourney wants to flip the whole feeling of it. Build a place you'd want to visit even if there were no scanner, then collect the health data as a side effect. I have no idea yet if the tech delivers what they claim. But the framing is smart. The hardest problem in preventive health has always been getting people to actually show up, and a spa solves that better than a hospital ever will.
Chubby♨️ tweet mediaChubby♨️ tweet media
Midjourney@midjourney

Announcing a new division of Midjourney called "Midjourney Medical"

English
67
109
1.5K
223.5K
FlyingIkki รีทวีตแล้ว
Alok
Alok@analogalok·
Google's Gemma 4 26B A4B QAT hits 25+ tokens/sec and 320+ tokens/sec prefill on 8 GB VRAM (RTX 4060) + 16 GB RAM using TurboQuant Prefill just went from 200 → 320+ tok/s on the same 8GB card. 1.6x, no new hardware, no new quant, just a KV cache trick stacked on top of the Gemma 4 26B MoE setup from a few days ago. A few days ago I posted Gemma 4 26B A4B hitting 28 tok/s decode on 8GB VRAM using native MTP. prefill was stuck around 200 tok/s. fair callout by the community. So today I tested something I'd already been meaning to try: TheTom/llama-cpp-turboquant, the TurboQuant KV cache fork by Tom Turney (@no_stp_on_snek). (github link in the comments) thanks to him, the fork just got resynced to mainline, so MTP + TurboQuant now run together cleanly (I didnt see any meaningful gains by using MTP with this setup though but you can try). The flags (No MTP): -m gemma-4-26B-A4B-it-qat-UD-Q4_K_XL.gguf -cnv -c 64000 --cache-type-k q8_0 --cache-type-v turbo3 Results on the same RTX 4060 8GB, tested with a 27k token prompt at 64k context loaded: Prefill: 200 tok/s → 320+ tok/s Decode: stayed above 25 tok/s (without MTP) Why it works: TurboQuant uses walsh hadamard rotation + polar quantization on the KV cache. keys are sensitive to compression, values aren't much, so it splits the difference: K stays at q8_0, V drops to turbo3 (~3 bits). bonus from the memory savings: same 8GB card can now stretch to 100-120k context with minimal decode penalty. It should now be snappier with any agent harness such as hermes agent without compromise on intelligence. If you're already running Gemma 4 on a small card, this stacks on top for free. Try --cache-type-k q8_0 --cache-type-v turbo3 on your setup and report back what your prefill/decode split looks like. unsloth model gguf and llama.cpp turboquant fork links in the comments. what's your prefill number before vs after?
Alok@analogalok

Run Gemma 4 26b MTP on 8 GB VRAM GPUs at 25+ tokens/second. Flags included! local llm space is moving at terminal velocity. only 3 days ago google released gemma 4 26b a4b qat quants. more efficient than before, ran on 8gb vram at 20 tok/sec. and now just a few hours ago, mainline llama.cpp merged a massive update and we just shattered our own record. decode throughput went 25-40% up on the same 8 GB VRAM setup! Before MTP: 20 tps -> After MTP: 28 tps! llama.cpp just officially merged PR #23398 ("add Gemma4 MTP"), bringing native Multi-Token Prediction (MTP) support to Gemma 4 models. By running speculative drafting on the same 8GB VRAM RTX 4060 setup, my decode throughput on a 64k context instantly leaped to a blistering 25–27 tokens/sec thats 25-30% increase with the same hardware. Here is the architectural catch you need to know: Unlike the Qwen 3.5 and 3.6 series, which bake the MTP heads directly into the base GGUF, the Gemma 4 MTP head is not built in. You must download a separate, specialized MTP drafter GGUF (the assistant model) to act as the speculator. (I've dropped the download link in the replies). copy and try the exact flags: -m gemma-4-26B-A4B-it-qat-UD-Q4_K_XL.gguf --spec-type draft-mtp --spec-draft-n-max 6 --spec-draft-p-min 0.7 --spec-draft-model gemma-4-26b-A4B-it-assistant-Q4_0.gguf -c 64000 -v n-max 4 and p-min 0.7 is also worth checking out. benchmark on your setup and workflow. if you have a single 8 gb vram nvidia rtx 4060, 3060, 3070, 2080, 2070, grab the MTP drafter GGUF link in the comments and try it yourself. Check it out even if you have asmaller or a larger gpu, such as a single rtx 3090, 4090, 3060, 2060. MTP works for all gemma 4 sizes such as gemma 4 12b, gemma 4 31b etc. but remember to grab the correct mtp draft assistant models respectively. what are you benchmarking today

English
24
56
523
73K