wiynt

2.3K posts

wiynt banner
wiynt

wiynt

@wiynt

saving bookmarks

Under Water Katılım Ekim 2011
4.4K Takip Edilen377 Takipçiler
wiynt retweetledi
Anish Moonka
Anish Moonka@anishmoonka·
The woman who inspired Before Sunrise never saw the movie. She died in a motorcycle accident seven weeks before filming even started. Richard Linklater met Amy Lehrhaupt in a toy shop in Philadelphia in 1989. They spent the whole night walking the city from midnight to 6am talking about everything, and he turned that one night into a script. Took him 11 days. The casting search lasted nine months. Jennifer Aniston auditioned before she ever got Friends, and Gwyneth Paltrow tried out too. The role went to Julie Delpy. Amy died on May 9, 1994. She was 24. Linklater kept waiting for her to show up at a screening, maybe tap him on the shoulder and say “Hey, that was our night.” He waited through the premiere. Waited through the sequel nine years later. Didn’t find out she was gone until 2010, when a friend of hers put the pieces together and wrote him a letter. He kept making the films. Brought Hawke and Delpy back every nine years, letting them age on screen in real time. Before Sunset (2004) was shot in 15 days for $2 million. Before Midnight (2013), same thing, 15 days, under $3 million. That last one is dedicated to Amy. All three films together cost $7.5 million to make and earned $61.5 million worldwide, and both sequels got nominated for the Oscar for Best Adapted Screenplay. The entire trilogy cost less than a single Janet Jackson music video that Kahn himself directed. The Library of Congress added Before Sunrise to the National Film Registry earlier this year. It’s now one of 925 films the government considers worth keeping forever. Linklater turned a single night with a woman he’d never see again into an 18-year trilogy. He just didn’t know the “never” part when he started.
Joseph Kahn@JosephKahn

Before Sunrise is unwatchable because it's so good. It's such a beautiful encapsulation of young iPhoneless love, shot on analog film, a nineties time machine of a mysterious Europe that no longer exists, you feel like you are dying as you watch it.

English
75
1.9K
14.7K
1.2M
wiynt retweetledi
Steve Jurvetson
Steve Jurvetson@FutureJurvetson·
The ASML book author saw the next generation – Lace Lithography, using helium atoms shooting through a holographic mask to scale beyond what’s possible with light, where the wavelength is larger than atomic scale. “ASML is the only company capable of using EUV—extreme ultraviolet light—to print ultra-fine chip patterns at the 2-nanometer level desired by Musk. Last year, ASML produced 48 of these EUV machines. While there are plans to ramp up production, a rapid doubling of output is not in the cards—ASML’s suppliers simply cannot manage that pace.” “Lace Lithography is developing a lithography machine capable of printing chip circuitry using helium atoms. ‘Where light ends, atoms begin,’ says Bodil Holst, founder of this start-up based in Bergen, Norway. The wavelength of light determines the precision with which one can 'print.' Think of it like making a tiny drawing: you would much rather use a fine-tipped pen than a blunt carpenter's pencil. EUV employs a wavelength of 13.5 nanometers and further narrows that beam using mirrors. The 'beam' of helium atoms, however, is less than one-tenth of a nanometer wide, allowing it to draw with far greater intricacy.” “In a nutshell: Lace propels energized helium atoms—each carrying an extra electrical charge—through a mask perforated with tiny holes. The atoms that pass through unimpeded strike the photosensitive layer of a silicon wafer, thereby etching the desired pattern. That perforated mask reminded Bodil Holst of *kantklossen* (known in English as 'lace making'), which is why she named her company just that. A test rig is currently operational at the Lace laboratory in Bergen” “The foundation for this ‘atomic approach’ dates back to the 1990s, but neither the timing nor the technology was ripe for it at the time. This is because extensive computation is required to design the perforated diffractive mask in such a way that the chip's circuitry remains accurate. This can only be achieved with the aid of AI using high-speed chips—explains Adrià Salvador Palau. ‘Consequently, without the powerful chips made by EUV machines, we would never have been able to solve this problem.’" — Translated from the Dutch original: nrc.nl/nieuws/2026/04…
Steve Jurvetson tweet media
Steve Jurvetson@FutureJurvetson

𝐅⃣𝐎⃣𝐂⃣𝐔⃣𝐒⃣ The ASML Way I just finished this history of the most important semiconductor equipment company in the world, as translated from the Dutch original (and lurking in the background might be a better way). Reminder: ASML builds 100% of the world’s extreme ultraviolet (EUV) lithography machines, without which cutting edge chips are simply impossible to make. It’s the most expensive mass-produced machine tool in history. Oh, and today, there are two special women without whom, all EUV lithography would sputter to a stop (see p.141 below) ASML was formed in 1984 as a JV with Philips, the Dutch electronics company that contributed ~$15M (in guilders) and 40 engineers, and “it seemed doomed from the start.” (p.35) There were 10 viable competitors at the time, more than enough to serve the market as ASML learned at SEMICON in 1984 (by coincidence, I was also there with my Dad who about to leave Mostek to run Varian’s Semiconductor Equipment Group, but they only had Molecular Beam Epitaxy, a low throughput lithography alternative. My Dad’s attempt to poach a CTO from ASML is on p.72). “In these initial years, management worked around the clock to bring in new subsidies. In these initial years, about half of ASML’s money for research came from The Hague or Brussels.” (48) ASML’s “machines were the first in the industry to utilize modular design. The lens, the wafer-table, the frame for the mask, the light source, the robot that picks the wafers: these are LEGO blocks that, when you bring them together, form a lithography system.” (62) IPO in 1995. Stock went up 600x in the 30 years that followed. March 2000 market crash: “cancellations from chip manufacturers poured in daily. On paper, the company was bankrupt. Radical cost-cutting measures would be needed.” (82) Nikon sues: “a rude awakening. ASML had paid far too little attention to its intellectual property in its early years.” (98) “The best inventors, some of which have more than 200 patents to their name, are commemorated by having their faces engraved on silicon wafers and hung on a series of large wooden beams, like a Mount Rushmore of the chip industry. As of 2023, ASML has registered more than 16,000 patents.” (99) The machines are insanely sensitive. “Atmospheric pressure fluctuations due to thunderstorms can easily disrupt the lithography process. Or cows. Intel once faced an inexplicable drop in yield every night for a few hours, with researchers running in circles until they finally realized the cause: cow farts. Intel had to pay for three farms to relocate.” (117) “In 2006 Intel, who was supplying the chips for Apple’s computers, was asked if it could also supply the processor for the iPhone. It declined.” (122) “EUV light is extremely difficult to generate and sustain in an industrial environment. The invisible rays are absorbed by almost all materials, even the air, which means the lithography machine needs to have (curved, atomically precise) mirrors instead of lenses and can only operate in a vacuum.” (127) The Cymer laser / light source has a molten tin “droplet generator capable of forming a 30-micron droplet of tin at a rate of 50,000 times per second. The laser was rigged to deal two separate blows. First, a gentle tap to flatten the droplet into a pancake-like shape, followed by an intense blast that heated the tin to 200,000 degrees, transforming it into a plasma.” (130) “During its journey through the lithography machine, the light beam comes across 10 mirrors, each absorbing 30% of the light. It starts with 1.5 megawatts from the grid that yields 30 kilowatts in the laser, and that creates 100 watts of EUV light. Of this, about 1 watt ends up on the wafer. But more power also creates more heat. That causes the mirrors to expand, which in turn causes small deviations that immediately need to be corrected with small motors. Even the EUV mask, which carries the blueprint of the chip on it, is itself an extremely sensitive mirror.” (132) “ASML was vastly underestimating the financial consequences of the new technology. In retrospect, this was for the best. No respectable CEO would sign for a project that would take 20 years, without any promise of success or interim profit to carry it through. That’s not taking a bet, that’s bananas. This is also why the Japanese competition dropped out of the race: not because their engineers were any less capable, but because Nikon and Canon were simply not prepared to continue pumping so much money into EUV.” (133) To finance the purchase of Cymer in 2012, “Intel invested 3.3B Euros into ASML in exchange for 15% of the shares. TSMC was required to purchase 5%... and Samsung acquired a stake at the 11th hour, taking 3%.” (139) “Only Joann and one of her colleagues have the ability to wind and solder invisibly small wires (around the nozzle that shoots the tin droplets). It’s a delicate task few could ever master. ‘Even watchmakers can’t do this,’ says their awestruck boss, ‘and there’s no way to automate it.’ It’s not a trivial matter: the nozzle regularly gets clogged during day-to-day use in the chip factory. When that inevitably happens, the only thing to do is to swap it out for a new one. It’s hard to imagine, but without the fingers of Joann and her colleague, the EUV machines at Samsung and TSMC would grind to a halt.” (141) In 2013, “most of the droplet generator was still hand-made by Cymer, and it was virtually impossible to test the part in advance. This made for completely unpredictable yields: in the initial phase of production, half of the droplet generators didn’t even work.” (142) “20% of the South Korean economy now relies on the revenue of one single company. Hence their nickname: this is the republic of Samsung.” (156) “Intel was being surpassed by their competitors in Asia on every front and would only start using EUV for chips after 2023.” (160) “The descriptions that chip manufacturers use for these technological generations or ‘nodes’ need to be taken with a grain of salt. The physical dimensions of the smallest circuits and connections on the chip are, in practice, 5 to 10 times larger than advertised. A nanometer was once a nanometer, but accuracy has never stopped a good marketing slogan.” (161) Cousins “Lisa Su and Jensen Huang, the leaders of AMD and NVIDIA were both born in Tainan, the city where TSMC now produces their chips.” (164) “The culture at TSMC is more hierarchical than ASML, but less militaristic than in South Korea.” (166) “TSMC now commands 60% of the entire foundry market, making it 4x larger than its closest competitor, Samsung.” (167) “ASML’s next generation of EUV machines goes by the nickname High NA (the numerical aperture increases from 0.35 to 0.55). These colossal scanners span 14 meters and feature large mirrors up to a meter wide. The optical system by itself consists of 20,000 parts and weighs 12 tons, making it 7x heavier than the optics for the current EUV machine.” (175) “The High NA system weighs 150 tons and costs 400M Euros. It takes 7 cargo planes to ship this system to customers.” (225) “The production of a complex EUV mask costs more than a half million Euros and takes a huge amount of time to calculate.” (181) They “use AI to understand the interplay between the light beam, the mask, and the chemical reactions on the wafer.” ASML’s CTO calls it “voodoo software.” (183) China: “European governments fear China is transforming into a totalitarian state, capable of forcing Chinese multinationals to spy for the Communist Party. And that poses significant risk to the 5G cellular infrastructure of the West.” (200) “In 2017, Chinese customers ordered 700M Euros worth of lithography machines, a new record. Hundreds of ASML’s scanners were running in the factories of SMIC, China’s largest foundry” (201) “EUV is controlled by the Wassenaar Arrangement, the multilateral export control regime on conventional arms and dual-use goods and technologies.” (203) “As far as ASML is concerned, fears about EUV being used for military applications are baloney. Most chips found in weapons are ‘off-the-shelf’ chips that can also be found in laptops, washing machines or cars, and are easy to purchase anywhere in the world. But the U.S. sees things differently. They fear the emergence of Chinese AI and cyber weapons. And there is one thing those all need: advanced chips.” (205) “In January 2020, the U.S. asked the Netherlands to block EUV exports, and suddenly ASML found itself in the spotlight. The Netherlands ultimately denied ASML a license… No EUV machine was going to SMIC.” (208) In 2023 “ASML was exporting far more older DUV machines to China than had been expected. Almost half of ASML’s revenue was coming from China. As the chip industry was pushing the pause button, China kept on hoarding. The U.S. pressed the Netherlands to slam the brakes before January 2024, and the cabinet duly revoked several approved export licenses for ASML machines destined for China.” (234) “As China is growing increasingly isolated, so too is the liklihood of a fully-fledged Chinese competitor emerging in the rearview mirror capable of developing an independent chip production chain.” (236) “ASML takes this seriously. Their go-to response: ‘The laws of nature are the same anywhere.’ What was achieved in Brabant, could be achieved in Beijing.” (335) “To qualify for government aid (in Biden’s Chips Act), companies had to agree not to build advanced chip foundries in China or other ‘countries of concern.’” (239) “The chip shortage had been a wakeup call, and the nightmare scenario was front and center on everyone’s mind: if China blocks Taiwan, we’ll be without chips within two weeks.” (242) “The estimated percentage of people with autism or ADHD at ASML far outnumbers the average. The highly specialized work, revolving around focusing on complex problems that require prolonged attention to the smallest details, makes it well-suited for some autistic traits. ASML’s CTO and President Van den Brink makes no secret about being dyslexic and actively advocates for targeting this neurodiverse group. They are precisely the analytical and creative thinkers ASML needs, but also often the ones who find it difficult to put themselves in other people’s shoes.” (287) Sounds like teen spirit… of Steve Jobs: “Van den Brink’s power of persuasion lies in his childlike enthusiasm. It works like some kind of reality distortion field. Martin can disrupt your perspective until you’re convinced that you can make the impossible possible.” (321) “Van den Brink never really led a big company. He guided it like a startup, as if it were a defiant toddler in the body of a mature multinational.” (329) The book ends with the poignant handover of the company in 2024 to a new leader, the Frenchman Chistophe Fouquet.

English
25
177
1.2K
122.1K
wiynt
wiynt@wiynt·
@kelxyz_ amazing guy, but not skilled with interviews under pressure. bittensor principles are beautiful but the timeframe required for subnets to create competitive products is becoming too long imo for people to continue subsidizing the network buying tao. tough interview i’d say
English
0
0
0
173
kel.
kel.@kelxyz_·
probability of const interview having a murad 2049 effect on subnets is mispriced
English
11
1
64
6.3K
wiynt
wiynt@wiynt·
@AlgodTrading no doubt about the growing quality of people joining the eco. but subnets output continue to underperform vs other opensource, inference free on every laptop soon, model training cost going down, ridges not beating claude/cursor etc. how long will people subsidize by buying tao?
English
0
0
0
156
Algod
Algod@AlgodTrading·
Bittensor obviously has flaws: -few issues with the incentive mechanism -overal subnet quality can be better -multi consensus could open up more use cases That being said, the quality is increasing at a rapid pace, many people from frontier labs and so on starting to build and compete, one of the few narratives in crypto that actually makes sense Show me another project in crypto with the same potential and development activity as bittensor
English
51
78
521
33.2K
Kyle Samani
Kyle Samani@KyleSamani·
I'm debating @Jason next week about TAO What is everything I need to know about TAO going into the debate? Give me the good, the bad, and the ugly please!
English
178
20
472
203.1K
wiynt retweetledi
thiccy
thiccy@thiccyth0t·
The best part of accumulating wealth is that you can start saying increasingly insane things and people begin treating them as insight instead of retardation
English
100
736
8.1K
480.4K
wiynt retweetledi
Chiefingza
Chiefingza@chiefingza·
quantum will get fixed, but saylor is bar none the most bearish thing about BTC
English
9
1
120
8.1K
wiynt retweetledi
PrismML
PrismML@PrismML·
Today, we are emerging from stealth and launching PrismML, an AI lab with Caltech origins that is centered on building the most concentrated form of intelligence. At PrismML, we believe that the next major leaps in AI will be driven by order-of-magnitude improvements in intelligence density, not just sheer parameter count. Our first proof point is the 1-bit Bonsai 8B, a 1-bit weight model that fits into 1.15 GBs of memory and delivers over 10x the intelligence density of its full-precision counterparts. It is 14x smaller, 8x faster, and 5x more energy efficient on edge hardware while remaining competitive with other models in its parameter-class. We are open-sourcing the model under Apache 2.0 license, along with Bonsai 4B and 1.7B models. When advanced models become small, fast, and efficient enough to run locally, the design space for AI changes immediately. We believe in a future of on-device agents, real-time robotics, offline intelligence and entirely new products that were previously impossible. We are excited to share our vision with you and keep working in the future to push the frontier of intelligence to the edge.
PrismML tweet media
English
174
590
4.1K
1.3M
wiynt retweetledi
nic carter
nic carter@nic_carter·
this might seem like a reasonable take, but it's actually unreasonable and wildly risky. as the Google paper explicitly says (see below), and others note, you will not get warning. the warning is what you are getting now. this is your warning. once logical cubits start to meaningfully scale, you will go from cracking 5 bits to 256 bits very quickly. if you're wagering the future of a trillion dollar asset on QC development following a slow, predictable, linear path, with public milestones, you are taking an enormous risk. Google paper: "progress in quantum computing is better understood using a threshold model rather than in terms of the number of physical qubits" "there may be little time between the breaking of 32-bit ECDLP and the breaking of 256-bit ECDLP. Furthermore, the community should not expect to see published demonstrations of the most advanced quantum error-correction architectures and quantum algorithms deployed to cryptanalytic problems" the field is already self-censoring to avoid giving tools to black hats (again, see the Google paper). progress will happen in secret. additionally, as we all know, it will take a minimum of several years to fully transition. a signature scheme must be chosen, tested, deployed, soft forked, and adopted (this has to happen for every single address, which alone will take months). thus, you have to start acting several years before Q-day. unfortunately, the premise that the development of a CRQC will happen slowly, publicly, and predictably that almost all Bitcoin devs seem to be relying on, is false. Bitcoin devs will have to do something uncomfortable: act under conditions of uncertainty and make difficult and costly tradeoffs _before_ the threat seems clearly apparent. we will not have the luxury to waiting until a N bit key is broken. at that point, we will have weeks or months, not the years required. this means that the testing and deployment of a PQ signature scheme on Bitcoin has to occur in 2026.
nic carter tweet media
English
24
60
616
112.9K
wiynt retweetledi
Max the VC 👨‍🚀
Max the VC 👨‍🚀@mreiffy·
Google is basically saying: “We’ve cut the quantum resources needed to break Bitcoin’s encryption by 20x. We can now break it. We can prove it. We’re just not going to tell you how. We’ve slowed down research to give crypto a chance. You have until 2029 to figure out a solution. Good luck.”
nic carter@nic_carter

Many are wondering "what Google saw" that caused them to revise their post-quantum cryptography transition deadline to 2029 last week. It was this: research.google/blog/safeguard…

English
618
1.7K
19.1K
3.7M
wiynt retweetledi
Justin Drake
Justin Drake@drakefjustin·
Today is a monumentous day for quantum computing and cryptography. Two breakthrough papers just landed (links in next tweet). Both papers improve Shor's algorithm, infamous for cracking RSA and elliptic curve cryptography. The two results compound, optimising separate layers of the quantum stack. The results are shocking. I expect a narrative shift and a further R&D boost toward post-quantum cryptography. The first paper is by Google Quantum AI. They tackle the (logical) Shor algorithm, tailoring it to crack Bitcoin and Ethereum signatures. The algorithm runs on ~1K logical qubits for the 256-bit elliptic curve secp256k1. Due to the low circuit depth, a fast superconducting computer would recover private keys in minutes. I'm grateful to have joined as a late paper co-author, in large part for the chance to interact with experts and the alpha gleaned from internal discussions. The second paper is by a stealthy startup called Oratomic, with ex-Google and prominent Caltech faculty. Their starting point is Google's improvements to the logical quantum circuit. They then apply improvements at the physical layer, with tricks specific to neutral atom quantum computers. The result estimates that 26,000 atomic qubits are sufficient to break 256-bit elliptic curve signatures. This would be roughly a 40x improvement in physical qubit count over previous state-of-the-art. On the flip side, a single Shor run would take ~10 days due to the relatively slow speed of neutral atoms. Below are my key takeaways. As a disclaimer, I am not a quantum expert. Time is needed for the results to be properly vetted. Based on my interactions with the team, I have faith the Google Quantum AI results are conservative. The Oratomic paper is much harder for me to assess, especially because of the use of more exotic qLDPC codes. I will take it with a grain of salt until the dust settles. → q-day: My confidence in q-day by 2032 has shot up significantly. IMO there's at least a 10% chance that by 2032 a quantum computer recovers a secp256k1 ECDSA private key from an exposed public key. While a cryptographically-relevant quantum computer (CRQC) before 2030 still feels unlikely, now is undoubtedly the time to start preparing. → censorship: The Google paper uses a zero-knowledge (ZK) proof to demonstrate the algorithm's existence without leaking actual optimisations. From now on, assume state-of-the-art algorithms will be censored. There may be self-censorship for moral or commercial reasons, or because of government pressure. A blackout in academic publications would be a tell-tale sign. → cracking time: A superconducting quantum computer, the type Google is building, could crack keys in minutes. This is because the optimised quantum circuit is just 100M Toffoli gates, which is surprisingly shallow. (Toffoli gates are hard because they require production of so-called "magic states".) Toffoli gates would consume ~10 microseconds on a superconducting platform, totalling ~1,000 sec of Shor runtime. → latency optimisations: Two latency optimisations bring key cracking time to single-digit minutes. The first parallelises computation across quantum devices. The second involves feeding the pubkey to the quantum computer mid-flight, after a generic setup phase. → fast- and slow-clock: At first approximation there are two families of quantum computers. The fast-clock flavour, which includes superconducting and photonic architectures, runs at roughly 100 kHz. The slow-clock flavour, which includes trapped ion and neutral atom architectures, runs roughly 1,000x slower (~100 Hz, or ~1 week to crack a single key). → qubit count: The size-optimised variant of the algorithm runs on 1,200 logical qubits. On a superconducting computer with surface code error correction that's roughly 500K physical qubits, a 400:1 physical-to-logical ratio. The surface code is conservative, assuming only four-way nearest-neighbour grid connectivity. It was demonstrated last year by Google on a real quantum computer. → future gains: Low-hanging fruit is still being picked, with at least one of the Google optimisations resulting from a surprisingly simple observation. Interestingly, AI was not (yet!) tasked to find optimisations. This was also the first time authors such as Craig Gidney attacked elliptic curves (as opposed to RSA). Shor logical qubit count could plausibly go under 1K soonish. → error correction: The physical-to-logical ratio for superconducting computers could go under 100:1. For superconducting computers that would be mean ~100K physical qubits for a CRQC, two orders of magnitude away from state of the art. Neutral atoms quantum computers are amenable to error correcting codes other than the surface code. While much slower to run, they can bring down the physical to logical qubit ratio closer to 10:1. → Bitcoin PoW: Commercially-viable Bitcoin PoW via Grover's algorithm is not happening any time soon. We're talking decades, possibly centuries away. This observation should help focus the discussion on ECDSA and Schnorr. (Side note: as unofficial Bitcoin security researcher, I still believe Bitcoin PoW is cooked due to the dwindling security budget.) → team quality: The folks at Google Quantum AI are the real deal. Craig Gidney (@CraigGidney) is arguably the world's top quantum circuit optimisooor. Just last year he squeezed 10x out of Shor for RSA, bringing the physical qubit count down from 10M to 1M. Special thanks to the Google team for patiently answering all my newb questions with detailed, fact-based answers. I was expecting some hype, but found none.
English
330
1.2K
5.8K
1.5M
wiynt retweetledi
nic carter
nic carter@nic_carter·
Many are wondering "what Google saw" that caused them to revise their post-quantum cryptography transition deadline to 2029 last week. It was this: research.google/blog/safeguard…
English
97
956
7.4K
6.9M
wiynt retweetledi
Science Magazine
Science Magazine@ScienceExpand·
The patterns of trigonometry.
English
140
5.1K
42.5K
5.1M
wiynt retweetledi
Antipodean Empire 🇦🇺
Antipodean Empire 🇦🇺@AntipodeEmpire·
Lee Kuan Yew: I would work with the Australians and New Zealanders but not the Americans
English
30
180
2K
412.1K
wiynt retweetledi
DAN KOE
DAN KOE@thedankoe·
Competition is largely an illusion. 95% of people don't even try to do great things. 0.1% of the people are loud, so you overestimate how many people there are. The rest get stuck worrying about competition and quitting after 2 weeks.
English
543
1.6K
12.7K
423.7K
wiynt retweetledi
TBPN
TBPN@tbpn·
Former Tesla President @jonmcneill says Elon Musk kept employees motivated post-IPO by "starving the balance sheet": "Even after we were public, we operated Tesla on a quarter's worth of cash." "I kept saying to Elon, 'I would like a little breathing room.' He's like, 'No. We've got to think like we're young entrepreneurs: if you're two steps from death—you operate differently.'" "I was like, 'Man, we have a quarter's worth of cash, but we have 70 days of payables. That means we have less than three weeks of cash. This is tight.' But that kept everybody sharp." "The experience of working in a Musk company is—you are literally on the biggest challenge you've ever faced in your life, with the best people, and you're doing the best work of your life. That's what keeps most people engaged—it's not the balance in their bank account."
English
34
143
2.3K
224.2K
wiynt retweetledi
stevibe
stevibe@stevibe·
Which local models can actually handle tool calling? I built a framework to find out. 15 scenarios. 12 tools. Mocked responses. Temperature 0. No cherry-picking. Tested every Qwen3.5 size from 0.8B to 397B, and since some of you asked after the distillation tests: yes, I included Jackrong's Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled too. Only two models went all green: the 27B dense and the distilled 27B. The 397B? Failed two tests. The 122B? Failed one. The 35B? Failed two. The timed-out results — mostly on the smaller models, are cases where the model got stuck in a loop, repeating the same tool call until it hit the 30-second limit. The test that exposed the most models: "Search for Iceland's population, then calculate 2% of it." Simple, but 35B, 122B, and 397B all used a rounded number from memory instead of the actual search result. They didn't trust their own tool output. Small models hallucinate data. Big models ignore data. The 27B just threaded it through.
English
110
240
1.9K
392K