Tactics/os

3.4K posts

Tactics/os banner
Tactics/os

Tactics/os

@Tacticsos

Katılım Mart 2017
96 Takip Edilen467 Takipçiler
Tactics/os retweetledi
ChrisO_wiki
ChrisO_wiki@ChrisO_wiki·
1/ The world faces a catastrophic cliff-edge shortage of oil due to the Strait of Hormuz blockade in the next four weeks, analysts warn. This will cause a deep recession, fuel rationing, the shutdown of entire industries, and oil prices potentially as high as $370 per barrel. ⬇️
ChrisO_wiki tweet media
English
111
1.1K
3.5K
572.2K
Tactics/os retweetledi
Qalaat Al Mudiq
Qalaat Al Mudiq@QalaatAlMudiq·
According to the Jerusalem Post, Israel is concerned that #Syria has begun rebuilding its military capabilities, incl. air defenses. jpost.com/israel-news/de…
Qalaat Al Mudiq tweet media
Qalaat Al Mudiq@QalaatAlMudiq

#Syria: the radar recently supplied by Turkey to #Damascus International Airport is an ASELSAN’s "HTRS‑100". This advanced air traffic control system, capable of detecting aircraft at distances of over 150 km, has already raised concerns in Israel. jpost.com/middle-east/ar…

English
11
60
272
118.4K
Tactics/os
Tactics/os@Tacticsos·
@Dorialexander I'm still waiting for an actual Roman Empire model. What's the point of figuring out synth if we can't use it to make a Roman LLM?
English
1
0
1
56
Alexander Doria
Alexander Doria@Dorialexander·
Still my roman empire (more to come, maybe)
Alexander Doria tweet media
English
3
1
24
820
Tactics/os retweetledi
Daniel Litt
Daniel Litt@littmath·
Still underrated how uneven frontier models are within math, IMO. I’ve recently been reading through some of the more interesting solutions to Erdős problems and quite enjoying them—here the models are reliably executing nontrivial ideas, combining known techniques, etc. But…
English
22
26
599
79.8K
Tactics/os retweetledi
Yevardiaղ
Yevardiaղ@haravayin_hogh·
By 1994-5, the collapse of 🇷🇺economy had nearly bottomed out (after 10 years of decline). Chernomyrdin managed to finally end shortages. But as average wages continued to fall, this was little consolation to most people. Stabilisation was uneven, as rural regions had no respite.
Yevardiaղ tweet mediaYevardiaղ tweet mediaYevardiaղ tweet mediaYevardiaղ tweet media
English
1
1
10
740
Tactics/os retweetledi
Yevardiaղ
Yevardiaղ@haravayin_hogh·
Echoing recent USian election rhetoric, Russian liberals, despairing nationalist & communist popularity, opined that "democracy" had to be destroyed in order to save it. Yeltsin's core base: Ex-Soviet bureaucrats - admin had miraculously expanded even as the economy shriveled.
Yevardiaղ tweet media
English
1
1
11
663
Tactics/os retweetledi
Yevardiaղ
Yevardiaղ@haravayin_hogh·
End of🧵post and thoughts. Incredibly vague as the referendum's phrasing was, the overwhelming majority of all republics' populations who participated voted to keep the Soviet Union. Gorbachev's newly-ratified Union treaty left huge ambiguities, whilst delegating most real sovereign power to the Republics. From this point on, the USSR was effectively dead. This juncture is in fact only a bit under 2/3rds through Zubok's book. Communist hardliners found the Union's terms unacceptable, eventually staging a (nearly) bloodless and horribly botched coup that suddenly killed the USSR for good. Not to say there isn't much of great interest in the remainder of Zubok's book, but the events of 1991 feel more like an epilogue, or more accurately, the troubled birth of the Russian Federation. I may post a few isolated excerpts - of particular interest is how gauche & unlikeable both American & European national leaders found Yeltsin. Bush, Mitterand, John Major and others withdrew their support of Gorbachev only reluctantly, as his powerlessness became impossible to hide. Towards the end of Zubok's book, the manner in which the Americans repeatedly humiliated Russia with one-sided treaties & denial of credit even drew criticism from worried Western European leaders, but to no avail. Though he strains to keep his language neutral, Zubok clearly views the collapse of the USSR as a catastrophe, and despises Gorbachev's weakness as a leader. Most strikingly Ukrainian independence was forced upon an elite and population that did not want it, with leaders like Kravchuk fearing the future wars that would come out of its artificial borders. Wars and ethnic cleansing in the Caucasus continue to this year, whilst much of Central Asia fell under personalist dictatorships. Gorbachev was a man with moral qualities that were admirable in an individual, but those same traits made him disastrous as a national leader. End /🧵 @kaptenblu @chernayakoshka @Miyhnea
Yevardiaղ tweet media
English
1
3
29
2.1K
Tactics/os retweetledi
Yevardiaղ
Yevardiaղ@haravayin_hogh·
As Gorbachev continued to haemorrhage power to local authorities acting in the name of the Russian SSR & other republics, he worked hard on a constitution for a new, "voluntary" Soviet federation. G read Lenin "in search of clues", his aides warned Yeltsin would take everything.
Yevardiaղ tweet media
English
1
1
14
1.9K
Tactics/os
Tactics/os@Tacticsos·
@MazMHussain "And if 20% of the world's oil production suffers long-term damage, well, not our problem..."
English
0
0
9
120
Tactics/os
Tactics/os@Tacticsos·
@MazMHussain "Oil shut-in in Iran will cause catastrophic damage, but oil shut-in in Iraq, Kuwait, Qatar, Bahrain, UAE, and Saudi Arabia is not something we need to worry about."
English
1
1
33
1.1K
Alexander Doria
Alexander Doria@Dorialexander·
People probably needs to update on the fact synthetic pretrain allows you to memorize reliably anything. Beyond 1T, it’s hardly signal.
Bojie Li@bojie_li

Closed labs hide model sizes. They can't hide what their models know, and what a model knows is an indicator on how big it is. Reasoning compresses. Factual knowledge doesn't. So you can size a frontier model from black-box API calls alone, and across releases you can literally watch a single fact arrive in the parameters over time. For three years, my friends Jiyan He and Zihan Zheng have been asking frontier LLMs the same question: "what do you know about USTC Hackergame?", a CTF contest. May 2024: GPT-4o invented fake titles. Feb 2025: Claude 3.7 Sonnet listed 19 verified 2023 challenges. By April 2026, frontier models recall specific challenges across consecutive years. After DeepSeek-V4 dropped, I instructed my agent to spend four days autonomously turning that habit into Incompressible Knowledge Probes (IKP) — 1,400 questions, 7 tiers of obscurity, 188 models, 27 vendors. Three findings: 1/ You can approximately size any black-box LLM from factual accuracy alone. Penalized accuracy is log-linear in log(params), R² = 0.917 on 89 open-weight models from 135M to 1.6T params. Project closed APIs onto the curve → GPT-5.5 ~9T, Claude Opus 4.7 ~4T, GPT-5.4 ~2.2T, Claude Sonnet 4.6 ~1.7T, Gemini 2.5 Pro ~1.2T (90% CI: 0.3-3x size). 2/ Citation count and h-index don't predict whether a frontier model recognizes a researcher. Two researchers with similar citation profiles get very different responses. Models memorize impact — work that shaped a field, not many incremental papers. 3/ Factual capacity doesn't compress over time. Across 96 open-weight models across 3 years, the IKP time coefficient is statistically zero, rejecting the Densing-Law prediction of +0.0117/month at p<10⁻¹⁵. Reasoning benchmarks saturate; factual capacity keeps scaling with parameters. Website: 01.me/research/ikp/ Paper: arxiv.org/pdf/2604.24827

English
10
8
164
20.8K
Tactics/os
Tactics/os@Tacticsos·
@Dorialexander That method of estimating model sizes produces drastically different results for models we know share the same base (i.e. 4o and o3), has massive confidence intervals, and would suggest that 3.1 flash is bigger than GPT-5.5.
English
0
0
3
526
Tactics/os
Tactics/os@Tacticsos·
@kalomaze Deepseek's whole thing even is that they're almost irrationally obsessed with finding more efficient architectures, which is something that distilling wouldn't help with at all.
English
0
0
3
55
kalomaze
kalomaze@kalomaze·
anyways, deepseek & kimi are fairly overwhelmingly not using heavy API output derivative synthetics in pretraining it's fairly true for qwen, though
English
1
2
61
4.5K
Tactics/os retweetledi
Gergely Orosz
Gergely Orosz@GergelyOrosz·
The last month, Anthropic: - Quietly nerfed their flagship model harness (Claude Code) without telling anyone - Banned corporate customers of Claude - Silently changed plans for customers with certain files in their repo All evidence that closed models are *massive* risks.
English
124
222
3.2K
135.7K
Tactics/os
Tactics/os@Tacticsos·
@kalomaze That shouldn't be that surprising? And if you really cared and had an infinite amount of money to digitize and transcribe, there's an endless amount of unpublished stuff sitting in archives.
English
0
0
10
564
kalomaze
kalomaze@kalomaze·
what i find most interesting about this release is that you can approach ~1T raw tokens from *before WWII*, before synthetic augmentation or rephrasers or whatever. that's the floor. and a post-WWII model could have the colloquial talk radio archives too, if transcribed...
David Duvenaud@DavidDuvenaud

Announcing Talkie: a new, open-weight historical LLM! We trained and finetuned a 13B model on a newly-curated dataset of only pre-1930 data. Try it below! with @AlecRad and @status_effects 🧵

English
7
4
205
18K
Tactics/os retweetledi
Omar El Fares 🇵🇸 عُمَر الفارس
Ukraine is imposing sanctions on Israel before the EU. Let that sink in.
Volodymyr Zelenskyy / Володимир Зеленський@ZelenskyyUa

In any normal country, purchasing stolen goods is an act that entails legal liability. This applies, in particular, to grain stolen by Russia. Another vessel carrying such grain has arrived at a port in Israel and is preparing to unload. This is not – and cannot be – legitimate business. The Israeli authorities cannot be unaware of which ships are arriving at the country’s ports and what cargo they are carrying. Russia is systematically seizing grain on temporarily occupied Ukrainian land and organizing its export through individuals linked to the occupiers. Such schemes violate the laws of the State of Israel itself. Ukraine has taken all necessary steps through diplomatic channels to prevent such incidents. However, we see that yet another such vessel has not been stopped. I have instructed the Ministry of Foreign Affairs of Ukraine to inform all partners of our state about the situation. Based on information from our intelligence services, Ukraine is preparing a relevant sanctions package that will cover both those directly transporting this grain and the individuals and legal entities attempting to profit from this criminal scheme. We will also coordinate with European partners to ensure that the relevant individuals are included in European sanctions regimes. Ukraine counts on partnership and mutual respect with every state. We are genuinely working to enhance security, particularly in the Middle East region. We expect that the Israeli authorities will respect Ukraine and refrain from actions that undermine our bilateral relations.

English
58
784
7.3K
250.2K
Tactics/os retweetledi
BURKOV
BURKOV@burkov·
If you don't understand this, you will not understand why LLM-based agents are irreparably failing for a general-purpose problem solving. An agent (by the way it was the topic of my PhD 20 years ago) to be useful, must be rational. Being rational means to always prefer an outcome that results in the maximal expected utility to its master/user. Let’s say an agent has two actions they can execute in an environment: a_1 and a_2. If the agent can predict that a_1 gives its user an expected utility of 10, and a_2 gives an expected utility of -100, then a rational agent must choose a_1 even if choosing a_2 seems like a better option when explained in words. The numbers 10 and -100 can be obtained by summing the products of all possible outcomes for each action and their likelihoods. Now here is the problem with LLM-based agents. The LLM is not optimizing expected utility in the environment. It is optimizing the next token, conditioned on a prompt, a context window, and a training distribution full of examples of what helpful answers are supposed to look like. Those are not the same objective. So when we wrap an LLM in a loop and call it an “agent,” we have not created a rational decision-maker. We have created a text generator that can imitate the surface form of deliberation. It may say things like: “I should compare the expected outcomes.” “The best action is probably a_1.” “I will now execute the optimal plan.” But the internal mechanism is not selecting actions by maximizing the user’s expected utility. It is generating a continuation that is statistically appropriate given the prompt and prior context. This distinction matters enormously. For narrow tasks, the imitation can be good enough. If the environment is constrained, the actions are simple, and the success criteria are close to patterns seen in training, the system can appear agentic. But for general-purpose problem solving, the gap becomes fatal. A rational agent needs stable preferences, calibrated beliefs, causal models of the world, the ability to evaluate consequences, and the discipline to choose the action with maximal expected utility even when that action is boring, non-linguistic, or unlike the examples in its training data. An LLM-based agent has none of that by default. It has fluency. It has pattern completion. It has a remarkable ability to compress and recombine human text. But fluency is not rationality, and a plausible plan is not an expected-utility calculation. This is why these systems so often fail in strange, brittle, and irreparable ways when given open-ended responsibility. They are not failing because the prompts are insufficiently clever. They are failing because we are asking a simulator of rational agency to be a rational agent.
English
175
272
1.6K
197.4K
Tactics/os
Tactics/os@Tacticsos·
@provisionalidea "We spent years completely overhyping what our products can do but, trust us, they're actually useful this time."
English
0
0
4
82
James Rosen-Birch ⚖️🕊️
James Rosen-Birch ⚖️🕊️@provisionalidea·
the problem with the former complaining about the latter is the latter are the people who are supposed to buy your product, and failing to understand their concerns and clearly communicate value to them is not *their* fault
Matt Turck@mattturck

Silicon Valley: AI is self-accelerating, agents run everything, old people are dumb Global 2000: I spent a fortune on your AI chat 2 years ago and got zero productivity; my engineers like the coding AI thing but no one else cares; agents are scary Never seen a gap this huge

English
2
2
23
2.7K