nikoster

91 posts

nikoster

@nikosters

https://t.co/yaT0AlhjWB

Saint Petersburg, Russia Katılım Ekim 2021

109 Takip Edilen6 Takipçiler

nikoster@nikosters·31s

@atorixa00 Я бы согласился с этим тезисом пару лет назад, но не сейчас. Мб для обычного пользователя это всё ещё верно, но для аутистов, вроде меня, которые могут потратить кучу времени на конфигурацию каждого пикселя на дисплее, десктопный линукс это лучший опыт пользования пк

Русский

atorixa🏳️‍⚧️@atorixa00·13h

линукс для десктопа это худшее что вообще существует

Русский

11.9K

nikoster retweetledi

Ben Davis@davis7·13h

This benchmark is the first one I've seen that maps 1:1 to my experience Almost to a degree where I'm scared to fully trust it since it so tightly maps to my existing opinions I feel like I'm missing something

Datacurve@datacurve

Opus 4.8 is now on DeepSWE. On the default high thinking effort, it scores 6% higher than Opus 4.7 xhigh, while also lowering average cost per task.

English

381

37.9K

nikoster retweetledi

Theo - t3.gg@theo·2h

I am thankful that OpenAI trained their models to be helpful assistants

HSVSphere@HSVSphere

What are they baking into claude, why did it respond like that ❓ Is this how people "fall for" the LLM? if so, sad It glazing me like that accomplishes nothing other than make me become skeptical & trust it less. I wanted to see how it responds to that message, knowing what ChatGPT models do, and I'm no longer surprised that the most retarded posters on here *love* Claude.

English

372

26.2K

nikoster retweetledi

чип из багажника 🇷🇺@vibecod3r·1d

🏴‍☠️waverly🦇💋@methylene333

мои одногруппники: слушают моргена, бухают, трахаются, играют в доту я: изучаю либертарианство, хожу в церковь, храню себя для порядочной жены как мне выжить в таком обществе?

ZXX

1.2K

19.8K

nikoster retweetledi

Osinachi@sin4ch·2d

I dream about having @theo's Excalidraw presentation skills.

English

10.8K

nikoster@nikosters·1d

A new era of PC. 25.0528, 121.5990

English

nikoster@nikosters·1d

@uwukko нужно сделать форк и назвать его huism

Русский

553

wukko@uwukko·1d

i named it helium prism in the end if anyone's curious

wukko@uwukko

coming up with a name for web components library is so hard

English

591

23.4K

nikoster retweetledi

Theo - t3.gg@theo·2d

Struggling to pick what agent, model, and effort levels to use? Miss the "slot machine" feel of Claude Code when using other tools? `npx slotslop "[prompt]"`

English

137

159

4.1K

279.2K

nikoster retweetledi

ThePrimeagen@ThePrimeagen·2d

benchmarks are stupid for models Just ignore them

English

124

1.5K

58.8K

nikoster retweetledi

wukko@uwukko·3d

ZXX

811

28.8K

nikoster retweetledi

scryo@scryocat·3d

discord has given us the e-cuck chair for even more immersive e-sex

Discord Previews@DiscordPreviews

Discord is adding Spatial Audio support for voice channels, so you can hear your friends as if you were talking next to each other!

English

177

8.2K

126K

4.6M

nikoster retweetledi

Loskoron@Loskoron·3d

255.255.255.0

Дуккха@PublicDukkha

Самая неподходящая маска для сексуальных ролевых игр?

1.7K

54.6K

nikoster retweetledi

maria@maria_rcks·2d

@adonis_singh i hope everyone drops swe bench pro and just starts using deepswe, swe bench pro is a joke atp

English

1.8K

nikoster retweetledi

Haider.@haider1·3d

the reason why anthropic is still keeping "mythos" locked in the lab: user: hey mythos spends 3 minutes deciding whether "hey" could mean urgency, affection, or a threat mythos: hey :) api cost: $200

English

284

9.7K

222.7K

nikoster@nikosters·2d

3) круто, что computer use бенчмарк показывает более высокие цифры, но я всё равно не буду его использовать пока claude code не станет нормальным harness остальные бенчмарки я комментировать не буду, так как для обычных повседневных задач мне хватает и kimi 2.6

Русский

nikoster@nikosters·2d

2) почему-то антропик до сих пор не сделали нормальный харнес для работы в терминале, как будто бы кодинг и работа в терминале единственные юзкейсы для фронтир моделей

Русский

nikoster@nikosters·2d

окей, что мы можем понять из этих бенчмарков: 1) swe bench pro ничего не показывает, так как gemini 3.1 набирает на 4 процента меньше чем gpt 5.5(разница в качестве аутпута этих моделей не 4 процента, там все 50)

Claude@claudeai

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today at the same price.

Русский

nikoster@nikosters·2d

Can't wait to see cost of benchmark runs

Artificial Analysis@ArtificialAnlys

Anthropic just launched Claude Opus 4.8, and it is the new leader on our GDPval-AA benchmark for agentic real-world work tasks Opus 4.8 scored 1890 on GDPval-AA at launch with its 'max' effort setting, +137 points from Opus 4.7 and +121 points ahead of the next-best model, GPT-5.5 xhigh. Compared head-to-head on the GDPval task set, this implies a ~67% win rate against GPT-5.5 xhigh. @AnthropicAI shared access with us ahead of the public release to benchmark this model and we’re glad to see our benchmarks referenced in today’s launch. The rest of the Artificial Analysis Intelligence Index is in progress - we’ll share final results soon!

English

nikoster@nikosters·2d

@michalmalewicz I heard mandarin burns even less. Also Chinese models seem to work better with prompts in mandarin

English

Michal Malewicz@michalmalewicz·2d

Polish language burns the least tokens. Learn polish.

English

970

47K

nikoster retweetledi

wukko@uwukko·2d

A fresh Brave install in 2026: sponsored ad wallpapers on new tab page by default (opt out). Brave VPN, News, Talk, Leo (AI), Rewards and other revenue-milking bloat is advertised/pinned by default. Analytics and "phoning home" by default. Google as default search engine in most regions by default. Sponsored search engines like Russian Yandex in CIS countries by default: github.com/brave/brave-co… Brave has an ad branch that handles advertising within the browser: brave.com/ads. Brave does on-device ad targeting based on cohorts and interests, just like what Chrome used to do and what Google was largely hated for (remember FLoC?). This applies to additional (opt-in) rewarded ads, shipped as part of Brave. Brave has injected referral IDs to crypto-related URLs entered into the omnibox in the past, intentionally, by design: x.com/CR1337/status/… github.com/brave/brave-br… reddit.com/r/privacytools… Brave also uses dark patterns to drive users away from turning off ads in their browser. For example, an article linked from the "opt out" button in the browser has a wall of text making excuses for ads before the actual steps needed to be taken to disable them: support.brave.app/hc/en-us/artic… kind of hypocritical for brave to judge firefox for lesser bullshit, don't you think?

English

546

5.2K

321K

Keşfet

@atorixa00 @theo @uwukko @adonis_singh @elonmusk @BarackObama @taylorswift13 @cristiano