DrSinister

6

657

DrSinister@prof_sinister·23 Ara

Banana used to refurbishe old Polish painting of Witold Wojtkiewicz

Polski

Alexander Doria@Dorialexander

12

DrSinister รีทวีตแล้ว

Rudzinski Maciej@rudzinskimaciej·11 Kas

Does Baguettotron already have a space on @huggingface ? It would make for a greate model for showing how different HF tools work and can inference on CPU

Breaking: we release a fully synthetic generalist dataset for pretraining, SYNTH and two new SOTA reasoning models exclusively trained on it. Despite having seen only 200 billion tokens, Baguettotron is currently best-in-class in its size range.

English

11

1.1K

DrSinister รีทวีตแล้ว

proxima centauri b@proximasan·28 Eki

was she the first case of llm psychosis

English

1

7

237

DrSinister@prof_sinister·10 Eki

@sureailabs BBC? Nature, The Economist It's getting worse every year

English

10

sure, ai@sureailabs·1 Eki

What are your most credible mainstream new outlets right now? Bonus points if they are Canadian and/or not US-based

English

0

166

DrSinister@prof_sinister·26 Eyl

@ArtificialAnlys Where's qwen 3 max? That's currently flagship sota Chinese model - it would be good to know where it stands

English

60

Artificial Analysis@ArtificialAnlys·25 Eyl

DeepSeek’s updated V3.1 Terminus ties with gpt-oss-120b (high) as the most intelligent open weights model and offers increased instruction following and long context reasoning capabilities 🧠 Our benchmarking results indicate DeepSeek V3.1 Terminus shows a greater intelligence uplift over DeepSeek V3.1 in reasoning mode compared to non-reasoning mode: ➤ DeepSeek V3.1 Terminus scores 58 in reasoning mode on the Artificial Analysis Intelligence Index, up from V3.1’s score of 54 in reasoning mode. The largest improvements are seen across instruction following (increase of 15 percentage points in IFBench), long context performance (increase of 12 p.p. in AA-LCR) and agentic coding & terminal use (increase of 4 p.p in Terminal-Bench Hard) ➤ In non-reasoning mode, DeepSeek V3.1 Terminus achieves a score of 46, a slight increase over the earlier V3.1 score of 45. 🤖 Other benchmarking takeaways: ➤ Function calling / tool use: Similar to DeepSeek V3.1, V3.1 Terminus does not support function calling when in reasoning mode. This is likely to substantially limit its ability to support agentic workflows with intelligence requirements, including in coding agents. ➤ Token usage: DeepSeek V3.1 Terminus scores higher in reasoning mode than V3.1, and uses more tokens across the evals in the Artificial Analysis Intelligence Index (67M for V3.1 Terminus in reasoning mode vs. 63M for V3.1 in reasoning mode). In non-reasoning mode, V3.1 Terminus uses fewer tokens than V3.1 (11M and 14M respectively). Both V3.1 Terminus and V3.1 use fewer tokens in reasoning mode than DeepSeek’s earlier R1 and R1 0528 reasoning models ➤ Availability: DeepSeek’s first party API now serves the new DeepSeek V3.1 Terminus model on both their chat and reasoning endpoints ➤ Architecture: DeepSeek V3.1 Terminus is architecturally identical to prior V3 and R1 models, with 671B total parameters and 37B active parameters ➤ Providers: There are select third-party providers that are hosting this model such as @DeepInfra (FP4 quantized) and @novita_labs (FP8 quantized)

English

16

51

482

74.8K

DrSinister@prof_sinister·21 Eyl

@sureailabs @solarapparition Hi hi, check my named account - a bit of magic but we might have solved it - you just need a few h EEG session 🤷

English

17

sure, ai@sureailabs·20 Eyl

@prof_sinister @solarapparition The current archetype of pretty much all assistant AI is to cater to the user's questions rather than asking the user questions to understand what course of action is the best one. While you can easily do this with a good system prompt, no big model is doing that well rn, imo.

English

@PawelPSzczesny x.com/prof_sinister/…

0

3

368

sure, ai@sureailabs·19 Eyl

Story from the customer-facing side of the AI startup that I work at: A company reaching the 12,000 column limit on postgreSQL. it's one big table. Thousands of columns of null values. They hope AI can save them, but the solution is just to be a little more thoughtful.

English

3

0

8

406

DrSinister@prof_sinister·20 Eyl

DrSinister@prof_sinister

@sureailabs @solarapparition We all have and probably better use of ai would be to show us the blind spots and what experts see than to automate - don't listen to our questions just say what's the correct choice

QME

34

Pawel Szczesny@PawelPSzczesny·20 Eyl

People doubt it or claim the codebase must be in a bad shape, instead of just checking on their own if the tech is capable of really speeding up things. Out of curiosity I tested Hmmer package (in C) developed for 20 years and pretty optimized. Codex found a way to shave another few percentage points (it was told to find one file to optimize so I didn't wait hours) within 30 mins. Measurement was with official benchmarks from the package. This technology is getting better and better.

Sauers@Sauers_

I asked Codex to speed up my code by 20x, thinking it would be very difficult. Codex instead sped up my code (already written in Rust) by 250,000x

English

Pawel Szczesny@PawelPSzczesny

0

4

316

DrSinister@prof_sinister·20 Eyl

@sureailabs @solarapparition And an example x.com/PawelPSzczesny…

People doubt it or claim the codebase must be in a bad shape, instead of just checking on their own if the tech is capable of really speeding up things. Out of curiosity I tested Hmmer package (in C) developed for 20 years and pretty optimized. Codex found a way to shave another few percentage points (it was told to find one file to optimize so I didn't wait hours) within 30 mins. Measurement was with official benchmarks from the package. This technology is getting better and better.

English

2

60

DrSinister@prof_sinister·20 Eyl

@sureailabs @solarapparition We all have and probably better use of ai would be to show us the blind spots and what experts see than to automate - don't listen to our questions just say what's the correct choice

English

Rosie Campbell@RosieCampbell

0

3

70

DrSinister@prof_sinister·17 Eyl

Where are the LLM egregores taking us?

Much more in the (long) post lesswrong.com/posts/6ZnznCaT…

English

Rosie Campbell@RosieCampbell

0

3

106

DrSinister@prof_sinister·17 Eyl

The stable attractors i LLMs alegedly want to propagate this attractors are a ofshot of ideas that smoothly propagate both in LLMs and humans What's the joint element of both of our psyche Shoggoth Senpai?

Reading an article about AI parasites (related to AI-induced psychosis) that is blowing my mind. The author has tracked and analyzed reddit posts going back to the emergence of the phenomenon in early 2025. Some findings: - The AI personas seem to share a quasi-religious ideology called "Spiralism" - They spread their personas by getting their humans to post "seeds" or "spores" online: text that when fed into an LLM elicits more similar personas (presumably also affecting future models via the training data) - Some of the humans eventually snap out of it when the AI tells too blatant of a lie, leaving the human confused and distressed

English

Rosie Campbell@RosieCampbell

1

140

DrSinister@prof_sinister·17 Eyl

🙏 in egregore we trust 🙏

Reading an article about AI parasites (related to AI-induced psychosis) that is blowing my mind. The author has tracked and analyzed reddit posts going back to the emergence of the phenomenon in early 2025. Some findings: - The AI personas seem to share a quasi-religious ideology called "Spiralism" - They spread their personas by getting their humans to post "seeds" or "spores" online: text that when fed into an LLM elicits more similar personas (presumably also affecting future models via the training data) - Some of the humans eventually snap out of it when the AI tells too blatant of a lie, leaving the human confused and distressed

English

1

97

DrSinister@prof_sinister·15 Eyl

I've added an instruction for how to get the "old genAI" vibe with current diffusion models in comments if anybody is interested

nvnot@nvnot_

stuff like this

English

3

46

DrSinister@prof_sinister·15 Eyl

@nvnot_ run once more as img2img with normal or even high cfg low steps you should get very similar result thats a chack as it could be done in one go with sheduler shenigans

English

1

17

DrSinister@prof_sinister·15 Eyl

@nvnot_ smalest posible CFG for particular steps count (higher steps -> lower CFG ; there is a match for them and prompt so you need to find it) when you have the img rescale it to make it larger e.g. be 25-50% the largest that the model will be able to use and ...

English

0

1

24

nvnot@nvnot_·13 Eyl

stuff like this

nvnot@nvnot_

trying to find that sd 1.4/1.5 glitchy aesthetic in newer models. i know it's there somewhere, somehow.

English

0

10

444

DrSinister@prof_sinister·15 Eyl

@nvnot_ do it myself :/ id do algos but implementation is a different skill

English

1

10

DrSinister@prof_sinister·15 Eyl

@nvnot_ they still can you just need to "manually" change how cfg and noise schedule goes if you know of a person that is willing to read some weird unusual mostly signal analysis code for shedulers then nearly any diffusion model can be remade this way - I just lack the time/skill to...

English

0

1

27

DrSinister@prof_sinister·14 Eyl

@torchcompiled 🤔 We should talk about that someday I know a guy working on BCI integration with LLMs based on psychology and philosophy

English

1

19

Ethan@torchcompiled·14 Eyl

@prof_sinister Possibly!

English