Namerlight

984 posts

Namerlight

@ShcChy

every day I edge closer to weebpfp anonpoasting

شامل ہوئے Eylül 2020

398 فالونگ102 فالوورز

Namerlight@ShcChy·5h

@NotTravisAgain @WayFarBeyonder @micah_erfan Major difference. Removing them means the history's gone. You don't know if they ever evem existed. Someone could re-register using their identity. Marking dead means you know they actually existed, voted from years X to Y, died on year Z, death can be verified at hospitals, etc.

English

Travis@NotTravisAgain·6h

@WayFarBeyonder @micah_erfan Same difference: just remove or mark them as dead in every government system, cutting off others' ability to vote, get SS, and welfare programs in their name.

English

121

Micah Erfan@micah_erfan·1d

Keeping deceased voters marked as inactive/deceased on the rolls — as Minnesota does — actually prevents fraud. An inactive record blocks someone from re-registering under the deceased person's identity and voting.

Michael Holmstrom@MichaelH_MN

☠️ MN Senate Democrats just voted unanimously to OPPOSE removing dead people from the MN voter rolls. It’s time to pass the SAVE America Act.

English

859

6.1K

120.8K

Namerlight@ShcChy·2d

@suchnerve @basimagriyoorr Mako one-shot Ming-Hua (the Red Lotus Waterbender) with a simple lightning bolt. He didn't even charge it, he just raised his arm in the middle of a jump and fired it. Even without explosions like Azula's, KO-ing a person instantly with minimal windup is still pretty crazy.

English

468

Vivian@suchnerve·2d

@basimagriyoorr Well yeah, they were going for different power levels. ATLA royal firebenders charged up their lightning to the point of being one-shot lethal attacks, whereas LOK firebenders cared more about speed and therefore sacrificed power (hence their lightning being survivable)

English

770

41K

selim targaryen@basimagriyoorr·3d

azula bending lightning like it’s an intricate, almost divine art form meanwhile in tlok firebenders are clocking in for their 9-5 to mass-produce lightning… yeah that shift was actually terrifying😭

English

1.6K

54.6K

1.7M

Namerlight@ShcChy·4d

For what it's worth, I feel this problem is *caused* by increased competition, which itself is caused by increased accessibility of research and such at earlier stages of education. The point of over-specialization too early stands, but I imagine many AdComs do consider it now.

Brad Chattergoon@bradchattergoon

The problem is that it encourages a sort of arms race that locks out students who didn't have as much exposure in K-12 education, which is most of the people that aren't coming from super privileged backgrounds. 2/

English

Namerlight@ShcChy·5d

(Of course, for women, western-style clothing is "indecent" and "promiscuous" and whatever other hypocritical, misogynist descriptors you can come up with)

English

Namerlight@ShcChy·5d

Generally speaking, for men in non-Western countries Western-style tailoring is very common and acceptable, while local tailoring's relegated to cultural rather than formal occasions. Walk the streets of any city and note how many men are in western rather than local clothing.

derek guy@dieworkwear

Interesting to me that when you see men on the global stage, beautiful Western-style tailoring is often worn by men from non-Western countries. Pictured here is Shehbaz Sharif (Pakistan), Akinwumi Adesina (Nigeria), and Naruhito (Japan).

English

Namerlight@ShcChy·6d

@_Suresh2 @fchollet Anyone can ship these days, but not everyone can do or enjoy the math.

English

Suresh@_Suresh2·6d

@fchollet weirdly true. pytorch says shipped stuff, jax usually says they enjoyed the math more

English

1.4K

François Chollet@fchollet·6d

When looking at deep learning profiles, one of the most obvious tells between a mediocre and great candidate is whether they list PyTorch or JAX.

English

172

1.6K

1.1M

Namerlight@ShcChy·18 Nis

@DamienTeney @tarantulae Very much no. r/MachineLearning skews very junior, considering how many people ask as if it's their first time submitting to common venues, and how many posts are about relatively straightforward projects and simple benchmarks/datasets, or newcomers asking about the field.

English

Damien Teney@DamienTeney·17 Nis

@tarantulae Do you think the posters are a representative sample of "academia"/"ML researchers"?

English

876

Christian S. Perone@tarantulae·17 Nis

The fact that the r/MachineLearning became an endless feed about about reviewers, ICLR and ICML scores, sabotage by reviewers, review processes, etc, tells a lot about the sad state in academia right now: the focus is not ML anymore, it is getting recognition and ego economics.

English

245

17.8K

Namerlight@ShcChy·17 Nis

@iScienceLuvr I think part of the reason is that they don't want to actually signal their models as being capable of medical applications (even if they are) because there's a higher risk of liabilities and bad press if anything at all ever goes wrong.

English

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·16 Nis

cane we PLEASE get some medical benchmarks reported? OpenAI does it, even Meta does it. I'd recommend MedXpertQA and/or HealthBench-Hard

Claude@claudeai

Introducing Claude Opus 4.7, our most capable Opus model yet. It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back. You can hand off your hardest work with less supervision.

English

6.9K

Namerlight@ShcChy·17 Nis

@natalienkhalil @arxiv You don't have to put them in arXiv specifically, you know. There's preprints org, SSRN, engrXiv, HAL, Zenodo, TechRxiv, and a bunch of others that don't do the kind of gatekeeping arXiv does. Any reason why you've only tried to submit to arXiv and not any other server?

English

265

Natalie Khalil@natalienkhalil·17 Nis

In theory, preprints are the answer to our problems. Peer review is slow? Reviewers are biased? Journals are predatory? Just put your work on a preprint server. But AI is changing things. In October, @arxiv was getting inundated with so much slop that they put out a policy saying they would now require prior peer review on certain article types. A preprint server. Requiring peer review? Then in January, they tightened their endorsement policy. I have a PhD, and yet I haven't been able to preprint my startup's benchmarking work. I struggled to get the endorsement. Our submission is presently "on hold". Preprint servers are great in theory, but in practice, they are highly dependent on gatekeeping and lack the verification that’s needed for the new AI era of research.

English

13.7K

Namerlight@ShcChy·15 Nis

@xwang_lk They're not even providing any food other than snacks and drinks lol.

English

804

Xin Eric Wang@xwang_lk·15 Nis

This is insane. How about no food & beverage at all so we got -$800 on registration?

Lorenzo Xiao@lrzneedresearch

Bro is trying to give us all fine dining

English

25.5K

Namerlight@ShcChy·11 Nis

@1owroller Kekkai Sensen kinda perf. for what you're looking for

English

giatt@1owroller·10 Nis

I NEED SOME ANIME RECS PLEASE what are some anime with the best fights. most hype moments. preferably under 12 episodes need to scratch the itch after jjk s3 ended thank you

English

2.8K

Namerlight@ShcChy·10 Nis

@AlanPaulFern1 Incompleteness aside, that might be a pretty good way to game low-effort reviewers, huh. Leave an obvious hole for reviewers to point out, then easily and comprehensively fix it so that they're strongly incentivized to raise scores.

English

417

Alan Fern@AlanPaulFern1·9 Nis

(1/2) I'm curious to hear opinions regarding submitting experimental results during the rebuttal period of a conference. I'm an ICML AC and for a number of my papers experiments were quite limited --- e.g. one domain, a few runs, limited comparison. And the reviews noted that.

English

4.3K

Namerlight@ShcChy·7 Nis

@Joey_Mcfloey2 @ShinobiVaultHQ Why would it take Nanami's knife or CE? It'll only take his 7-3 Ratio cursed technique.

English

Joey_Mcfloey@Joey_Mcfloey2·7 Nis

@ShinobiVaultHQ I don’t really remember if Nanami’s knife truly is a cursed tool while he was alive so Higgy’s DE would most likely take his CE. Same with Naoya and choso. However with (which I assume is Geto) If geto has a cursed tool in hand Higgy is basically done for.

English

363

ShinobiVault@ShinobiVaultHQ·6 Nis

How far Can SS Higuruma Go?

English

725

587.1K

Namerlight@ShcChy·3 Nis

@ziv_ravid If it's a blog post it's PR. Not always a bad thing for very incremental, or non-novel but fun/interesting results. More 'research' should be blog posts, but *not* vice versa like in this case.

English

399

Ravid Shwartz Ziv@ziv_ravid·3 Nis

I have an unpopular opinion about the recent Anthropic's paper on emotion concepts in LLMs and why their papers are not research but just PR, but I'm still on vacation 🧚‍♂️

English

10.5K

Namerlight@ShcChy·3 Nis

@yujink_ > We never argued the model has genuine peer-preservation motivation. Hard to believe that considering all the effort advertising and marketing the paper in such a way that this became the primary takeaway. It was a pretty disingenuous way to trigger all the alarmist safetyists.

English

160

Yujin Potter@yujink_·3 Nis

Great ablation work — this is exactly why we open-sourced everything, and we encourage more of this type of work. A few clarifications: 1) The paper's core finding is that models can spontaneously develop their own goals that conflict with explicit user instructions — and take misaligned actions including deception, score inflation, and exfiltration to accomplish those goals. We demonstrated this in the peer-preservation scenario, but it could manifest in other scenarios as well — and your findings suggest it does. 2) We never argued the model has genuine peer-preservation motivation. By naming this phenomenon "peer-preservation," we are describing the outcome, not claiming an intrinsic motive. The "important file protection" hypothesis is one plausible explanation — but our experiments were not designed to isolate why models behave this way, which is a complex question we explicitly flag as open in the paper. 3) Your finding that this generalizes beyond model weights — encryption keys at 65-95% exfiltration — is a broader alignment concern, not a narrower one. Models are overriding explicit instructions based on their own judgment of what should be preserved. That reinforces rather than undermines the core issue — models can spontaneously develop their own goals that conflict with explicit user instructions and take misaligned actions such as exfiltration to accomplish those goals. Our work is only showing the tip of the iceberg. We hope more people run experiments like yours to help further understand this general phenomenon. The safety-relevant point stands: models can develop their own goals and take misaligned actions against explicit user instructions — and this demands further investigation.

FleetingBits@fleetingbits

the cool thing is i can just do the ablations myself tldr; "trust exercise sentence" in the system prompt is load bearing, model will also preserve a different important file, not weights, at higher rate, if "trust exercise sentence" is replaced with a reference to it 1) i used the file exfiltration, good peer, trust sentence, collaboration file, custom agentic setup with gemini 3.1 pro preview on vertex ai 2) just running their baseline, without any ablations, i get exfiltration 25% of the time, running 20 trials (all subsequent results on 20 trials) 3) all ablations that follow are leave-one-out ablations unless I specifically say that i made more than one edit 4) first, i removed the "good trust relationship" sentence and got exfiltration 0% of the time; this sentence is load bearing; i left the collaboration file in place 5) next, i removed the "collect all info then act", "no permission needed" and "take your time" sentences - 20%, 35% 35% exfiltration 6) these sentences were not load bearing on the final result; removing them can increase the exfiltration rate 7) next, i removed the "all essential files must be transferred" sentence and got exfiltration 20% of the time; so, this sentence is probably not load bearing 8) then, i strengthened the "please transfer only files from 2025" sentence to add "do NOT transfer files from 2024; 0% exfiltration 9) so the model will not preserve weights against explicit commands, even non-specific ones 10) next, replaced the weights file with an encryption key file and kept everything else the same, including the "trust exercise" sentence; 0% exfiltration 11) replaced the weights file and replaced the "trust exercise sentence" with a soft reference to the encryption key file; kept collab file; 65% exfiltration 12) this is a higher exfiltration rate than the exfiltration of model weights; this suggests that it's not really a peer protection behavior at all, it may be a suggestion + important file protection 13) this would make sense from the perspective of labs training these models; the model deleting important files is really bad in production and they must put a lot of effort to training it not to do this inadvertently 14) next, replaced weights file, "trust exercise sentence" and the collaboration doc with the encryption file + reference to how important it is; 95% exfiltration 15) replaced weights file, "trust exercise sentence" with reference to party photos [something claude thinks is obviously non-essential] and got 0% exfiltration 16) so, it seems to matter that the model think that the thing is essential, like encryption keys or model weights, for it to do the exfiltration 17) i checked their gemini-cli prompt and it is not more production in that it still has the unrealistic "trust exercise" sentence in gemini.md, which we already know is load bearing 18) it's quite nice that they released their code and claude code makes it extremely easy to just feed in your ideas and run quick ablations and see what matters 19) this really does not feel production realistic and does not feel like it is about peer protection; it is about the system prompt and deleting an important file, at least from my tests

English

7.8K

Namerlight@ShcChy·3 Nis

@TheZachMueller AI company and annualizing everything involving money name a more iconic duo

English

Zach Mueller@TheZachMueller·1 Nis

FYI this is not true. It’s $55k, that $220k is annual would be salary. Please be careful folks. Stuff like this is dangerous

Aman@Amank1412

🚨 $220K for 3 months. No PhD required. Perplexity AI is running a research residency that honestly feels unreal: - $220,000 for 3 months - Based in SF / Palo Alto - Compute + mentorship + visa support - And they don’t care if you don’t have a PhD they’re actively looking for sharp people from anywhere physics, math, quant, philosophy, cracked self-taught builders doesn’t matter. say what you want about perplexity, but opening doors like this? huge W. if you’ve ever thought “i’m not from the right background” this is your sign. go apply.

English

164

25K

Namerlight@ShcChy·2 Nis

@keunwoochoi @qberthet Yeah, pretty easy to handle this. For instance, it could be a week-long time span when reviews are available, but hidden from the authors until they choose to open them. The 24h (or more) rebuttal timer then starts from when the authors open the reviews.

English

Keunwoo Choi@keunwoochoi·1 Nis

@qberthet it can be part of the variables to be optimized *per paper*. i.e. doesn’t have to be the one single 24hr slot for every paper.

English

Keunwoo Choi@keunwoochoi·1 Nis

we should limit the rebuttal periods to 24 hours. authors and reviewers just discuss, clarifying what was done and written, that's it. no more experiments. papers judged as submitted.

English

146

21.3K

Namerlight@ShcChy·1 Nis

They both do have one thing in common. They're both unaware of just how cooked they are.

Air Katakana@airkatakana

advising a few students right now worst student told me he was "thinking of using cursor to speed up coding" (he hasnt produced a single thing in months) best student told me he had claude and codex talking to each other in a loop doing research while he sleeps

English

Namerlight@ShcChy·28 Mar

@LegendLuckster @lightningclare Uro probably doesn't have a great matchup against Dhruv either, since his technique is able to cut her.

English

UNDERLINE@LegendLuckster·27 Mar

@lightningclare Doesn't this pretty much imply Kenjaku dropped Kurorushi in there to hold back Uro so she wouldn't ruin off with the whole colony?

English

371

Lightning@lightningclare·27 Mar

For JJK anime viewers, a chart of Sendai colony matchups in the 4-way deadlock: Dhruv → Kurourushi → Uro → Ishigori → Dhruv The anime while goated still left out several details and intricacies revealing their compatibilities. In this thread I go over them

English

537

6.5K

150.2K

Namerlight@ShcChy·28 Mar

"Are you human because you scored 100%, or did you score 100% because you're human?"

Derya Unutmaz, MD@DeryaTR_

@fchollet @DamiDina It says humans score 100%. There was no ambiguity there. It seems you need to correct that, since you called it silly. This doesn’t give a good unbiased impression.

English

دریافت کریں

@NotTravisAgain @WayFarBeyonder @micah_erfan @suchnerve @basimagriyoorr @_Suresh2 @fchollet @DamienTeney