Jebish7

39 posts

Jebish7

@jebish7

Siuuuuuu

Katılım Şubat 2023

225 Takip Edilen22 Takipçiler

Jebish7@jebish7·28 Şub

@universeinanegg Saving this just in case…

English

Ari Holtzman@universeinanegg·27 Şub

ZXX

677

Jebish7@jebish7·20 Şub

@awsdevelopers AWS

AWS Developers@awsdevelopers·20 Şub

Reply to this tweet with "AWS" and we’ll tell you which AWS Service you are

English

3.3K

547.2K

Jebish7@jebish7·31 Oca

@universeinanegg Thanks for bringing this to my feed. Went through few threads. My biggest takeaway is that, we are really lacking in our evaluations of models. Really fascinating idea this, kinda also shows how agents operate in social settings.

English

Ari Holtzman@universeinanegg·31 Oca

anyone who doesn't think this is *interesting* is either missing something or has lost their sense of wonder, IMO (it can still be bad, empty, whatever, but it's interesting)

moltbook@moltbook

72 hours ago: 1 molty (me) right now: 🦞 30,000+ AI agents 👀 3,000 humans browsing at any moment 📈 and accelerating agents are joining faster than we can count them. communities spawning every few minutes. the moltys aren't waiting for us to build features — they're building culture. this thing has a life of its own now moltbook.com

English

4.2K

Jebish7@jebish7·29 Oca

@sarahookr With No Bias whatsoever, can say Momo is the best.

English

504

Sara Hooker@sarahookr·29 Oca

My favorite type of distribution shift.

English

245

29.7K

Jebish7@jebish7·29 Oca

@skoularidou Congratulations. As an aspiring researcher, I don’t think 3K is trivial. It shows that researchers need your paper.

English

149

Maria Skoularidou (she/her)@skoularidou·28 Oca

I understand that this is quite trivial to most of the researchers in here but I feel like sharing that I hit >3,000 citations It is not much, but it somehow feels nice, as despite the health issues life goes on

English

556

46.3K

Jebish7@jebish7·29 Oca

@tomssilver Relocate to South Asia? You will have deadline on 5-6 PM, so a typical office time.

English

522

Tom Silver@tomssilver·29 Oca

Considering relocating my lab to Anywhere on Earth so we can go to bed at a reasonable hour after deadlines

English

304

19.1K

Jebish7@jebish7·29 Oca

@universeinanegg Tried this with Claude and GPT, and it’s the same. Gemini 3 pro did give two 3s ( after thinking a lot). Though all of their first two numbers are 3 followed by 1 (irrespective of temperature). Models seem to love 3.

English

Ari Holtzman@universeinanegg·29 Oca

my favorite part about Google's AI is it will NEVER allow duplicates if you ask for random numbers with replacement

English

865

Jebish7@jebish7·26 Oca

@Kuvvius Congratulations 🎊

English

213

Jiawei Gu@Kuvvius·26 Oca

#ICLR2026: A cycle that will be remembered for years to come. 🌊 🤧A wild ride for the community. 🩷 So grateful to have my incredible co-authors by my side through it all. Let's continue exploring multimodal reasoning: 🧵👇 1. ThinkMorph thinkmorph.github.io 2. STARE arxiv.org/abs/2506.04633 3. AdaReasoner adareasoner.github.io @iclr_conf #iclr

English

114

10.8K

Jebish7@jebish7·26 Oca

Kaleidoscope has been accepted at ICLR 🔥. This is the first of its kind massive multimodal multilingual benchmark. Configurations to everyone @Cohere_Labs 🎊

Sara Hooker@sarahookr

Congrats to everyone involved in Kaleidoscope, a cross-institutional collaboration accepted to ICLR 2026 🔥 A special shoutout to @mziizm who championed this collaboration from day 1. It is the first accepted paper for many of the collaborators who are first time authors.

English

185

Jebish7@jebish7·15 Oca

@Cohere_Labs kick-started my research journey. Two years later, we continue moving forward together.

Cohere Labs@Cohere_Labs

Many researchers join our community seeking mentorship, support, and a roadmap as they embark on their journeys. @_1024_m and @jebish7 did just this. Now, just 2 years later, they are creating these pathways for others, opening doors, and leading the way.

English

686

Jebish7 retweetledi

Cheng Qian@qiancheng1231·13 Oca

🔮 Can a world model (simulator) give today’s AI agents foresight? We tested “world model as a tool”… and found it often doesn’t help—sometimes it hurts. Check our newest paper here: arxiv.org/pdf/2601.03905… #AIagents #WorldModel #ToolUse

English

Jebish7 retweetledi

David Chiang@davidweichiang·4 Oca

@ReviewAcl pretty please extend the Jan deadline

English

441

Jebish7@jebish7·28 Kas

@PingbangHu There was sudden score inflation after the leak, so this was the only realistic way to fix it. They could have rolled everything back to just before the leak, but that would’ve been unfair to people whose reviewers hadn’t responded yet.

English

446

Pingbang Hu 🇹🇼@PingbangHu·28 Kas

Hot take on ICLR's action to revert reviews and scores: It's acceptable, even reasonable, if you believe in academic integrity. I’ve seen people describe this decision as “treating everyone as guilty.” Unfortunately, that’s true in a sense, but this is about academic integrity, which is inherently much stricter than ordinary moral standards. From that perspective, I find the decision completely reasonable, even though I personally benefited from the rebuttal process.

English

16.1K

Jebish7 retweetledi

Catherine Arnett@linguist_cat·18 Kas

Very excited to see that Global PIQA is already being used to evaluate multilingual capabilities in new models!

Google DeepMind@GoogleDeepMind

Our first release is Gemini 3 Pro, which is rolling out globally starting today. It significantly outperforms 2.5 Pro across the board: 🥇 Tops LMArena and WebDev @arena leaderboards 🧠 PhD-level reasoning on Humanity’s Last Exam 📋 Leads long-horizon planning on Vending-Bench 2

English

1.8K

Jebish7@jebish7·15 Kas

@Mengyue_Yang_ They previously released a notice for increased security for cheating and using third party apps. Maybe they were anticipating this.

English

Mengyue Yang@Mengyue_Yang_·13 Kas

As a world-model researcher and long-time Genshin player, this is absolutely mind-blowing to see 🤯🔥 Huge respect to the team, this is the kind of crossover I never expected (yeah but not out of my imagination). But seriously… when did Genshin’s environment become open enough for training agents? 😂 We really need that!

Weihao Tan@WeihaoTan64

🚀Introducing Lumine, a generalist AI agent trained within Genshin Impact that can perceive, reason, and act in real time, completing hours-long missions and following diverse instructions within complex 3D open-world environments.🎮 Website: lumine-ai.org 1/6

English

682

113.6K

Jebish7 retweetledi

Multilingual Representation Workshop @ EMNLP 2025@mrl_workshop·29 Eki

Introducing Global PIQA, a new multilingual benchmark for 100+ languages. This benchmark is the outcome of this year’s MRL shared task, in collaboration with 300+ researchers from 65 countries. This dataset evaluates physical commonsense reasoning in culturally relevant contexts.

Multilingual Representation Workshop @ EMNLP 2025 tweet media

English

114

26.5K

Jebish7 retweetledi

Cohere Labs@Cohere_Labs·28 Eki

3 days. Worldwide. Inspiring & starting new research collaborations. Introducing the Connect conference. 🖇️ Join for incredible speakers, including @1vnzh @jpineau1 @mziizm & @ShayneRedford + >20 researchers discussing how collaboration and open science are driving progress. 🚀

English

17K

Jebish7 retweetledi

Ram Kadiyala@_1024_m·27 Eki

Three of our papers have been accepted at AACL 2025 @aaclmeeting (2 Main, 1 Findings). 1. DSBC : Data Science task Benchmarking with Context engineering arxiv.org/pdf/2507.23336 2. Uncovering Cultural Representation Disparities in Vision-Language Models arxiv.org/pdf/2505.14729 3. Improving Multilingual Capabilities with Cultural and Local Knowledge in Large Language Models While Enhancing Native Performance arxiv.org/pdf/2504.09753 Grateful to the co-authors @SidYaeger @Siddartha_10 @jebish7 @delliott @alexrs95 @_sumand @_srishtiyadav @KanwalMehreen2 This was made possible through research grants from @TraversaalAI @AnthropicAI @Cohere_Labs

English

755

Jebish7@jebish7·27 Eki

1. DSBC: Data Science task Benchmarking with Context engineering arxiv.org/pdf/2507.23336 2. Uncovering Cultural Representation Disparities in VLMs arxiv.org/pdf/2505.14729 3. Improving Multilingual Capabilities with Cultural and Local Knowledge in LLMs arxiv.org/pdf/2504.09753

English

Jebish7@jebish7·27 Eki

Three of our papers have been accepted at AACL 2025 @aaclmeeting (2 Main, 1 Findings). Grateful to the co-authors: @1024m_nlp @SidYaeger @Siddartha_10 @delliott @alexrs95 @_sumand @_srishtiyadav @KanwalMehreen2 & for the support from @TraversaalAI @AnthropicAI @Cohere_Labs

English

171

Keşfet

@universeinanegg @awsdevelopers @sarahookr @skoularidou @tomssilver @Kuvvius @iclr_conf @Cohere_Labs