Perouz Taslakian

109 posts

Perouz Taslakian

@PerouzT

I am an AI Research Scientist at Service Now. I am also an Adjunct Professor at McGill University & an industry member at MILA.

Montreal, QC Katılım Eylül 2009

231 Takip Edilen264 Takipçiler

Sabitlenmiş Tweet

Perouz Taslakian@PerouzT·15 Eki

🚀 Internship Opportunities in AI Agents! At @ServiceNowRSRCH, we have 4 internship positions on AI Agents — exploring robustness, privacy & collaboration. 🧠 Applicants must be registered students at a Canadian university. 👇 Thread with details & links to apply:

English

14.8K

Perouz Taslakian@PerouzT·24 Mar

🔬👁️ Looking for an AI intern to work on eye disease detection! Build models that combine high-res images with clinical data to automate diagnosis. 🇨🇦 Mitacs internship (8 months, full-time, ASAP) (Must be enrolled at a Canadian university) 📩 lnkd.in/enPiEsuJ #McGill

English

129

Perouz Taslakian retweetledi

Alexandre Drouin@alexandredrouin·24 Mar

The NeurIPS Datasets & Benchmarks Track is now the Evaluations & Datasets (ED) Track. It now treats evaluation as a scientific object of study in its own right. Datasets/benchmarks still fully in scope. 👉Details: blog.neurips.cc/2026/03/23/int… We look forward to your submissions!

NeurIPS Conference@NeurIPSConf

The Datasets & Benchmarks track is now "Evaluation and Datasets", with an expanded scope for NeurIPS 2026! Read the call for papers neurips.cc/Conferences/20…, and learn more about the changes in our blog post: blog.neurips.cc/2026/03/23/int…

English

4.2K

Perouz Taslakian retweetledi

Siva Reddy@sivareddyg·15 Oca

Thoughtology paper -- the study of reasoning chains of thinking models -- is now published at TMLR. Since we wrote the paper, a lot has changed. Many more models have been released with open-weights. 1. These models are no longer thinking verbosely. GPT-OSS has crisper thoughts than Qwen3/R1. 2. GPT-OSS almost never self-verifies or tries alternate solutions. 3. Qwen3 has a large bloom step (initial solution) than R1. Among commonalities: 4. All of them still have a problem-specific sweet spot (i.e., overthinking doesn't help) 5. Incorrect problems still have a longer chain length. On another note, thanks to @TmlrOrg for allowing us to submit a ridiculously long paper :). 135 pages in total. We thank reviewers and AE for their time. This is the first paper where every member of the group contributed to it! Special thanks to @saraveramarjano and @arkil_patel. We have a documentary around it taken by @CBCNews and @binhanv, hopefully you will get to see it one day. Thanks to @SimonsInstitute for letting us work on this during their LLM2 semester program. @IVADO_Qc for the funding, and @Mila_Quebec members for the feedback. Full paper: openreview.net/forum?id=BZwKs…

Sara Vera Marjanović@saraveramarjano

🚨Thoughtology is now accepted to #TMLR! We've added some new analyses, most notably: 🌟 We quantify rumination; repetitive thoughts are associated with incorrect responses 🌟 We add 2 LRMs: gpt-oss and Qwen3. Both show a reasoning 'sweet spot' See 📃 : openreview.net/pdf?id=BZwKsiR…

English

260

29.5K

Perouz Taslakian retweetledi

Alexandre Drouin@alexandredrouin·22 Eki

Excited to speak at the AAAI-26 Workshop on Agentic AI Benchmarks & Enterprise Tasks (Jan 26, Singapore) 🇸🇬 As agents are rapidly productized, realistic enterprise benchmarks for capabilities and reliability are essential! Submit: openreview.net/group?id=AAAI.… 🗓️ Oct 29 cc @gneubig

English

447

Perouz Taslakian@PerouzT·15 Eki

4️⃣ Agent Memory Poisoning Attacks Detecting and defending against corrupted memories. Apply 👉 tinyurl.com/SNOWMemPoisoni…

English

465

Perouz Taslakian@PerouzT·15 Eki

3️⃣ BlackBox Whisperer Tuning smaller open models to collaborate effectively with large black-box LLMs. Apply👉 tinyurl.com/SNOWBlackBox

English

548

Perouz Taslakian@PerouzT·15 Eki

English

14.8K

Perouz Taslakian retweetledi

Christos Tsirigotis@tsirigoc·7 Eki

A big shoutout to amazing collaborators @vaibhav_adlakha @joaomonteirof @AaronCourville @PerouzT Come find us at poster 55, 11am-1pm, on Tuesday to learn more! arxiv.org/abs/2508.06781 #COLM2025 #informationretrieval #denseencoders

English

241

Perouz Taslakian retweetledi

ServiceNow AI Research@ServiceNowRSRCH·30 Eyl

SLAM Labs presents Apriel-1.5-15B-Thinker 🚀 An open-weights multimodal reasoning model that hits frontier-level performance with just a fraction of the compute.

English

334

59.7K

Perouz Taslakian retweetledi

Massimo Caccia@MassCaccia·18 Eyl

See you in San Diego 🚀 #NeurIPS2025

Massimo Caccia@MassCaccia

🎉 Our paper “𝐻𝑜𝑤 𝑡𝑜 𝑇𝑟𝑎𝑖𝑛 𝑌𝑜𝑢𝑟 𝐿𝐿𝑀 𝑊𝑒𝑏 𝐴𝑔𝑒𝑛𝑡: 𝐴 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑎𝑙 𝐷𝑖𝑎𝑔𝑛𝑜𝑠𝑖𝑠” got an 𝐨𝐫𝐚𝐥 at next week’s 𝗜𝗖𝗠𝗟 𝗪𝗼𝗿𝗸𝘀𝗵𝗼𝗽 𝗼𝗻 𝗖𝗼𝗺𝗽𝘂𝘁𝗲𝗿 𝗨𝘀𝗲 𝗔𝗴𝗲𝗻𝘁𝘀! 🖥️🧠 We present the 𝐟𝐢𝐫𝐬𝐭 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞 𝐬𝐭𝐮𝐝𝐲 𝐨𝐟 𝐜𝐨𝐦𝐩𝐮𝐭𝐞 𝐭𝐫𝐚𝐝𝐞-𝐨𝐟𝐟𝐬 between pure SFT, pure RL, and hybrid SFT+RL for multi-step agents. SFT ➡️ RL pushes the Pareto front — and it's the 𝐨𝐧𝐥𝐲 strategy that closes the gap with closed models! 👇🧵

English

5.1K

Perouz Taslakian@PerouzT·18 Eyl

🎉Congratulations to all the authors for this great work -- specially to @Ahmed_Masry97 for his perseverance through the highs and lows of this project 😀 Excited to see AlignVLM accepted to #NeurIPS2025! @ServiceNowRSRCH

Ahmed Masry@Ahmed_Masry97

Excited to announce that AlignVLM got accepted to NeurIPS! 🎉🥳 We’ll be releasing the code and sharing an updated version of the paper with reviewer feedback soon. #NeurIPS2025

English

783

Perouz Taslakian retweetledi

Rabiul Awal@_rabiulawal·16 Eyl

🚨Exciting news! Our paper “WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation” is accepted for an oral presentation at EMNLP 2025! 🎉 WebMMU addresses a critical gap in AI evaluation: how well can models understand and build websites? 🧵1/n

English

2.8K

Perouz Taslakian retweetledi

Ahmed Masry@Ahmed_Masry97·21 Ağu

UI-Vision vs GPT-5: Still holding the crown 👑 and far from saturation. GPT-5 has strengths in coding and reasoning, but when it comes to computer-use tasks, it’s still awkward to rely on it alone. And our team's UI-Vision (ICML 2025) remains a key and still unbeaten multimodal eval framework for screen understanding and grounding. What we continue to see: focused training is essential to beat our evals, and this is exactly where open-source models have been shining. A big thanks to research teams at Microsoft, OpenCUA, and UI-Tars for actively using UI-Vision to push the limits of visual screen understanding. If you are working on VLMs or screen grounding applications for ICLR submissions, UI-Vision is the place to measure and improve your systems. And we are only getting started: our next, UI-Vision-Grounding, is on the way🚀. It brings a larger dataset that the community can make use of, harder grounding tasks, and new training recipes to help models level up in grounding abilities. 🔗uivision.github.io 📜arxiv.org/abs/2503.15661 Big kudos to all our partners and collaborators who made this possible! @ServiceNowRSRCH, @turingcom, @Mila_Quebec, @PShravannayak, @EdwardJian2, @aarashfeizi, @gspandana, @PerouzT, Qinghong Lin, @chrisjpal, @_rabiulawal, @dvazquezcv, @joanrod_ai, @RajeswarSai

English

1.8K

Perouz Taslakian@PerouzT·10 Haz

🚀 We just released the final test split of #RepLiQA —our dataset for evaluating QA on truly unseen content! 📚 Dataset: huggingface.co/datasets/Servi… 📝 NeurIPS ’24: neurips.cc/virtual/2024/p… Big thanks to my amazing co-authors @ @ServiceNowRSRCH ! 🙌 #RAG #LLMs #NLP #QA

English

396

Perouz Taslakian retweetledi

NewInML @ NeurIPS 2025@NewInML·9 Haz

New to ML research? Never published at ICML? Don't miss this! Check out the New in ML workshop at ICML 2025 — no rejections, detailed feedback, awards, and ICML tickets for selected authors. Deadline: June 10 (AoE) Submit: openreview.net/group?id=ICML.… Info: newinml.github.io

English

1.8K

Perouz Taslakian retweetledi

Joan Rodriguez@joanrod_ai·28 May

Thanks @_akhaliq for sharing our work! Excited to present our next generation of SVG models, now using Reinforcement Learning from Rendering Feedback (RLRF). 🧠 We think we cracked SVG generalization with this one. Go read the paper! arxiv.org/abs/2505.20793 More details on the demo, code, and models coming soon! Stay tuned 💫

AK@_akhaliq

Rendering-Aware Reinforcement Learning for Vector Graphics Generation RLRF significantly outperforms supervised fine-tuning, addressing common failure modes and enabling precise, high-quality SVG generation with strong structural understanding and generalization

English

123

16.2K

Perouz Taslakian retweetledi

Patrice Bechard@patricebechard·28 May

🚀 New paper from our team at @ServiceNowRSRCH!⁣ ⁣ 💫𝐒𝐭𝐚𝐫𝐅𝐥𝐨𝐰: 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐒𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞𝐝 𝐖𝐨𝐫𝐤𝐟𝐥𝐨𝐰 𝐎𝐮𝐭𝐩𝐮𝐭𝐬 𝐅𝐫𝐨𝐦 𝐒𝐤𝐞𝐭𝐜𝐡 𝐈𝐦𝐚𝐠𝐞𝐬⁣ We use VLMs to turn 𝘩𝘢𝘯𝘥-𝘥𝘳𝘢𝘸𝘯 𝘴𝘬𝘦𝘵𝘤𝘩𝘦𝘴 and diagrams into executable workflows. 🖍️→⚙️⁣ ⁣ 🔗arxiv.org/abs/2503.21889⁣ 📝tinyurl.com/3utdbn97⁣ #Sketch2Flow #AI #VLM

English

3.9K

Perouz Taslakian@PerouzT·15 May

Our team has released the UI-Vision benchmark (accepted at #ICML2025) for testing GUI agent visual grounding and action prediction! 🚀🚀🚀 🤗 Dataset: huggingface.co/datasets/Servi… Special thanks to the students to lead this effort, @PShravannayak and @EdwardJian2 @ServiceNowRSRCH

P Shravan Nayak@PShravannayak

🚀 Excited to share that UI-Vision has been accepted at ICML 2025! 🎉 We have also released the UI-Vision grounding datasets. Test your agents on it now! 🚀 🤗 Dataset: huggingface.co/datasets/Servi… #ICML2025 #AI #DatasetRelease #Agents

English

726

Keşfet

@TmlrOrg @saraveramarjano @arkil_patel @CBCNews @binhanv @SimonsInstitute @IVADO_Qc @Mila_Quebec