HKUST NLP

31 posts

HKUST NLP banner
HKUST NLP

HKUST NLP

@hkustNLP

HKUST Natural Language Processing Research #NLProc

Katılım Ekim 2024
113 Takip Edilen262 Takipçiler
HKUST NLP retweetledi
HKUST NLP retweetledi
Zhitao He
Zhitao He@ZhouHE777·
Excited to present our work, MMBoundary, at #ACL2025! Come chat with us at our poster session! 📍 Hall 4/5, Session 12: Poster Session 4 🗓️ Wednesday, July 30 ⏰ 11:00-12:30
Zhitao He tweet media
English
3
9
19
1.7K
HKUST NLP retweetledi
Zhitao He
Zhitao He@ZhouHE777·
🤯 Multimodal LLMs can be confidently wrong. A single early mistake in perception can lead to a completely incorrect answer. 🚀Introducing our work, MMBoundary, a new framework to make MLLMs aware of their own knowledge boundaries! 🧵 Paper:arxiv.org/abs/2505.23224 #ACL2025
Zhitao He tweet media
English
0
4
12
1.3K
HKUST NLP retweetledi
Jianshu Zhang
Jianshu Zhang@SterZhang·
Find us at the poster booth ✨Wed 7/30 11am Hall 4/5 ✨
Yi R. (May) Fung@May_F1_

[1/n] "𝘔𝘢𝘵𝘤𝘩𝘪𝘯𝘨 𝘤𝘶𝘦𝘴 𝘧𝘰𝘳 𝘪𝘥𝘦𝘯𝘵𝘪𝘤𝘢𝘭 𝘰𝘣𝘫𝘦𝘤𝘵𝘴, 𝘥𝘪𝘴𝘵𝘪𝘯𝘤𝘵 𝘢𝘵𝘵𝘳𝘪𝘣𝘶𝘵𝘦𝘴 𝘧𝘰𝘳 𝘶𝘯𝘪𝘲𝘶𝘦 𝘰𝘯𝘦𝘴." Such 𝙘𝙧𝙤𝙨𝙨-𝙘𝙤𝙣𝙩𝙚𝙭𝙩 𝙫𝙞𝙨𝙪𝙖𝙡 𝙧𝙚𝙖𝙨𝙤𝙣𝙞𝙣𝙜 is extremely simple and straightforward for the human cognitive process, but 𝗾𝘂𝗶𝘁𝗲 𝗰𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗶𝗻𝗴 𝗳𝗼𝗿 𝗰𝘂𝗿𝗿𝗲𝗻𝘁 𝗹𝗮𝗿𝗴𝗲 𝘃𝗶𝘀𝗶𝗼𝗻 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗺𝗼𝗱𝗲𝗹𝘀 (𝗟𝗩𝗟𝗠𝘀), especially across multiple images and videos ‼️ 𝙒𝙝𝙮, and 𝙝𝙤𝙬 𝙘𝙖𝙣 𝙬𝙚 𝙞𝙢𝙥𝙧𝙤𝙫𝙚 upon this major shortcoming in future research? 🔍 🔥🚀 Check out our latest release, “𝗩𝗟𝗠𝟮 -𝗕𝗲𝗻𝗰𝗵: 𝗔 𝗖𝗹𝗼𝘀𝗲𝗿 𝗟𝗼𝗼𝗸 𝗮𝘁 𝗛𝗼𝘄 𝗪𝗲𝗹𝗹 𝗩𝗟𝗠𝘀 𝗜𝗺𝗽𝗹𝗶𝗰𝗶𝘁𝗹𝘆 𝗟𝗶𝗻𝗸 𝗘𝘅𝗽𝗹𝗶𝗰𝗶𝘁 𝗠𝗮𝘁𝗰𝗵𝗶𝗻𝗴 𝗩𝗶𝘀𝘂𝗮𝗹 𝗖𝘂𝗲𝘀”! 🚀🔥 🔗 Project page: vlm2-bench.github.io 📑 Paper: arxiv.org/abs/2502.12084 💻 Github: github.com/vlm2-bench/VLM… 🤗 Huggingface: huggingface.co/datasets/Sterz… --- Work led by our amazing team of students Jianshu, Dongyu, and Renjie at The Hong Kong University of Science and Technology. Find us at the poster booth ✨Wed 7/30 11am Hall 4/5 ✨

English
0
5
10
872
HKUST NLP retweetledi
Yi R. (May) Fung
Yi R. (May) Fung@May_F1_·
[1/n] "𝘔𝘢𝘵𝘤𝘩𝘪𝘯𝘨 𝘤𝘶𝘦𝘴 𝘧𝘰𝘳 𝘪𝘥𝘦𝘯𝘵𝘪𝘤𝘢𝘭 𝘰𝘣𝘫𝘦𝘤𝘵𝘴, 𝘥𝘪𝘴𝘵𝘪𝘯𝘤𝘵 𝘢𝘵𝘵𝘳𝘪𝘣𝘶𝘵𝘦𝘴 𝘧𝘰𝘳 𝘶𝘯𝘪𝘲𝘶𝘦 𝘰𝘯𝘦𝘴." Such 𝙘𝙧𝙤𝙨𝙨-𝙘𝙤𝙣𝙩𝙚𝙭𝙩 𝙫𝙞𝙨𝙪𝙖𝙡 𝙧𝙚𝙖𝙨𝙤𝙣𝙞𝙣𝙜 is extremely simple and straightforward for the human cognitive process, but 𝗾𝘂𝗶𝘁𝗲 𝗰𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗶𝗻𝗴 𝗳𝗼𝗿 𝗰𝘂𝗿𝗿𝗲𝗻𝘁 𝗹𝗮𝗿𝗴𝗲 𝘃𝗶𝘀𝗶𝗼𝗻 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗺𝗼𝗱𝗲𝗹𝘀 (𝗟𝗩𝗟𝗠𝘀), especially across multiple images and videos ‼️ 𝙒𝙝𝙮, and 𝙝𝙤𝙬 𝙘𝙖𝙣 𝙬𝙚 𝙞𝙢𝙥𝙧𝙤𝙫𝙚 upon this major shortcoming in future research? 🔍 🔥🚀 Check out our latest release, “𝗩𝗟𝗠𝟮 -𝗕𝗲𝗻𝗰𝗵: 𝗔 𝗖𝗹𝗼𝘀𝗲𝗿 𝗟𝗼𝗼𝗸 𝗮𝘁 𝗛𝗼𝘄 𝗪𝗲𝗹𝗹 𝗩𝗟𝗠𝘀 𝗜𝗺𝗽𝗹𝗶𝗰𝗶𝘁𝗹𝘆 𝗟𝗶𝗻𝗸 𝗘𝘅𝗽𝗹𝗶𝗰𝗶𝘁 𝗠𝗮𝘁𝗰𝗵𝗶𝗻𝗴 𝗩𝗶𝘀𝘂𝗮𝗹 𝗖𝘂𝗲𝘀”! 🚀🔥 🔗 Project page: vlm2-bench.github.io 📑 Paper: arxiv.org/abs/2502.12084 💻 Github: github.com/vlm2-bench/VLM… 🤗 Huggingface: huggingface.co/datasets/Sterz… --- Work led by our amazing team of students Jianshu, Dongyu, and Renjie at The Hong Kong University of Science and Technology. Find us at the poster booth ✨Wed 7/30 11am Hall 4/5 ✨
Yi R. (May) Fung tweet media
English
0
4
8
1.4K
HKUST NLP retweetledi
Yi R. (May) Fung
Yi R. (May) Fung@May_F1_·
Heading out to #ACL2025 in Vienna with six main/finding papers to present! 🇦🇹✈️🤩 Would love to chat about research on multimodal model reasoning and agent, as well as opportunities in my group @hkustNLP. Please DM if you'd like to meet!
Yi R. (May) Fung tweet media
English
1
7
35
2.7K
HKUST NLP retweetledi
Zeyu Qin
Zeyu Qin@ZeyuQin_alan·
#COLM2025 Our work has been accepted to COLM 2025😊 Looking forward to discussing Scalable Oversight and Synthetic Data with old and new friends in Montréal @COLM_conf
Yi R. (May) Fung@May_F1_

🚀 Data to pre-train LLMs on are reaching critical bottleneck. 𝘿𝙤𝙚𝙨 𝙢𝙤𝙙𝙚𝙡-𝙜𝙚𝙣𝙚𝙧𝙖𝙩𝙚𝙙 𝙨𝙮𝙣𝙩𝙝𝙚𝙩𝙞𝙘 𝙙𝙖𝙩𝙖 𝙬𝙤𝙧𝙠 𝙨𝙞𝙢𝙞𝙡𝙖𝙧𝙡𝙮 𝙬𝙚𝙡𝙡 𝙛𝙤𝙧 𝙨𝙘𝙖𝙡𝙞𝙣𝙜 𝙥𝙧𝙚-𝙩𝙧𝙖𝙞𝙣𝙞𝙣𝙜 𝙛𝙪𝙧𝙩𝙝𝙚𝙧? Let's dive into the "𝗦𝗰𝗮𝗹𝗶𝗻𝗴 𝗟𝗮𝘄𝘀 𝗼𝗳 𝗦𝘆𝗻𝘁𝗵𝗲𝘁𝗶𝗰 𝗗𝗮𝘁𝗮 𝗳𝗼𝗿 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗠𝗼𝗱𝗲𝗹𝘀" together! Several key findings: ⭐️ 𝗚𝗿𝗮𝗽𝗵-𝗯𝗮𝘀𝗲𝗱 𝗲𝘅𝘁𝗿𝗮𝗰𝘁𝗶𝗼𝗻 & 𝗿𝗲𝗰𝗼𝗺𝗯𝗶𝗻𝗮𝘁𝗶𝗼𝗻 of concepts across docs transforms pre-training corpora into 𝘥𝘪𝘷𝘦𝘳𝘴𝘦, 𝘩𝘪𝘨𝘩-𝘲𝘶𝘢𝘭𝘪𝘵𝘺 𝘴𝘺𝘯𝘵𝘩𝘦𝘵𝘪𝘤 𝘥𝘢𝘵𝘢𝘴𝘦𝘵𝘴; ⭐️𝗣𝗿𝗲𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝘄𝗶𝘁𝗵 𝘀𝘆𝗻𝘁𝗵𝗲𝘀𝗶𝘇𝗲𝗱 𝗱𝗮𝘁𝗮 reliably follows 𝘳𝘦𝘤𝘵𝘪𝘧𝘪𝘦𝘥 𝘴𝘤𝘢𝘭𝘪𝘯𝘨 𝘭𝘢𝘸 across various model sizes; ⭐️ 𝗟𝗮𝗿𝗴𝗲𝗿 𝗺𝗼𝗱𝗲𝗹𝘀 approach optimal performance with 𝙛𝙚𝙬𝙚𝙧 𝙩𝙧𝙖𝙞𝙣𝙞𝙣𝙜 𝙩𝙤𝙠𝙚𝙣𝙨; ⭐️ 𝘿𝙞𝙢𝙞𝙣𝙞𝙨𝙝𝙞𝙣𝙜 𝙧𝙚𝙩𝙪𝙧𝙣𝙨 start to kick in near 300B tokens. 👇Check preprint for further details~ 🔗 arxiv.org/html/2503.1955…

English
0
6
25
2.2K
HKUST NLP retweetledi
Zhaochen Su
Zhaochen Su@SuZhaochen0110·
Excited to share our new survey on the reasoning paradigm shift from "Think with Text" to "Think with Image"! 🧠🖼️ Our work offers a roadmap for more powerful & aligned AI. 🚀 📜 Paper: arxiv.org/pdf/2506.23918 ⭐ GitHub (400+🌟): github.com/zhaochen0110/A…
Zhaochen Su tweet mediaZhaochen Su tweet mediaZhaochen Su tweet mediaZhaochen Su tweet media
English
7
64
159
16.1K
HKUST NLP retweetledi
Yangqiu Song
Yangqiu Song@yqsong·
Thrilled to share a major milestone: the culmination of a 15-month project, ATLAS—a new benchmark in event graphs and conceptualization! This journey began with Probase in 2012, evolved through ASER (2019), AbstractATOMIC (2022), and AbsPyramid (2023), and now realizes a decade-long dream of building a web-scale knowledge graph integrating entities, events, and a conceptualization framework. It’s also a tribute to Marvin Minsky’s K-Lines Theory. I still vividly recall 2012 when I worked with Haixun, including intern Fangting Xia, spent months exploring event semantic primitives and conceptualization, only to pivot due to limitations. Fast forward to the era of large language models, and we’ve finally brought this vision to life! Someone argued that knowledge graphs are obsolete in the age of large models. We disagree—it’s about having high-quality graphs that scale with these models. That’s why we invested HK$1.5M to process Wikipedia, academic abstracts, and 3% of web data from Dolma, creating a 1B-node graph—the best we could achieve. While entities cover 70% of factual queries, adding event semantic primitives boosts coverage to 90%. This is why text decomposition is gaining traction. Our extracted knowledge graph excels on complex factual queries like HotpotQA, MMLU, and FELM, making it the largest open-source GraphRAG dataset available. This work is the result of 10+ students and nearly 10 collaborators pouring their hearts into it. I’m deeply grateful for their contributions to this meaningful milestone, and I hope they’ve gained as much from the journey as I have. Big shoutout to anyone curious about pushing this further! Parsing the entire web’s data would cost ~HK$40M, and I’d love to collaborate with bold thinkers to make it happen. Paper: arxiv.org/abs/2505.23628 Code & Resources: github.com/HKUST-KnowComp… Project: pypi.org/project/atlas-… Feedback and critiques are welcome! Let’s keep advancing the field together. #KnowledgeGraphs #ATLAS #GraphRAG
Yangqiu Song tweet media
English
2
7
41
4.3K
HKUST NLP retweetledi
Junxian He
Junxian He@junxian_he·
We studied both rule-based and model-based verifiers and found that each has unique limitations. Rule-based verifiers are often unreliable, even in math, and are unavailable in many domains. Model-based verifiers can be easily hacked. In our paper, we construct simple adversarial patterns to successfully crack several open model-based verifiers. 🧐Interestingly, as the community shifts toward generative verifiers, we found them to be far more vulnerable than discriminative ones. This suggests that discriminative verifiers might be more robust against reward hacking in RLVR.
Yuzhen Huang@yuzhenh17

🔍 Are Verifiers Trustworthy in RLVR? Our paper, Pitfalls of Rule- and Model-based Verifiers, exposes the critical flaws in reinforcement learning verification for mathematical reasoning. 🔑 Key findings: 1️⃣ Rule-based verifiers miss correct answers, especially when presented in different formats. 2️⃣ Model-based verifiers are susceptible to reward hacking, undermining RL outcomes. 3️⃣ A probe study reveals that most model-based verifiers, particularly generative ones (like those using chain-of-thought reasoning), are highly vulnerable to adversarial attacks. 📄Paper: arxiv.org/abs/2505.22203 💻Code & Model & Dataset: github.com/hkust-nlp/RL-V… Special thanks to my amazing collaborators @AndrewZeng17, Xingshan Zeng, and Qi Zhu, and to our advisor @junxian_he! 🥰

English
2
17
137
15.6K
HKUST NLP retweetledi
Yi R. (May) Fung
Yi R. (May) Fung@May_F1_·
Great to see the wonderful series of work that @WangCarrey has been leading at UIUC. We also had a fun collaboration recently together with my incoming PhD student Shijue. Check out our latest release 𝘈𝘥𝘢𝘊𝘵𝘳𝘭: 𝘈𝘥𝘢𝘱𝘵𝘪𝘷𝘦 𝘢𝘯𝘥 𝘊𝘰𝘯𝘵𝘳𝘰𝘭𝘭𝘢𝘣𝘭𝘦 𝘙𝘦𝘢𝘴𝘰𝘯𝘪𝘯𝘨 𝘷𝘪𝘢 𝘋𝘪𝘧𝘧𝘪𝘤𝘶𝘭𝘵𝘺-𝘈𝘸𝘢𝘳𝘦 𝘉𝘶𝘥𝘨𝘦𝘵𝘪𝘯𝘨 🔥🔥🔥 . . . arxiv.org/pdf/2505.18822
Yi R. (May) Fung tweet media
Hongru Wang@HongruWang007

Time flies. My visiting at UIUC is coming to an end. I’m sincerely grateful to Prof. Heng Ji @hengjinlp for the invaluable opportunity to join such a warm, collaborative, and world-leading research group. I’ve learned so much from Prof. Heng and lots of amazing students here @qiancheng1231 @xiusi_chen @emrecanacikgoz @BowenJin13 @zhenhailongW @Yuji_Zhang_NLP @Glaciohound. This experience will always be a defining chapter of my Ph.D. life, filled with friendships, collaborations, and unforgettable moments. Looking back, we truly did something together for the world — from SMART to ToolRL, OTC, RM-R1, ModelingAgent, and DecisionFlow, with even more impactful work on the way. I’ll always be cheering for BlenderLab and looking forward to what’s next!

English
2
8
22
8.3K
HKUST NLP retweetledi
Wenhao Yu
Wenhao Yu@wyu_nd·
🚀 We release MMLongBench: Benchmark for evaluating long-context VLMs. 📊 13,331 examples across 5 tasks: – Visual RAG – Many-shot ICL – Needle-in-a-haystack – VL Summarization – Long-document VQA 📏 Lengths: 8 / 16 / 32 / 64 / 128K 🔍 Benchmarking both thoroughly & effectively!
Wenhao Yu tweet media
English
4
31
138
23.1K