Haixun Wang

221 posts

Haixun Wang

Haixun Wang

@haixunwang

VP Engineering & Head of AI @ EvenUp, ex Instacart, Amazon, Facebook, Google, Microsoft, IBM

USA Katılım Nisan 2011
366 Takip Edilen434 Takipçiler
Haixun Wang
Haixun Wang@haixunwang·
A little weekend fun during the lobster craze. @haixun/one-billion-lobsters-and-a-file-called-soul-md-438d757fe833" target="_blank" rel="nofollow noopener">medium.com/@haixun/one-bi…
English
0
0
0
28
Haixun Wang
Haixun Wang@haixunwang·
The final pages of three books accompanied me into the holidays - 'The Road', 'All the Light We Cannot See', and 'Atonement' (round two, 12 years later). None of them easy. All of them necessary. Weird thing is, reading about humanity's darkest moments while surrounded by twinkling lights and warm cookies didn't feel wrong. It felt right. Not because it made me appreciate the good stuff more, but for how they softly opened my eyes to life's deeper currents. These books might deal with wars, desolation, and betrayal, but strip away the drama and you've got us - all our missed connections, lonely nights, and choices we can't take back. That's the gut punch these writers deliver - they take the stuff we're too numb to notice and make it impossible to ignore. They remind us that even our most ordinary days are anything but. And maybe that's exactly what we need sometimes - a wake-up call to the epic story we're all living, even when it feels like nothing's happening at all.
Haixun Wang tweet media
English
1
0
2
284
Yann LeCun
Yann LeCun@ylecun·
Elon's weird sense of priority: - The most important thing for humanity is getting to Mars ASAP. - Trump promised me he'll help (like he promised me he cared about climate change in 2016). - So I'm supporting him. Nothing else matters. - I bought Twitter so I could amplify the propaganda. I overpaid for it, and I'm losing tons of money, but nothing is more important than going to Mars. - Yeah, I'm spewing nonsense all the time, and I bribe folks with $1M checks illegally, but nothing is more important than going to Mars ASAP. - Democracy be damned.
Elon Musk@elonmusk

True. I think @realDonaldTrump winning makes a big difference in humanity getting to Mars and making life multiplanetary. This might one day save life as we know it.

English
476
255
2.9K
375.9K
Haixun Wang
Haixun Wang@haixunwang·
Thrilled to share our latest blog on e-commerce search & discovery, showcasing how LLMs are reshaping the landscape. Huge thanks to the team for the outstanding effort! Right now, the real value of LLMs isn’t just in responding to user queries in natural language or conversation. It's about using LLMs to enhance existing data, generate new insights, and create richer signals. The biggest value comes from combining the vast world knowledge within LLMs with the proprietary data of e-commerce companies. The next step is to use LLMs to create bespoke presentation of results, based on the query, user intent, and context. But the true impact of this new wave of AI on search and discovery will be a novel system - a generative information retrieval system - that fundamentally addresses the weaknesses of existing IR systems (e.g., the limited number of results we can effectively process andthe high development costs of building, maintaining, improving, and integrating multiple ML models.). It will be an exciting ride! tech.instacart.com/supercharging-…
English
0
3
12
1.6K
Haixun Wang
Haixun Wang@haixunwang·
At KDD 2024 keynote speech "AI for Nature: From Science to Impact" by Tanya Berger-Wolf. Wonderful & inspiring talk. It served as a powerful reminder that the true purpose of science goes beyond the scale of foundational models and debates about AGI.
Haixun Wang tweet mediaHaixun Wang tweet media
English
0
0
6
452
Haixun Wang
Haixun Wang@haixunwang·
Lately, there’s been a growing chorus claiming Generative AI is the next overhyped tech darling. Critics are lining up to call it just another bubble waiting to burst. Sure, Fortune 500s are struggling to move Gen-AI from the whiteboard to the real world—accuracy, liability, and security are giving them headaches. But let’s get real. Today, AI can write essays that dazzle professors, ace the Math Olympiad, and design novel drugs and vaccines. Two years ago, that was pure sci-fi. So if you, Fortune 500s, are shouting “overhyped” because you can’t make it work, my first question is: Have you truly tried? Remember Alexa? Yeah, I’ve got five. The real breakthrough was speech-to-text, but what made it a universal product was Amazon putting in the grunt work—hiring armies of people to manually connect actions to those transcriptions. That massive effort laid the foundation for the smart home tech we now take for granted. Comparing the speech-to-text breakthrough back then to the Gen-AI breakthrough today is, if I may over-exaggerate, like comparing driving to the supermarket to landing on the moon. So, if Amazon put in that much effort for Alexa, shouldn’t you, Fortune 500s, be pushing even harder to bring Gen-AI to life? Now, here’s the second question: Are you even trying the right way? The cold, hard truth: The bubble isn’t Gen-AI itself; it’s the wave of people treating this tech like a cure-all. This isn’t enthusiasm—it’s blind speculation. For example, they think they can skip decades of research on dialogue systems and just throw everything into a large language model, expecting it to magically handle complex conversations. They also believe English can replace SQL for querying databases. Sure, that works for simple queries. But in the real world, SQL statements can run hundreds of lines long.  Have you ever bother to express a query that complex in English? The human language that can naturally convey such intricate commands doesn’t exist. And yet, we’ve got startups popping up left and right, convinced they can crack this problem without grasping the challenge. As the joke goes, the fastest animals on Earth aren’t cheetahs anymore—they’re the people racing to label themselves AI experts. But they’re also the first to cry foul when Gen-AI doesn’t meet their naive expectations. Here’s the truth: Tech takes time. You don’t master math or stats in three months, and you certainly don't master AI that quickly either. fortune.com/2024/08/06/gen…
English
7
13
104
19.6K
Haixun Wang
Haixun Wang@haixunwang·
"... and the pairing has already been evaluated by all leading LLMs and LVMs ..."
Haixun Wang tweet media
English
0
0
2
518
Haixun Wang retweetledi
Jeff Dean
Jeff Dean@JeffDean·
Gemma 2 is now available available in 9B and 27B sizes (great for practical deployments) 🎉 These open-source models offer best-in-class performance (often better than significantly larger models), run at incredible speed across different hardware and easily integrate with other AI tools. Download the weights on @Kaggle and @HuggingFace, or access the models in Google AI Studio Both sizes also do very well compared to other proprietary and open models on lmsys (although error bars are still largish due to limited votes so far): See more on the blog: blog.google/technology/dev… Tech report: goo.gle/gemma2report Congrats to all the people who have been working hard on this new release!
Jeff Dean tweet mediaJeff Dean tweet media
English
15
95
622
73.8K
Haixun Wang
Haixun Wang@haixunwang·
A succinct summary of a crucial idea and practical approach in marketing optimization, balancing technical depth with clear insights.
English
0
1
3
2.6K
Haixun Wang
Haixun Wang@haixunwang·
Your post, and espcially that quote from "The Big Book of Concepts," brought back great memories! When I started the Probase project, I was also intrigued by another book, "Women, Fire, and Dangerous Things" by George Lakoff. I love the title because it shows how seemingly unrelated concepts could form strong relationships in certain contexts. Many psychology books provide mind-blowing ideas and revelations. In addition to those mentioned, "The Language Instinct" by @sapinker and "Metaphors to Live By" by George Lakoff and Mark Johsnon significantly impacted the Probase project as well. However, psychological ideas and hypothesis often aren't quantifiable and rely on human subject tests. One of Probase's goals was to make concepts and conceptualization "computable." For example, we'd like to answer the following questions through inferencing: a) Why are car seats considered chairs, and chairs are considered furniture, but car seats aren't considered furniture? b) Why a robin is considered a typical bird, but a penguin isn't? Probase collects concepts and relationships among them (especially the isA relationship) from web corpora and assigns weights to quantify their strength, which allows to create inferencing mechaims that are able to answer such questions in a quantifiable way. With the rise of deep learning and implicit representations like embeddings, language models have made huge impacts. However, explicit representations still excel in common sense reasoning and interpretability. The fusion of implicit and explicit representations is a promising direction going forward. It's great to see the impressive and practical progress Yangqiu Song made in conceptualization! Way to go!!
Yangqiu Song@yqsong

“Concepts are the glue that holds our mental world together.”– Murphy (2004) I started to work on conceptualization in 2010 when I joined #MSRA to work on #Probase (haixun.github.io/probase.html) with Haixun Wang @haixunwang . At that time, we leveraged Probase to conceptualize things in the world through contextualization and composition. We started to think what if we can also conceptualize events? I worked with my intern Fangting Xia, however, we met much difficulty. Later when I joined #HKUST, I started my project, #ASER (Activities, States, Events, and their Relations, github.com/HKUST-KnowComp…), with my students Hongming Zhang @hongming110 , Xin Liu, and Haojie Pan. We use IE to find a lot discourse relations for events, and we prove that we can transfer such knowledge to commonsense knowledge such as #TransOMCS (github.com/HKUST-KnowComp…) and #DISCOS (github.com/HKUST-KnowComp…). In ASER, we started to build on our idea of conceptualization. Then we initiated the new project, #AbstractATOMIC (github.com/HKUST-KnowComp…), in which we conceptualize both entities and events in #ATOMIC knowledge base. Now, I am so happy that the paper has been eventually accepted by Artificial Intelligence (doi.org/10.1016/j.arti…). Big thanks to my students Mutian He, Tianqing Fang @TFang229, and Weiqi Wang @MightyWeaver2 . In fact, the review took 2 years, but we have been working on conceptualization without waiting for the outcome. Here is the list of what we have been doing: On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation (github.com/HKUST-KnowComp…) AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph (github.com/HKUST-KnowComp…) 🕯️CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning (github.com/HKUST-KnowComp…) 🚗CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering (github.com/HKUST-KnowComp…) 🐈CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning (github.com/HKUST-KnowComp…) Moving forward, we are now working on something beyond social and physical knowledge reasoning, which we call metaphysical reasoning (github.com/HKUST-KnowComp…). The essential building block of such reasoning is again, conceptualization. I am really excited about the development we have done. It's something like your 10+ year dream coming true. I really appreciate all my collaborators, especially my genius students working on related stuff, trusting me and pushing this direction to be even more interesting!

English
0
0
3
332
Haixun Wang
Haixun Wang@haixunwang·
No. It won't work. Real intelligence & creativity hinge on creating surprises. A good story captivates because it delivers believable surprises. Does the story follow logic? Yes, otherwise, it wouldn't be believable. However, blindly following logic, you will have to produce a million examples to land one good surprise. As a result, your new model, trained on an abundance of mediocre data, won't be able to deliver the real creativity.
Bindu Reddy@bindureddy

Here is how to build super intelligence in one straight shot Step 1 - Get a GPU super cluster and train a set of foundation models Step 2 - Have these foundation models curate, clean and generate a bunch of dataset based on logic constraints and heuristics Step 3 - Build the next generation of LLMs based on these datasets. Step 4 - Use the new LLMs to generate datasets that address harder and more complex problems and the steps required to create them. You also need to set up mechanisms to validate these datasets including using humans when it makes sense. Step 5 - Repeat 3 and 4 until you get to super intelligence

English
0
0
1
213
Haixun Wang
Haixun Wang@haixunwang·
The dominance of AI generated text is akin to a culinary landscape where there are fewer surprising dishes created by imaginative, eccentric, or accidentally inspired chefs, and more uniform outlets like Starbucks or McDonald’s. haixun.medium.com/the-homogeniza…
English
1
0
1
198
Haixun Wang
Haixun Wang@haixunwang·
Highly unlikely. Real-life SQL queries are often too complex to convert into English, let alone conversion from English. Practically speaking, English doesn't have the expressive power of SQL when it comes to complex data manipulation; that's why we invent programming languages.
Bindu Reddy@bindureddy

SQL Has Been Replaced by English You don't need to learn SQL anymore; you can simply learn English 🤣 Our platform users have been using this feature for the last 18 months.

English
0
0
1
283
Haixun Wang
Haixun Wang@haixunwang·
We are thrilled to announce a guest lecture by Professor Tao Yu from the University of Hong Kong, focusing on the forefront of AI technology—Multimodality in AI agents and their evaluation. Date: Wednesday, April 10, 2024 Time: 4:00–5:00 pm PT tech.instacart.com/instacart-dist…
English
0
1
3
193
Haixun Wang
Haixun Wang@haixunwang·
@bindureddy > After that point the research papers around LLMs started to have less and less detail. To the point some have become merely announcements or advertisements instead of research papers.
English
0
0
0
37
Bindu Reddy
Bindu Reddy@bindureddy·
Just a year ago, SOTA AI research was shared and published freely! This culture lead to the invention of LLMs. Google shared their breakthrough around transformers freely and openly with the rest of the world. OpenAI wouldn’t have invented the GPT architecture and LLMs without transformers and the rest is history! Now the closed AI labs have become abnormally tight-lipped and secretive about their research and are stifling innovation and seem focused on concentrating power! 😿 The sudden shift in culture was initiated by OpenAI when they realized how powerful and lucrative AI can be with GPT 3.0. After that point the research papers around LLMs started to have less and less detail. Our only hope is the small but passionate open source community along with Meta and now Musk! Hopefully open source will catch up to closed and we will restore the culture of innovation and work towards building safe AGI Few people understand how important this to ensure a bright future for humanity🤞
English
22
32
265
26.3K
Haixun Wang
Haixun Wang@haixunwang·
Indeed, humans are the funny part in the rise of AI. But so long as they are in the loop, machines are serving humanity. More amusing is AI agents using human languages for communication, turning something often flawed for its ambiguity into a mechanism of universality.
Bindu Reddy@bindureddy

It’s kinda funny that humans take all this trouble to create complex PDFs with all kinds of figures and tables in it… just so the PDFs can be fed to LLMs to deconstruct these figures and tables back into plain english language 🤷‍♀️ Soon AI models will create these PDFs and other AI models will deconstruct them - completely eliminate the human in the loop

English
0
0
1
204
Haixun Wang
Haixun Wang@haixunwang·
An interesting read on @timberners_lee amidst the government's attempt to ban TikTok, for the wrong reason. @timberners_lee/marking-the-webs-35th-birthday-an-open-letter-ebb410cc7d42" target="_blank" rel="nofollow noopener">medium.com/@timberners_le
English
0
0
2
131