Dimitris Bertsimas

53 posts

Dimitris Bertsimas

Dimitris Bertsimas

@dbertsim

MIT professor, analytics, optimizer, Machine Learner, entrepreneur, philatelist

Belmont, MA Katılım Ağustos 2017
92 Takip Edilen2.9K Takipçiler
StopAntisemitism
StopAntisemitism@StopAntisemites·
What's it like being an MIT Jewish student since 10/7? Get ready to have your jaw drop. Oct. 8: not even 24 hours passes since the worst attack on Jews occurs since the Holocaust does MIT's Coalition Against Apartheid (CAA) blames the massacre on ... Israel. No one is reprimanded. Oct. 13: CAA pushes for a “global day of Jihad”. No one is reprimanded. Oct. 19: students filmed chanting “One Solution: Intifada Revolution” & “From the river to the sea, Palestine will be free.” No one is reprimanded. Oct. 23: protesters disrupt classes, waving Palestinian flags & accusing MIT, Israel, and the U.S. of “genocide”. No one is reprimanded. Nov. 2: CAA members with a bullhorn and a drum roam through campus shouting anti-Israel slogans, barging into the President's office; they are escorted by the MIT police. No one is reprimanded. Nov 7: DEI staff member tells a Jewish student "they are not a protected class". DEI staff member Sophia Hasenfus even helped organize CAA rallies and openly endorsed statements justifying Hamas’s terror attack. Nov. 9 - CAA blockades Lobby 7, a major thoroughfare through campus. No one is reprimanded. Nov. 12 - an even larger group of protesters march across the Mass. Ave. bridge and gather at the Institute’s main entrance. No one is reprimanded. Nov. 13: MIT President Sally Kornbluth refuses to adopt the IHRA Definition of Antisemitism Nov. 16: the “Planetary Health: Indigenous Land, Peoples and Bodies" event takes place, led by an interfaith chaplain in which the Chaplain states Palestinians are being “wrongfully subjugated and oppressed by racist white European colonizers.” No one is reprimanded. Dec. 5: MIT President Sally Kornbluth miserably testifies in front of Congress. Dec. 6: a man harasses people at the Hillel building and urinates on it. No one is reprimanded. Dec. 13 - algorithms lecturer Mauricio Karchmer resigns due to MIT's failure of its Jewish students Dec. 21 - CAA offers support for Yemeni Houthi terrorists in a social media post. No one is reprimanded. Jan. 24 - campus wide hate initiative stresses Islamophobia comes out Feb. 12 - CAA again protests on campus and FINALLY gets a temporary suspension - 4 months after terrorizing Jews on campus. Feb. 18 - CAA crashed an MLK event. No one is reprimanded.
English
171
686
2.2K
154.5K
Dimitris Bertsimas retweetledi
Wes Gurnee
Wes Gurnee@wesg52·
Precision and recall can also be helpful guides, and remind us that it should not be assumed a model will learn to represent features in an ontology convenient or familiar to humans.
Wes Gurnee tweet media
English
2
1
23
4.5K
Dimitris Bertsimas retweetledi
Wes Gurnee
Wes Gurnee@wesg52·
While we found tons of interesting neurons with sparse probing, it requires careful follow up analysis to draw more rigorous conclusions. E.g., athlete neurons turn out to be more general sport neurons when analyzing max average activating tokens.
Wes Gurnee tweet media
English
1
2
22
3.9K
Dimitris Bertsimas retweetledi
Wes Gurnee
Wes Gurnee@wesg52·
What happens with scale? We find representational sparsity increases on average, but different features obey different scaling dynamics. In particular, quantization and neuron splitting: features both emerge and split into finer grained features.
Wes Gurnee tweet media
English
1
3
19
4.1K
Dimitris Bertsimas retweetledi
Wes Gurnee
Wes Gurnee@wesg52·
Results in toy models from @AnthropicAI and @ch402 suggest a potential mechanistic fingerprint of superposition: large MLP weight norms and negative biases. We find a striking drop in early layers in the Pythia models from @AiEleuther and @BlancheMinerva.
Wes Gurnee tweet media
English
1
3
30
5.5K
Dimitris Bertsimas retweetledi
Wes Gurnee
Wes Gurnee@wesg52·
Early layers seem to use sparse combinations of neurons to represent many features in superposition. That is, using the activations of multiple polysemantic neurons to boost the signal of the true feature over all interfering features (here “social security” vs. adjacent bigrams)
Wes Gurnee tweet media
English
1
3
37
4.9K
Dimitris Bertsimas retweetledi
Wes Gurnee
Wes Gurnee@wesg52·
But what if there are more features than there are neurons? This results in polysemantic neurons which fire for a large set of unrelated features. Here we show a single early layer neuron which activates for a large collection of unrelated n-grams.
Wes Gurnee tweet media
English
1
3
38
6.9K
Dimitris Bertsimas retweetledi
Wes Gurnee
Wes Gurnee@wesg52·
Neural nets are often thought of as feature extractors. But what features are neurons in LLMs actually extracting? In our new paper, we leverage sparse probing to find out arxiv.org/abs/2305.01610. A 🧵:
Wes Gurnee tweet media
English
10
126
691
227.6K
Dimitris Bertsimas retweetledi
Wes Gurnee
Wes Gurnee@wesg52·
One large family of neurons we find are “context” neurons, which activate only for tokens in a particular context (French, Python code, US patent documents, etc). When deleting these neurons the loss increases in the relevant context while leaving other contexts unaffected!
Wes Gurnee tweet media
English
3
12
114
91.4K
Dimitris Bertsimas
Dimitris Bertsimas@dbertsim·
As part of HIAS and together with Professor Georgios Stamou from NTUA, Greece we are offering a course on Universal AI (in English, free of charge) aicourse2023.hellenic-ias.org on July 3-5, 2023 in Athens, Greece. Prospective participants can declare their interest in the website.
English
0
3
19
2.4K
Dimitris Bertsimas retweetledi
Ryan Cory-Wright
Ryan Cory-Wright@RyanCoryWright·
Delighted to share that our paper "A new perspective on low-rank optimization" has just been accepted for publication by Math Programming! Valid & often strong lower bounds on low-rank problems via a generalization of the perspective reformulation from mixed-integer optimization
Ryan Cory-Wright@RyanCoryWright

Excited to share a new paper with @dbertsim and Jean Pauphilet on a matrix perspective reformulation technique for strong relaxations of low-rank problems. Applications in reduced-rank regression and D-optimal experimental design: optimization-online.org/DB_HTML/2021/0…

English
1
1
16
0
Dimitris Bertsimas retweetledi
Ryan Cory-Wright
Ryan Cory-Wright@RyanCoryWright·
📢New preprint alert! arxiv.org/abs/2303.07695 We use sampling schemes and clustering to improve the scalability of deterministic Bender's decomposition on data-driven network design problems, while maintaining optimality. w/ @dbertsim, Jean Pauphilet, and Periklis Petridis
English
1
2
19
2.6K
Dimitris Bertsimas
Dimitris Bertsimas@dbertsim·
The paper presents a novel holistic deep learning framework that improves accuracy, robustness, sparsity, and stability over standard deep learning models, as demonstrated by extensive experiments on both tabular and image data sets. arxiv.org/abs/2110.15829
English
0
1
35
4.3K
Dimitris Bertsimas
Dimitris Bertsimas@dbertsim·
My book with David Gamarnik “Queueing Theory: Classical and Modern Methods” was published. It was a long journey that lasted two decades but both of us are delighted with the journeys completion. For more details see dynamic-ideas.com/books/quueing-…
Dimitris Bertsimas tweet media
English
6
22
180
0
Dimitris Bertsimas retweetledi