Mohammad Rifat Arefin

41 posts

Mohammad Rifat Arefin

Mohammad Rifat Arefin

@mo_rifat

CS PhD Student at @utarlington | Software Engineering | Program Analysis

Arlington, TX Katılım Haziran 2022
470 Takip Edilen65 Takipçiler
Sabitlenmiş Tweet
Mohammad Rifat Arefin
Mohammad Rifat Arefin@mo_rifat·
#icse2024 Introducing TreeVada, another tool in the line of grammar inference built on top of Arvada (ASE '21). In TreeVada we exploit the bracket-guided nesting of programs as a hint of the target grammar. We propose a deterministic approach based on Arvada's bubble and merge grammar generalization strategy which turns out to capture better grammar in a shorter time in empirical comparison.
Mohammad Rifat Arefin tweet media
English
2
1
4
463
Mohammad Rifat Arefin retweetledi
ICML Conference
ICML Conference@icmlconf·
Get feedback on your #ICML2026 submission from Google's Gemini-based paper assistant tool, an advanced LLM tool specialized to assist with ICML submissions. Live now! More details in the blog post below 👇
ICML Conference tweet media
English
6
40
322
73.1K
Mohammad Rifat Arefin retweetledi
Wes Roth
Wes Roth@WesRoth·
Geoffrey Hinton explains that large language models are nothing like traditional software written line by line. Instead of explicit instructions, they rely on code that teaches them how to learn from data. What actually emerges is billions or trillions of learned connection strengths that no one can directly interpret. This makes their internal workings fundamentally opaque, much like the human brain.
English
44
114
722
49.6K
Mohammad Rifat Arefin retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
Fei-Fei Li (@drfeifei) on limitations of LLMs. "There's no language out there in nature. You don't go out in nature and there's words written in the sky for you.. There is a 3D world that follows laws of physics." Language is purely generated signal.
Rohan Paul@rohanpaul_ai

Columbia CS Prof explains why LLMs can’t generate new scientific ideas. Bcz LLMs learn a structured “map”, Bayesian manifold, of known data and work well within it, but fail outside it. But true discovery means creating new maps, which LLMs cannot do.

English
178
550
3.7K
1.2M
Mohammad Rifat Arefin retweetledi
Percy Liang
Percy Liang@percyliang·
You spend $1B training a model A. Someone on your team leaves and launches their own model API B. You're suspicious. Was B was derived (e.g., fine-tuned) from A? But you only have blackbox access to B... With our paper, you can still tell with strong statistical guarantees (p-values < 1e-8). Idea: test for independence of A's training data order with likelihoods under B. There are crazy amounts of metadata about training process baked into the model that can't be washed out, like a palimpsest...
Sally Zhu@SallyHZhu

🔎Did someone steal your language model? We can tell you, as long as you shuffled your training data🔀. All we need is some text from their model! Concretely, suppose Alice trains an open-weight model and Bob uses it to produce text. Can Alice prove Bob used her model?🚨

English
54
210
2.4K
380.8K
Mohammad Rifat Arefin retweetledi
Ruben Hassid
Ruben Hassid@rubenhassid·
→ 70% of AI agents' tasks are repetitive, narrow operations like parsing commands, formatting outputs, and routing requests. Using 175 billion parameter models for these tasks is like hiring a PhD to flip burgers.
Ruben Hassid tweet media
English
1
2
30
4.6K
Mohammad Rifat Arefin retweetledi
Daniel Kang
Daniel Kang@ddkang·
The prevailing wisdom is that compute is the most important factor for frontier AI training. We think this is wrong: data is the most costly and important component of AI training. We collected estimates of revenue for major data labeling companies and compared them with the marginal compute cost for training top models in 2024. Our estimates show that data labeling is ~3x higher than the marginal training compute. 1/8
Daniel Kang tweet media
English
51
162
1.1K
154.2K
Mohammad Rifat Arefin retweetledi
Jürgen Schmidhuber
Jürgen Schmidhuber@SchmidhuberAI·
Who invented convolutional neural networks (CNNs)? 1969: Fukushima had CNN-relevant ReLUs [2]. 1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than today. 1987: Waibel applied Linnainmaa's 1970 backpropagation [3] to weight-sharing TDNNs with 1-dimensional convolutions [4]. 1988: Wei Zhang et al. applied "modern" backprop-trained 2-dimensional CNNs to character recognition [5]. All of the above was published in Japan 1979-1988. 1989: LeCun et al. applied CNNs again to character recognition (zip codes) [6,10]. 1990-93: Fukushima’s downsampling based on spatial averaging [1] was replaced by max-pooling for 1-D TDNNs (Yamaguchi et al.) [7] and 2-D CNNs (Weng et al.) [8]. 2011: Much later, my team with Dan Ciresan made max-pooling CNNs really fast on NVIDIA GPUs. In 2011, DanNet achieved the first superhuman pattern recognition result [9]. For a while, it enjoyed a monopoly: from May 2011 to Sept 2012, DanNet won every image recognition challenge it entered, 4 of them in a row. Admittedly, however, this was mostly about engineering & scaling up the basic insights from the previous millennium, profiting from much faster hardware. Some "AI experts" claim that "making CNNs work" (e.g., [5,6,9]) was as important as inventing them. But "making them work" largely depended on whether your lab was rich enough to buy the latest computers required to scale up the original work. It's the same as today. Basic research vs engineering/development - the R vs the D in R&D. REFERENCES [1] K. Fukushima (1979). Neural network model for a mechanism of pattern recognition unaffected by shift in position — Neocognitron. Trans. IECE, vol. J62-A, no. 10, pp. 658-665, 1979. [2] K. Fukushima (1969). Visual feature extraction by a multilayered network of analog threshold elements. IEEE Transactions on Systems Science and Cybernetics. 5 (4): 322-333. This work introduced rectified linear units (ReLUs), now used in many CNNs. [3] S. Linnainmaa (1970). Master's Thesis, Univ. Helsinki, 1970. The first publication on "modern" backpropagation, also known as the reverse mode of automatic differentiation. (See Schmidhuber's well-known backpropagation overview: "Who Invented Backpropagation?") [4] A. Waibel. Phoneme Recognition Using Time-Delay Neural Networks. Meeting of IEICE, Tokyo, Japan, 1987. Backpropagation for a weight-sharing TDNN with 1-dimensional convolutions. [5] W. Zhang, J. Tanida, K. Itoh, Y. Ichioka. Shift-invariant pattern recognition neural network and its optical architecture. Proc. Annual Conference of the Japan Society of Applied Physics, 1988. First backpropagation-trained 2-dimensional CNN, with applications to English character recognition. [6] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel: Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1(4):541-551, 1989. See also Sec. 3 of [10]. [7] K. Yamaguchi, K. Sakamoto, A. Kenji, T. Akabane, Y. Fujimoto. A Neural Network for Speaker-Independent Isolated Word Recognition. First International Conference on Spoken Language Processing (ICSLP 90), Kobe, Japan, Nov 1990. A 1-dimensional convolutional TDNN using Max-Pooling instead of Fukushima's Spatial Averaging [1]. [8] Weng, J., Ahuja, N., and Huang, T. S. (1993). Learning recognition and segmentation of 3-D objects from 2-D images. Proc. 4th Intl. Conf. Computer Vision, Berlin, pp. 121-128. A 2-dimensional CNN whose downsampling layers use Max-Pooling (which has become very popular) instead of Fukushima's Spatial Averaging [1]. [9] In 2011, the fast and deep GPU-based CNN called DanNet (7+ layers) achieved the first superhuman performance in a computer vision contest. See overview: "2011: DanNet triggers deep CNN revolution." [10] How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. Technical Report IDSIA-23-23, Swiss AI Lab IDSIA, 14 Dec 2023. See also the YouTube video for the Bower Award Ceremony 2021: J. Schmidhuber lauds Kunihiko Fukushima.
Jürgen Schmidhuber tweet media
English
86
408
2.4K
616.3K
Mohammad Rifat Arefin retweetledi
Alex Vacca
Alex Vacca@itsalexvacca·
Before AWS existed, one company ran the servers for Twitter, LinkedIn, and Facebook's entire app ecosystem. They owned Node.js, invented containers 8 years before Docker, and Peter Thiel even backed them. Then something happened...
English
238
2.3K
20.1K
3.3M
Mohammad Rifat Arefin retweetledi
Marc Brooker
Marc Brooker@MarcJBrooker·
In this month's ACM Queue, @ankushpd and I write about some of the methods and tools we apply to systems correctness at AWS: from testing, to simulation, to fault injection, to formal proofs.
Marc Brooker tweet media
English
7
59
324
43.1K
Mohammad Rifat Arefin retweetledi
Dilly Hussain
Dilly Hussain@DillyHussain88·
Saiful Azam is a Bengali fighter pilot, formerly of the Pakistan Air Force, who holds the record of shooting down the most Israeli warplanes in history (4) during the 1967 war. Azam also shot down one Indian warplane in the 1965 Indo-Pakistani War. x.com/RyanRozbiani/s…
English
81
493
2.7K
91.6K
Mohammad Rifat Arefin retweetledi
Shalev
Shalev@Shalev_lif·
Best poster moment at #NeurIPS2024
Shalev tweet media
English
28
720
10.5K
376.6K
Mohammad Rifat Arefin retweetledi
Archit Sharma
Archit Sharma@archit_sharma97·
don’t email me unless you have this kind of ambition
Archit Sharma tweet media
English
126
67
3.4K
352.3K
Mohammad Rifat Arefin retweetledi
Joelle Pineau
Joelle Pineau@jpineau1·
I’m excited to share a new AI Coding Competition from Meta and Microsoft Research building on Meta’s annual Hacker Cup! The most capable LLMs to date will be challenged to solve these questions. We invite major players in the code generation space to join.
Soumith Chintala@soumithchintala

Hacker Cup – one of the preeminent coding competitions started an AI track w/ Meta & Microsoft problems are hardddd – only a handful of engineers reliably solve them – requires deep algorithmic knowledge, reasoning, planning and fast execution – to solve 5 problems in 30 minutes, multiple times over 5 rounds. As of today, AI fails miserably. We hope this competition can be a big venue that rallies academic labs, industrial researchers and students to meaningfully advance AI coding capabilities. Would love to see more major players in the code generation space join the Hacker Cup AI code competition – @openAI, @anthropicAI, @googledeepmind, @cognition_labs @cursor_ai @replit @sourcegraphCody @codeiumdev. Register for the practice round Starting 20th: facebook.com/codingcompetit… The Discord community has grown to >2k folks. Partnering with Autogen, Langchain and SWE-agent – there are multiple starter kits showcasing baseline accuracy of modern models on Hacker Cup problems. Join the Discord community for further rules and details discord.gg/gSupVz7bMc

English
4
37
240
60.3K
Mohammad Rifat Arefin retweetledi
Rep. Alexandria Ocasio-Cortez
Dozens of protestors have been killed by Bangladeshi authorities in recent days. My constituents cannot reach their loved ones due to a government implemented communications blackout. I call for an end to the blackout and de-escalation of violence against protestors.
Reuters@Reuters

Telecommunications were widely disrupted in Bangladesh amid violent student protests against quotas for government jobs in which nearly two dozen people have been killed this week reut.rs/3WvJQlD

English
219
297
654
81.5K