Ali

1.8K posts

Ali banner
Ali

Ali

@ms802x

Electrical Engineer with passion, Ai Engineer @humainai

Ahsa, Kingdom of Saudi Arabia Katılım Mayıs 2014
971 Takip Edilen124 Takipçiler
Ali retweetledi
Tareq Amin
Tareq Amin@TareqAmin_·
Extremely proud of the team behind @HUMAIN’s #ALLAM 34B, our Arabic-first large language model (LLM), which has ranked #2 just behind GPT-5.2 on the BALSAM Leaderboard, a benchmark measuring global Arabic AI maturity.
واس العلمي@SPA_sci

مجمع الملك سلمان العالمي للغة العربية يصدر التقرير الثاني من مؤشر "بَلْسَم" لتقييم تقنيات الذكاء الاصطناعي للغة العربية. spa.gov.sa/ar/w2513529 #واس

English
12
33
149
18.1K
Ali retweetledi
KAUST
KAUST@KAUST_News·
KAUST PhD student Kaja Gruntkowska has been awarded a @Google PhD Fellowship, becoming the first-ever recipient from the GCC countries. Recognized for her work in Algorithms and Optimization, her research advances both the theory and practice of optimization for machine learning, making AI training faster, more cost-effective, and more resource-efficient.
KAUST tweet media
English
5
24
355
151.8K
Ali retweetledi
Akshit
Akshit@akshitwt·
this is a great read by jack morris. gives such a fresh perspective on what really matters. TLDR: with every "new" architecture, we unlocked a new source of data to use at scale. it was the large amount of data we unlocked that boosted performance not the architecture itself, and videos are the next big thing to harness. 2012: alexnet unlocked the entire imagenet dataset 2017: transformers unlocked the entire internet (as text) 2022: RLHF unlocked learning from humans 2024: reasoning unlocked learning from verifiers 2026: ??? when you look at progress this way, it becomes very very clear what the next pillar to unlock is: videos. or more specifically, youtube. youtube stores an insane amount of video data. people upload 720.000 hours of videos to the platform every single day. thats 4.3 Petabytes of new data every day that need to be stored. for comparison, currently models are trained on a few terabytes of text. this means that the data uploaded to youtube daily is 1000x the data used to train a typical LLM. once we come up with an architecture that can harness videos at scale, we will see the next big jump in our quest to AGI
Akshit tweet media
English
33
217
2.2K
207.8K
Ali
Ali@ms802x·
@xichen_pan Nice work! Would it be possible to share the code for inference and training?
English
1
0
0
273
Xichen Pan
Xichen Pan@xichen_pan·
We find training unified multimodal understanding and generation models is so easy, you do not need to tune MLLMs at all. MLLM's knowledge/reasoning/in-context learning can be transferred from multimodal understanding (text output) to generation (pixel output) even it is FROZEN!
Xichen Pan tweet media
English
9
67
412
70.5K
Ali retweetledi
Ostris
Ostris@ostrisai·
I do not get "vibe coding". Maybe people are just doing much more llm friendly tasks than I am, but no matter what llm I use, 99% of the time, I spend 100x more time fighting with it to keep it from doing stupid things, and then eventually resort to just doing it myself.
English
39
2
114
9.3K
Ali retweetledi
ℏεsam
ℏεsam@Hesamation·
the only way you can learn how to work SMART, is by working really HARD
English
97
1.1K
9.7K
215.8K
Ali retweetledi
يوسف القوس
يوسف القوس@y_algoos·
قصة يوسف، قصة وطن🇸🇦أُسس على المطايا🐪 (وأصبح وطن يحلم جميع العالم بالعيش فيه). إحتفاءً ب #عام_الإبل_2024 (سطرت قصتي من الصحراء لعالم #أشباه_الموصلات في كلمة خريجي🎓 @KAUST_NewsAR). (حلمنا بالأمس أصبح واقع اليوم) بفضل الله ثم بإستثمار وطنا بعقول أبناءه ليكونوا حراك رؤية 2030 🇸🇦
العربية
89
216
273
248.7K
Ali retweetledi
SDAIA
SDAIA@SDAIA_SA·
نسأل الله العظيم رب العرش العظيم أن يمد #خادم_الحرمين_الشريفين بموفور الصحة والعافية.
SDAIA tweet media
العربية
12
85
132
19K
Ali retweetledi
Sebastian Raschka
Sebastian Raschka@rasbt·
The Llama 3.2 1B and 3B models are my favorite LLMs -- small but very capable. If you want to understand how the architectures look like under the hood, I implemented them from scratch (one of the best ways to learn): github.com/rasbt/LLMs-fro…
Sebastian Raschka tweet media
English
30
575
3.3K
292.9K
Ali retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Huge congrats to @AIatMeta on the Llama 3.1 release! Few notes: Today, with the 405B model release, is the first time that a frontier-capability LLM is available to everyone to work with and build on. The model appears to be GPT-4 / Claude 3.5 Sonnet grade and the weights are open and permissively licensed, including commercial use, synthetic data generation, distillation and finetuning. This is an actual, open, frontier-capability LLM release from Meta. The release includes a lot more, e.g. including a 92-page PDF with a lot of detail about the model: ai.meta.com/research/publi… The philosophy underlying this release is in this longread from Zuck, well worth reading as it nicely covers all the major points and arguments in favor of the open AI ecosystem worldview: "Open Source AI is the Path Forward" facebook.com/4/posts/101157… I like to say that it is still very early days, that we are back in the ~1980s of computing all over again, that LLMs are a next major computing paradigm, and Meta is clearly positioning itself to be the open ecosystem leader of it. - People will prompt and RAG the models. - People will finetune the models. - People will distill them into smaller expert models for narrow tasks and applications. - People will study, benchmark, optimize. Open ecosystems also self-organize in modular ways into products apps and services, where each party can contribute their own unique expertise. One example from this morning is @GroqInc , who built a new chip that inferences LLMs *really fast*. They've already integrated Llama 3.1 models and appear to be able to inference the 8B model ~instantly: x.com/karpathy/statu… And (I can't seem to try it due to server pressure) the 405B running on Groq is probably the highest capability, fastest LLM today (?). Early model evaluations look good: ai.meta.com/blog/meta-llam… x.com/alexandr_wang/… Pending still is the "vibe check", look out for that on X / r/LocalLlama over the next few days (hours?). I expect the closed model players (which imo have a role in the ecosystem too) to give chase soon, and I'm looking forward to that. There's a lot to like on the technical side too, w.r.t. multilingual, context lengths, function calling, multimodal, etc. I'll post about some of the technical notes a bit later, once I make it through all the 92 pages of the paper :)
English
184
1.4K
12.1K
987.5K
Ali
Ali@ms802x·
@XQ55 عظم الله أجركم وأحسن عزاكم وغفر الله لوالدكم وأسكنه فسيح جناته.
العربية
0
0
0
26
XQ55
XQ55@XQ55·
إنّا لله وإنّا إليهِ رَاجعُون بقلوب مؤمنة راضية بقضاء الله وقدره انتقل الى رحمة الله تعالى أبي سعد العنزي اللهم اغفر له وارحمه واسكنه الفردوس الأعلى وأبدله داراً خيراً من داره واجعل قبره روضة من رياض الجنة دعواتكم له بالرحمة الدفان اليوم الثلاثاء بمقبرة الصلبيخات بعد صلاة العشاء
العربية
1.5K
158
875
164.2K
Ali retweetledi
Edward Beeching
Edward Beeching@edwardbeeching·
Imitation Learning support has been added to Godot RL Agents, you can now learn complex behaviours from player demonstrations and then fine-tune with RL. Check out the trained agent (a Neural Network) from our example game.
English
1
4
22
2K
Ali retweetledi
AK
AK@_akhaliq·
ViTAR Vision Transformer with Any Resolution This paper tackles a significant challenge faced by Vision Transformers (ViTs): their constrained scalability across different image resolutions. Typically, ViTs experience a performance decline when processing resolutions
AK tweet media
English
6
80
454
67.3K
Ali retweetledi
maker.io
maker.io@MakerIO·
How to fiber optic cables work? Let's talk about it!
English
1
4
62
5.1K
Ali retweetledi
Barsee 🐶
Barsee 🐶@heyBarsee·
NVIDIA just announced GR00T. It will enable a robot to understand multimodal instructions like language, video, and motion. Very soon we will see them cooking, preparing coffee, in supermarkets, changing tires, etc.
English
40
156
733
159K
Ali retweetledi
AK
AK@_akhaliq·
MusicHiFi Fast High-Fidelity Stereo Vocoding Diffusion-based audio and music generation models commonly generate music by constructing an image representation of audio (e.g., a mel-spectrogram) and then converting it to audio using a phase reconstruction model or vocoder.
English
2
57
236
27.3K