Daniel D'souza 

1.7K posts

Daniel D'souza 

@mrdanieldsouza

Research Engineer @Cohere_Labs💙 | @UMichECE Alum 〽️ | 🇮🇳✖️🇺🇸 💫"The Universe Works in Mysterious Ways"💫

Ann Arbor, MI Katılım Kasım 2016

1K Takip Edilen991 Takipçiler

Sabitlenmiş Tweet

Daniel D'souza @mrdanieldsouza·14 Ağu

“Best Paper Award” @ ACL 2024 🪄What an incredible culmination of perseverance to connect and represent languages around the 🗺️! 🪄 🤗 Huge thanks to the @aclmeeting committee for recognizing the massive effort behind Project Aya @CohereForAI 💙 #ACL2024

Ahmet Üstün@ahmetustun89

I'm incredibly proud that Aya received #ACL2024 Best Paper Award 🥹. Huge congratulations to the Aya team and @CohereForAI community who make this possible by for extending frontiers of LLMs to multilingual, building Aya Model and Aya Dataset 🌿🌏

English

Daniel D'souza  retweetledi

MichiganAI@michigan_AI·2d

We’re excited to announce our next #AI Seminar! Marzieh Fadaee @mziizm (Head of @Cohere_Labs) will join us virtually for a talk on "Data Strategies for Language Models: Insights from Multilingual Research:" 📅 Thursday, April 9 | 1:30 pm ET 🔗cse.engin.umich.edu/event/data-str…

English

1.4K

Daniel D'souza  retweetledi

Ammar Khairi@ammar__khairi·2d

So exciting to see our work 𝑭𝒖𝒔𝒊𝒐𝑵 : Making, not Taking the Best-of-N in action @OpenRouter 🔥 I will be presenting this work from @Cohere_Labs at @iclr_conf 🇧🇷 in 2 weeks, come and find us if you are there !

OpenRouter@OpenRouter

New public experiment: Model Fusion Use multiple models, analyze outputs, and fuse the results for a response that every Deep Research agent preferred to its own, in our testing. No subscription needed at all.

English

3.4K

Daniel D'souza @mrdanieldsouza·23 Şub

🚨There is always exciting work that comes from The Expedition at @Cohere_Labs that goes on to become highly-technical blogspots or even full-blown research papers! 🪄 Sign up for this years edition where we work to build BIG ideas using 🤏 "Tiny" Aya 🌎🌍🌏

Cohere Labs@Cohere_Labs

We’re officially kicking off Expedition Tiny Aya, a global, mentor-supported open build challenge built on the Tiny Aya model — and we’d love to see you there.

English

1.8K

Daniel D'souza  retweetledi

Sebastian Raschka@rasbt·20 Şub

Tiny Aya reimplementation From Scratch! Have been reading through the technical reports of the recent wave of open-weight LLM releases (more on that soon). Tiny Aya (2 days ago) was a bit under the radar. Looks like a nice, small 3.35B model with strongest multilingual support of that size class. Great for on-device translation tasks. Just did a from-scratch implementation here: github.com/rasbt/LLMs-fro… Architecture-wise, Tiny Aya is a classic decoder-style transformer with a few noteworthy modifications (besides the obvious ones like SwiGLU and Grouped Query Attention): 1. Parallel transformer blocks. A parallel transformer block computes attention and MLP from the same normalized input, then adds both to the residual in one step. I assume this is to reduce serial dependencies inside a layer to improve computational throughput. 2. Sliding window attention. Specifically, it uses a 3:1 local:global ratio similar to Arcee Trinity and Olmo 3. The window size is also 4096. Also, similar to Arcee, the sliding window layers use RoPE whereas the full attention layers use NoPE. 3. LayerNorm. Most architectures moved to RMSNorm as it's computationally a bit cheaper and performs well. Tiny Aya is keeping it more classic with a modified version of LayerNorm (the implementation here is like standard LayerNorm but without shift, i.e., bias, parameter).

English

160

1.1K

67.6K

Daniel D'souza @mrdanieldsouza·18 Şub

@sarahookr I’m loving this journey of discovery. It’s been a long time coming :)

English

101

Sara Hooker@sarahookr·18 Şub

The most consumed biscuit is the entire world. G stands for genius.

English

884

30.5K

Daniel D'souza @mrdanieldsouza·17 Şub

🤏

Nick Frosst@nickfrosst

actual image of the compute we used to train this model.

QME

168

Daniel D'souza @mrdanieldsouza·17 Şub

@sarahookr @Cohere_Labs 💗

QME

Sara Hooker@sarahookr·17 Şub

Huge congrats to the @Cohere_Labs team and community! Super special to see tiny Aya in the open. Congrats to everyone involved, and the large amount of care required for this type of launch.

Cohere Labs@Cohere_Labs

Introducing ✨Tiny Aya✨, a family of massively multilingual small language models built to run where people actually are. Tiny Aya delivers strong multilingual performance in 70+ global languages in a 3.35B parameter model, efficient enough to run locally, even on a phone.

English

126

8.7K

Daniel D'souza @mrdanieldsouza·17 Şub

Extremely proud to present ✨Tiny Aya ✨ Tiny 🤏 but mighty 💪3.35B parameter models, massively multilingual from the ground up 🌎🌍🌏, built with immense care w.r.t language representation🤗 We had a blast building this! 💗 Have at it! 🎆

Cohere Labs@Cohere_Labs

English

5.5K

Daniel D'souza @mrdanieldsouza·4 Şub

@sarahookr @adaptionlabs Congratulations @sarahookr ! 🚀

English

Sara Hooker@sarahookr·4 Şub

Beginnings are very special. Today is an important day for @adaptionlabs. Today a handful of one-size-fits-all-models are optimized for the average use case. Averages erase the exceptional. Everything intelligent adapts. So should AI.

English

840

220.5K

Daniel D'souza  retweetledi

Beyza Ermiş@beyzaermis·15 Oca

Proud to share this work led by @oliverjbolton 🎉 We introduce SimMerge, a practical way to make model merging more reliable at scale. Paper and results in the thread 👇

Cohere Labs@Cohere_Labs

🧩In modern LLM development, we often end up with many specialized checkpoints. Merging into one model is attractive, but the results depend a lot on the selected merge method & order. Our paper introduces SimMerge: a simple way to choose the merge configuration automatically.

English

5.3K

Daniel D'souza  retweetledi

Cohere Labs@Cohere_Labs·15 Oca

English

17.9K

Daniel D'souza @mrdanieldsouza·3 Oca

@sarahookr “without an agenda” is such a key part of this

English

Sara Hooker@sarahookr·3 Oca

I always ask a trusted group for feedback at the beginning of an endeavor. critical to find 1-2 calibrated skeptics without an agenda. Take their view over time seriously since they can be swayed w evidence. If their view remains the same, you are not making progress.

English

5.3K

Daniel D'souza @mrdanieldsouza·31 Ara

“You can just do things” energy in 2026 ✨

Tim Urban@waitbutwhy

Good day to remember just how big that green tree is

English

130

Daniel D'souza  retweetledi

Cohere Labs@Cohere_Labs·29 Ara

Sr Research Scientist, Julia Kreutzer: Treasure Hunt paper. 🗺️ This work introduces a method to improve model performance by adding markers to tokens of the pretraining data, enabling real-time targeting of the long tail using training-time markers. youtu.be/K3BUpKag_nA

YouTube

English

1.1K

Daniel D'souza @mrdanieldsouza·27 Ara

@championswimmer Hafiz is the bomb ❤️ try their pomegranate turkish delight specifically 😍

English

834

Arnav Gupta@championswimmer·26 Ara

Went to the famous Hafiz Mustafa 1864 baklava place in Istanbul and had the Humbara. By far one of the best desserts I've had in recent memory. The clotted cream is out of the world. 10/10 highly recommend. They have branches in London and Dubai too.

English

432

104.5K

Daniel D'souza  retweetledi

Sara Hooker@sarahookr·23 Ara

Careers are long. My first director gave me something that has guided my entire career: treat people with respect because you will meet them again and again. Interactions are rarely one off, and what matters in the long run is that you are fair and have integrity.

English

1.2K

6.6K

455.8K

Daniel D'souza  retweetledi

SAIL Media@readsail·22 Ara

Doing More with Less: "Treasure Hunt" & The Future of Multilingual AI ft @mrdanieldsouza

English

497

Daniel D'souza @mrdanieldsouza·23 Ara

Such a fun chat with @readsail at #NeurIPS2025 ! 🤗 Got to chat about our recent work "TreasureHunt" (arxiv.org/abs/2506.14702…) and play a game of Guess Who? 👀 Kudos to @natolambert for setting this in motion! 🔥

SAIL Media@readsail

"AI should be built for the world, not just for the people who speak English." 🌍 @mrdanieldsouza from @Cohere_Labs discusses the critical need for multilingual models at #NeurIPS2025. Unlocking knowledge across languages is the next frontier. 🚀 Full video now on Youtube.

English

2.9K

Daniel D'souza @mrdanieldsouza·17 Ara

@rao2z @anilananth Airlines 💙

English

128

Subbarao Kambhampati (కంభంపాటి సుబ్బారావు)@rao2z·17 Ara

Chatting #AI with @anilananth over ravva dosa/poori/idlee/kapi at Tivoli garden in B'lore.. 😎

Subbarao Kambhampati (కంభంపాటి సుబ్బారావు) tweet media

English

5.9K

Daniel D'souza  retweetledi

Convai_rg@convAI2024·16 Ara

📢 Join the final Conversational AI Reading Group meeting of 2025! 📅 Thursday, Dec 18th | 11 AM - 12 EST 🎙 Speaker: Daniel D'souza (@mrdanieldsouza) - @cohere Labs 📖 Topic: "Data as Leverage: Improving Foundation Models Beyond Scaling." 🔗 Details: poonehmousavi.github.io/rg.html

English

359

Keşfet

@mziizm @Cohere_Labs @OpenRouter @iclr_conf @sarahookr @oliverjbolton @championswimmer @elonmusk