Emre

2.6K posts

Emre

@etunch

42. The answer is 42. But what is the question? That's the ultimate question. Living to be the person my dog thinks I am. People @AppSamurai

Hammersmith, London شامل ہوئے Ağustos 2009

845 فالونگ741 فالوورز

Emre@etunch·19 Mar

AGI won't come from better LLMs and models, it'll come from better harnesses. So given the right harness, it's already here?

Rohit@rohit4verse

x.com/i/article/2028…

English

Emre@etunch·31 Ara

@emrefa we love that feeling!!

English

Emre Fadillioglu@emrefa·31 Ara

x.com/i/article/2006…

ZXX

256

Emre ری ٹویٹ کیا

212.vc@212vc·14 Ağu

Happy to see so many of our portfolio companies listed in @FastCompanyT's Startup 100 List this year! 🎉

English

670

Emre ری ٹویٹ کیا

Owain Evans@OwainEvans_UK·22 Tem

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

English

282

1.1K

8.4K

Emre ری ٹویٹ کیا

Zafer@ZaferElcik·18 Nis

Merhaba, Son bir yıldır üzerinde çalıştığımız CrayonClub sonunda yayında! 🎉 Deneyimlerinizi, ⭐️ puanlarınızı ve yorumlarınızı bekliyoruz. Destekleriniz için şimdiden çok teşekkürler! 👉 App Store: apps.apple.com/us/app/crayon-… 👉 Play Store: play.google.com/store/apps/det…

Türkçe

494

Emre ری ٹویٹ کیا

Grant Sanderson@3blue1brown·8 Şub

I just put up a new video, which was a collaboration with Terence Tao about the cosmic distance ladder. You can find the full video on YouTube, and here's a bit of extra footage that didn't make it into the final.

English

584

5.7K

305.6K

Emre ری ٹویٹ کیا

Chris Lattner@clattner_llvm·17 Eki

@deedydas I’m glad I didn’t take this compiler class, I would have also gotten 0/100. No wonder people think compilers are scary, they shouldn’t be taught this way! It’s also flawed in many ways (and old) but I think this is more approachable llvm.org/docs/tutorial/

English

364

6.2K

963.3K

Emre ری ٹویٹ کیا

andrew chen@andrewchen·14 Eyl

this stat always surprises me >50% of consumer in-app spend on iOS and Android is on mobile games 🤯 That's right, for iOS: - $25.2B total spend (that's up +13.1%) - $12.85B come from gaming - Android is even more tilted towards gaming the number is huge bc so much of the social media apps that take our time monetize through advertising, where you are the product, as opposed to letting you pay for the product!

English

298

45.1K

Emre@etunch·7 Eyl

➕

J.R. Holmsted@JHolmsted

Every. Damn. Time.

ART

Emre@etunch·25 Ağu

🔥

Brian Roemmele@BrianRoemmele

Meet OPEN SOURCE AND FREE SakanaAI/ The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. I have been running a lot of tests on this for quite a bit. Enjoy uncensored SCIENCE. github.com/SakanaAI/AI-Sc…

ART

Emre@etunch·13 Ağu

Brilliant

Tivadar Danka@TivadarDanka

The single most undervalued fact of linear algebra: matrices are graphs, and graphs are matrices. Encoding matrices as graphs is a cheat code, making complex behavior simple to study. Let me show you how!

English

Emre@etunch·8 Ağu

❤️

212.vc@212vc

🎉 Congratulations to our portfolio companies, @AppSamurai, @boltinsightcom, @fazla_tr , @getmobil, @Insider, @Mall_IQ and @TrioMobil, for making the Startup 100 List by @FastCompanyT! Kudos to @B2Metric and @PhiTech_Bioinfo from @SimyaVC's portfolio for being listed 👏

ART

Emre ری ٹویٹ کیا

Andrej Karpathy@karpathy·11 Tem

In 2019, OpenAI announced GPT-2 with this post: openai.com/index/better-l… Today (~5 years later) you can train your own for ~$672, running on one 8XH100 GPU node for 24 hours. Our latest llm.c post gives the walkthrough in some detail: github.com/karpathy/llm.c… Incredibly, the costs have come down dramatically over the last 5 years due to improvements in compute hardware (H100 GPUs), software (CUDA, cuBLAS, cuDNN, FlashAttention) and data quality (e.g. the FineWeb-Edu dataset). For this exercise, the algorithm was kept fixed and follows the GPT-2/3 papers. Because llm.c is a direct implementation of GPT training in C/CUDA, the requirements are minimal - there is no need for conda environments, Python interpreters, pip installs, etc. You spin up a cloud GPU node (e.g. on Lambda), optionally install NVIDIA cuDNN, NCCL/MPI, download the .bin data shards, compile and run, and you're stepping in minutes. You then wait 24 hours and enjoy samples about English-speaking Unicorns in the Andes. For me, this is a very nice checkpoint to get to because the entire llm.c project started with me thinking about reproducing GPT-2 for an educational video, getting stuck with some PyTorch things, then rage quitting to just write the whole thing from scratch in C/CUDA. That set me on a longer journey than I anticipated, but it was quite fun, I learned more CUDA, I made friends along the way, and llm.c is really nice now. It's ~5,000 lines of code, it compiles and steps very fast so there is very little waiting around, it has constant memory footprint, it trains in mixed precision, distributed across multi-node with NNCL, it is bitwise deterministic, and hovers around ~50% MFU. So it's quite cute. llm.c couldn't have gotten here without a great group of devs who assembled from the internet, and helped get things to this point, especially ademeure, ngc92, @gordic_aleksa, and rosslwheeler. And thank you to @LambdaAPI for the GPU cycles support. There's still a lot of work left to do. I'm still not 100% happy with the current runs - the evals should be better, the training should be more stable especially at larger model sizes for longer runs. There's a lot of interesting new directions too: fp8 (imminent!), inference, finetuning, multimodal (VQVAE etc.), more modern architectures (Llama/Gemma). The goal of llm.c remains to have a simple, minimal, clean training stack for a full-featured LLM agent, in direct C/CUDA, and companion educational materials to bring many people up to speed in this awesome field. Eye candy: my much longer 400B token GPT-2 run (up from 33B tokens), which went great until 330B (reaching 61% HellaSwag, way above GPT-2 and GPT-3 of this size) and then exploded shortly after this plot, which I am looking into now :)

English

124

749

6.3K

724K

Emre ری ٹویٹ کیا

Jim Rogers Nebraska@JimRogers_Nebr·7 May

@akarlin Sorry-- here's the link: pnas.org/doi/10.1073/pn…

English

2.5K

Emre ری ٹویٹ کیا

Jeff Barr ☁️@jeffbarr·30 Nis

Thank you to everyone who brought this article to our attention. We agree that customers should not have to pay for unauthorized requests that they did not initiate. We’ll have more to share on exactly how we’ll help prevent these charges shortly. #AWS #S3 How an empty S3 bucket can make your AWS bill explode - @maciej.pocwierz/how-an-empty-s3-bucket-can-make-your-aws-bill-explode-934a383cb8b1" target="_blank" rel="nofollow noopener">medium.com/@maciej.pocwie…

English

542

3.4K

1.3M

Emre ری ٹویٹ کیا

nano@nanulled·30 Nis

My speculation: GPT2 is an advanced multi-transformer architecture that combines two transformers (Find and Replace) The results speak for themselves This is from paper that was published by an anonymous authors

English

197

34.6K

Emre ری ٹویٹ کیا

dr. jack morris@jxmnop·29 Nis

one of the most important things I know about deep learning I learned from this paper: "Pretraining Without Attention" this what I found so surprising: these people developed an architecture very different from Transformers called BiGS, spent months and months optimizing it and training different configurations, only to discover that at the same parameter count, a wildly different architecture produces identical performance to transformers this may imply that as long as there are enough parameters, and things are reasonably well-conditioned (i.e. a decent number of nonlinearities and and connections between the pieces) then it really doesn't matter how you arrange them, i.e. any sufficiently good architecture works just fine i feel there's something really deep here, and we may be already very close to the upper bound of how well we can approximate a given function given a certain amount of compute. so we should spend more time thinking about other questions, such as what that function should actually look like (what data? which objective function?) and how to make it more efficient

English

408

3.1K

489.2K

Emre ری ٹویٹ کیا

Ian Johnson 🔬🤖@enjalot·16 Şub

Where do dads keep all of their jokes? In a dad-a-base! But what does a dadabase look like when you try to retrieve a joke? Introducing Latent Scope: a new open source instrument for visualizing unstructured data

English

5.6K

Emre ری ٹویٹ کیا

Jason Citron@jasoncitron·18 Mar

Big news for developers today on Discord. We’ve opened up the developer preview for user installable apps as well as HTML5 experiences for apps. This dramatically changes what’s possible to build on Discord. I can’t wait to see what y’all come up with! discord.com/developers/doc…

English

666

265.5K

Emre ری ٹویٹ کیا

Robert Lukoszko@Karmedge·19 Şub

If you look deeper, @GroqInc and 500 tokens / sec mixtral tech was founded by the same person who created TPU for @GoogleAI Tensor Processing Unit – the core thing google AI servers a most likely rely on Whatever Jonathan Ross is about to do is about to change the AI chip industry Its already been 9 years since groq is founded Beast is about to be unleashed

English

377

104.1K

دریافت کریں

@emrefa @FastCompanyT @deedydas @gordic_aleksa @LambdaAPI @akarlin @elonmusk @BarackObama