Botty Dimanov

107 posts

Botty Dimanov banner
Botty Dimanov

Botty Dimanov

@botty_dimanov

CEO & Co-founder | PhD Visual AI @ Cambridge | YCombinator | Forbes 30U30

Cambridge, England Katılım Mart 2019
452 Takip Edilen155 Takipçiler
Botty Dimanov
Botty Dimanov@botty_dimanov·
@om_patel5 Yes, this is real, and small businesses can use this tech already. DM me if you are a business owner who wants to increase your operational visibility. We promise you’ll see 10-20% profit boost within 9 months, or you get this and next year for free.
English
0
0
0
5
Om Patel
Om Patel@om_patel5·
this is insane a coffee shop is using AI to monitor every employee and customer in real time > how many cups each barista has made > how long every customer has been waiting > who's working fast and who's falling behind walmart already does this across all their stores. it's only a matter of time before every small business has access to this kind of tech.
English
625
950
5.2K
1.4M
Botty Dimanov
Botty Dimanov@botty_dimanov·
This is somewhat false. OAI’s API gross margin sits at very healthy 75%. H1 2025 revenue: $4.3B, net loss $13.5B. However, the loss is due to research and frontier model training. They can stop innovating tomorrow, and run a very profitable business. Investors give them capital because they know the best model will win bigger market share.
English
0
0
0
39
Geoff Lewis
Geoff Lewis@GeoffLewisOrg·
It’s time.
English
507
119
1.7K
3.7M
Botty Dimanov
Botty Dimanov@botty_dimanov·
Excited to see our new product featured by Capital — in an excellent piece by Yoan Bondakov, opening in true Arthur Conan Doyle style. 🕵️‍♂️ We just launched Tenyks Visual Intelligence — turning any existing camera into a VideoGPT. 🎥🤖
English
5
0
0
98
Botty Dimanov
Botty Dimanov@botty_dimanov·
And this is just the beginning. We’re building a future where every visual source becomes an intelligent agent: • CCTV • Smart glasses • Drones • Satellites • Even humanoid robots 🤖
English
0
0
0
36
Botty Dimanov
Botty Dimanov@botty_dimanov·
Why it matters: 📈 +1 second in speed-of-service = $30K/year per restaurant 📊 +0.1% in retention = tens of millions for large franchisors Tiny optimizations. Massive value.
English
0
0
0
30
Botty Dimanov
Botty Dimanov@botty_dimanov·
In people-heavy sectors like restaurants, retail, and logistics, Vision AI = clarity at scale. Operators finally gain visibility into how teams move, serve, and decide — minute by minute.
English
0
0
0
28
Botty Dimanov
Botty Dimanov@botty_dimanov·
Most cameras today are blind archives. Tenyks unlocks real-time, actionable insight from footage that’s been sitting unused for years.
English
0
0
0
24
Botty Dimanov
Botty Dimanov@botty_dimanov·
Deeply honoured to be recognised as an AI Expert among the 50 Leading Bulgarians in the UK Tech Industry. It's a true privilege to be part of such an incredible cohort of founders, investors, and innovators — many of whom I've had the pleasure of meeting. The list features: – "Founders of companies generating millions of pounds in revenue" – "Investors from Atomico, HV Capital, and Octopus Ventures" – "Innovators at DeepMind, Palantir, and Oracle" Sincere gratitude to the Bulgarian Angels Club, BEX, British Bulgarian Chamber of Commerce (BBCC), and the Embassy of the Republic of @Bulgaria in London. Looking forward to pushing the boundaries of our collective excellence together!
Botty Dimanov tweet media
English
0
0
0
87
Botty Dimanov retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Congrats to @AIatMeta on Llama 3 release!! 🎉 ai.meta.com/blog/meta-llam… Notes: Releasing 8B and 70B (both base and finetuned) models, strong-performing in their model class (but we'll see when the rankings come in @ @lmsysorg :)) 400B is still training, but already encroaching GPT-4 territory (e.g. 84.8 MMLU vs. 86.5 4Turbo). Tokenizer: number of tokens was 4X'd from 32K (Llama 2) -> 128K (Llama 3). With more tokens you can compress sequences more in length, cites 15% fewer tokens, and see better downstream performance. Architecture: no major changes from the Llama 2. In Llama 2 only the bigger models used Grouped Query Attention (GQA), but now all models do, including the smallest 8B model. This is a parameter sharing scheme for the keys/values in the Attention, which reduces the size of the KV cache during inference. This is a good, welcome, complexity reducing fix and optimization. Sequence length: the maximum number of tokens in the context window was bumped up to 8192 from 4096 (Llama 2) and 2048 (Llama 1). This bump is welcome, but quite small w.r.t. modern standards (e.g. GPT-4 is 128K) and I think many people were hoping for more on this axis. May come as a finetune later (?). Training data. Llama 2 was trained on 2 trillion tokens, Llama 3 was bumped to 15T training dataset, including a lot of attention that went to quality, 4X more code tokens, and 5% non-en tokens over 30 languages. (5% is fairly low w.r.t. non-en:en mix, so certainly this is a mostly English model, but it's quite nice that it is > 0). Scaling laws. Very notably, 15T is a very very large dataset to train with for a model as "small" as 8B parameters, and this is not normally done and is new and very welcome. The Chinchilla "compute optimal" point for an 8B model would be train it for ~200B tokens. (if you were only interested to get the most "bang-for-the-buck" w.r.t. model performance at that size). So this is training ~75X beyond that point, which is unusual but personally, I think extremely welcome. Because we all get a very capable model that is very small, easy to work with and inference. Meta mentions that even at this point, the model doesn't seem to be "converging" in a standard sense. In other words, the LLMs we work with all the time are significantly undertrained by a factor of maybe 100-1000X or more, nowhere near their point of convergence. Actually, I really hope people carry forward the trend and start training and releasing even more long-trained, even smaller models. Systems. Llama 3 is cited as trained with 16K GPUs at observed throughput of 400 TFLOPS. It's not mentioned but I'm assuming these are H100s at fp16, which clock in at 1,979 TFLOPS in NVIDIA marketing materials. But we all know their tiny asterisk (*with sparsity) is doing a lot of work, and really you want to divide this number by 2 to get the real TFLOPS of ~990. Why is sparsity counting as FLOPS? Anyway, focus Andrej. So 400/990 ~= 40% utilization, not too bad at all across that many GPUs! A lot of really solid engineering is required to get here at that scale. TLDR: Super welcome, Llama 3 is a very capable looking model release from Meta. Sticking to fundamentals, spending a lot of quality time on solid systems and data work, exploring the limits of long-training models. Also very excited for the 400B model, which could be the first GPT-4 grade open source release. I think many people will ask for more context length. Personal ask: I think I'm not alone to say that I'd also love much smaller models than 8B, for educational work, and for (unit) testing, and maybe for embedded applications etc. Ideally at ~100M and ~1B scale. Talk to it at meta.ai Integration with github.com/pytorch/torcht…
English
138
992
7.6K
885.7K
Botty Dimanov
Botty Dimanov@botty_dimanov·
@nealkhosla +1: just try to get the sum of estimated time for nested tasks... Yet we all still use it 😀 what does that tell us?
English
0
0
0
315
Botty Dimanov
Botty Dimanov@botty_dimanov·
"de-biasing a large and diverse dataset may be prohibitively expensive" @aleks_madry, Tenyks and data curation platforms solve exactly that problem. Would love to give Madry Lab full access to see how much your team can push extrapolation and robustness.
Aleksander Madry@aleks_madry

Models often fail under distribution shifts—can pre-training on a large and diverse dataset and then fine-tuning on a task-specific dataset help? W/ @bcohenwang, @josh_vendrow we show that this depends on the specific failure mode. In particular, pre-training can help with extrapolation, but does not address failures that stem from dataset biases.

English
0
0
1
176
Botty Dimanov
Botty Dimanov@botty_dimanov·
Great to see data curation starting to take a central piece in the research literature. For everyone fine-tuning models, you must first curate your dataset to balance the underlying features. Otherwise, fine-tuning doesn't really help.
Aleksander Madry@aleks_madry

Models often fail under distribution shifts—can pre-training on a large and diverse dataset and then fine-tuning on a task-specific dataset help? W/ @bcohenwang, @josh_vendrow we show that this depends on the specific failure mode. In particular, pre-training can help with extrapolation, but does not address failures that stem from dataset biases.

English
0
0
0
103
Botty Dimanov
Botty Dimanov@botty_dimanov·
5/5:👉 SIGN HERE: forms.gle/S7tn1LYqcsF4C7… & read the full letter in the comments below to foster a thriving ecosystem of innovation and entrepreneurship. Let's unite for equitable guidelines and protect the entrepreneurial spirit in academia!
English
0
0
0
37
Botty Dimanov
Botty Dimanov@botty_dimanov·
4/5:Your Role: Support the open letter to TenU. Your signature can make a difference in shaping a future that values innovation and fair play. Act Now: We need 100 signatures! 7 Days left to prevent this calamity.
English
0
0
0
46