Ville🤖

1.9K posts

Ville🤖 banner
Ville🤖

Ville🤖

@VilleKuosmanen

gentleman scientist 🤖 @voyagerobotics

London, UK Katılım Mayıs 2020
784 Takip Edilen3K Takipçiler
Sabitlenmiş Tweet
Ville🤖
Ville🤖@VilleKuosmanen·
A new AI model achieved my highest repeatability so far, at around 85% success rates. If the objects are at the centre and easily visible and reachable, this increases further
English
19
15
221
26.1K
Ville🤖
Ville🤖@VilleKuosmanen·
ok so we're gonna need a bigger disk
Ville🤖 tweet media
English
1
0
3
290
Ville🤖
Ville🤖@VilleKuosmanen·
@ivanburazin @DrataHQ previous startup I used to work at had the same journey. dedicated expert working for over a year on soc 2. was surprised how many startups had it so quickly, now we know!
English
0
0
0
1.7K
Ivan Burazin
Ivan Burazin@ivanburazin·
It took our compliance officer close to a year to get SOC 2 done. In the meantime, I saw even 3 month old startups getting it with ease. After a point I got frustrated: “Why the hell are we using @DrataHQ? Let's try these guys and get it done!" He kept saying “It's just not possible. Leave me alone, I'll get it done.” I was still furious but let him do it his way. It seems he was right after all.
Ivan Burazin tweet media
English
34
21
581
65.7K
Ville🤖
Ville🤖@VilleKuosmanen·
@JulianSaks my repo was a bit behind but they did release a few commits recently
English
0
0
1
71
JulianSaks
JulianSaks@JulianSaks·
@VilleKuosmanen Didn’t see any big changes in recent commits? Am I missing something? 😅
English
1
0
0
98
Ville🤖
Ville🤖@VilleKuosmanen·
I really want Cursor and other teams building models not controlled by the big labs to succeed, and drive competition on API pricing! Cursor just started with their own models so it's probably better to view the gradient of model improvement rather than absolute performance
English
0
0
0
111
Ville🤖
Ville🤖@VilleKuosmanen·
would love to say good things about Composer 2 Flash but 1st impression is... not good 🫠
English
2
0
1
348
Ville🤖
Ville🤖@VilleKuosmanen·
testing the official TOPReward implementation against the one I vibe-coded in an evening, looks like fixing the few discrepancies produces a slightly more optimal and robust reward signal 🦾 @DJiafei
Ville🤖 tweet mediaVille🤖 tweet media
English
0
0
7
917
Ville🤖
Ville🤖@VilleKuosmanen·
@chetan_ If I strap a GoPro on my cat will you buy the data? Useful for training robot dogs maybe 🤔
English
1
0
1
52
Ville🤖
Ville🤖@VilleKuosmanen·
has anyone implemented an open-source version of @ToyotaResearch's LBMs? Would love to see more tests across LBM-style, scaled-up diffusion policies, versus pre-trained foundation models like pi 🤔
Ville🤖 tweet media
English
3
0
16
1.7K
Ville🤖
Ville🤖@VilleKuosmanen·
@chetan_ also wash and clean stations for grippers and hands
English
1
0
1
95
Chetan
Chetan@chetan_·
someone needs to come out with windshield wipers for robot cameras
English
5
0
11
601
Ville🤖
Ville🤖@VilleKuosmanen·
me heads down building watching everyone else have fun in gtc 🥲
Ville🤖 tweet media
English
2
0
8
418
Ville🤖
Ville🤖@VilleKuosmanen·
@animesh_garg @bznotes I like the custom screwdriver end effector integrated into the policy! Policy needs to adapt to different end-effectors, not just grippers, and tolerate changes in the wrist camera angles (camera is further and points at a sharper angle than in the gripper)
English
0
0
0
143
Animesh Garg
Animesh Garg@animesh_garg·
looking at this video, this is mostly position guided control. (the object dropped rather than placed!) I believe the state of the art is a little beyond this, but to their credit this was a live running demo at a conference. So they likely picked an example with higher reliability.
English
1
0
10
1.8K
Ville🤖
Ville🤖@VilleKuosmanen·
@ClementDelangue For the Hub, ways to pay for higher rate limits without upgrading to Enterprise would be great (e.g. an optional paid add-on to Pro) Building on top of LeRobotDatasets can require downloading lots of data in spikes which goes through resolver requests quickly 😅
Ville🤖 tweet media
English
0
0
0
59
clem 🤗
clem 🤗@ClementDelangue·
5. Anything else you want me to know, challenge, or pay attention to?
English
4
0
10
4.6K
clem 🤗
clem 🤗@ClementDelangue·
Just sent these questions to the HF team after paternity leave - would love the community's take too 👇
English
8
4
120
40.2K
Ville🤖
Ville🤖@VilleKuosmanen·
@mstockton One correction - LLMs are getting good in coding thanks to RL post-training, not because they have access to lots of code tokens during pre-training. Coding has some of easiest verifiable success/fail conditions (along with maths) which makes them ideal for RL post-training
English
2
0
4
910
Matt Stockton
Matt Stockton@mstockton·
This is also very bitter lesson pilled. My stream of consciousness here: - We found out that LLMs are *extremely* good at writing code. There is just so much code in the training data, and it turns out that matter a lot for a thing that generates next tokens. - Code can solve lots of problems. Maybe almost all of them that we have, if the code is assembled correctly. - Based on the two things above, we can now solve lots of problems - and generating code is the highest utility thing that a model can do right now - But, people don't want to look at code, and in fact, putting code in their view causes friction. Code is the wrong level of abstraction for most people - Foundation model companies know this, and are actively working to find the right abstraction layer that is still code but not presented as code to the user. - Thus, Claude Cowork - which improves the value of the abstraction layer to *most* users, but compromises the utility *just* a little bit - improves the reach of these tools without compromising that much value - This has all happened in a period of only 3 months, because we are at the exponential. - Thus, it's just going to happen again, in a shorter timeframe. So basically, yes and no. Some of this stuff is 'wasted learning' because things wont be the same in 2 months. But you can't not do it -- because it's learning that feeds directly into the next thing. It's very weird that you both *need* to learn the new thing to take advantage of the new new thing, but the new thing is going to only get you 3 months. Welcome to the exponential Iguess.
Brett Caughran@FundamentEdge

Having spent a lot of the last two weeks experimenting with different user interfaces for Claude Code (including via terminal, Cursor and VS Code) one of my primary observations is how frustrating it can be to get up to a baseline level of operability with these tools. Just the basics have taken me many hours and many YouTube videos, and numerous times cursing into an LLM as tutor “I said explain it to me like a 12 year old”. Maybe I’m dense, I’m certainly not very technically proficient, and my brain goes haywire staring at any coding language to be quite honest. But perhaps I’m a litmus test of the average end user of these tools (a user base that needs someone to help them build their Bloomberg launch pad, after all), and in that sense I doubt rolling out VS Code across many users to an investment organization to drive consistent adoption of AI is a viable plan to generate adoption. The good news is how fast the user experience is evolving. I contrast that maddening experience to the much more user-friendly experience of using Claude Co-Work and even Claude Code direct in the desktop app, and particularly the incredible usefulness of Agentic Work Platforms like Perplexity Computer to be a delightful and powerful user experience (not sponsored), and wonder if that user experience is indicative of what's to come. Multi-LLM agentic Workflow tools tied to your select MCP/APIs, your internal data and notes, and trained via natural language to create a Skills architecture that works the way you work (much like investors set up their Bloomberg launchpads right now i.e. in a heterogeneous fashion unique to their specific workflow). Do I lose something not being close to the code? I'm not sure I even know what that means (but all the arrogant YouTubers tell me it’s important). But I know I'm shocked by the usefulness of the outputs I'm getting out of Claude Co-Work and Perplexity Computer, and that’s what matters to me. For that reason, I wonder if learning VS Code for investors is akin to learning to construct prompts by hand in summer 2025, ie. mostly wasted learning that will quickly be subsumed by the next iteration of technology. For the vast majority of users (I.e. investors in a front office seat). It seems like we are very close to a level of intuitive operability with agentic work systems.

English
13
12
292
80.1K
Ville🤖
Ville🤖@VilleKuosmanen·
Defence tech funding per capita in the Nordics (via Yle) 🇫🇮 Finland is one of the few Western countries maintaining conscription, and has a historically strong industrial base and hardware successes. Natural fit for defence tech!
Ville🤖 tweet media
English
0
1
7
347