James Shi

123 posts

James Shi banner
James Shi

James Shi

@shiqyy

curving data @datacurve

san francisco Katılım Nisan 2023
305 Takip Edilen215 Takipçiler
James Shi retweetledi
Datacurve
Datacurve@datacurve·
Opus 4.8 is now on DeepSWE. On the default high thinking effort, it scores 6% higher than Opus 4.7 xhigh, while also lowering average cost per task.
English
59
98
1.3K
629K
Daniel Ching
Daniel Ching@danielchingwq·
joined @datacurve to work on shipping rl envs for the summer! time to grind and work on hard problems🫡
English
12
1
97
7.7K
James Shi retweetledi
Serena Ge (Datacurve)
Serena Ge (Datacurve)@serenaa_ge·
Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks. On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.
Serena Ge (Datacurve) tweet media
English
504
742
6K
1.9M
James Shi retweetledi
brayden petersen ⁂
i’ve joined @datacurve to lead design in san francisco (as an intern)! check out our new site :)
English
60
6
315
21.4K
James Shi retweetledi
will depue
will depue@willdepue·
academics are unprepared for the coming world where much scientific progress is majorly a function of inference compute. whether OpenAI points the Eye of Stargate at your particular field will decide its acceleration. talent will leach away into the labs. it's already begun
English
78
84
1.6K
605.6K
James Shi
James Shi@shiqyy·
ever-improving models are a forcing function for us to live more deeply. automation buys us the time to go on longer walks, learn that new language, or play the piano. the richness the world has to offer floods back into our work. our ideas become more abstract, creative, better. better ideas build better models. then the cycle continues : D
English
2
0
8
114
James Shi retweetledi
Kevin Huang (Wenqi)
Kevin Huang (Wenqi)@winkey_h·
If Opus 4.6 has felt worse lately, this may be part of why: On a 100-task swe-bench pro sample, opus 4.6 on high finished behind both sonnet 4.6 and opus 4.5 on the same setting! Turning off adaptive reasoning didn’t seem to change much but max effort did
Kevin Huang (Wenqi) tweet media
Evan You@evanyou

Honestly don't know what happened to Claude Code. Tried a one-off simple task on a fresh directory yesterday, tried a bunch of things that didn't work, asked for a ton of permissions, and then got stuck for 4 minutes before I got tired of waiting and killed the session. This was on medium effort. Switched to Codex gpt 5.4 with medium effort and one shotted the task in under 1 minute.

English
2
4
15
1.2K
James Shi retweetledi
luffy
luffy@0xluffy·
every so often a human being does something that rewires how the rest of the world think about what's possible > neil armstrong stepped onto the moon > usain bolt ran 100m in 9.58s > hathor bjornsson deadlifted 501kg before these moments, the achievement existed only as a fairy tale, ambitious but delusional after? it became a target 4 minute mile, someone broke it, dozen followed. the ceiling wasn't physical, it was psychological. someone just had to go first but all these breakthroughs share underlying logic. push harder, grind, better outcome. peak human performance has always been measured by results. the blood, sweat and tears are just the admission ticket but alysa liu broke a different kind of ceiling. what she showed was not a new record. she showed that the highest form of human potential is enjoying the process. the courage to say if the pursuit of winning kills the joy of doing, you've already lost the thing that actually mattered we live in the age of AI. doomers are afraid of getting replaced. you tie your identity to your output. if you are programmer, claude code writes better lines faster. if you are an analyst, claude crunches numbers in seconds. what's left of you is a shell of nothingness alysa liu shows us that we have been asking the wrong question a machine can eventually land a triple axel triple lutz triple toe with perfection. but that doesn't compare to what alysa liu did. the falls, the morning ice, the moment your body finally understands the rotation. the meaning was never in the landing but the learning and act of doing it people who struggle most in this age are the ones who were already disconnected from the experience. the ones who were every only in there for the output, the status, the paycheck. blame AI all you want, but it did not create the emptiness. it just made it impossible to ignore the ones who would thrive are the ones who were already doing things because the doing itself was the point. the programmer who loves the puzzle. the filmmaker who writes because it's a story she wants to express. the violinist who finds something close to nirvana in the music. for them AI is just another tool in a practice that was always about something deeper than the outcome what alysa showed the world, with or without the medal is: decouple your worth from your output. the outcome was never the point. the act of doing, fully, happy, on your own teams, that was always the real feat. just like every other paradigm-breaking achievements before it, now someone has shown us it's possible, and the rest of us can follow
luffy tweet media
English
77
515
6.2K
568.6K
Serena Ge (Datacurve)
Serena Ge (Datacurve)@serenaa_ge·
Today we’re announcing we’ve raised $17.5 million in funding across a $15M Series A led by Chemistry and a $2.7M Seed to accelerate foundation model progress through providing frontier training data for LLMs. When we first started Datacurve, it came from a simple realization: foundation model progress is limited not just by compute, but by data quality and complexity. The right data unlocks new capabilities, especially in coding, where accuracy and reasoning matter most. We’re now proud to partner with the world’s leading foundation-model labs, providing them with high-quality, complex training data that helps push the boundaries of what AI can do. This is still just the start. Come build the future of technology with us in San Francisco: datacurve.ai/careers Huge thanks to our incredible team and investors who’ve believed in us since day one and beyond: @garrytan at @ycombinator, @1vnzh from @cohere , @Mark_Goldberg_ from @chemistry_fund, @TheDerrickLi from @AforeVC, @forwarddeploy, @SoheilK, and @shyamalanadkat.
Serena Ge (Datacurve) tweet media
English
150
62
1.1K
449K
James Shi retweetledi
Flowers ☾
Flowers ☾@flowersslop·
There’s this small niche of people with no technical background or flashy résumés, but who are obsessively into AI on a deep, non-technical level. They follow every new model, know their quirks and capabilities, and often end up knowing more about current systems than some computer science folks in the field, simply because they’re constantly experimenting and staying up to date. These people have valuable insights but no real way to apply them: labs don’t see their use, they lack the skills to build things themselves, and they’re not rich enough to hire others. Ironically, they may benefit from AGI more than most technical staff. And if labs were smart, they’d bring them in for their unique outside perspective and uncanny intuition about AI, even if it’s not technical.
English
343
345
6K
599.3K
James Shi
James Shi@shiqyy·
this must be cluely's head of talent
James Shi tweet mediaJames Shi tweet media
English
3
0
4
1.1K
James Shi retweetledi
Serena Ge (Datacurve)
Serena Ge (Datacurve)@serenaa_ge·
Datacurve is hiring a contract designer ASAP to work on our gamified problem solving / bounty platform! (which turns into data for the tier 1 labs) DM me!
Serena Ge (Datacurve) tweet media
English
3
5
58
10.2K
James Shi
James Shi@shiqyy·
started using orion browser again and its been the best experience ive had with a browser in ages: - nested tabs bar when cmd clicking - all tabs overview view when you pinch out - supports both firefox + chrome extensions (although most i've tried just don't work lol) plus its webkit and it looks fking amazing
English
0
0
3
399
James Shi
James Shi@shiqyy·
@BrianMiki03 lmao they might just have more taste than most devs in general
English
0
0
1
67
BrianMiki
BrianMiki@BrianMiki03·
Windsurf has better product taste then most devs building with it
English
2
0
7
550
luffy
luffy@0xluffy·
the lion this the lion that stop lion to yourself bro
English
8
1
74
3.6K
James Shi
James Shi@shiqyy·
we are soo back lfg
James Shi tweet media
English
1
0
1
289