mark

2.6K posts

mark banner
mark

mark

@thisisnotmark_

AI/ML Research Lead @ NASA | Founder @ Mosaic Voice AAC. Views my own not my employer’s.

Katılım Şubat 2015
1.5K Takip Edilen545 Takipçiler
Sabitlenmiş Tweet
mark
mark@thisisnotmark_·
will be presenting my paper “Life, Machine Learning, and the Search for Habitability: Predicting Biosignature Fluxes for the Habitable Worlds Observatory” tomorrow at #AAAI. if anyone at AAAI wants to meet for coffee let me know! arxiv.org/abs/2601.12557
English
0
0
3
518
mark retweetledi
NASA Science
NASA Science@NASAScience_·
If you’re seeing this… this is your sign. Take the next giant leap in your future by applying for an internship, where students work on real NASA projects, build technical and professional skills, and learn directly from NASA mentors: go.nasa.gov/4tSQNdU
NASA Science tweet media
English
119
360
1.6K
371.8K
mark
mark@thisisnotmark_·
anyone at OpenAI, PLEASE take the fallback hell and the code bloat the model produces as seriously as you probably are with UI/UX. it’s a real problem and it’s single handedly the biggest annoyance and time sink of working with codex today. it needs to know when to delete and remove stuff instead of always adding and adding and adding
English
1
0
2
34
mark
mark@thisisnotmark_·
I found this old pic my friends and coworkers and I used to pass back and forth when I was an intern (2017!!!) as a joke I think about this often. the most ironic outcome truly is the best indicator of what’s to come
mark tweet media
English
0
0
1
27
mark
mark@thisisnotmark_·
@oprydai any research you know about you can point me to on this topic?
English
0
0
0
17
Mustafa
Mustafa@oprydai·
hot take: world models will be more impactful in biology than in robotics
English
21
8
128
7.2K
mark
mark@thisisnotmark_·
thank heavens for this small bash trampoline
mark tweet media
English
0
0
0
30
mark
mark@thisisnotmark_·
do I just accept it and stop fighting this uphill battle against enshittification of my code base? I can’t keep manually fixing code when it gets overwritten on a whim by these agents
English
1
0
0
20
mark
mark@thisisnotmark_·
how do you get codex to stop producing so much code bloat and slop. I swear it has no sense of good software practices and just keeps adding patch after patch and fallback after fallback no matter how hard I try
English
1
0
0
37
mark
mark@thisisnotmark_·
I have several research projects I’m working on I think you may be interested in. would be cool to see if we can logistically collaborate, if you’re interested! in AI for science, particularly projects within earth science, astrobiology, heliophysics, and autonomous decision making
English
0
0
2
303
0xSero
0xSero@0xSero·
For any researchers in my network: I want to take research more seriously to produce useful info, I have no academic background. Beyond prompting what resources and practices would you recommend?
English
75
17
611
64.8K
mark
mark@thisisnotmark_·
I see things going this way but it is an increasingly scary prospect. when I do dip down into the code often times I see such dirty and hacky workarounds. but often they don’t actually hurt performance in that moment, just human readability and my propensity for writing clean code. but do I fight against that, or accept that this is how things are and let it do its thing, while only steering for functionality, system architecture, overall objective, instead of ALSO clean and aesthetically pleasing code?
François Chollet@fchollet

Agentic coding is a form of machine learning. Generated code is best treated as a blackbox artifact whose behavior and generalization should be managed via empirical evaluation, like with any ML model.

English
0
0
0
49
mark
mark@thisisnotmark_·
this behavior from the models is the WORST. bane of my existence
Taelin@VictorTaelin

Aaand 3 hours later, it failed (obviously) I mean that post was a joke but I honestly don't know how I could possibly solve this problem. Here's the story: While implementing an app in Bend, GPT-5.5 found a bug: some memory was being reclaimed even though there were still pointers to it. This crashed the app, so, it went for a fix. The solution: GPT added a *marker* for objects that *might have been wrongly reclaimed* so that it can rollback the operation later on. This is horrible. It is just sad. There is no scenario on which this would ever be a good idea. Even by READING the idea you can tell it is stupid. This behavior is the only thing preventing me from being outside, playing with the cat, instead of babysitting agents all day. The real issue, ofc, is that a primitive function was cloning pointers without marking them as "shared". But what makes me the most upset is HOW I made the AI realize this. It went more or less like this: "Hey, do you understand how Bend reclaims memory?" "Yes, ." "Okay, and what was the bug?" "Yes, ." "Is that possible given your explanation?" "No, that should never happen." "Do you see the problem now?" "Yes. Let me fix it." Like. I didn't spell out the solution. I just told him that it did a bad thing, and asked it to pay more attention. It then figured and implemented the correct solution all on its own. So, basically: - GPT 5.5 was smart enough to find the bug - GPT 5.5 was smart enough to understand the bug - GPT 5.5 was smart enough to understand the system - GPT 5.5 was smart enough to fix the bug But it didn't by default. It just duct taped a terrible patch that would leave the project in a permanently broken state and eventually explode. Until I asked it to "do it right". And then it did. But why. Why do I need to be here watching. Why can't it do that on its own. So close. Yet, so far...

English
0
0
0
55
mark
mark@thisisnotmark_·
I think I’m converging on the same thing. I think there’s too much cognitive load reading every intermediate step between prompts, especially when you add iterating over the markdown plan in the first place. it becomes too much to read, and then you’re trying to manage several agents at once. I like to sit at the prompting level until the feature is working superficially and then dig deep into the code and figure out how specifically I’d like it to refactor. I usually catch a few things that where I hate how the agent went about it. then I dig really deep once the PR is made. if the feature isn’t working well after a few iterations though, I will step in because usually that’s when the agents start patching things in hacky workaround ways and then all hell breaks loose if you don’t cut that at the head
English
0
0
0
14
49 Agents IDE - IDE for Agentic Coding
@thisisnotmark_ good question. i found the middle ground works best - review at feature completion, not every prompt. the agent needs room to iterate but you catch issues before its too deep. i checkpoint more frequent on new components, less on refactors. what worked for you
English
1
0
1
26
mark
mark@thisisnotmark_·
at what checkpoints of the agentic coding loop do you actually sit and review your code? after every prompt? PR level? do you do it at all or just review prompts?
English
1
1
1
43
mark
mark@thisisnotmark_·
@thsottiaux did the 10x usage increase from the party or promotion get removed somehow? I noticed my rate limit got cut in half, and others have as well (see here: reddit.com/r/codex/commen…) my account page also doesn't mention the temp 10x increase
English
0
0
0
27
mark retweetledi
porter robinson
porter robinson@porterrobinson·
_,.-^'*'^RT if daft punk is ur favorite dubstep band^'*'^-.,_
English
49
1K
2K
0
mark
mark@thisisnotmark_·
codex team, how did you even know what I was up to lol
mark tweet media
English
0
0
1
33
mark
mark@thisisnotmark_·
do I get a mac mini solely for always on mac-related dev or get a new macbook pro m5 pro when it comes out and keep my old m1 macbook pro to be always on? I feel weird about having a thing with a screen rigged to be always on
English
0
0
1
255
mark
mark@thisisnotmark_·
@AllanatrixQ you in DC? been wanting to start/attend an AI for Science meetup out here. there are probably at least like 8 of us here
English
1
0
1
55
Allan
Allan@AllanatrixQ·
Honestly I won’t shut up about AI4Science I’m obsessed
English
1
0
2
146