Arkin Terli

341 posts

Arkin Terli

@arkinterli

Technologist. Generalist. Believer that true art lies in ultimate simplification. Personal opinions. Formerly at ArgoAI, Apple, IMG, AMD, and Microsoft 🇺🇸🗽

San Francisco, CA Se unió Şubat 2010

475 Siguiendo185 Seguidores

Arkin Terli@arkinterli·7h

@simplifyinAI Very cool!

English

167

Simplifying AI@simplifyinAI·12h

Microsoft just solved the context window problem. Right now, every AI suffers from a fatal flaw: the "context window problem." When an AI reasons through a complex problem, it generates a massive chain-of-thought. But there is a catch. It has to keep every single token of that thought in its active memory. The technical term is the "KV Cache." The longer the AI thinks, the heavier it gets. It slows down. It gets expensive. Eventually, it runs out of space. We thought the only fix was renting bigger, more expensive cloud GPUs to hold all that context. Microsoft just proved us wrong. They published a paper called "MEMENTO." Instead of giving the AI a bigger memory, they taught it how to forget. Here is how it works: Instead of generating one endless stream of consciousness, a Memento-trained model breaks its reasoning into small blocks. After it finishes a block, it writes a dense, highly compressed summary of its own logic—a "memento." Then, it does something unprecedented. It physically deletes the entire previous reasoning block from its memory cache. It only carries the memento forward. The model reasons, extracts the core logic, and instantly drops the dead weight. The results rewrite the economics of running AI. • Context length compressed by 6x. • Active memory usage (KV cache) reduced by 2.5x. • Zero loss in math, science, or coding accuracy. And here is the real implication. Big tech has been charging you by the token for massive context windows you don't actually need. With this architecture, small businesses and solo operators can run complex, multi-step autonomous agents entirely locally. You don't need an enterprise cloud setup. A standard machine running an open-source model can now reason indefinitely without overflowing its memory. No API fees. Complete privacy. We spent the last two years trying to give AI an infinite memory. It turns out, the secret to smarter AI isn't remembering everything. It's knowing exactly what to forget.

English

263

15.2K

Arkin Terli@arkinterli·8 Nis

@Zai_org Great model, but it breaks down above 100k context.

English

4.7K

Z.ai@Zai_org·7 Nis

Introducing GLM-5.1: The Next Level of Open Source - Top-Tier Performance: #1 in open source and #3 globally across SWE-Bench Pro, Terminal-Bench, and NL2Repo. - Built for Long-Horizon Tasks: Runs autonomously for 8 hours, refining strategies through thousands of iterations. Blog: z.ai/blog/glm-5.1 Weights: huggingface.co/zai-org/GLM-5.1 API: docs.z.ai/guides/llm/glm… Coding Plan: z.ai/subscribe Coming to chat.z.ai in the next few days.

English

536

1.3K

10.8K

4.2M

Arkin Terli@arkinterli·30 Mar

@brockpierson yes

⭕ Brock Pierson@brockpierson·29 Mar

Last night I played 4 hours of the original Command & Conquer Red Alert with a friend. I absolutely loved this game. One of the first PC games I truly obsessed over. Released in 1996. Still an incredible game. DId you play it?

English

850

238

7.3K

540.6K

Arkin Terli@arkinterli·29 Mar

@CuriosityonX Wow

Curiosity@CuriosityonX·28 Mar

NEWS🚨: Western Australia’s sky turned an eerie shade of red as dust filled the air ahead of Tropical Cyclone Narelle.

English

175

1.3K

82.1K

Arkin Terli@arkinterli·7 Mar

@awnihannun I believe it will take less than 2 years.

English

259

Awni Hannun@awnihannun·6 Mar

According to benchmarks Qwen3.5 4B is as good as GPT 4o. GPT 4o came out ~2 years ago (May 2024). Qwen 3.5 4B runs easily on modern mobile devices. So the gap between frontier intelligence in a datacenter and running a model of equal quality on your iPhone could be 2-3 years. (Probably closer to 3 assuming Qwen3.5 4B is more benchmaxxed than 4o) I don't expect the trend of increasing intelligence-per-watt to change. So in 2-3 years it's plausible we will be running GPT 5.x quality models on an iPhone. Pretty wild.

English

123

147

198.9K

Arkin Terli@arkinterli·6 Mar

@thdxr No.

122

dax@thdxr·6 Mar

ugh this is gonna beat the tui isn't it

Kit Langton@kitlangton

stupid sexy composer

English

128

2.3K

202.8K

Arkin Terli@arkinterli·4 Mar

@XorDev @wookash_podcast well said!

English

Xor@XorDev·3 Mar

@wookash_podcast It seems to me that LLMs actually raise the bar for excellence. The generalized work is not as important, but the advanced knowledge, concepts and nuance are even more important now. There's no shortcuts to those high level skills. It requires starting from the bottom

English

2.5K

Łukasz | Wookash Podcast@wookash_podcast·3 Mar

I don't get "knowledge is worth nothing now" LLM crowd. Have you ever tried to build something? If you want to build something there are thousands of small decisions, assumptions, tests to make, validate and run. There is a reason why even though you have a calculator, everyone learns in school what's 2+3. You don't want (or can't) in real world reach for calculator for 2+3. So how are you going to make something with zero knowledge? "Hey I want to go to Mars, how do I do that?" "Here are five easy steps to get to Mars in no time!"

English

423

17.8K

Arkin Terli@arkinterli·11 Şub

@XorDev As always, magnificent.

English

Xor@XorDev·11 Şub

Orchard vec3 p,v=normalize(FC.rgb*2.-r.xyx),c=v/v.y;c.z+=.5*t;for(float z,i,b,g,m;i++<5e1;z+=.8*max(b=length((p.y-m)/1e2/(abs(sin(c.xz/.1))-.05/v.y)),min(4.-m,g=length(sin(p.xz)+1.-.1*(1.+sin(p.y-p.zx*.5))*m))-b),o.rgb+=(.7-v)/(g+b))p=z*v+1.,p.z-=t,m=abs(++p.y);o=tanh(o/5e2);

491

12.3K

Arkin Terli@arkinterli·11 Şub

30+ years in the 21–24 BMI range. Current: 6’2” | 179 lbs | BMI 23 Secret: Eat healthy.

English

Arkin Terli@arkinterli·17 Oca

@tomwarren @zerohedge Wow

Tom Warren@tomwarren·17 Oca

"I kind of think of ads as like a last resort for us as a business model," - Sam Altman, October 2024

Sam Altman@sama

We are starting to test ads in ChatGPT free and Go (new $8/month option) tiers. Here are our principles. Most importantly, we will not accept money to influence the answer ChatGPT gives you, and we keep your conversations private from advertisers. It is clear to us that a lot of people want to use a lot of AI and don't want to pay, so we are are hopeful a business model like this can work. (An example of ads I like are on Instagram, where I've found stuff I like that I otherwise never would have. We will try to make ads ever more useful to users.)

English

774

6.6K

77.4K

5.4M

Arkin Terli@arkinterli·14 Oca

@ScottAdamsSays RIP

Scott Adams@ScottAdamsSays·13 Oca

A Final Message From Scott Adams

English

13.2K

32K

191.6K

43M

Arkin Terli@arkinterli·25 Ara

❄️🎄 Merry Christmas 🎄❄️

Eesti

Arkin Terli@arkinterli·13 Tem

@XorDev You are about to get into the demoscene world.

English

102

Xor@XorDev·12 Tem

"Twist" for(float i,z,d,s;i++<1e2;){vec3 c=vec3(1,3,5)+s/50.,p=z*normalize(FC.rgb*2.-r.xyy),a=normalize(cos(c));p.z+=30.;a=a*dot(a,p)-cross(a,p),a.xy*=mat2(cos(t+vec4(0,33,11,0))),a=abs(a);z+=d=.05+.1*abs(abs(a.z-20.)-cos(s=a.x+a.y));o.rgb+=(cos(.1*i-c)+1.)/d/d;}o=tanh(o/6e3);

Čeština

240

7.3K

Arkin Terli@arkinterli·16 May

@flockaroo @XorDev

QAM

Florian Berger@flockaroo·14 May

vec3 q=vec3(0,0,1e4),v=FC.gbr-r.yxx*.3,p;for(float i,s,d;i++<53.;d=.2*t){for(p=q,s=6e3;4.<s;p=p.zxy-s*sin(p/s*6.3)*.05,s*=d=.8)p.yz*=rotate2D(d);d=length(vec2(length(p.yz)-1e4,p.x))-3e3;v.x+=v.x*step(7e2,q.x)*(1.1-mod(FC.y,2.)*2.);q+=sin(i)+v/r.x*d;o+=exp(-d*d/vec4(3,2,1,1))/i;}

155

1.9K

57.6K

Arkin Terli@arkinterli·11 May

Happy Mother’s Day! ✨💐🌎 #MothersDay

English

268

Arkin Terli@arkinterli·6 May

@_trish_xD C++ can be as easy as any other language if you have enough experience. It’s your experience, not the language itself, that makes things complicated.

English

trish@_trish_xD·6 May

The Happy Periodic Family of Programming Languages! It’s not just chemistry that has a periodic family anymore — now developers have one too!

English

218

11.6K

Arkin Terli@arkinterli·5 May

@XorDev YES!

262

Xor@XorDev·5 May

Allman indentation is the only acceptable style

English

Arkin Terli@arkinterli·29 Nis

@levelsio Good times!

English

@levelsio@levelsio·28 Nis

Power still down in Spain and Portugal All internet is gone too cause the 4G masts ran out of battery power, at least in Portugal, don't know Spain Phone calls don't even work anymore! Also I heard many gas stations in Portugal can't pump gasoline cause their pumps work on electricity Only internet we have is in the Continente supermarket which I'm typing this on 😂 We went to buy an analog battery radio at Radio Popular and they're all sold out already! 📻

@levelsio@levelsio

Power is STILL down That makes it I think the worst blackout in Europe since 2006 and maybe longer We're in Portugal and went to supermarket to get water and food in case power doesn't come back (small chance but already hours now) Absolute anarchy ala COVID 2020 there Internet progressively went from 5G to 4G then 3G then 2G then nothing at all progressively, it seems the telecom masts have a battery for just a few hours Anyway everyone's buying water, food and toilet paper ATMs are all empty of cash money Funny we drove past some guys working on an electricity mast, maybe they didn't get the news it's entire Spain and Portugal? 😂

English

259

2.5K

772.2K

Arkin Terli@arkinterli·28 Nis

@lauriewired I love it!

English

631

LaurieWired@lauriewired·27 Nis

Major new QEMU update released. The coolest part? Paravirutalized Apple GPUs. You can now spin up disposable macOS VMs *with* hardware acceleration. macOS guests now expose a thin vGPU (apple-gfx-mmio). very useful for CI, reverse engineering, gfx research, etc