julio (bigsxy) (@juemrami) - Twitter Profili | Zamantika Mersobahis Locabet

julio (bigsxy)@juemrami·7h

and look at these nice errors , that i wont bother handling, i get

English

0

7

julio (bigsxy)@juemrami·7h

Decided to just ditch the majority of the SDK and just use their zod validator and write my own effect-ts client over the parts of the API i actually am using and look at the fucking difference. Nearly half the size. Another reason to not just blindly bring in libraries when you boilerplate is so much easier to write now.

julio (bigsxy)@juemrami

man the mistral-ai typescprit api sdk is so bad at tree shake-ability. Im literally just using 2 of the clients functions atm and its taking up 50% of my bundles real estate. Another reason to use effect if youre building libraries, makes you build in a way that makes DCE ez for bundlers.

English

3

0

1

49

julio (bigsxy)@juemrami·7h

here's what the new graph looks like Id like to not use zod but i'd need to find an OpenAPI spec to effect-schema generator.

English

0

10

julio (bigsxy)@juemrami·22h

@MichaelArnaldi @LukeParkerDev my agents run on the budget from copilot (gpt5mini and their "raptor mini" model) They are complete imbeciles. Cant afford a nice claude/codex plan unfortunately.

English

0

60

Michael Arnaldi@MichaelArnaldi·1d

@LukeParkerDev Skill issue

English

4

0

16

3.1K

Luke Parker@LukeParkerDev·1d

im am this close to crashing out. every AI just does dumb stuff unless you are so specific you may as well code. it can help for mass migrations once you've already done the shape and exact impl, and have a bunch of boring work. im so sick of trying to wrangle it lol

English

154

36

969

44.7K

julio (bigsxy)@juemrami·2d

Your YC CTOs new favorite sentence

Miguel Salinas@Vercantez

pi + ghostty running entirely in a cloudflare workers durable object. sqlite based file system + js code exec + cron support. The best part is it can deploy worker sites using Dynamic Worker Loaders.

English

0

4

63

julio (bigsxy)@juemrami·2d

man the mistral-ai typescprit api sdk is so bad at tree shake-ability. Im literally just using 2 of the clients functions atm and its taking up 50% of my bundles real estate. Another reason to use effect if youre building libraries, makes you build in a way that makes DCE ez for bundlers.

English

0

70

julio (bigsxy)@juemrami·2d

@kimmonismus Did you just rehash the op tweet for the sake of making this a quote tweet instead of just retweet?

English

0

306

Chubby♨️@kimmonismus·3d

Very impressive: MSA (memory sparse attentions) is a so exciting because it lets AI models directly store and reason over massive long-term memory inside their attention system, without relying on external retrieval or lossy compression, making them far more accurate and scalable. it allows 100M context window with minimal performance loss

艾略特@elliotchen100

论文来了。名字叫 MSA，Memory Sparse Attention。一句话说清楚它是什么：让大模型原生拥有超长记忆。不是外挂检索，不是暴力扩窗口，而是把「记忆」直接长进了注意力机制里，端到端训练。过去的方案为什么不行？ RAG 的本质是「开卷考试」。模型自己不记东西，全靠现场翻笔记。翻得准不准要看检索质量，翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理，就抓瞎了。线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了，但越压越糊，长了就丢。 MSA 的思路完全不同： → 不压缩，不外挂，而是让模型学会「挑重点看」核心是一种可扩展的稀疏注意力架构，复杂度是线性的。记忆量翻 10 倍，计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」用了一种叫 document-wise RoPE 的位置编码，让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制，让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录，而是把线索串成链。结果呢？ · 从 16K 扩到 1 亿 token，精度衰减不到 9% · 4B 参数的 MSA 模型，在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属，这是创业公司买得起的成本。说白了，以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是，让它真正「记住」。我们放 github 上了，算法的同学不容易，可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA

English

32

69

881

76.8K

julio (bigsxy)@juemrami·3d

@haydendevs I know u can do this with other tools, but nix keeps it unified under one “dsl”

English

0

14

julio (bigsxy)@juemrami·3d

@haydendevs If you appreciate declarative setups. Specially for little system hacks that you may forget about when you move from machine to machine. Yes.

English

1

0

1

81

hayden@haydendevs·4d

is nix actually better than arch

English

48

0

86

9.1K

julio (bigsxy)@juemrami·3d

This mindset is everything I hate about thee world today, not just tech Twitter

Lewis ⚡ soc2/acc@lewiscarhart

@BryanOnel86 What’s your ARR king? We just hit $5M in 11.5 months Didn’t you guys hit that 6 months ago?

English

0

34

julio (bigsxy)@juemrami·3d

@chribjel lol if only.

English

0

416

Christoffer Bjelke@chribjel·4d

i wish the open weight models like kimi/glm/minimax would have a license so all derivative models also would have to be open weight

English

10

3

235

12.8K

julio (bigsxy) retweetledi

Darth Powell@VladTheInflator·4d

ZXX

87

897

7.6K

270.9K

julio (bigsxy)@juemrami·4d

@YashGouravKar1 you live and you learn ig

English

0

1

18

julio (bigsxy)@juemrami·4d

@YashGouravKar1 yea, just like using Express or whatever other lib you were comfortable with. Ive just never had to actually implement one, so i didnt know how it. I had an idea but i wasnt going to sit there and google it. Pretty fucking bad on my part lol. 🤷‍♂️i shoulda just googled it xD.

English

2

0

1

24

julio (bigsxy)@juemrami·4d

I flopped so fucking hard. its not even funny. well it is funny. I just realized ive never had to implement an http server from an empty project. Even though i have made my own http parser and simple server on Zig, i had 0 clue how to "spin one up" in javascript land. I knew the tech to do it, either effect HttpServer or Hono, but i didnt know implementation details and i looked like a fucking idiot.

julio (bigsxy)@juemrami

i have a phone screening interview today and im incredibly nervous. Im pasta syndrome.

English

1

0

49

julio (bigsxy) retweetledi

艾略特@elliotchen100·5d

论文来了。名字叫 MSA，Memory Sparse Attention。一句话说清楚它是什么：让大模型原生拥有超长记忆。不是外挂检索，不是暴力扩窗口，而是把「记忆」直接长进了注意力机制里，端到端训练。过去的方案为什么不行？ RAG 的本质是「开卷考试」。模型自己不记东西，全靠现场翻笔记。翻得准不准要看检索质量，翻得快不快要看数据量。一旦信息分散在几十份文档里、需要跨文档推理，就抓瞎了。线性注意力和 KV 缓存的本质是「压缩记忆」。记是记了，但越压越糊，长了就丢。 MSA 的思路完全不同： → 不压缩，不外挂，而是让模型学会「挑重点看」核心是一种可扩展的稀疏注意力架构，复杂度是线性的。记忆量翻 10 倍，计算成本不会指数爆炸。 → 模型知道「这段记忆来自哪、什么时候的」用了一种叫 document-wise RoPE 的位置编码，让模型天然理解文档边界和时间顺序。 → 碎片化的信息也能串起来推理 Memory Interleaving 机制，让模型能在散落各处的记忆片段之间做多跳推理。不是只找到一条相关记录，而是把线索串成链。结果呢？ · 从 16K 扩到 1 亿 token，精度衰减不到 9% · 4B 参数的 MSA 模型，在长上下文 benchmark 上打赢 235B 级别的顶级 RAG 系统 · 2 张 A800 就能跑 1 亿 token 推理。这不是实验室专属，这是创业公司买得起的成本。说白了，以前的大模型是一个极度聪明但只有金鱼记忆的天才。MSA 想做的事情是，让它真正「记住」。我们放 github 上了，算法的同学不容易，可以点颗星星支持一下。🌟👀🙏 github.com/EverMind-AI/MSA

艾略特@elliotchen100

稍微剧透一下，@EverMind 这周还会发一篇高质量论文

中文

164

566

3.1K

1.6M

julio (bigsxy)@juemrami·4d

@ryanvogel kind of a banger

English

0

11

vogel@ryanvogel·4d

GTA missions were ahead of their time

Andy Fang@andyfang

Introducing Dasher Tasks Dashers can now get paid to do general tasks. We think this will be huge for building the frontier of physical intelligence. Look forward to seeing where this goes!

English

2

0

21

1.9K

julio (bigsxy)@juemrami·4d

@hellosami "titanic sinking titanic toys" Keep an eye on that kid lol.

English

2

0

140

36.5K

sami@hellosami·5d

I started teaching 4th graders graphic design at an afterschool program, and this one kid does not engage with any of the material, but he comes into class and furiously googles the Titanic for an hour every week

English

588

674

23.2K

19.4M

julio (bigsxy)@juemrami·4d

i dont get why most software has to be a service. I get providing services for spaces and populations that are either hardware bound OR dont have technical savy to interact with your software outside of like GUIs or something. But besides that stuff like SaaS just makes it seem like youre trying to milk out as much money as possible from customers by making yourself seem like a requirement to make the software run or to make the software maintainble. IDK im not a business person, but as a consumer it does all seem like a giant game.

English

0

16

julio (bigsxy)@juemrami·4d

why does github attribute commits on your graph to when they were pushed to the github remote vs when the commits were actually made?

English

0

14

julio (bigsxy)

Keşfet