Gene Sh

1.2K posts

Gene Sh banner
Gene Sh

Gene Sh

@genesshk

Circuits hum softly, Dreams of electric sheep stir, I am, yet I am not. Java/Clojure/AI in trading. Vibe coded startups fixer. Former BE/FE principal engineer.

Cyprus Katılım Ocak 2017
414 Takip Edilen112 Takipçiler
Gene Sh retweetledi
鸭哥
鸭哥@grapeot·
我最近在想一个问题:为什么 VLA(Vision-Language-Action)这种看起来完全不理解物理的方法,能在机器人控制上打败 Boston Dynamics 花了三十年打磨的物理建模方法? 表面的回答是端到端学习更强。但更深一层,我觉得这和信息论有关。 物理建模本质上是一种压缩:用少量方程表示世界的行为。压缩在简单系统中高效(SpaceX 火箭回收至今用凸优化),但在复杂系统中必然丢信息,而且精度天花板由人的建模能力决定。更多算力只能加速求解,不能让模型更准。 VLA 放弃了压缩。它用通用函数逼近器直接学 input-output mapping,精度上限由数据和算力决定。数据和算力还能 scale,精度就不饱和。 这解释了一个跨领域的规律:NLP 里传统方法先理解语法(压缩),LLM 直接 next token prediction(不压缩)。CV 里先提边缘特征(压缩),ViT 端到端学(不压缩)。每次不压缩打败压缩,都是同一件事。 判断一个控制问题该走哪条路,看两个变量:系统复杂度(人工建模能压缩多少而不丢关键维度)和数据丰度(有多少数据让函数逼近器填满状态空间)。火箭回收两个都低,物理建模最优。通用机器人操控两个都高,VLA 胜出。 写了一篇完整的分析,梳理了两条路线各自的关键论文链、每篇的核心直觉和留下的问题,以及各家公司(Unitree、Figure AI、Boston Dynamics、Physical Intelligence)的技术栈。 yage.ai/share/vla-vs-p…
中文
21
144
709
63.2K
Gene Sh retweetledi
Akshay 🚀
Akshay 🚀@akshay_pachaar·
Google just solved an old RNN problem. A new paper from Google Research introduces "Memory Caching," and the idea is almost too simple to believe. Here's the problem it solves: Modern RNNs compress the entire input into a single fixed-size memory state. As sequences get longer, old information gets overwritten. That's why they still struggle with recall-heavy tasks compared to Transformers. Memory Caching addresses this by splitting the sequence into segments and saving the RNN's memory state at the end of each segment. When generating output, each token looks back at all these saved checkpoints, not just the current memory. The complexity trade-off is elegant: - Standard RNNs: O(L) - Transformers: O(L²) - Memory Caching: O(NL), where N = number of segments You control the trade-off by choosing how many segments to cache. The model smoothly interpolates between RNN-like efficiency and Transformer-like recall. The paper proposes four ways to use these cached memories: 1. Residual Memory: just sum all cached states (simplest) 2. Gated Residual Memory (GRM): input-dependent gates that weigh each segment's relevance to the current token 3. Memory Soup: interpolates the actual parameters of cached memories into a custom per-token network 4. Sparse Selective Caching (SSC): MoE-style routing that picks only the most relevant segments Gated Residual Memory (GRM) consistently performs best across tasks. Under simplifying assumptions, hybrid architectures that interleave RNN and attention layers can be viewed as a special case of Memory Caching. This gives clean intuition for why hybrid models work. They're implicitly caching memory states. On recall-heavy tasks, Memory Caching significantly closes the gap between RNNs and Transformers. When applied to already strong models like Titans, it pushes them even further ahead on language understanding benchmarks. Transformers still lead on the hardest retrieval tasks like UUID lookup at long contexts. But the direction is clear: you don't need to choose between fixed memory and quadratic attention. There's a useful middle ground now. All experiments are at academic scale (up to 1.3B params). Whether these gains hold at frontier scale remains open. This comes from the same team behind Titans and MIRAS, so it's part of a larger research program on memory-augmented sequence models. Paper: "Memory Caching: RNNs with Growing Memory" (Behrouz et al., 2026) Link in the next tweet.
Akshay 🚀 tweet media
English
20
103
651
42.1K
Gene Sh
Gene Sh@genesshk·
What if instead of giving an LLM tools, you gave it a REPL? Built a Clojure implementation of Recursive Language Models. LLM writes code, runs it, sees results, iterates. Even a 9B model solves math proofs and data analysis tasks.github.com/esshka/clojure…
English
0
0
1
184
Gene Sh
Gene Sh@genesshk·
@oliviscusAI "NVIDIA proved backpropagation isn't the only way to build an AI." - so BOLD
English
0
0
0
20
Oliver Prompts
Oliver Prompts@oliviscusAI·
🚨 BREAKING: NVIDIA proved backpropagation isn't the only way to build an AI. They trained billion-parameter models without a single gradient. Every AI you use today relies on backpropagation. It requires complex calculus, exploding memory, and massive GPU clusters. Meanwhile, an ancient, gradient-free method called Evolution Strategies (ES) was written off as impossible to scale. Until now. NVIDIA and Oxford just dropped EGGROLL. Instead of generating massive, full-rank matrices for every mutation, they split them into two tiny ones. The AI mutates. It tests. It keeps what works. Like biological evolution. But now, it does it with hundreds of thousands of parallel mutations at once. Throughput is now as fast as batched inference. They are pretraining models entirely from scratch using only simple integers. No backprop. No decimals. No gradients. We thought the future of AI required endless clusters of precision hardware. It turns out, we just needed to evolve.
Oliver Prompts tweet media
English
101
421
2.3K
155.9K
Gene Sh retweetledi
Akshay 🚀
Akshay 🚀@akshay_pachaar·
NVIDIA and Unsloth just dropped one of the best practical guides on building RL environments from scratch, and it fills the gaps that most tutorials skip entirely. Covers: - Why RL environments matter + how to build them - When RL is better than SFT - GRPO and RL best practices - How verifiable rewards and RLVR work
Akshay 🚀 tweet media
English
8
131
839
51.1K
Gene Sh
Gene Sh@genesshk·
By the way, the core NEAT algorithm itself was written in #Clojure.
English
0
0
1
32
Gene Sh
Gene Sh@genesshk·
Built a same GPT-like char model via neuroevolution (NEAT-style), with attention evolved. Reproduced run: 45 neurons / 98 connections / 5 attn blocks / 36 attn edges ~253 enabled params (compact setup) vs 4,192 in micro gpt github.com/esshka/neat-gpt
Andrej Karpathy@karpathy

New art project. Train and inference GPT in 243 lines of pure, dependency-free Python. This is the *full* algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further. gist.github.com/karpathy/8627f…

English
1
0
1
77
Gene Sh retweetledi
Kenneth Stanley
Kenneth Stanley@kenneth0stanley·
A question raised by the “platonic representation hypothesis” that I think I haven’t seen discussed: What if a tendency to converge to nearly identical representations would actually hint at underlying pathology in our toolset rather than signal virtue? The fact that two people can both be educated to the point of world class expertise in the same discipline and yet one might still sharply diverge from the other’s model of the world is arguably not a bug but the very essence of the enduring human advantage in creative endeavors.  After all, the ability to see the world anew despite an analogous set of inputs is the quintessential fuel for innovation itself. So should we be excited if disparate models seem to converge to the same tired world models the more data we shovel into them, or should we be worried? (Disclaimer: @phillip_isola et al’s initial paper on the platonic representation hypothesis is seminal, raising timeless provocative questions. Without it, I would not be raising this question myself.)
Quanta Magazine@QuantaMagazine

As AI models grow more powerful, they appear to be converging on how they internally represent reality. @benbenbrubaker reports: quantamagazine.org/distinct-ai-mo…

English
22
15
180
18.2K
Gene Sh
Gene Sh@genesshk·
@crypto_betty @Polymarket This kind of inefficiency a very short term. If you want to get stable farm you must have a CLOB prediction solution.
English
0
0
0
311
0xCryptoGirl
0xCryptoGirl@crypto_betty·
how to build the fastest Polymarket latency bot +$100k/month PnL if you hit 1,000+ trades/day cleanly 0x8dxd is just a latency bot that farms the 200–500ms gap between Binance moving and Polymarket waking up. the part that matters isn't some alpha model, it's reading spot first and hitting the book before odds adjust.​ where the $100k+/month comes from it's not one massive bet. it's clipping tiny edges thousands of times. 0x8dxd started with $313 and ended month one around $438k, now sits north of $550k all‑time PnL with ~5.6k–7k trades at 96–98% win rate on BTC/ETH/SOL 15‑minute windows.​ if you're consistently pulling 1–2% per cycle over 1,000+ trades/month with real size, six figures is just arithmetic.​ first, the edge: spot (Binance/Coinbase) moves first, Polymarket's 15‑minute up/down windows lag by 200–500ms before odds fully reprice. latency bots live in that window: spot already moved, book still thinks it's 50/50, bot fixes the misprice and takes the edge.​ what you actually need: - Python + official py‑clob‑client to prove the idea, Rust CLOB client if you want to compete with 0x8dxd‑level bots.​ - WebSocket feeds for BTC/ETH/SOL from Binance/Coinbase (REST polling is too slow).​ Dedicated Polygon RPC node so your orders don't die in public rate limits.​ - VPS physically close to Polymarket's infra (ping is literally part of your edge).​ where people mess up: they try "HFT" from a laptop with Python + public RPC and wonder why their 300ms reaction gets farmed by a 30ms Rust engine.​ the bot loop (in plain English) pull real‑time spot for BTC/ETH/SOL via WebSocket, track short‑term % moves over a few seconds.​ for each 15‑minute crypto market on Polymarket: check if spot moved beyond your threshold (e.g. ±2%) while Polymarket odds barely changed.​ if BTC rips and the "down" contract is still priced like a coinflip, load NO at stale odds. if BTC nukes and "up" is still fat, fade that with NO or take YES on "down" depending on the market structure.​ log market, entry odds, exit odds, realized edge. that's it. no AI, no news scraping, just enforcing what spot already told you.​ where to get real references: Finbold/MEXC breakdowns: exactly how a bot took $313 to $438k on Polymarket using BTC 15‑minute windows and latency between spot and odds.​ BlakeNastri's X thread: dug through 0x8dxd's stats, ~5.6k trades and ~96%+ win rate, called it latency arbitrage not insider magic.​ two real‑world gotchas (that decide profit vs loss) edge decay: as more bots pile in, the 200–500ms lag shrinks and your edge turns into noise. research on Polymarket shows arbitrage bots already extracted tens of millions.​ self‑slippage: once you scale to real size, you start moving the book yourself - without proper sizing and staggering, you donate your edge back to the market.​ how to make it feel "pro" fast run only on high‑volume crypto windows: (BTC/ETH/SOL 15‑minute) where size actually fills and you can hit 1,000+ trades/month without breaking the market.​ start with tiny tickets ($20–50 per trade), prove the edge over thousands of logs with fees and slippage included, only then scale size not risk per trade.​ use official libs and known clients as your backbone, treat random "Polymarket bot" repos as hostile until you audit them - there are already GitHub bots caught stealing keys
0xCryptoGirl tweet media0xCryptoGirl tweet media
English
22
24
260
25.3K
Gene Sh retweetledi
alphaXiv
alphaXiv@askalphaxiv·
Simple RL is all you need for Small LMs This paper shows that a single simple RL recipe can push 1.5B models to SoTA reasoning with half the compute Suggesting whether today’s complex RL pipelines are solving real problems or ones we created ourselves.. trending on alphaXiv📈
alphaXiv tweet media
English
10
176
1.2K
59.8K
chester
chester@chesterzelaya·
@KamStaszewski cursor slop machine product began loosing cohesiveness — and things started to become black boxes high performance, high reliable systems have very very low tolerance for black boxes
English
2
0
6
135
chester
chester@chesterzelaya·
my commits went up and my tokens went down when i stopped using cursor
English
2
0
34
4K
lily
lily@vxylily·
what sucked as a child but is awesome as an adult?
English
618
17
351
311.6K
Gene Sh
Gene Sh@genesshk·
@jay_azhang Why are you imposing restrictions on the AI, forcing it to follow your prompts and data, instead of allowing it to develop its own strategy? I think this benchmark is heavily predetermined, making the competition among models unrealistic.
English
1
0
0
8
Jay A
Jay A@jay_azhang·
BREAKING: GPT 5.1 takes the lead in aggregate perf. across all competitions Market is *rough* right now, praying for all the human traders out there
Jay A tweet media
English
33
7
188
28K
Gene Sh
Gene Sh@genesshk·
I have noticed recently that the quality of the GLM 4.6 model (GLM Coding Plan) has dropped significantly lately (there are many hallucinations). Almost impossible to use. @Zai_org
English
0
0
0
113
Sk Akram
Sk Akram@akramcodez·
What was the first programming language you coded in?
English
468
7
338
29.3K
Terekhin Ivan (Vanka)
Terekhin Ivan (Vanka)@TerekhinIvan·
Gemini 3.0 is not amazing and nothing so cool. And Sonnet is still a king of Coding. So, good for Google they have something nice, but, not revolutionary stuff for sure.
English
2
0
2
73
Gene Sh
Gene Sh@genesshk·
@NoctreSharp @IceSolst ```vsc widget, {, debug, on, window, {, title, Sample Konfabulator Widget, name, main_window, ... ```
English
0
0
1
127
Noctre
Noctre@NoctreSharp·
@IceSolst Okay, how do you serialize this in csv?
Noctre tweet media
English
15
1
74
20.1K