grape_ape

497 posts

grape_ape banner
grape_ape

grape_ape

@tastic_ape

Katılım Nisan 2021
2.5K Takip Edilen55 Takipçiler
grape_ape
grape_ape@tastic_ape·
banger @alt_w_v_g . A more fleshed out and humorous, but directionally similar, sequence of events here vs when I'm talking to my cousins who are preparing to drop generational wealth on dubious educational credentials. The credentialing always wins so far tho
Ethan Brooks@alt_w_v_g

My wife mentioned a nice private school over dinner this week She said the campus was beautiful I asked what's the tuition She said we should look at it as an investment in him not a cost I made a note She said don't make a note I said I always make notes She said this isn't a deal I said everything is a deal She closed her eyes She said we'd discuss it Saturday I agreed Saturday 7:02am She came downstairs in her Saturday robe Coffee in hand I had my cargo shorts on The dining room had been cleared The projector was on The analyst was at the head of the table Quarter zip on, three iced coffees, a legal pad, and two laptops He had been there since 6:44am I texted him at 11:14pm Friday The text said dining room 6:45am bring the model He sent a thumbs up My wife stopped in the doorway She said what is this I said you said you wanted to discuss it She said this is not a discussion I did not respond She sat down anyway The analyst stood He said good morning ma'am She did not respond He sat back down A printed deck in front of each seat A fourth copy in case Slide 1 Tuition Schedule $38,500 per year Thirteen years $500,500 nominal Before escalators The school has raised tuition 4.2% per year for a decade With escalators $648,000 My wife said okay I said I'm not done Slide 2 Opportunity Cost Even before escalators $38,500 invested annually 10% nominal return S&P long-run average since 1928 By his eighteenth birthday $944,000 My wife said we can afford it I said I know that's not the slide Slide 3 Terminal Value at Age 65 $83 million She was quiet The analyst slid the sensitivity tables across the table 8% return $31 million 10% return $83 million 12% return $222 million She did not look She said this isn't about money I said it's always about money She said no it isn't I said then what is it about She did not answer She said you can't put a dollar value on his teachers his classmates his environment I said I can the analyst already did slide 6 He flipped to slide 6 She did not look She said the school is the best in the city I said best is a feeling She said it produces the best students I said the students were already the best before they got there She said our son deserves it I said our son deserves $83 million My son walked in He is five Dinosaur pajamas He looked at the projector He looked at the open deck on the table He looked at slide 3 He said are we modeling pre-tax or after-tax The analyst opened a new tab My wife looked at the ceiling He said what's the discount rate The analyst set down his pen She closed her eyes He said is this the same return assumption from the 529 conversation The analyst stopped typing He looked at me I did not say anything She stood up Sat back down He said dad can I help I said yes He pulled up a chair The analyst handed him a printout He started reading My wife watched him read She watched him for a long time She said his name He looked up She said do you like school He said the work is too easy and the kids don't ask questions She did not respond She looked at the ceiling She walked out of the room The analyst started packing up He said should I follow up Monday sir I said no follow up needed He'll be fine Sent from my iPhone

English
0
0
0
13
grape_ape
grape_ape@tastic_ape·
@Nat87a @thulynnn @notyetadegen Agree directionally but SG > HK >> Seoul in terms of openness to foreigners from all over which means SG property market continues to uponly more than most
English
1
0
1
43
Linxiang Zhao
Linxiang Zhao@Nat87a·
@thulynnn @notyetadegen It's different. It's a geographic issue. In Beijing/Shanghai/New York/London, people are living in plains, if people become rich, they can expand the city. HK/Singapore/Seoul are not plains, they can't expand the city, if they become richer, housing price will be more expensive
English
3
0
0
184
old school degen
old school degen@notyetadegen·
It’s Michael Burry type stuff in SG right now and when the bubble finally pops it’s going to be generational trauma Local wages can’t support local house prices The region can’t support local wages Higher SGD makes both worse Houses aren’t for living SGD isn’t for spending
old school degen tweet mediaold school degen tweet media
English
29
20
249
36.2K
grape_ape
grape_ape@tastic_ape·
@notyetadegen Keen to understand if you really believe it will end. SG property market has been up only since ~2006? No other RE market has been this exuberant from what I see indicating this will continue barring a Grey swan event like what's happening to Dubai currently.
English
0
0
1
753
Pop Base
Pop Base@PopBase·
Spotify has released personalised user data for the app’s 20th anniversary. Users can now view: • Which day they joined Spotify • Their total amount of songs streamed • The first song they streamed • Their most-streamed artist • A playlist of their most-streamed songs
Pop Base tweet media
English
537
2.5K
61.2K
6.2M
Blueprintsmb
Blueprintsmb@blueprintsmb22·
ARKK investors since COVID
English
1
1
43
6.2K
grape_ape
grape_ape@tastic_ape·
@GitGuardian proactively alerted me of a throwaway API key that had accidentally been exposed in a public repo. Deactivated immediately. Thanks @GitGuardian
English
0
0
0
9
Th0r
Th0r@Thzer0r·
the best negotiator I’ve ever known taught me to be absolutely shameless when trying to get the best deal. those (uncomfortable) lessons have permanently conditioned me to be outrageous from the outset.
English
31
173
5.1K
240.9K
grape_ape
grape_ape@tastic_ape·
@chontang @dawiddrzala Thanks for sharing! How do you mitigate concerns around giving an autonomous agent access to your emails - at the very least it means one more avenue for your PII to be leaked?
English
1
0
0
83
Chon Tang
Chon Tang@chontang·
Right now just automating my obvious daily tasks: - filter emails, intelligent routing to whatever preferred messaging platform based on content / sender (sync'ed from airtable), - bots to summarize and scan CRM / sales opportunities across a team, basically as a replacement to the CRM's SaaS front end, - manage a separate wiki for each portfolio company, where email updates / call transcripts are used to update the wiki, - and then just a bunch of cron tasks to make sure I got all the balls being juggled in the air (across LPs, companies, other VCs) is handled correctly. I'm not building a product for external users or trying to drive revenue, just trying to improve my efficiency. But for the first time, I'm fully appreciative of the fact I might not need to pay for any SaaS tools (except very basic system of record) very soon.
Sanger, CA 🇺🇸 English
1
0
9
1.2K
Lotto
Lotto@LottoLabs·
Update on Opencode Go It’s great value for $5/month, there’s really no reason not to do the first month. At $10/month it’s still good value and gets you access to all sota OS models. You can’t daily drive it without hitting limits on the big models but w/ Kimi x3 you won’t hit limits unless you’re insane. Overall highly recommend the first month, then make your own decision.
Lotto tweet media
English
80
57
1.4K
463.3K
grape_ape
grape_ape@tastic_ape·
@Dorialexander Thanks for sharing - concise, impactful, and incorporating other relevant innovations. Helps me understand this rapidly evolving space better!
English
0
0
1
558
Alexander Doria
Alexander Doria@Dorialexander·
So DeepSeek-V4: finally took me the week. Overall the paper is attempting many things at once, not easy to disentangle as it's all surprisingly connected. It's first a serious attempt at briding the gap between close and open LLM architecture. It is generally rumored that Opus and [largest model bundled in GPT-5] belong to an entirely different category of models: very large, very sparse mixture of experts, able to holding an unprecendently wide search space while still being servable. Simply put current hardware cannot hold a model on one node, so you have to play with the interconnect and various level of quantization, for different layers, at different stage of training. An important focus of DsV4 is on communication latency, showing it can be hidden through effective management of interconnect (roughly you slide communication time inside computation side). Overall, you cannot simply enter this game without the capability to rewrite kernels from scratch and the model report relentlessly come back to this. Because this is the frontier game. It's then a radical, but very successful attempt at making long context simultaneously more efficient and more affordable. Long context is literally a "context" problems: what exactly is worth attending? An obvious fix is to prioritize the most recent tokens. This might be sufficient for basic search but not for the new demands of agentic pipelines that require accurate recall of distant yet strategic content. V4 clever approach is to rely on two different axis of memorization by allocating layers to two different attention compression schemes. Like the name suggest, Heavily Compressed Attention is the brute force method collapsing each sequence of 128 tokens to a unique entry and take care of the fuzzy yet global context. Compressed Sparsed Attention rely on a "lighting indexer" to bring the relevant local blocks for query, even when they can be thousands of tokens away. Everything here is optimized for end inference: there is very large head_dim (512) which is costlier for training but allows for even more compressed kv cache which is your actual bottleneck at inference time, especially in prefill mode. End result is very classical DeepSeek play, introducing a new radical disruption of inference economics after DSA. I predict hybrid CSA/HCA (or similar counterparts) will be essentially part of the mainstream arch by the end of this year. Now we come to the more ambitious but also more unfinished part: an attempt at redefining model architecture and the learning signal. Most preeminent part is mHC and hybrid CSA/HCA, but it's actually a long list of less documented innovations: swapping softmax for sqrt(softplus) or using an hybrid two-stage scheme with non-standard values for Muon. Yet the interconnection all of these new components is still unknown and likely to account for the significant training unstabilities: typically "mHC involves a matrix multiplication with an output dimension of only 24" which introduces non-determinism. Even one the best AI labs in the world will run here into ablation combinatorial explosion, so the association of all these choices is likely non-tractable and would require a more consistent theory — which the conclusion gestures at, but does not solve ("In future iterations, we will carry out more comprehensive and principled investigations to distill the architecture down to its most essential designs"). The more limited experiments in post-training are maybe more promising. Significantly, the one lab that popularized the standard RL+reasoning recipe is rethinking the recipe. For now it's a two stage design (RL on specialized model, then on-policy distillation): ever since Self-Principled Critique Tuning DeepSeek has been concerned with expanding the reasoning training signal beyond final sparse reward. I'm not sure this is final say: in this domain everything is a bit in flux and you could even argue the type of verified pipeline we designed for SYNTH is a form of extreme offline RL-like training. There is an even longer term plan (here >3-5 years), which is about redefining hardware. For now it's a way of transforming a constraint into an opportunity: as the leading Chinese labs, DeepSeek was very incentivized to make training work on Ascend and contribute to the national effort for chips autonomy. Very unusually, the report includes a lengthy wishlist for future hardware to come in the report itself. As several experts noted, many of these recommendations don't really hold up for Nvidia but make perfect sense for a newcomer in the GPU hardware business. DeepSeek seem to be anticipating a world where labs have to secure a close hardware partner to retroactively fit the chips to the particular demand of model design or inference. Now there is what DeepSeek did not do yet. The paper hardly mention anything about synthetic pipelines, rephrasing, simulated environment. Training data size (32T tokens) likely involve some significant part of generated data, as this is more quality tokens than the web and other digitized sources could held — so maybe similar synthetic proportions as Trinity (roughly half) or Kimi. Still, it's pretty clear that all their attention was focused on the infra, architecture and scaling side, leaving a proper extensive retraining for later. This is likely not that dissimilar to how Anthropic or OpenAI proceeded: the fact we're still in the middle of the same model series even though significant parts of the model have changed (the tokennizer with Opus 4.7) suggests that a model lifecycle involves multiple rounds of training potentially as large as a pretraining a few years ago. The fact DeepSeek took on multiple Moonshot innovation (and Moonshot in turn has been hugely reliant on DeepSeek) suggest we might also have an ecosystem dynamic here. Maybe DeepSeek can exclusively focus on hard infrastructure problems and expect some of the axis of development to be sorted out later.
English
25
103
796
74.1K
grape_ape
grape_ape@tastic_ape·
@mr_r0b0t Thanks for sharing, I realise now there is a discount through to the end of the month. Anyway, I still blew through 5c quite quickly today, but ig we're doing different tasks and mine are more coding intensive. Good luck with what you're building!
grape_ape tweet media
English
1
0
1
20
mr-r0b0t
mr-r0b0t@mr_r0b0t·
@tastic_ape I used it via openrouter at first and it doesn’t apply the discount. I haven’t tried loading my key and using it BYOK, the discount should apply at that point!
English
1
0
1
1.7K
mr-r0b0t
mr-r0b0t@mr_r0b0t·
Be me: Sign up for DeepSeek API because everyone is saying how reasonably priced it is. Load up $50, let’s see. 30 minutes in, here’s where we’re at using V4-Pro
mr-r0b0t tweet media
English
207
108
4.8K
530.5K
Graeme
Graeme@gkisokay·
I can see there's quite a large demand for my research agent setup. I am away this weekend but I will post it in the coming week
English
3
0
38
3.1K
Graeme
Graeme@gkisokay·
There’s one Hermes use case for everyone, and if you're not using it, you're already behind. Do yourself a favour and build a research agent as I outline below; it will change the way you work. Mine researches my topics of interest and cuts through the noise to find what actually matters. Every day, it watches the AI/agent space, picks out useful signals, writes research briefs, suggests content angles, tracks what I ignore, and Hermes keeps improving parts of its own workflow. The basic version is almost free: 1. Pick a domain: AI, crypto, startups, sales leads, competitors, papers, jobs, whatever. 2. Give it sources: X lists, RSS feeds, blogs, GitHub repos, docs, newsletters, YouTube transcripts. 3. Define signal: What should it care about? New tools, benchmarks, launches, funding, tutorials, strange patterns, useful claims. 4. Save the evidence: Links, dates, summaries, claims, and why it matters in a vault. 5. Deliver a daily brief: Discord, Slack, Notion, email, Obsidian, and local markdown. 6. Give feedback: “More like this. This source is noisy. This is useful. This is mid.” That is enough for the loop to start. Once you have a research agent, everything gets easier: - Content agents need research - Trading agents need market context - Sales agents need account intel - coding agents need docs and changelogs - Strategy agents need a fresh signal With a daily stream of inputs, generating ideas for outputs becomes much easier. If you want it, I’ll share the full research agent setup I use.
Nous Research@NousResearch

Hermes Agent v0.12.0 - “The Curator Release”

English
49
73
1.2K
117.1K
grape_ape
grape_ape@tastic_ape·
@quxiaoyin @quxiaoyin - thanks for sharing. Given Hermes Agent and CC have their own harnesses/skills.md files, would a more accurate comparisonbe to run 2 hermes agent instances (same version), one calling deepseek and the other calling Claude API? Thanks
English
0
0
2
1.7K
Xiaoyin Qu
Xiaoyin Qu@quxiaoyin·
Claude code 4.7 v.s. deepseek v4+ hermes. Same coding task side by side comparison.
English
53
91
1.2K
87.5K
grape_ape
grape_ape@tastic_ape·
@quxiaoyin Deepseek is great but $5 will barely get you a toy dashboard
English
0
0
0
154
Xiaoyin Qu
Xiaoyin Qu@quxiaoyin·
I can’t believe I stopped using Claude Code max and entirely use DeepSeek and Hermes. It’s so fast, so so fast, 3x faster for the same task. So cheap. I spent $5 last week and never need worry about being rate limited or usage hit limits very two hours. For most tasks it’s perfect enough.
English
245
172
3.3K
265K
Afshin Samadi
Afshin Samadi@ashsamadi·
Credit swaps between a currency pair are created ONLY under circumstances where one currency is facing massive exit against the other ie AED against USD. In order to stop massive sale at the market and thus devaluation, central banks agree to a swap of curreny, off market. This slows down the inevitable eventual massive devaluation. As an investor you are better off doing it early. That's Ghalibafs advice.
English
2
2
49
1.3K