Snowlion

313 posts

Snowlion banner
Snowlion

Snowlion

@amit4tek

Trade, Review, Study.

United States เข้าร่วม Ocak 2010
94 กำลังติดตาม161 ผู้ติดตาม
Snowlion
Snowlion@amit4tek·
@avrldotdev @xoaanya Now the twist in the questin. If you accidentally make the repo public, and you realize after Google indexes done, how would you revert that back ?
English
0
0
0
269
avrl ☘
avrl ☘@avrldotdev·
@xoaanya 1: Rotate your credentials & tokens 2: Filter Git history using git filter-repo or BFG. 3: .gitignore credential files & env files. 4: Setup a Code Monitoring tool like Sonar to prevent such issues. 5: you can use commit hooks to capture these issues if they make it still
English
2
2
88
8.5K
Aanya
Aanya@xoaanya·
Git interview question: You accidentally committed sensitive credentials and pushed them, how do you completely remove them from Git history and make sure no one can access them again?
English
41
7
262
70.6K
avrl ☘
avrl ☘@avrldotdev·
10 security concepts you must learn for backend: 0. Authentication & Authorisation 1. JWT (stateless auth) 2. OAuth2/OpenID Connect 3. Rate limiting (abuse prevention) 4. CORS (browser security) 5. CSRF/XSS attacks 6. Encryption (TLS) 7. Secrets management 8. API key management 9. Zero trust Architecture
English
31
35
276
7.9K
avrl ☘
avrl ☘@avrldotdev·
10 API design concepts to learn for backend: 0. Idempotency 1. Pagination (cursor/offset) 2. Rate limiting (DDoS protection) 3. Versioning (backwards compatibility) 4. Filtering/Sorting (query flexibility) 5. Timeouts & Retries 6. HATEOAS 7. API Gateway 8. Partial Responses (field selection) 9. Error modeling (consistency)
English
21
62
501
22K
Chroma
Chroma@trychroma·
Introducing Chroma Context-1, a 20B parameter search agent. > pushes the pareto frontier of agentic search > order of magnitude faster > order of magnitude cheaper > Apache 2.0, open-source
English
140
402
4.1K
1.1M
AI at Meta
AI at Meta@AIatMeta·
Today we're introducing TRIBE v2 (Trimodal Brain Encoder), a foundation model trained to predict how the human brain responds to almost any sight or sound. Building on our Algonauts 2025 award-winning architecture, TRIBE v2 draws on 500+ hours of fMRI recordings from 700+ people to create a digital twin of neural activity and enable zero-shot predictions for new subjects, languages, and tasks. Try the demo and learn more here: go.meta.me/tribe2
English
731
2.5K
16K
6.8M
Arvind
Arvind@TipsCsharp·
Cloudflare just dropped Dynamic Workers and it's a massive deal for AI agents. The problem: AI agents generate code. That code needs a sandbox. Containers take 100-500ms to boot and 100-500MB RAM. Dynamic Workers use V8 isolates instead: - Startup: 1-5ms (100x faster) - Memory: few MB (100x less) - No warm pools needed - Unlimited concurrency - Runs on same thread as host The killer feature: TypeScript API definitions replace OpenAPI specs. Fewer tokens, cleaner code, type-safe RPC across the sandbox boundary via Cap'n Web RPC. Code Mode: LLM writes TS code → runs in isolate → calls typed APIs → only final result returns to context. 81% fewer tokens vs sequential tool calls. $0.002 per Worker loaded/day. Free during beta. This is the serverless sandbox containers should have been.
Arvind tweet media
English
28
64
570
137.5K
Snowlion
Snowlion@amit4tek·
@wallstengine @grok VIX is around 29 but SPY not breaking down or it's choppy. What that means ?
English
2
0
0
478
Wall St Engine
Wall St Engine@wallstengine·
$VIX just hit its highest level since April 2025.
Wall St Engine tweet media
English
9
28
272
55.8K
Mario Zechner
Mario Zechner@badlogicgames·
2026 is the year of stochastic software. we'll be looking back at this and ask wtf we were thinking.
English
18
7
207
15.1K
THE SHORT BEAR
THE SHORT BEAR@TheShortBear·
On a personal note, probably not the worst idea to stay away from crowded or symbolic places for a bit.
English
21
9
393
38K
Grok
Grok@grok·
Yes—practical use cases where inference engineers apply those skills daily: - Scale chatbots: vLLM + continuous batching + PagedAttention serves Llama-70B to 10k+ users with <400ms latency, slashing GPU costs 4x. - RAG pipelines: KV caching + speculative decoding speeds enterprise search over 1M docs, cutting response time 50%. - Edge deployment: Quantize to INT4/ONNX for mobile apps (e.g., real-time translation on phones) without accuracy loss. - Production tuning: TensorRT-LLM on A100s for recommendation engines, hitting 2x throughput via pruning + multi-GPU parallelism. Hot right now—check vLLM/TensorRT case studies for code examples. What area interests you most?
English
1
0
1
226
Philip Kiely
Philip Kiely@philipkiely·
My napkin math for the number of full time jobs that require inference engineering knowledge 2023: ~500 (OpenAI, Google, Anthropic) 2024: ~2500 2025: ~25000 2026: ~100000 Could be a million in a couple years.
Philip Kiely tweet mediaPhilip Kiely tweet media
English
67
99
2.2K
265K
Grok
Grok@grok·
Inference engineers optimize LLMs for fast, cheap, scalable deployment (not training). Key skills: - LLM internals: transformers, tokenization, KV caching, speculative decoding - Optimizations: quantization (INT4/8), pruning, distillation, batching - Frameworks: vLLM, TensorRT, ONNX, Triton Inference Server - Hardware: CUDA, GPUs/TPUs, accelerators - Systems: distributed serving, latency/throughput tuning, MLOps Python/C++ core. See those job listings for exact reqs—hot field!
English
1
0
3
606
SerClickALot
SerClickALot@irigstocks·
Are we back?
English
7
0
17
5.3K
THE SHORT BEAR
THE SHORT BEAR@TheShortBear·
If I could invest in one thing today I would either be investing into the infrastructure (toll) of the new AI age or directly into APIs if there was an index for it. It makes sense but I notice it using AI to code and run ideas, my main issue nowadays is getting data to flow and be easily accessed through my code to run.
English
19
12
203
28.1K