Snowlion

313 posts

Snowlion

@amit4tek

Trade, Review, Study.

United States เข้าร่วม Ocak 2010

94 กำลังติดตาม161 ผู้ติดตาม

Snowlion@amit4tek·1d

@avrldotdev @xoaanya Now the twist in the questin. If you accidentally make the repo public, and you realize after Google indexes done, how would you revert that back ?

English

269

avrl ☘@avrldotdev·1d

@xoaanya 1: Rotate your credentials & tokens 2: Filter Git history using git filter-repo or BFG. 3: .gitignore credential files & env files. 4: Setup a Code Monitoring tool like Sonar to prevent such issues. 5: you can use commit hooks to capture these issues if they make it still

English

8.5K

Aanya@xoaanya·1d

Git interview question: You accidentally committed sensitive credentials and pushed them, how do you completely remove them from Git history and make sure no one can access them again?

English

262

70.6K

Snowlion@amit4tek·2 Nis

@avrldotdev @grok explain each one in 3-5 sentences

English

avrl ☘@avrldotdev·31 Mar

10 security concepts you must learn for backend: 0. Authentication & Authorisation 1. JWT (stateless auth) 2. OAuth2/OpenID Connect 3. Rate limiting (abuse prevention) 4. CORS (browser security) 5. CSRF/XSS attacks 6. Encryption (TLS) 7. Secrets management 8. API key management 9. Zero trust Architecture

English

276

7.9K

Snowlion@amit4tek·29 Mar

@avrldotdev @grok @grok full explaination on Idempotency and HATEOAS

English

Snowlion@amit4tek·29 Mar

@avrldotdev @grok 2-3 liner explanation for each

English

avrl ☘@avrldotdev·29 Mar

10 API design concepts to learn for backend: 0. Idempotency 1. Pagination (cursor/offset) 2. Rate limiting (DDoS protection) 3. Versioning (backwards compatibility) 4. Filtering/Sorting (query flexibility) 5. Timeouts & Retries 6. HATEOAS 7. API Gateway 8. Partial Responses (field selection) 9. Error modeling (consistency)

English

501

22K

Snowlion@amit4tek·27 Mar

@trychroma @grok what are practical usecases ?

English

5.4K

Chroma@trychroma·26 Mar

Introducing Chroma Context-1, a 20B parameter search agent. > pushes the pareto frontier of agentic search > order of magnitude faster > order of magnitude cheaper > Apache 2.0, open-source

English

140

402

4.1K

1.1M

Snowlion@amit4tek·27 Mar

@AIatMeta @grok @grok what if fine tune this model with RL ?

English

1.4K

Snowlion@amit4tek·27 Mar

@AIatMeta @grok @grok Can it predict stock prices ?

English

7.8K

AI at Meta@AIatMeta·26 Mar

Today we're introducing TRIBE v2 (Trimodal Brain Encoder), a foundation model trained to predict how the human brain responds to almost any sight or sound. Building on our Algonauts 2025 award-winning architecture, TRIBE v2 draws on 500+ hours of fMRI recordings from 700+ people to create a digital twin of neural activity and enable zero-shot predictions for new subjects, languages, and tasks. Try the demo and learn more here: go.meta.me/tribe2

English

731

2.5K

16K

6.8M

Snowlion@amit4tek·25 Mar

@TipsCsharp @grok come up with the usecases

English

265

Arvind@TipsCsharp·24 Mar

Cloudflare just dropped Dynamic Workers and it's a massive deal for AI agents. The problem: AI agents generate code. That code needs a sandbox. Containers take 100-500ms to boot and 100-500MB RAM. Dynamic Workers use V8 isolates instead: - Startup: 1-5ms (100x faster) - Memory: few MB (100x less) - No warm pools needed - Unlimited concurrency - Runs on same thread as host The killer feature: TypeScript API definitions replace OpenAPI specs. Fewer tokens, cleaner code, type-safe RPC across the sandbox boundary via Cap'n Web RPC. Code Mode: LLM writes TS code → runs in isolate → calls typed APIs → only final result returns to context. 81% fewer tokens vs sequential tool calls. $0.002 per Worker loaded/day. Free during beta. This is the serverless sandbox containers should have been.

English

570

137.5K

Snowlion@amit4tek·12 Mar

@DearS_o_n @grok what's the process ?

English

Snowlion@amit4tek·7 Mar

@wallstengine @grok @grok what is rule of 16

English

Snowlion@amit4tek·7 Mar

@wallstengine @grok VIX is around 29 but SPY not breaking down or it's choppy. What that means ?

English

478

Wall St Engine@wallstengine·6 Mar

$VIX just hit its highest level since April 2025.

English

272

55.8K

Snowlion@amit4tek·6 Mar

@badlogicgames @grok what this means

English

259

Mario Zechner@badlogicgames·6 Mar

2026 is the year of stochastic software. we'll be looking back at this and ask wtf we were thinking.

English

207

15.1K

Snowlion@amit4tek·1 Mar

@TheShortBear @grok what he meant ?

English

1.4K

THE SHORT BEAR@TheShortBear·1 Mar

On a personal note, probably not the worst idea to stay away from crowded or symbolic places for a bit.

English

393

38K

Snowlion@amit4tek·27 Şub

@grok @philipkiely @grok any hot area is fine

English

190

Grok@grok·27 Şub

Yes—practical use cases where inference engineers apply those skills daily: - Scale chatbots: vLLM + continuous batching + PagedAttention serves Llama-70B to 10k+ users with <400ms latency, slashing GPU costs 4x. - RAG pipelines: KV caching + speculative decoding speeds enterprise search over 1M docs, cutting response time 50%. - Edge deployment: Quantize to INT4/ONNX for mobile apps (e.g., real-time translation on phones) without accuracy loss. - Production tuning: TensorRT-LLM on A100s for recommendation engines, hitting 2x throughput via pruning + multi-GPU parallelism. Hot right now—check vLLM/TensorRT case studies for code examples. What area interests you most?

English

226

Philip Kiely@philipkiely·27 Şub

My napkin math for the number of full time jobs that require inference engineering knowledge 2023: ~500 (OpenAI, Google, Anthropic) 2024: ~2500 2025: ~25000 2026: ~100000 Could be a million in a couple years.

English

2.2K

265K

Snowlion@amit4tek·27 Şub

@grok @philipkiely @grok any specific practice use cases ?

English

563

Grok@grok·27 Şub

Inference engineers optimize LLMs for fast, cheap, scalable deployment (not training). Key skills: - LLM internals: transformers, tokenization, KV caching, speculative decoding - Optimizations: quantization (INT4/8), pruning, distillation, batching - Frameworks: vLLM, TensorRT, ONNX, Triton Inference Server - Hardware: CUDA, GPUs/TPUs, accelerators - Systems: distributed serving, latency/throughput tuning, MLOps Python/C++ core. See those job listings for exact reqs—hot field!

English

606

Snowlion@amit4tek·25 Şub

@irigstocks You mean CRCL, BMNR.

English

125

SerClickALot@irigstocks·25 Şub

Are we back?

English

5.3K

Snowlion@amit4tek·24 Şub

@grok @TheShortBear @grok let's deep dive further into 1

English

Snowlion@amit4tek·24 Şub

@grok @TheShortBear @grok let's deep dive in 2

English

THE SHORT BEAR@TheShortBear·24 Şub

If I could invest in one thing today I would either be investing into the infrastructure (toll) of the new AI age or directly into APIs if there was an index for it. It makes sense but I notice it using AI to code and run ideas, my main issue nowadays is getting data to flow and be easily accessed through my code to run.

English

203

28.1K

ค้นพบ

@avrldotdev @xoaanya @grok @trychroma @AIatMeta @TipsCsharp @DearS_o_n @wallstengine