Krish Modi

211 posts

Krish Modi

Krish Modi

@krishmodi404

@uwaterloo se, dev @palantirtech, prev @bloomberg, @Huawei, ISEF

Sarnia ON Se unió Şubat 2022
446 Siguiendo682 Seguidores
Tweet fijado
Krish Modi
Krish Modi@krishmodi404·
I made AgentIR: a scheduler for distributed LLM serving that makes agent workloads run much faster. 41.3% lower E2E latency, and up to 70% higher throughput
English
13
24
125
12.5K
Rishi Shah
Rishi Shah@Rishi_Shah99·
I took OSDN, a brand-new linear-attention model that learns to tune its own memory updates as it reads (think AdaGrad for the architectures trying to replace the transformer), rebuilt it from scratch in pure C++ with my own autograd engine, and ran it on a $4 microcontroller to predict hypoglycemia 60 minutes before it hits. No PyTorch. No JAX. No TensorFlow. No ML library at all. Straight C++ standard library.
English
6
4
13
1.1K
Krish Modi
Krish Modi@krishmodi404·
launching AgentIR Blackbox agentir.dev an llm request router for agent system Blackbox finds which llm calls are on your workflow’s critical path, sends them to faster providers, and routes less urgent calls cheaper to maintain your selected cost-latency constraint it uses your workflow stats and real-time provider latency profiles to reroute before throttling or slowdowns hit the full workflow setup is simple too. connect your app, and blackbox handles the workflow annotations for you use it for free!
Krish Modi tweet media
English
16
14
63
8.2K
brayden petersen ⁂
brayden petersen ⁂@bmptrsn·
i’ve joined @datacurve to lead design in san francisco (as an intern)! check out our new site :)
English
61
6
316
21.7K
ian
ian@IKorovinsky·
@krishmodi404 can confirm this is goated
English
1
0
1
195
rajan agarwal
rajan agarwal@_rajanagarwal·
ive witnessed krish work on agentIR for the past several months! it’s really magical to use, try it out!
Krish Modi@krishmodi404

launching AgentIR Blackbox agentir.dev an llm request router for agent system Blackbox finds which llm calls are on your workflow’s critical path, sends them to faster providers, and routes less urgent calls cheaper to maintain your selected cost-latency constraint it uses your workflow stats and real-time provider latency profiles to reroute before throttling or slowdowns hit the full workflow setup is simple too. connect your app, and blackbox handles the workflow annotations for you use it for free!

English
1
0
22
4K