Nikola P. Borisov

62 posts

Nikola P. Borisov

@nikolaborisof

CEO, Co-founder @DeepInfra, ex @imoim

Los Altos, CA Katılım Mart 2011

93 Takip Edilen179 Takipçiler

Nikola P. Borisov@nikolaborisof·1d

Had a great time chatting with Roland Siebelink @cyberroland on the Scaling Without Breaking podcast about building @DeepInfra HR Bot, the team behind it, and how to stay focused when everything's shifting. YouTube: youtube.com/watch?v=9siruL… Apple: podcasts.apple.com/us/podcast/sca…

YouTube

English

Nikola P. Borisov@nikolaborisof·1d

Had a great time digging into this with @realmtbman. The supply chain behind inference matters more than most enterprises realize. youtu.be/DS2-iheW6pI

YouTube

Yohann Calpu@realmtbman

Enterprises ask "is your AI compliant?" The better question: who actually runs the inference? Nikola Borisov, co-founder of @DeepInfra ($107M Series B raise - including NVIDIA) on @palebluenexus: "You want to make sure you're not giving it to someone that will give it to someone that will give it to someone. And maybe the final inference happens in China."

English

Nikola P. Borisov@nikolaborisof·15 Tem

@DeepInfra The latest model on DeepInfra

English

150

DeepInfra@DeepInfra·15 Tem

Moonshot AI's Kimi 2 is now live on DeepInfra, as always at the best price of $0.55/$2.20, full tool call and context support. Best open source non-reasoning model available according to multiple benchmarks. Running on Nvidia Blackwell🇺🇸.

English

158

15.3K

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·1 Tem

Get Nvidia B200 GPUs for $1.99/h on demand until the end of July. Why not?

GIF

English

3.1K

Nikola P. Borisov@nikolaborisof·22 Haz

@nixcraft Nice

English

nixCraft 🐧@nixcraft·21 Haz

Linus Torvalds & Bill Gates just met each other for the first time

English

383

1.3K

14.5K

Nikola P. Borisov retweetledi

Supermicro@Supermicro·10 Haz

Introducing Supermicro DLC-2: Superior liquid cooling that reduces power, water usage, noise, and space. The new liquid-cooled 4U NVIDIA HGX B200 8-GPU system doubles cooling capacity with advanced cold plates and a 250kW CDU.

English

379

2.1M

Nikola P. Borisov@nikolaborisof·13 Haz

GCP going down today was kind of crazy. Gather.town didn't work, google meet was kind of broken. Someone on the team had to join audio on WA. DeepInfra was not affected, but our GPUs cooled down a bit because some of our clients were using GCP.

English

220

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·24 Ağu

Our execs came back from vacation and decided to have a pricing meeting. I guess llama3-70b is 35c now.

English

3.7K

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·8 May

We just launched a TURBO version of the popular MythoMax model. You can get up to 120 tokens per second. Same price of 0.13 USD per 1M tokens.

English

1.3K

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·18 Nis

Official Mixtral-8x22b-Instruct model just got released and is now on @DeepInfra. This is the best open LLM and we are hosting at the best price of $0.65 / 1M tokens. deepinfra.com/mistralai/Mixt…

English

1.2K

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·18 Nis

Also we just dropped pricing on most of our 7b, 13b and 70b models to $0.10, $0.18, $0.64 per million input tokens. We will always have the best prices. deepinfra.com/pricing

English

1.2K

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·11 Nis

We set the bar when we launched the first mixtral, and we're going to do it again! The new mixtral, with 65K context, at ... 65c / 1 million tokens! This new model is almost 3 times larger. deepinfra.com/mistralai/Mixt…

English

2.6K

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·10 Nis

databricks/dbrx-instruct is now on @deepinfra. $0.6 per 1M tokens - the best price of all providers. Also up to 130 tps. Try it our here: deepinfra.com/databricks/dbr…

English

2.5K

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·2 Nis

You can now host your custom LLMs at DeepInfra. It's managed LLM hosing service. Pay-per GPU/h $2/A100, $4/H100. It's super simple. Read more here deepinfra.com/blog/custom-ll…

English

1.9K

Nikola P. Borisov@nikolaborisof·27 Mar

github.com/placeholderkv/…

ZXX

Nikola P. Borisov retweetledi

Hao AI Lab@haoailab·19 Mar

Still optimizing throughput for LLM Serving? Think again: Goodput might be a better choice! Splitting prefill from decode to different GPUs yields - up to 4.48x goodput - up to 10.2x stricter latency criteria Blog: hao-ai-lab.github.io/blogs/distserv… Paper: arxiv.org/abs/2401.09670

GIF

English

179

78.2K

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·12 Mar

Guided JSON response is now available @DeepInfra API. Read more about it here deepinfra.com/blog/json-mode. Restricting the output to JSON had almost no performance penalty and is FREE.

English

1.1K

Nikola P. Borisov retweetledi

DeepInfra@DeepInfra·27 Oca

We just shipped function calling. With great power comes great responsibility. If the LLM tells you to `shutil.rmtree("/")` maybe don't do it. deepinfra.com/blog/function-…

English

2.5K

Nikola P. Borisov@nikolaborisof·9 Ara

We just added Mixtral from @MistralAI. Demo and API here: deepinfra.com/deepinfra/mixt…

English

Nikola P. Borisov@nikolaborisof·29 Kas

Lol

Keşfet

@cyberroland @DeepInfra @realmtbman @nixcraft @elonmusk @BarackObama @taylorswift13 @cristiano