ollama

73

Tim Evans@tim_evans·1h

@ollama Yes!!

0

81

ollama@ollama·18h

Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex

English

242

599

4.9K

579.2K

ollama@ollama·4h

@D_Twitt3r Need to wait for now. Sorry! We are getting it in shape.

English

3

371

D.@D_Twitt3r·4h

@ollama Will this updated Ollama support other mlx and/or nvfp4 models downloaded from Hugging Face? Or do we need to wait for you to do more adjustments to them and post in your own catalog?

English

0

441

ollama@ollama·4h

@crashev Will be accelerated across all Apple silicon devices.

English

4

102

Pawel@crashev·4h

@ollama What about M4 Pro ?

English

0

52

ollama@ollama·18h

This results in a large speedup of Ollama on all Apple Silicon devices. On Apple’s M5, M5 Pro and M5 Max chips, Ollama leverages the new GPU Neural Accelerators to accelerate both time to first token (TTFT) and generation speed (tokens per second). note: test was conducted on using Alibaba’s Qwen3.5-35B-A3B model quantized to nvfp4 and Ollama’s previous implementation quantized to q4_K_M using Ollama 0.18. Ollama 0.19 will see even higher performance (1851 token/s prefill and 134 token/s decode when running with int4).

English

8

20

298

30.5K

ollama@ollama·4h

@scottcjohnston @Apple ❤️ it really unlocks use cases for developers and builders alike

English

3

313

Scott Johnston@scottcjohnston·4h

Feel the need ... the need for SPEED!? @ollama now optimized for @Apple Silicon via #MLX, Apple's machine learning framework. Check it out!

Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex

English

0

6

602

ollama@ollama·4h

@John7Istheman Try Ollama's cloud! Water can't touch it. ❤️

English

0

12

514

john7istheman@John7Istheman·4h

hmmmm I have a macair that might run this better than the PC.... problem is 21 yr old spilled water on it 3 years ago and its been messed up since.

Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex

English

0

2

809

ollama@ollama·4h

@codeRunnerUK @ivanfioravanti so sorry to hear that. May I ask if you which model from ollama.com you are downloading?

English

0

21

Jonathan Rudderham@codeRunnerUK·4h

@ivanfioravanti @ollama I stopped using Ollama because it wouldn’t download models. It would do a few %, then drop back down, do a few %, then drop back down, rinse and repeat. At least with LM Studio I can just manually download the models and point it at the folder.

English

0

30

Ivan Fioravanti ᯅ@ivanfioravanti·17h

Ollama has got MLX finally! You did it @ollama 🚀 I have to deep dive on this ASAP!

Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex

English

5

7

157

13K

ollama@ollama·5h

@tinyblue_dev I don't know your specific usage patterns - but Ollama's cloud's $100 plan offers significantly more usage than Anthropic's 20x Max plan.

English

0

2

546

nick@tinyblue_dev·5h

@ollama You missed the question. The free plan is going to hit the usage limit in 10 minutes for my workload. I have 2x Anthropic's 20x Max plan. For your $100/ month plan IF I never hit your usage limit? I will transfer those funds directly into your pocket and cancel Anthropic.

English

0

1

541

nick@tinyblue_dev·5h

Hey @ollama - give me a 1 day trial of your max plan, the $100.00 a month plan, if I never hit a usage limit today, I will change all of my subscriptions to you.

English

0

9

7.2K

ollama@ollama·5h

@RamanduLight so sorry! We are working to make MiniMax experience good.

English

0

9

1.2K

Radu@RamanduLight·5h

Trying to use minimax cloud @ollama - I'm running into lots of errors

English

0

4

1.2K

ollama@ollama·5h

@ziwenxu_ @tinyblue_dev @AnthropicAI Ollama is the best place use open models. Give it a try

English

0

2

30

Ziwen@ziwenxu_·5h

@tinyblue_dev @ollama @AnthropicAI Will I get more usage in ollama if I pay subs there?

English

0

2

25

Ziwen@ziwenxu_·1d

Peak hour limits in Claude are brutal now. Used to push 2 hours straight. Now I'm tapped out in under 1. Sonnet blocked. Opus blocked. What's the play here? Only move left is running Codex to survive those 3-4 peak hours daily.

English

22

2

100

6.8K

ollama@ollama·5h

Qwen 3.5 35B will be great! Works well for - coding (Claude Code, Codex, VS Code, etc. ) - building assistants / agents (Pi for excel, OpenClaw, etc.) - general chat // with docs - we are seeing developers building their own integrations for Ollama (over 50k+) now Claude Code: ollama launch claude --model qwen3.5:35b-a3b-coding-nvfp4 OpenClaw: ollama launch openclaw --model qwen3.5:35b-a3b-coding-nvfp4 Chat with the model: ollama run qwen3.5:35b-a3b-coding-nvfp4

English

1

63

Kweku Amoah@KwekuOnX·5h

@ollama what models can I run on a M5 Pro with 48GB and what are some use cases

English

0

18

ollama@ollama·5h

@ivanfioravanti @Prince_Canuma .@Prince_Canuma is amazing! ❤️❤️❤️

English

5

691

Ivan Fioravanti ᯅ@ivanfioravanti·6h

If you are not yet following @Prince_Canuma do it now! He is the man behind many of the engines powering local AI on your Apple Silicon, leveraging Apple MLX framework. 🚀

English

7

86

4.3K

ollama@ollama·6h

@purea1go okay, there is only a single model for MLX right now as we add more! ❤️

English

2

157

Mostafa Adel@purea1go·6h

@ollama That's good to know. I plan to benchmark inference speeds in Ollama as well.

English

0

1

169

ollama@ollama·6h

@purea1go This is super cool! Thank you for sharing. Ollama is built on top of MLX, and doesn't use MLX-LM.

English

0

4

1.3K

Mostafa Adel@purea1go·6h

@ollama In case you're interested. I benchmarked inference speeds between m4 max vs. m5 max macbook pros. x.com/purea1go/statu…

Mostafa Adel@purea1go

x.com/i/article/2038…

English

0

6

2.5K

ollama retweetet

Shawkat@Shawkat_m1·6h

I upgraded my Ollama to use MLX and my QWEN3.5:36b speed 2.2Xd instantly.

Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex

English

8

10

252

31.3K

ollama retweetet

John O'Reilly@joreilly·7h

Just tried out new qwen3.5:4b-nvfp4 @ollama model on M1 Max here (in project where it's used with Koog AI agent).....38% faster than qwen3.5:4b (averaged over 5 runs of the agent).

English

5

3

44

8.5K

ollama@ollama·8h

@keter_slater Ollama also offers hosted models via Ollama’s cloud. It’s the best place to use open models. Give it a try!!

English

6

1.9K

Keter Slater@keter_slater·8h

solid update but real talk the Apple silicon speed gap between ollama and cloud APIs is still massive for anything beyond casual use. "fastest on Mac" is a different benchmark than production inference at scale. this is great for tinkering tho not quite replacing hosted APIs yet fr

English

0

2.1K

ollama@ollama·8h

@harrycblum No. It means Ollama didn’t detect it installed on your computer.

English