ollama

7.7K posts

ollama banner
ollama

ollama

@ollama

https://t.co/1JpLwJ93nX

California, USA Entrou em Ağustos 2023
10 Seguindo134.4K Seguidores
Tweet fixado
ollama
ollama@ollama·
Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex
English
122
251
2K
162.7K
marcelo
marcelo@zidszopers·
@ollama why are you ignoring hermes-agent?
English
1
0
0
372
ollama
ollama@ollama·
Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex
English
122
251
2K
162.7K
ollama retweetou
Cheng
Cheng@zcbenz·
We have been expecting this since ollama's first pull request to MLX. It is just the beginning, CUDA & CPU backends are still improving and hopefully we will have one framework unifying inference & training for all platforms.
ollama@ollama

Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex

English
1
2
17
2.1K
ollama
ollama@ollama·
@zzddfge @JustinLin610 if you have the weights already, no. But it's only available for: ollama pull qwen3.5:35b-a3b-coding-nvfp4 and the int4 variation.
English
0
2
2
1.3K
ollama
ollama@ollama·
@Sc0ttTheRobot it'll work (just have to watch out for other apps taking away available memory)
English
1
0
0
58
RexMonte
RexMonte@Sc0ttTheRobot·
@ollama More than? Why if I have a 32gb Mac mini exactly?
English
1
0
1
113
ollama
ollama@ollama·
Improved caching for more responsiveness Ollama’s cache has been upgraded to make coding and agentic tasks more efficient. Lower memory utilization: Ollama will now reuse its cache across conversations, meaning less memory utilization and more cache hits when branching when using a shared system prompt with tools like Claude Code. Intelligent checkpoints: Ollama will now store snapshots of its cache at intelligent locations in the prompt, resulting in less prompt processing and faster responses. Smarter eviction: shared prefixes survive longer even when older branches are dropped.
English
1
2
32
7.6K
Ryo Lu
Ryo Lu@ryolu_·
when software had a soul there was a moment around 2005 when using a Mac felt like touching something alive. the dock bounced. the genie effect swooped. exposé scattered your windows like cards on a table. none of it was strictly necessary. all of it felt like someone cared – not about metrics, but about the feeling of using a machine. software back then had texture. it had a philosophy. you could feel the person behind it. someone made a decision to make that icon beautiful, to animate that transition just so, to write that error message with a little warmth. apps had personalities. some were weird. some were over-designed in ways that would make a modern PM flinch. but they were alive. the web was the same. personal sites were genuinely personal. blogs felt like letters. forums had regulars. you knew who made what. the internet had neighborhoods, and each one felt different. nothing was optimized for scale. things were made by people who loved what they were making. somewhere along the way, we traded all of that for growth. A/B tests flattened the edges. design systems standardized the personality out. everything got faster, smoother, more consistent – and somehow less interesting. the quirks were removed because they didn't test well. the warmth got cut because it wasn't measurable. we optimized our way into a world of things that work perfectly and feel like nothing. now every app looks the same. every interface follows the same patterns. every product speaks in the same calm, frictionless voice, siloed in their own little islands. the humanity got rounded off. and then came AI agents. and the speed got inhuman. now you can generate an entire product in an afternoon. ship a feature before lunch. spin up ten variations before anyone's had their coffee. the gap from idea to code is basically zero. which sounds incredible. and it is. but there's a catch. when making things are too easy, the slop comes for free too. mediocre things don't look obviously bad – they look fine. they work. they ship. they pass review. and now there are infinite of them. the internet is filling up with software that functions but means nothing. interfaces that are correct but feel dead. products made by agents, reviewed by no one, shipped into the void. this is the thing that keeps me up at night. not that AI will replace people who care. but that it will drown them out. here's what I still believe: the best things are made by people who couldn't help themselves. someone who lost sleep over an icon. who rewrote the same line of copy twelve times. who added an animation nobody asked for because it made the thing feel right. that obsession – that's not inefficiency. that's the whole point. AI doesn't make that irrelevant. it actually makes it rarer and more valuable. taste is not a markdown skill. caring is not a parameter. the weird, specific, "soul" thing you put into something – that can't be programmed into existence. the path forward isn't to make more slop faster. it's to finally give people with real vision the tools to make the thing they always imagined but couldn't build alone. the designer who had the idea but couldn't code. the kid who saw something nobody else saw. the person who cared too much about something most people wouldn't notice. if we get this right, we don't get a faster factory. we get a renaissance. more strange, personal, opinionated software made by teams of people who care and mean it. that's still possible. but only if the people who care get the space and tools to actually express themselves – and don't just hand the wheel to the agent and walk away.
English
52
88
672
37.6K
ollama
ollama@ollama·
@SMT_Solvers ❤️❤️❤️ super happy! We are all in it together to drive open model adoption.
English
0
1
13
2.4K
ollama retweetou
ollama
ollama@ollama·
Future models We are actively working to support future models. For users with custom models fine-tuned on supported architectures, we will introduce an easier way to import models into Ollama. In the meantime, we will expand the list of supported architectures.
English
2
0
30
6.9K
ollama
ollama@ollama·
NVFP4 support: higher quality responses and production parity Ollama now leverages NVIDIA’s NVFP4 format to maintain model accuracy while reducing memory bandwidth and storage requirements for inference workloads. As more inference providers scale inference using NVFP4 format, this allows Ollama users to share the same results as they would in a production environment. It further opens up Ollama to have the ability to models optimized by NVIDIA’s model optimizer. Other precisions will be made available based on the design and usage intent from Ollama’s research and hardware partners.
ollama tweet media
English
1
0
51
8.3K
ollama
ollama@ollama·
This results in a large speedup of Ollama on all Apple Silicon devices. On Apple’s M5, M5 Pro and M5 Max chips, Ollama leverages the new GPU Neural Accelerators to accelerate both time to first token (TTFT) and generation speed (tokens per second). note: test was conducted on using Alibaba’s Qwen3.5-35B-A3B model quantized to nvfp4 and Ollama’s previous implementation quantized to q4_K_M using Ollama 0.18. Ollama 0.19 will see even higher performance (1851 token/s prefill and 134 token/s decode when running with int4).
ollama tweet media
English
3
11
153
12.1K
ollama
ollama@ollama·
@MrRemKing The moment the GLM team launches it open-source! ❤️❤️❤️
English
1
0
2
32
ollama
ollama@ollama·
@k_k_kaundal That you have to ask Docker. This is not our product. You can use Ollama or Ollama inside Docker.
English
1
0
1
86
K. K. 💫
K. K. 💫@k_k_kaundal·
@ollama Most of time i used these models with ollama based config so , can i use docker models in VS using this update ?
K. K. 💫 tweet media
English
1
0
0
81
ollama
ollama@ollama·
Visual Studio Code now integrates with Ollama via GitHub Copilot. If you have Ollama installed, any local or cloud model from Ollama can be selected for use within Visual Studio Code.
ollama tweet mediaollama tweet media
English
114
495
4.5K
322.9K