Yannick T.
8K posts

Yannick T.
@jylls35
Groupe Rocher, HIP, #Devops, #IOT, #observability, #API, #GreenIT, GreenAPI, API Thinking, VTT/MTB, Innovation IT

En route pour les baies de 1 MW au format OCP (Open Compute Project) chez les hyperscalers. On y apprend que les racks lié à l'IA devrait atteindre 500 kW avant 2030 pour ensuite progresser à 1 MW. datacenterdynamics.com/en/news/hypers…








Journée expérimentation avec le @sdis85 #VG2024 #SacSOS #resilience











Ollama 0.2 is here! Concurrency is now enabled by default. ollama.com/download This unlocks 2 major features: Parallel requests Ollama can now serve multiple requests at the same time, using only a little bit of additional memory for each request. This enables use cases such as: - Handling multiple chat sessions at the same time - Hosting code completion LLMs for your team - Processing different parts of a document simultaneously - Running multiple agents at the same time Run multiple models Ollama now supports loading different models at the same time. This improves several use cases: - Retrieval Augmented Generation (RAG): both the embedding and text completion models can be loaded into memory simultaneously. - Agents: multiple versions of an agent can now run simultaneously - Running large and small models side-by-side Models are automatically loaded and unloaded based on requests and how much GPU memory is available.


















