Roberto Paredes

1.4K posts

Roberto Paredes

@RobertoParPal

Full Professor DSIC-UPV. Former Director of PRHLT Research Center. CTO Solver Machine Learning. European Distributed Deep Learning Library-EDDL Lead Developer.

Valencia, España เข้าร่วม Ekim 2012

113 กำลังติดตาม294 ผู้ติดตาม

Roberto Paredes@RobertoParPal·5 Mar

@Glennalbert @Staphylo_ailus

GIF

QME

Alberto@Glennalbert·5 Mar

A ver te explico El carnicero básicamente estableció un precio por debajo del precio de equilibrio, es decir, estaba vendiendo muy barato. La consecuencia de un precio por debajo del precio potencial de mercado, produce una demanda excesiva para los bienes provistos, lo que hará que la demanda aumente, vaciando rápidamente su inventario. Pero si además la inflación incrementa el precio de reposición entonces el carnicero no podrá aprovisionarse nuevamente se inventario, quedando en la quiebra. Básicamente liquidó su negocio a voluntad.

Español

124

193

5.3K

211.3K

Ailu@Staphylo_ailus·4 Mar

Pero como? Si el vendía más barato no se suponía que todos le iban a comprar a él y los otros negocios iban a tener que bajar los precios obligadamente? No funciona así el mercado y la oferta-demanda?

Tendencias en Argentina@porqueTTarg

“Carnicero” Porque el carnicero que se negaba a subir los precios ahora cierra su negocio: “La inflación me ganó”. Sergio se negaba a aumentar los valores de sus productos porque los jubilados que viven en su barrio no podían pagarlos.

Español

492

2.4K

24.2K

Roberto Paredes@RobertoParPal·20 Şub

@jimkxa Yes this is why ROCm (misery) is soooo good…

English

302

Jim Keller@jimkxa·17 Şub

Cuda’s a swamp, not a moat. x86 was a swamp too

English

178

1.3K

403.2K

Roberto Paredes@RobertoParPal·16 Şub

Ale, pues ya está…

OpenAI@OpenAI

Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. openai.com/sora Prompt: “Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.”

Español

229

Roberto Paredes@RobertoParPal·5 Ara

@antor Desiré = de ver ** 🤣

Español

Roberto Paredes@RobertoParPal·5 Ara

@antor Puede ser interesante pero Desirés de ver esto… no sé x.com/gregkamradt/st…

Greg Kamradt@GregKamradt

Pressure Testing GPT-4-128K With Long Context Recall 128K tokens of context is awesome - but what's performance like? I wanted to find out so I did a “needle in a haystack” analysis Some expected (and unexpected) results Here's what I found: Findings: * GPT-4’s recall performance started to degrade above 73K tokens * Low recall performance was correlated when the fact to be recalled was placed between at 7%-50% document depth * If the fact was at the beginning of the document, it was recalled regardless of context length So what: * No Guarantees - Your facts are not guaranteed to be retrieved. Don’t bake the assumption they will into your applications * Less context = more accuracy - This is well know, but when possible reduce the amount of context you send to GPT-4 to increase its ability to recall * Position matters - Also well know, but facts placed at the very beginning and 2nd half of the document seem to be recalled better Overview of the process: * Use Paul Graham essays as ‘background’ tokens. With 218 essays it’s easy to get up to 128K tokens * Place a random statement within the document at various depths. Fact used: “The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.” * Ask GPT-4 to answer this question only using the context provided * Evaluate GPT-4s answer with another model (gpt-4 again) using @langchain evals * Rinse and repeat for 15x document depths between 0% (top of document) and 100% (bottom of document) and 15x context lengths (1K Tokens > 128K Tokens) Next Steps To Take This Further: * Iterations of this analysis were evenly distributed, it’s been suggested that doing a sigmoid distribution would be better (it would tease out more nuanced at the start and end of the document) * For rigor, one should do a key:value retrieval step. However for relatability I did a San Francisco line within PGs essays. Notes: * While I think this will be directionally correct, more testing is needed to get a firmer grip on GPT4s abilities * Switching up prompt with vary results * 2x tests were run at large context lengths to tease out more performance * This test cost ~$200 for API calls (a single call at 128K input tokens costs $1.28) * Thank you to @charles_irl for being a sounding board and providing great next steps

Español

Andrés Miguel Torrubia Sáez@antor·5 Ara

¿Contextos de 128k+ en una sola GPU en local? Sí con: github.com/state-spaces/m… Aún no hay modelos RLHF (Chat) y los que hay son "solo" de 2.8b sin embargo posible punto de inflexión en LLMs si esta arquitectura escala.

Español

3.2K

Roberto Paredes@RobertoParPal·5 Ara

@antor Mira pues LINCE Mistral 7B bastante más barato…

Español

Andrés Miguel Torrubia Sáez@antor·5 Ara

@RobertoParPal Estos modelos tienen caducidad anticipada, pero Falcon-180b es de los pocos que se manejan medianamente en castellano, de ahi (principalmente) el interés. Estoy bajando ahora deepseek en fp16 con 67B y haré unas pruebas.

Español

122

Andrés Miguel Torrubia Sáez@antor·4 Ara

Falcon 180B en local con ollama (supongo que estará cuantizado a 4 bits)

Español

2.7K

Roberto Paredes รีทวีตแล้ว

Worten@WortenES·15 Kas

Pues el anuncio para el Black Friday nos ha quedado precioso. Esperamos que no te importe que hayamos retocado un poco tu vídeo @lladosfitness 😘

Español

726

3.9K

22K

3.8M

Roberto Paredes@RobertoParPal·18 Eki

@antor Esto de qué coche es?

Español

169

Andrés Miguel Torrubia Sáez@antor·18 Eki

Dos bicharracos de estos (masivos, lo de abajo es una fuente de PC) para refrigerar de forma relativamente silenciosa (espero) el sucesor de xataka.com/ordenadores/me…

Español

6.5K

Roberto Paredes@RobertoParPal·23 Eyl

@karpathy @OwainEvans_UK Where is the problem?

English

Andrej Karpathy@karpathy·22 Eyl

LLM knowledge is a lot more "patchy" than you'd expect. I still don't have great intuition for it. They learn any thing in the specific "direction" of the context window of that occurrence and may not generalize when asked in other directions. It's a weird partial generalization. The "reversal curse" (cool name) is imo a special case of this.

Owain Evans@OwainEvans_UK

Does a language model trained on “A is B” generalize to “B is A”? E.g. When trained only on “George Washington was the first US president”, can models automatically answer “Who was the first US president?” Our new paper shows they cannot!

English

157

325

875.5K

Roberto Paredes รีทวีตแล้ว

Elon Musk@elonmusk·26 Ağu

twitter.com/i/broadcasts/1…

ZXX

14.4K

23.7K

154.7K

48.9M

Roberto Paredes@RobertoParPal·25 Ağu

Lo ha clavado instagram.com/p/CwXpWg0t1xL/…

Español

Roberto Paredes@RobertoParPal·12 Ağu

@saivenkataraju @abhi1thakur Yeah barely one token per second

English

Saivenkataraj@saivenkataraju·11 Ağu

@RobertoParPal @abhi1thakur Yeah. But not production level speed I think

English

abhishek@abhi1thakur·11 Ağu

The EASIEST way to finetune LLAMA-v2 (or any other LLM) on local machine!

English

170

1.1K

164.1K

Roberto Paredes@RobertoParPal·11 Ağu

@saivenkataraju @abhi1thakur It is slow but yes you can. Check for instance also the llama2.c project of Karpathy.

English

Saivenkataraj@saivenkataraju·11 Ağu

@abhi1thakur Can we atleats inference on CPU’s using existing models

English

409

Roberto Paredes@RobertoParPal·11 Ağu

@harumambaru @abhi1thakur Not sure but I think the it uses int4 and llama 7b so probably could fit but not sure

English

Igor Kasianenko@harumambaru·11 Ağu

@abhi1thakur Hi, thanks for your video! I wonder what is GPU that you are using, will it be possible to train on 1 GPU with 8GB memory? Or to the matter of fact on m1 pro macbook as they use shared memory.

English

470

Roberto Paredes@RobertoParPal·24 Haz

@DesatranqueJaen @RaulHernandezL

QME

200

Desatranques Jaén@DesatranqueJaen·23 Haz

Dime un disco que, en tu opinión, sea un 10/10.

Español

680

115

1.7K

501.6K

Roberto Paredes@RobertoParPal·17 May

@francoisfleuret Temperature of the room

English

398

François Fleuret@francoisfleuret·16 May

People who train enormous models (10s billions parameters and several weeks), what are the quantities you monitor, and what manual interventions do you do?

English

143

65.8K

Roberto Paredes รีทวีตแล้ว

Solver Intelligent Analytics@iasolver·5 May

La semana pasada celebramos la 2ª edición de nuestro ‘Almuerzo con Inteligencia Artificial’, un encuentro organizado en Madrid que contó nuevamente con Jordi Mansanet, Roberto Paredes y Victoria Corral de Solver como anfitriones y que conecta con empresas que apuestan por la IA.

Solver Intelligent Analytics tweet media

Español

311

Roberto Paredes@RobertoParPal·4 May

@elonmusk @clownworld @CommunityNotes No, it is the border between Spain and Morocco. That happens in a particular days when Spain a Morocco relationships were bad… so clearly Morocco can definitively avoid this but they use that as a weapon.

English

Elon Musk@elonmusk·4 May

@clownworld Is this real? @CommunityNotes

English

2.2K

15.8K

Roberto Paredes@RobertoParPal·2 May

@DrBPChamberlain @tk_rusch @mmbronstein (Pay) Attention is all you need

English

Ben Chamberlain@DrBPChamberlain·1 May

And the prize for the worst poster at #ICLR2023 goes to poster 65, gradient gating for deep multirate graphs (thanks @tk_rusch!!) @mmbronstein and I are here for the next 2h.

English

101

20.9K

ค้นพบ

@Glennalbert @Staphylo_ailus @jimkxa @antor @lladosfitness @karpathy @OwainEvans_UK @saivenkataraju