Roberto Paredes

1.4K posts

Roberto Paredes banner
Roberto Paredes

Roberto Paredes

@RobertoParPal

Full Professor DSIC-UPV. Former Director of PRHLT Research Center. CTO Solver Machine Learning. European Distributed Deep Learning Library-EDDL Lead Developer.

Valencia, España เข้าร่วม Ekim 2012
113 กำลังติดตาม294 ผู้ติดตาม
Alberto
Alberto@Glennalbert·
A ver te explico El carnicero básicamente estableció un precio por debajo del precio de equilibrio, es decir, estaba vendiendo muy barato. La consecuencia de un precio por debajo del precio potencial de mercado, produce una demanda excesiva para los bienes provistos, lo que hará que la demanda aumente, vaciando rápidamente su inventario. Pero si además la inflación incrementa el precio de reposición entonces el carnicero no podrá aprovisionarse nuevamente se inventario, quedando en la quiebra. Básicamente liquidó su negocio a voluntad.
Alberto tweet media
Español
124
193
5.3K
211.3K
Ailu
Ailu@Staphylo_ailus·
Pero como? Si el vendía más barato no se suponía que todos le iban a comprar a él y los otros negocios iban a tener que bajar los precios obligadamente? No funciona así el mercado y la oferta-demanda?
Tendencias en Argentina@porqueTTarg

“Carnicero” Porque el carnicero que se negaba a subir los precios ahora cierra su negocio: “La inflación me ganó”. Sergio se negaba a aumentar los valores de sus productos porque los jubilados que viven en su barrio no podían pagarlos.

Español
492
2.4K
24.2K
2M
Roberto Paredes
Roberto Paredes@RobertoParPal·
@jimkxa Yes this is why ROCm (misery) is soooo good…
English
0
0
0
302
Jim Keller
Jim Keller@jimkxa·
Cuda’s a swamp, not a moat. x86 was a swamp too
English
46
178
1.3K
403.2K
Roberto Paredes
Roberto Paredes@RobertoParPal·
Ale, pues ya está…
OpenAI@OpenAI

Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. openai.com/sora Prompt: “Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.”

Español
0
0
6
229
Roberto Paredes
Roberto Paredes@RobertoParPal·
@antor Puede ser interesante pero Desirés de ver esto… no sé x.com/gregkamradt/st…
Greg Kamradt@GregKamradt

Pressure Testing GPT-4-128K With Long Context Recall 128K tokens of context is awesome - but what's performance like? I wanted to find out so I did a “needle in a haystack” analysis Some expected (and unexpected) results Here's what I found: Findings: * GPT-4’s recall performance started to degrade above 73K tokens * Low recall performance was correlated when the fact to be recalled was placed between at 7%-50% document depth * If the fact was at the beginning of the document, it was recalled regardless of context length So what: * No Guarantees - Your facts are not guaranteed to be retrieved. Don’t bake the assumption they will into your applications * Less context = more accuracy - This is well know, but when possible reduce the amount of context you send to GPT-4 to increase its ability to recall * Position matters - Also well know, but facts placed at the very beginning and 2nd half of the document seem to be recalled better Overview of the process: * Use Paul Graham essays as ‘background’ tokens. With 218 essays it’s easy to get up to 128K tokens * Place a random statement within the document at various depths. Fact used: “The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.” * Ask GPT-4 to answer this question only using the context provided * Evaluate GPT-4s answer with another model (gpt-4 again) using @langchain evals * Rinse and repeat for 15x document depths between 0% (top of document) and 100% (bottom of document) and 15x context lengths (1K Tokens > 128K Tokens) Next Steps To Take This Further: * Iterations of this analysis were evenly distributed, it’s been suggested that doing a sigmoid distribution would be better (it would tease out more nuanced at the start and end of the document) * For rigor, one should do a key:value retrieval step. However for relatability I did a San Francisco line within PGs essays. Notes: * While I think this will be directionally correct, more testing is needed to get a firmer grip on GPT4s abilities * Switching up prompt with vary results * 2x tests were run at large context lengths to tease out more performance * This test cost ~$200 for API calls (a single call at 128K input tokens costs $1.28) * Thank you to @charles_irl for being a sounding board and providing great next steps

Español
1
0
0
27
Andrés Miguel Torrubia Sáez
¿Contextos de 128k+ en una sola GPU en local? Sí con: github.com/state-spaces/m… Aún no hay modelos RLHF (Chat) y los que hay son "solo" de 2.8b sin embargo posible punto de inflexión en LLMs si esta arquitectura escala.
Español
4
2
26
3.2K
Roberto Paredes
Roberto Paredes@RobertoParPal·
@antor Mira pues LINCE Mistral 7B bastante más barato…
Español
0
0
1
21
Andrés Miguel Torrubia Sáez
@RobertoParPal Estos modelos tienen caducidad anticipada, pero Falcon-180b es de los pocos que se manejan medianamente en castellano, de ahi (principalmente) el interés. Estoy bajando ahora deepseek en fp16 con 67B y haré unas pruebas.
Español
1
0
2
122
Roberto Paredes รีทวีตแล้ว
Worten
Worten@WortenES·
Pues el anuncio para el Black Friday nos ha quedado precioso. Esperamos que no te importe que hayamos retocado un poco tu vídeo @lladosfitness 😘
Español
726
3.9K
22K
3.8M
Andrej Karpathy
Andrej Karpathy@karpathy·
LLM knowledge is a lot more "patchy" than you'd expect. I still don't have great intuition for it. They learn any thing in the specific "direction" of the context window of that occurrence and may not generalize when asked in other directions. It's a weird partial generalization. The "reversal curse" (cool name) is imo a special case of this.
Owain Evans@OwainEvans_UK

Does a language model trained on “A is B” generalize to “B is A”? E.g. When trained only on “George Washington was the first US president”, can models automatically answer “Who was the first US president?” Our new paper shows they cannot!

English
157
325
3K
875.5K
abhishek
abhishek@abhi1thakur·
The EASIEST way to finetune LLAMA-v2 (or any other LLM) on local machine!
English
19
170
1.1K
164.1K
Saivenkataraj
Saivenkataraj@saivenkataraju·
@abhi1thakur Can we atleats inference on CPU’s using existing models
English
1
0
0
409
Igor Kasianenko
Igor Kasianenko@harumambaru·
@abhi1thakur Hi, thanks for your video! I wonder what is GPU that you are using, will it be possible to train on 1 GPU with 8GB memory? Or to the matter of fact on m1 pro macbook as they use shared memory.
English
1
0
1
470
Desatranques Jaén
Desatranques Jaén@DesatranqueJaen·
Dime un disco que, en tu opinión, sea un 10/10.
Desatranques Jaén tweet media
Español
680
115
1.7K
501.6K
François Fleuret
François Fleuret@francoisfleuret·
People who train enormous models (10s billions parameters and several weeks), what are the quantities you monitor, and what manual interventions do you do?
English
13
3
143
65.8K
Roberto Paredes รีทวีตแล้ว
Solver Intelligent Analytics
La semana pasada celebramos la 2ª edición de nuestro ‘Almuerzo con Inteligencia Artificial’, un encuentro organizado en Madrid que contó nuevamente con Jordi Mansanet, Roberto Paredes y Victoria Corral de Solver como anfitriones y que conecta con empresas que apuestan por la IA.
Solver Intelligent Analytics tweet mediaSolver Intelligent Analytics tweet media
Español
1
1
5
311
Roberto Paredes
Roberto Paredes@RobertoParPal·
@elonmusk @clownworld @CommunityNotes No, it is the border between Spain and Morocco. That happens in a particular days when Spain a Morocco relationships were bad… so clearly Morocco can definitively avoid this but they use that as a weapon.
English
0
0
0
73
Ben Chamberlain
Ben Chamberlain@DrBPChamberlain·
And the prize for the worst poster at #ICLR2023 goes to poster 65, gradient gating for deep multirate graphs (thanks @tk_rusch!!) @mmbronstein and I are here for the next 2h.
Ben Chamberlain tweet media
English
7
3
101
20.9K