Juan Carlos Olano

123 posts

Juan Carlos Olano

Juan Carlos Olano

@jcolano

M2M-Location-Mobility/Mobile Enterprise Application Platforms Entrepreneur with emphasis in real-time and context-aware data solutions, culinary and life hacks.

Miami Beigetreten Eylül 2009
367 Folgt140 Follower
Juan Carlos Olano
Juan Carlos Olano@jcolano·
@flySFO This is not what happened. AA2045 was already at gate and most passengers for next flight to MIA were already onboard when smoke was detected.
English
2
0
1
1.3K
San Francisco International Airport (SFO) ✈️
American Airlines Flight 2045 from Miami was taxiing to the gate when the crew reported smoke in the cabin. The aircraft was evacuated and SFFD respond. Passengers are being transported to the terminal. 3 minor injuries reported from the evacuation, none needing medical transport
English
4
18
77
22K
Dagny T
Dagny T@needanespresso·
@miyadavid My daughter’s friend was on that flight. Supposedly a passenger had firecrackers in their backpack!
English
2
0
0
207
Emilia David
Emilia David@miyadavid·
A plane here at SFO en route to Miami apparently filled with smoke as boarding finished. There’s a bunch of really angry, confused and understandably scared passengers here right now
English
7
4
8
2.4K
Mat
Mat@Maticesss·
@FlightEmergency Flight AA2045 SFO to Miami. the plane caught fire just before departure and we had to escape through some ramps. Everybody is OK apparently.
Mat tweet media
English
23
17
84
26.3K
Juan Carlos Olano
Juan Carlos Olano@jcolano·
@PalmettoFord1 My leased 2021 Ford Explorer broke on 11/19. On the 21th I dropped it in the Palmetto Ford of Miami dealer. It has been almost a month since I left the car there and I have been calling or visiting every day, and every day they say “today or tomorrow". Is this normal?
English
0
0
0
19
Juan Carlos Olano
Juan Carlos Olano@jcolano·
@bindureddy I think: Had Google known their 2017 paper could have such potential, it would have been kept secret.
English
0
0
0
104
Bindu Reddy
Bindu Reddy@bindureddy·
Until recently, industry AI research labs were open and transparent. Imagine if Google hadn't chosen to publish "Attention is all your need" and kept Transformers' secret? OpenAI wouldn't even have invented GPTs! We lost that spirit of openness and collaboration when people realized they could make $$ and yield immense power by being secretive. The only way to go back to an open, innovative, and transparent AI ecosystem is for open-source to catch up to GPT-4 The good news is we are so close that you can almost feel it :) 🤞🤞
English
39
62
466
80.5K
Juan Carlos Olano
Juan Carlos Olano@jcolano·
𝐈 𝐫𝐞𝐜𝐨𝐦𝐦𝐞𝐧𝐝 𝐞𝐱𝐩𝐥𝐨𝐫𝐢𝐧𝐠 𝐝𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐭 𝐞𝐱𝐩𝐥𝐚𝐧𝐚𝐭𝐢𝐨𝐧𝐬 - and that's what this course offers. My hope is that you will find explanations that resonate with you and that make everything click. 🧠💡
English
0
0
1
65
Juan Carlos Olano
Juan Carlos Olano@jcolano·
🔍 Why my course? 𝐈 𝐛𝐞𝐥𝐢𝐞𝐯𝐞 𝐢𝐧 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐟𝐫𝐨𝐦 𝐦𝐮𝐥𝐭𝐢𝐩𝐥𝐞 𝐯𝐢𝐞𝐰𝐩𝐨𝐢𝐧𝐭𝐬. Each explanation sheds new light on a concept, and that's what I bring to you – diverse perspectives in one comprehensive course.
English
1
0
1
121
Juan Carlos Olano
Juan Carlos Olano@jcolano·
That is exactly why I've created this course. While creating the course I tried to explain each topic using examples of similar concepts to try to facilitate the understanding of each component.
English
1
0
1
42
Juan Carlos Olano
Juan Carlos Olano@jcolano·
🚀 I created [one more] course on Transformers: "The Transformer Layer By Layer"! 🎓 In my process of learning artificial intelligence, I've realized that mastering the Transformer model is both 𝐟𝐚𝐬𝐜𝐢𝐧𝐚𝐭𝐢𝐧𝐠 𝐚𝐧𝐝 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐢𝐧𝐠.
English
1
0
1
59
Juan Carlos Olano
Juan Carlos Olano@jcolano·
For me to gain an intuition of the different parts of the transformer, I've seen each concept explained numerous times, each time with a new perspective. And I know firsthand that 𝐢𝐭'𝐬 𝐧𝐨𝐭 𝐞𝐚𝐬𝐲 𝐭𝐨 𝐠𝐫𝐚𝐬𝐩 𝐭𝐡𝐞𝐬𝐞 𝐜𝐨𝐦𝐩𝐥𝐞𝐱 𝐢𝐝𝐞𝐚𝐬 at first glance.
English
1
0
1
43
Juan Carlos Olano
Juan Carlos Olano@jcolano·
I finished training 3 tiny models of 13.5MM parameters each, using specialized datasets of 5MM tokens each. Each dataset is very focused on one specific topic: Biology, Philosophy, and Greek classics. Their output is incredible.
English
1
0
3
63
Juan Carlos Olano
Juan Carlos Olano@jcolano·
@GregKamradt I will repeat now your experiment using your code base and again changing from langchain to the api of Claude. Will share results shortly.
English
0
0
0
59
Greg Kamradt
Greg Kamradt@GregKamradt·
Claude 2.1 (200K Tokens) - Pressure Testing Long Context Recall We all love increasing context lengths - but what's performance like? Anthropic reached out with early access to Claude 2.1 so I repeated the “needle in a haystack” analysis I did on GPT-4 Here's what I found: Findings: * At 200K tokens (nearly 470 pages), Claude 2.1 was able to recall facts at some document depths * Facts at the very top and very bottom of the document were recalled with nearly 100% accuracy * Facts positioned at the top of the document were recalled with less performance than the bottom (similar to GPT-4) * Starting at ~90K tokens, performance of recall at the bottom of the document started to get increasingly worse * Performance at low context lengths was not guaranteed So what: * Prompting Engineering Matters - It’s worth tinkering with your prompt and running A/B tests to measure retrieval accuracy * No Guarantees - Your facts are not guaranteed to be retrieved. Don’t bake the assumption they will into your applications * Less context = more accuracy - This is well know, but when possible reduce the amount of context you send to the models to increase its ability to recall * Position Matters - Also well know, but facts placed at the very beginning and 2nd half of the document seem to be recalled better Why run this test?: * I’m a big fan of Anthropic! They are helping to push the bounds on LLM performance and creating powerful tools for the world * As a practitioner of LLMs, it’s important to build an intuition for how they work, where they excel and their limits * Tests like these, while not bulletproof, help showcase real world examples and get a feeling for how they work. The goal is to transfer this knowledge to productive use cases Overview of the process: * Use Paul Graham essays as ‘background’ tokens. With 218 essays it’s easy to get up to 200K tokens (repeated essays when necessary) * Place a random statement within the document at various depths. Fact used: “The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.” * Ask Claude 2.1 to answer this question only using the context provided * Evaluate Claude 2.1s answer with GPT-4 using @langchain evals * Rinse and repeat for 35x document depths between 0% (top of document) and 100% (bottom of document) (sigmoid distribution) and 35x context lengths (1K Tokens > 200K Tokens) Next Steps To Take This Further: * For rigor, one should do a key:value retrieval step. However for relatability I did a San Francisco line within PGs essays for clarity and practical relevance * Repeat test multiple times for increased statistical significance Notes: * Amount Of Recall Matters - The model's performance is hypothesized to diminish when tasked with multiple fact retrievals or when engaging in synthetic reasoning steps * Changing your prompt, question, fact to be retrieved and background context will impact performance * The Anthropic team reached out and offered credits to repeat this test. They also offered prompt advice to maximize performance. It's important to clarify that their involvement was strictly logistical. The integrity and independence of the results were maintained, ensuring that the findings reflect my unbiased evaluation and are not influenced by their support. * This test cost ~$1,016 for API calls ($8 per million tokens)
Greg Kamradt tweet media
English
158
541
3K
1.2M
Juan Carlos Olano
Juan Carlos Olano@jcolano·
@GregKamradt In your previous experiment of gpt-4 you got low performance at certain depths and context sizes. I repeated your experiment using your exact code and dataset but instead of using Langchain I used the api directly and got 100% retrieval rate at all levels of context and depth.
English
0
0
0
97
Jerry Liu
Jerry Liu@jerryjliu0·
How well do long-context LLMs (gpt-4-turbo, claude-2) recall specifics in BIG documents? (>= 250k tokens) Inspired by @GregKamradt’s work on stress-testing gpt-4 128k, we extended this by stress testing gpt-4/Claude on even bigger documents that overflow the context window, *without* retrieval. This is especially important for summarization tasks, which inherently have to ingest large context as opposed to simple QA. Methodology: We hid “Jerry likes Hot Cheetos” 🍟🌶️ into different positions in the 2021 Uber 10-k (290k tokens). We used response synthesis strategies in @llama_index to synthesize over large context (create and refine, tree summarize aka map-reduce). We used our @llama_index evals to decide whether the answer was correct. Core discoveries 🧑‍🔬: 💡 claude-2 doesn’t do well with long response synthesis in general (it ran into rate-limit errors for tree summarize). 💡 gpt-4 turbo does decently with “create and refine” if the context is in the beginning or end (of the document, not the context window!). But it fails in the middle. 💡 Neither model seems to do well with tree-summarization / map-reduce style strategies. The main-finding here is that large-scale summarization/analysis with current long-context LLMs is still a work in progress. There are still many issues with the LLM dropping context, and may require prompt engineering to get right. Check out our guide and the diagram below for an illustration: github.com/run-llama/llam…
Jerry Liu tweet media
English
19
76
498
140K
Greg Kamradt
Greg Kamradt@GregKamradt·
Pressure Testing GPT-4-128K With Long Context Recall 128K tokens of context is awesome - but what's performance like? I wanted to find out so I did a “needle in a haystack” analysis Some expected (and unexpected) results Here's what I found: Findings: * GPT-4’s recall performance started to degrade above 73K tokens * Low recall performance was correlated when the fact to be recalled was placed between at 7%-50% document depth * If the fact was at the beginning of the document, it was recalled regardless of context length So what: * No Guarantees - Your facts are not guaranteed to be retrieved. Don’t bake the assumption they will into your applications * Less context = more accuracy - This is well know, but when possible reduce the amount of context you send to GPT-4 to increase its ability to recall * Position matters - Also well know, but facts placed at the very beginning and 2nd half of the document seem to be recalled better Overview of the process: * Use Paul Graham essays as ‘background’ tokens. With 218 essays it’s easy to get up to 128K tokens * Place a random statement within the document at various depths. Fact used: “The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.” * Ask GPT-4 to answer this question only using the context provided * Evaluate GPT-4s answer with another model (gpt-4 again) using @langchain evals * Rinse and repeat for 15x document depths between 0% (top of document) and 100% (bottom of document) and 15x context lengths (1K Tokens > 128K Tokens) Next Steps To Take This Further: * Iterations of this analysis were evenly distributed, it’s been suggested that doing a sigmoid distribution would be better (it would tease out more nuanced at the start and end of the document) * For rigor, one should do a key:value retrieval step. However for relatability I did a San Francisco line within PGs essays. Notes: * While I think this will be directionally correct, more testing is needed to get a firmer grip on GPT4s abilities * Switching up prompt with vary results * 2x tests were run at large context lengths to tease out more performance * This test cost ~$200 for API calls (a single call at 128K input tokens costs $1.28) * Thank you to @charles_irl for being a sounding board and providing great next steps
Greg Kamradt tweet media
English
202
608
3.8K
1.5M