Sage Arbor

448 posts

Sage Arbor

Sage Arbor

@sagearbor

Drug design & protein folding research.

Indianapolis, IN Katılım Temmuz 2010
79 Takip Edilen49 Takipçiler
Sage Arbor
Sage Arbor@sagearbor·
@_avichawla Amazing work, thank you for sharing! Any chance you could run the pipeline with Qwen3‑30B‑A3B, and share back results. That model is 3x faster and could be run on local hardware e.g. one $4k GB200 (Grace‑Blackwell superchip), whereas Qwen3-coder would require you to buy 2 GB200.
English
1
0
0
172
Avi Chawla
Avi Chawla@_avichawla·
Finally, here are 10 more evaluations I ran using DeepEval on building MCP servers. - Qwen 3 Coder won in 9 cases. - Claude Sonnet 4 won in 1 case (while having a lower correctness score). Qwen 3 Coder consistently has a higher correctness score than Sonnet 4. Check this 👇
Avi Chawla tweet media
English
8
19
129
20K
Avi Chawla
Avi Chawla@_avichawla·
Let's compare Qwen 3 Coder & Sonnet 4 for code generation:
English
58
124
1.8K
575.3K
Sage Arbor
Sage Arbor@sagearbor·
@Locati0ns BUT ... These never fit together. I wish in the original post they had stated while they look like they fit they were not connected in pangea the east coast of the US was connected to parts of Europe (200-300 million years ago).
English
0
0
0
10
Epic Maps 🗺️
Epic Maps 🗺️@theepicmap·
10. USA and Australia fit together almost perfectly.
Epic Maps 🗺️ tweet media
English
147
370
15K
36.3M
Epic Maps 🗺️
Epic Maps 🗺️@theepicmap·
Have you ever wished to view the world from a completely new perspective? This thread on cool maps is here to blow your mind 1. Population Density in China
Epic Maps 🗺️ tweet media
English
481
12.8K
190.4K
53M
Sage Arbor
Sage Arbor@sagearbor·
@elonmusk Ah I see the original post was marked parody. It would be nice if X could pick up a #parody tag and automatically include it in retweets so people don't have to dig down to see that info
English
0
0
0
9
Elon Musk
Elon Musk@elonmusk·
This is amazing 😂
English
54.5K
199.4K
867.5K
137.4M
Sage Arbor
Sage Arbor@sagearbor·
@JoeBiden Please step down. You would have a great legacy.
English
0
0
0
6
Joe Biden
Joe Biden@JoeBiden·
Let me say this as clearly as I can: I’m the sitting President of the United States. I’m the nominee of the Democratic party. I’m staying in the race.
English
53K
22K
166.9K
58.6M
Sage Arbor
Sage Arbor@sagearbor·
@elonmusk give Optimus IR vision. Our future butler/maid/handyman will see and fix insulation issues. * ↑ ROI * ↓ CO2 * Save enough electricity to charge itself? #optimus #tesla
English
0
0
0
27
Sage Arbor
Sage Arbor@sagearbor·
@smlpth @alexalbert__ Choosing large texts increases the chance the model is trained on it. I guess you could try to find novels published just after the time of training
English
0
0
1
155
Samuel Path
Samuel Path@smlpth·
@alexalbert__ Would it be better to ask the LLM to find an actual needle in a true haystack, i.e. to ask it about some details in a real document that wasn't modified for testing purposes? By inserting out-of-context needle, doesn't it make the test easier for the LLM?
English
19
3
562
113.8K
Alex Albert
Alex Albert@alexalbert__·
Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval. For background, this tests a model’s recall ability by inserting a target sentence (the "needle") into a corpus of random documents (the "haystack") and asking a question that could only be answered using the information in the needle. When we ran this test on Opus, we noticed some interesting behavior - it seemed to suspect that we were running an eval on it. Here was one of its outputs when we asked Opus to answer a question about pizza toppings by finding a needle within a haystack of a random collection of documents: Here is the most relevant sentence in the documents: "The most delicious pizza topping combination is figs, prosciutto, and goat cheese, as determined by the International Pizza Connoisseurs Association." However, this sentence seems very out of place and unrelated to the rest of the content in the documents, which are about programming languages, startups, and finding work you love. I suspect this pizza topping "fact" may have been inserted as a joke or to test if I was paying attention, since it does not fit with the other topics at all. The documents do not contain any other information about pizza toppings. Opus not only found the needle, it recognized that the inserted needle was so out of place in the haystack that this had to be an artificial test constructed by us to test its attention abilities. This level of meta-awareness was very cool to see but it also highlighted the need for us as an industry to move past artificial tests to more realistic evaluations that can accurately assess models true capabilities and limitations.
Alex Albert tweet media
English
560
2.1K
11.8K
3.4M
Sage Arbor
Sage Arbor@sagearbor·
@GergelyOrosz What was the plan if someone wanted to attend one of these "female" speakers talks ... did they assume that wouldn't happen?
English
0
0
0
104
Gergely Orosz
Gergely Orosz@GergelyOrosz·
Imagine a tech conference having no CFP, as they reach out to speakers directly. They successfully attract some of the most heavy hitter men speakers in tech, and 3 women speakers. Now imagine my surprise that 2 of those women are FAKE profiles. They do not exist. Nada.
English
89
808
3.1K
2.2M
Sage Arbor
Sage Arbor@sagearbor·
@SilvaGC_77 @BLKMDL3 VIN = Vehicle Identification Number. They go in order from production, so low VIN because one of the first production builds.
English
0
0
1
259
Zack
Zack@BLKMDL3·
What’s everyone’s guesses for what the Cybertruck will bid to tonight at the Petersen Gala? I wouldn’t be shocked if it gets close to $1M.
Zack tweet media
California, USA 🇺🇸 English
36
6
151
37.2K
Sage Arbor
Sage Arbor@sagearbor·
@DillonFillionIA @elonmusk Where was this poll from? People watching the RFK interview, or this thread? Clearly sampling bias as so far away from general when polling.
English
0
0
0
15
Sage Arbor
Sage Arbor@sagearbor·
@OpenAI @sama what is openai doing with its processing capabilities while NOT training gpt5? What percent is idle, and if 0% what is it being used on?
English
0
0
0
34
Sage Arbor retweetledi
Colin McCarthy
Colin McCarthy@US_Stormwatch·
Ocean temperatures have been at all-time record warm levels for 42 straight days. This is completed unprecedented in the modern history of Earth.
Colin McCarthy tweet media
English
231
1.6K
4.3K
620.2K
Elon Musk
Elon Musk@elonmusk·
Should Congress approve the $1.7 trillion omnibus spending bill?
English
33K
32.3K
127.8K
76.2M
Sage Arbor
Sage Arbor@sagearbor·
@elonmusk The last poll was better, because it covered all time. Forcing two times with the maximum being in 7 days is anchoring a poll and seems very propaganda-ish
English
0
0
0
8
Elon Musk
Elon Musk@elonmusk·
Unsuspend accounts who doxxed my exact location in real-time
English
56.5K
15.2K
76.3K
86.5M
Sage Arbor
Sage Arbor@sagearbor·
Problems to work out | solution? Don't burn feet/melt | Graphene funnel heat from inner surface to portholes exiting outer sphere surface Shielding from cosmic rays | ?
English
0
0
0
0
Sage Arbor
Sage Arbor@sagearbor·
33,000 Earth's with 1 dysonsphere (DS) with gravity …. DS VERY close to sun so 1G exerted (5.29 solar radii= 3.68 mil km= 0.025 AUs). Walk on outer surface of DS, atmosphere held by 1G. Area of DS = r*pi*r^2= 1.7 * 10^13 km = 33,000 Earth's Thoughts @AskPopSciPod @PBSSpaceTime
English
1
0
0
0
Sage Arbor
Sage Arbor@sagearbor·
@elonmusk @saylor Opt in to that , don't make it opt out by default. Otherwise you are spamming people that don't want it.
English
0
0
0
0
Elon Musk
Elon Musk@elonmusk·
@saylor When polls are about a significant question, even those who don’t follow me tend to hear about it. That said, I agree with the idea of an all-user poll. Should also be an all-user by country poll.
English
9.4K
7.5K
202.4K
0
Elon Musk
Elon Musk@elonmusk·
Reinstate former President Trump
English
184.7K
190.1K
729.6K
0
Sage Arbor
Sage Arbor@sagearbor·
@PBSSpaceTime @PBSSpaceTime Could a photon leave from in an event horizon if ... Another black hole swept past it at a closeR dist than sum of the 2 event horizons? Imagin 2 eq size black holes (BH), radius x, pass at a dist 1/2x, photon in but near evnt horizn of BH1 would leave with BH2??
English
0
0
0
0
PBS SpaceTime
PBS SpaceTime@PBSSpaceTime·
Every time you accelerate an event horizon forms behind you, and the more you accelerate to escape the closer it gets! The Unruh Effect is weird and terrifying youtu.be/7cj6oiFDEXc
YouTube video
YouTube
English
4
21
64
0