Josh Simmich

15 posts

Josh Simmich

Josh Simmich

@JSimmich

PhD | The University of Queensland, Australia | Researching serious games in physiotherapy. Other interests: AI/automation, mobile health, behavioural economics

St Lucia, Brisbane شامل ہوئے Haziran 2018
18 فالونگ35 فالوورز
Josh Simmich
Josh Simmich@JSimmich·
@petergostev @ericssunLeon What's the direction of causation though? Does reasoning make it worse, or does falling for the BS then use more reasoning tokens (because it continues to think through the question rather than just giving an initial refusal)?
English
0
0
0
53
Peter Gostev
Peter Gostev@petergostev·
Does thinking help? Not really - if anything it reduces the performance. One theory someone mentioned as that thinking models perhaps try to get to an answer no matter what, so maybe that's an explanation. Credit to: @ericssunLeon for the chart idea
Peter Gostev tweet media
English
4
1
53
6.1K
Peter Gostev
Peter Gostev@petergostev·
BullshitBench v2 is out! It is one of the few benchmarks where models are generally not getting better (except Claude) and where reasoning isn't helping. What's new: 100 new questions, by domain (coding (40 Q's), medical (15), legal (15), finance (15), physics(15)), 70+ model variants tested. BullshitBench is already at 380 starts on GitHub - all questions, scripts, responses and judgements are there so check it out. TL;DR: - Results replicated - @AnthropicAI latest models are scoring exceptionally well - @Alibaba_Qwen is another very strong performer - OpenAI and Google models are not doing well and are not improving - Domains do not show much difference - rates of BS detection are about the same across all domains - Reasoning, if anything, has negative effect - Newer models don't do that much better than older ones (except Anthropic) Links: - Data explorer: petergpt.github.io/bullshit-bench… - GitHub: github.com/petergpt/bulls… Highly recommend the data explorer where you can study the data and the questions & sample answers.
English
48
96
794
237.2K
Josh Simmich
Josh Simmich@JSimmich·
@emollick ChatGPT 5 is the only one that maintained the they/them pronouns from the prompt.
English
0
0
0
69
Ethan Mollick
Ethan Mollick@emollick·
“Write a single paragraph about someone who doles out their remaining words like wartime rations, having been told they only have ten thousand left in their lifetime. They’re at 47 words remaining, holding their newborn.”
Ethan Mollick tweet mediaEthan Mollick tweet mediaEthan Mollick tweet mediaEthan Mollick tweet media
English
41
19
380
62.4K
Josh Simmich
Josh Simmich@JSimmich·
Excited to share my latest research exploring the potential of a GPS-based app in assessing mobility for those with persistent pain. We investigated if smartphones can reliably measure walk distance, allowing for remote assessment. Read our findings here: formative.jmir.org/2024/1/e46820
English
0
3
2
177
Josh Simmich ری ٹویٹ کیا
Hang DING
Hang DING@hangding·
Evaluation framework for conversational agents with artificial intelligence in health interventions: a systematic scoping review academic.oup.com/jamia/article-…
English
0
2
2
101
Josh Simmich ری ٹویٹ کیا
UQ RECOVER Injury Research Centre
UQ RECOVER Injury Research Centre@RecoverResearch·
The RECOVER Conference 2021 is a full-day, in-person event hosted by our RECOVER researchers alongside international and national keynote speakers. Conference 7 Oct in Brisbane. Early bird registration extended until Friday 10th September. Get in quick! bit.ly/3cNLozW
UQ RECOVER Injury Research Centre tweet media
English
0
2
2
0
Josh Simmich
Josh Simmich@JSimmich·
Not sure if scientists are lazy or just very busy, but so many just copy and paste from a journal article to make their slides. Stop it. The presentation is an advertisement for your paper, not a PowerPoint version of the paper. nature.com/articles/d4158…
English
0
0
0
0
Josh Simmich ری ٹویٹ کیا
SimGHOSTS
SimGHOSTS@SimGHOSTS·
Patrea Andersen sharing examples of technology changing health professional education at #SG18AUS
SimGHOSTS tweet media
English
0
2
7
0
Josh Simmich
Josh Simmich@JSimmich·
Great to see so many people interested in spending a weekend learning about AI in health #aiinhealthSCUGC
Josh Simmich tweet media
English
0
0
2
0
Josh Simmich ری ٹویٹ کیا
Robin Hanson
Robin Hanson@robinhanson·
It still amazes me that academic fields, connected by co-citation, are arranged in a ring. Is there a missing "dark field" in the middle that we will find someday to connect it all together well?
Robin Hanson tweet media
English
334
2.3K
5.1K
0
Josh Simmich ری ٹویٹ کیا
Ben Ferns
Ben Ferns@ben_ferns·
This tech from @CTRLlabsCo which can read and interpret nerve signals via ML (even when the movement is not made) has a huge amount of potential - no cameras, no trackers, and not limited to what humans can do (note the robot arm can rotate its wrist 360 degrees). Links below vvv
English
9
499
1.1K
0