Cor-Paul Bezemer (@corpaul) - Twitter Profili | Zamantika Mersobahis Locabet

Very cool! A preprint of our work is available at asgaard.ece.ualberta.ca/papers/Journal…

The Game Accessibility Conference@GA_Conf

And one more! The finalists for Best Academic Research are Vishnu Nair, Ian Gauk, Kathrin Gerling, Anna Chen, and Jesse J Martinez. This category sits outside the usual voting process, but you can find links to their papers at gaconf.com/gaconf-awards-… #gamedev #accessibility

English

0

1

5

303

Cor-Paul Bezemer retweetledi

taesiri@taesiri·27 Şub

Happy to announce that GlitchBench has been accepted to #CVPR2024🎉 twitter.com/_akhaliq/statu…

AK@_akhaliq

GlitchBench: Can large multimodal models detect video game glitches? paper page: huggingface.co/papers/2312.05… Large multimodal models (LMMs) have evolved from large language models (LLMs) to integrate multiple input modalities, such as visual inputs. This integration augments the capacity of LLMs for tasks requiring visual comprehension and reasoning. However, the extent and limitations of their enhanced abilities are not fully understood, especially when it comes to real-world tasks. To address this gap, we introduce GlitchBench, a novel benchmark derived from video game quality assurance tasks, to test and evaluate the reasoning capabilities of LMMs. Our benchmark is curated from a variety of unusual and glitched scenarios from video games and aims to challenge both the visual and linguistic reasoning powers of LMMs in detecting and interpreting out-of-the-ordinary events. We evaluate multiple state-of-the-art LMMs, and we show that GlitchBench presents a new challenge for these models.

English

0

2

7

1.1K

Cor-Paul Bezemer retweetledi

AK@_akhaliq·12 Ara

GlitchBench: Can large multimodal models detect video game glitches? paper page: huggingface.co/papers/2312.05… Large multimodal models (LMMs) have evolved from large language models (LLMs) to integrate multiple input modalities, such as visual inputs. This integration augments the capacity of LLMs for tasks requiring visual comprehension and reasoning. However, the extent and limitations of their enhanced abilities are not fully understood, especially when it comes to real-world tasks. To address this gap, we introduce GlitchBench, a novel benchmark derived from video game quality assurance tasks, to test and evaluate the reasoning capabilities of LMMs. Our benchmark is curated from a variety of unusual and glitched scenarios from video games and aims to challenge both the visual and linguistic reasoning powers of LMMs in detecting and interpreting out-of-the-ordinary events. We evaluate multiple state-of-the-art LMMs, and we show that GlitchBench presents a new challenge for these models.

English

0

16

41

12.1K

Cor-Paul Bezemer retweetledi

taesiri@taesiri·12 Ara

Excited to share GlitchBench! 🚀 It is a new benchmark designed specifically for large multimodal models. GlitchBench sets a new standard by incorporating tasks from actual game quality assurance scenarios 🎮, bringing real-world challenges into focus. #AI #MachineLearning #GameDev ArXiv: arxiv.org/abs/2312.05291 Project Website: glitchbench.github.io Hugging Face 🤗 Dataset: huggingface.co/datasets/glitc… Leaderboard 🏆: huggingface.co/spaces/glitchb…

English

3

5

11

1.4K

Cor-Paul Bezemer retweetledi

Anh Totti Nguyen@anh_ng8·18 Haz

How to score > 90% on ImageNet? Our new study on the spatial biases of ImageNet and relevant ImageNet-scale, OOD benchmarks reveals that all common image classifiers tested can score > 90%, if the model looks at the correct crop, i.e., ⭐️ Zoom 🔎 is all you need! ⭐️ 1/n

English

2

29

138

31.2K

Cor-Paul Bezemer retweetledi

The ASGAARD Lab@asgaard_lab·29 Ağu

Super congrats to @finlaymacklon, @taesiri, Stefan and @viggiato who had their paper "Automatically Detecting Visual Bugs in HTML5 <canvas> Games” accepted at @ASE_conf! Preprint available at asgaard.ece.ualberta.ca/automatically-… (collaboration with @ProdigyGame)

English

0

4

5

0

Cor-Paul Bezemer retweetledi

Philipp Leitner (@xLeitix@discuss.systems)

Philipp Leitner (@[email protected])@xLeitix·4 May

Ad for our teaching professor position is now out: web103.reachmee.com/ext/I005/1035/… 100%, permanent from start, min. 25% of worktime reserved for research (can be increased with grants). Hit me up if you want to know more, and RTs appreciated ;)

Philipp Leitner (@[email protected])@xLeitix

My division at @cse_gbg is hiring 1-2 Assistant / Associate Teaching Professors in Software Engineering and/or Interaction Design. Positions will be 100% & permanent from start.

English

1

9

4

0

Cor-Paul Bezemer retweetledi

The ASGAARD Lab@asgaard_lab·22 Nis

@viggiato's paper “Identifying Similar Test Cases That Are Specified in Natural Language” was accepted for publication in the Transactions on Software Engineering journal! Preprint available at asgaard.ece.ualberta.ca/identifying-si… (with @ProdigyGame)

English

0

3

8

0

Cor-Paul Bezemer@corpaul·12 Nis

Happening today in less than one hour!

David Daly@DavidDaly44

Tomorrow brings day 2 of @ICPEconf #ICPE2022 and a session I've been looking forward to: The #DataChallenge! Organized by @corpaul, @swy351, and myself, and using a dataset from @MongoDB, we invited participants to do something cool with the dataset. And they have! /1

English

0

1

0

Cor-Paul Bezemer retweetledi

Philipp Leitner (@[email protected])@xLeitix·31 Mar

Give yourself an early Easter present and join us at ICPE! It’s free of charge and we have great speakers (keynote and otherwise :) )!

SPEC@spec_perf

Very excited to see the #ICPE2022 keynote from @Google's John Wilkes on "Building Warehouse-Scale Computers," taking place at the 4/9-4/13 virtual conference. Learn more about this year's keynotes at: ow.ly/U0Nc50IrGb0 Don't forget to register - it's free!

English

0

6

11

0

Cor-Paul Bezemer retweetledi

Simon Eismann@simon_eismann·31 Mar

Ever wondered how the performance of #Serverless applications changes WITHOUT code changes? Over 10 months, we observed significant changes on #AWS in our @JSSoftware paper "A case study on the stability of performance tests for serverless applications" bit.ly/38kG5ZH

English

1

6

20

0

Cor-Paul Bezemer retweetledi

Simon Eismann@simon_eismann·31 Mar

This paper is the result of a collaboration within the @spec_perf research group 'Devops Performance' together with Lizhi Liao, @DiegoEliasCosta @andrevanhoorn, @corpaul, @swy351, @skounev

English

0

2

4

0

Cor-Paul Bezemer retweetledi

The ASGAARD Lab@asgaard_lab·24 Mar

@taesiri and Finlay's paper "CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning" was accepted at @msrconf ! Preprint available at asgaard.ece.ualberta.ca/clip-meets-gam…

English

0

2

7

0

Cor-Paul Bezemer retweetledi

AK@_akhaliq·22 Mar

CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning abs: arxiv.org/abs/2203.11096 project page: asgaardlab.github.io/CLIPxGamePhysi…

English

2

24

131

0

Cor-Paul Bezemer retweetledi

Karim Ali (كريم علي)@karimhamdanali·15 Şub

Game developers! We are running an anonymous research survey on the current practices, goals, and needs for quality assurance in #gamedev and would love your input. Over $4,000 in random draw prizes. Please spread the word! surveymonkey.com/r/gamedevtesti…

English

1

5

7

0

Cor-Paul Bezemer retweetledi

The ASGAARD Lab@asgaard_lab·15 Şub

Finally, @HaoLi24342250's paper "An Empirical Study of Yanked Releases in the Rust Package Registry" was accepted in the @computersociety TSE journal! Preprint available @ asgaard.ece.ualberta.ca/an-empirical-s…

English

0

2

5

0

Cor-Paul Bezemer retweetledi

The ASGAARD Lab@asgaard_lab·15 Şub

Also, (Twitter-anonymous) Mikael's paper "Studying the Performance Risks of Upgrading Docker Hub Images: A Case Study of WordPress" was accepted at ICPE 2022! Preprint available @ asgaard.ece.ualberta.ca/studying-the-p…

English

0

1

7

0

Cor-Paul Bezemer retweetledi

The ASGAARD Lab@asgaard_lab·15 Şub

Some great news from the @asgaard_lab ! @viggiato's paper “Using Natural Language Processing Techniques to Improve Manual Test Case Descriptions” was accepted in ICSE-SEIP 2022! Preprint available @ asgaard.ece.ualberta.ca/using-natural-…

English

0

2

7

0

Cor-Paul Bezemer retweetledi

Diego Elias Costa@DiegoEliasCosta·27 Eki

Great to see @gvwilson summary of our work in bad practices of Java benchmarking! Work done in collaboration with @xLeitix, @corpaul, and Artur Andrzejak. :)

English

0

3

11

0

Cor-Paul Bezemer retweetledi

The ASGAARD Lab@asgaard_lab·31 Eki

Mikael and Chloe's systematic literature survey on Applications of Generative Adversarial Networks in Anomaly Detection is available now on arXiv: arxiv.org/abs/2110.12076

English

1

5

0

Cor-Paul Bezemer

Keşfet