Sabitlenmiş Tweet
Patrick
9K posts


@monster0x0000 @0necontroller There are. There is Vosk etc and these days with OpenAI Whisper and AI tools you can get really good speech recognition. That said I feel like it's an open problem. Especially for the Swahili language... And even more so over poor networks. That might require custom models.
English

@Kirembu @0necontroller I'm not into ML/AI/Mobile but I am thinking this should be relatively straightforward automatic speech recognition. and there should be some free libraries online.
English

@monster0x0000 @0necontroller LoL Good point. I'm not implementing the whole microphone array thing... Was hoping to learn how speech detections works. In some fintech areas saying the word "Paybill" should be as loud as a gun going off 😅
English

@Kirembu @0necontroller Why would a virtual IT department app need a microphone array?
English

@akinyi__wendy @CollinceBecky Initially used Grok, then switched to Claude 4.6 to verify and fix what I missed. Here is the corrected Colab notebook. Let me know if it makes sense or if there's something I've missed. I'm not an ML Expert so feedback is welcome!
colab.research.google.com/drive/1VhFmtjv…
English

@CollinceBecky @Kirembu I dont know what AI has generated for you . I cant comment on what I dont know
English

@monster0x0000 @0necontroller That's my next rabbit hole actually. I'm implementing this on my Android App play.google.com/store/apps/det…
English

@0necontroller @Kirembu For any sensible triangulation to be done you would need a tonne of phones to be close to where the shots are fired and their microphones to always be on(not optimal). Most acoustic detection companies in the US rely on large arrays of microphones mounted onto street lights.
English

@CollinceBecky @akinyi__wendy Agreed on the shared kernel issue, fixed in the corrected implementation. Each channel now has independent parameters so gradients can specialise them. I'm not an ML Expert so I had some help here.
All addressed here: colab.research.google.com/drive/1VhFmtjv…
English

@StephenR96828 Agree the results were disappointing, but this open discourse is useful. Critical reproductions add value when aimed at improving research and science. Thanks for sharing your experience.
English

@Kirembu Btw good work on the notebook. I actually have her model 😂 vituko central. Your hypothesis about the accuracy being low is correct. It does trigger over random sounds and labels it as a bomb or ak47 when it's just a video of a Glock. Would write an article but that would be mean
English

@StephenR96828 Yep, fair point. Reinventing (or independently rediscovering) ideas happens all the time in research. The parametric spiral kernel is still a nice touch. For Science! 🚀
English

@Kirembu There's a spiralnet++ algorithm library so it's not plagiarism just reinventing the wheel
English

@TiskTusk @TiskTusk Updated with a deeper MobileNetV2-style version using spiral kernels.
Colab: colab.research.google.com/drive/1aDEZN_C…
Results (1000 CIFAR-10 images, 40 epochs):
• SKCNN-Deep: 288k params → 14% acc (unstable)
• MobileNetV2: 2.24M params → 25% (peaked 28.5%)
Feedback welcome!
English

This paper is absolute comedy once you stare at it for a bit. She mentions the architecture having fewer parameters than a Resnet and then proceeds to mention that it is based on a Mobilenet model(so there shouldn't be any bragging about efficiency here, you use mobilenets for constrained compute). And then move on to this section with big claims and see this joke of an "architecture design" section. You can probably brute force your way through it though using intelligent guesses.

Samaritan@0_samaritan
1/ n Hi. I am assuming you are relatively my age. I have read quite a number of research papers on ML and DL . Am also quite familiar with how research works and what it takes to invent a new theory or architecture in ML. Given the things you have listed to have accomplished
English


