Oshan Ivantha

797 posts

Oshan Ivantha banner
Oshan Ivantha

Oshan Ivantha

@_ivantha

Researcher | Lead ML Engineer @saluslabs | RL, ML, AI, CV, Game Theory | ex @wso2 | @ucsc_lk

Sri Lanka Beigetreten Haziran 2011
324 Folgt137 Follower
Oshan Ivantha
Oshan Ivantha@_ivantha·
Going with snake case everywhere.
English
0
0
0
17
Oshan Ivantha retweetet
Andrew Ng
Andrew Ng@AndrewYNg·
Releasing a new "Agentic Reviewer" for research papers. I started coding this as a weekend project, and @jyx_su made it much better. I was inspired by a student who had a paper rejected 6 times over 3 years. Their feedback loop -- waiting ~6 months for feedback each time -- was painfully slow. We wanted to see if an agentic workflow can help researchers iterate faster. When we trained the system on ICLR 2025 reviews and measured Spearman correlation (higher is better) on the test set: - Correlation between two human reviewers: 0.41 - Correlation between AI and a human reviewer: 0.42 This suggests agentic reviewing is approaching human-level performance. The agent grounds its feedback by searching arXiv, so it works best in fields like AI where research is freely published there. It’s an experimental tool, but I hope it helps you with your research. Check it out here: paperreview.ai
Andrew Ng tweet media
English
248
1.1K
6.3K
1.1M
Oshan Ivantha retweetet
Numbers.lk
Numbers.lk@numberslka·
🇱🇰 3 Months, No Policy — E-Commerce Still in Limbo ⭕ It has been over 3 months since the government supposedly began working on a new framework for taxing and releasing e-commerce goods. ⭕ As usual with the Department of Trade and Investment policies & Dept of Customs, nothing has materialized. ⭕ Meanwhile, 🇨🇳 AliExpress continues to impose arbitrary taxes on goods electronic parts, circuit components, and educational items that cannot be sourced elsewhere. ⭕ The worst part is that there is no cross-check mechanism between what customers pay as “taxes” and what intermediaries declare to customs — meaning much of the changed fees as "taxes" never reaches government coffers. #SriLanka #Ecommerce
Numbers.lk tweet media
Numbers.lk@numberslka

Sri Lanka 🇱🇰 is set to implement a new tariff mechanism for e-commerce (AliExpress, Temu) trade—question: how long will it take? ⭕ President has instructed the Finance Ministry to implement a tariff framework to speed up clearance and facilitate B2C e-commerce trade. ⭕ The Ministry is reportedly weighing an intermediate system between the older informal per-kg methodology and the current HS-based system—possibly a de minimis threshold plus a flat rate. ⭕ The Trade and Investment Policies (TIP) department of the Finance Ministry has been tasked with implementing this new mechanism. ⭕ Historically, TIP has been one of Sri Lanka’s most lethargic departments: hundreds of policy proposals from industry and other government ministries and departments sit unreviewed, and it has never found the right balance between facilitation and tariff barriers. ⭕ Decades-old concessionary rates on certain HS codes, as well as exorbitant rates on others, have rarely been reviewed or adjusted based on whether the intended outcomes have been achieved. ⭕ For example, shoes carry a Rs 2,000 cess duty on every imported pair—a rate unchanged for decades—yet we have not seen measurable growth in the local industry; instead, smugglers profit. ⭕ Industries such as electric appliances are lobbying for separate HS codes for items requiring SLSI approval versus those that do not, to avoid unnecessary clearances and approval delays caused by grouping all products under the same code. - not happened yet. ⭕ The solar industry wants to lower the lithium-ion battery tax to foster growth in the battery-backed solar space. - not happened yet. ⭕ Even government ministries find it extremely difficult to get things done by TIP. For example, the Industries Minister wants to increase the tax on cheap lighter imports to protect the failing matchbox industry, but that change has not yet occurred. ⭕ The saddest part is that TIP could be a great tool for the country—like the Central Bank—for driving significant economic progress by analyzing data and setting rates while bypassing bureaucracy, yet the department has rarely acted. ⭕ The administration wants a viable tariff mechanism for E commerce within days to weeks; however, if TIP follows its old pattern, it may take not months or years, but decades before a new e-commerce tariff framework materializes. This would be a good KPI for the TIP (Department of Trade and Investment Policies) under this government. Let’s see how long it will take to implement a new mechanism... #SriLanka #Ecommerce

English
8
18
126
15.3K
Oshan Ivantha retweetet
Mathieu
Mathieu@miniapeur·
Mathieu tweet media
ZXX
7
178
1.5K
69.8K
Oshan Ivantha retweetet
SightBringer
SightBringer@_The_Prophet__·
The single most destructive falsehood taught in school wasn’t a fact, it was a frame: “The system is neutral.” We were taught: •History is objective •The economy is fair •Science is apolitical •Authority is earned •Success is linear •Money follows merit •The future will reward obedience All of it designed to manufacture compliance not understanding. The truth is: •History is written by power. •The economy is shaped by leverage, not labor. •Science is real, but funding decides the questions. •Authority is often inherited or engineered. •Success follows strategic asymmetry. •Money follows network, not virtue. •The system rewards signal, not submission. We weren’t taught how the world works. We were taught how to function inside the world they already built. That’s why so many wake up at 30, 40, or 50 and feel betrayed. The curriculum was never designed to liberate you. It was designed to format you. The deepest falsehood wasn’t any one fact. It was the illusion that the system was built for truth.
English
179
883
7.6K
265.8K
Oshan Ivantha retweetet
Reads with Ravi
Reads with Ravi@readswithravi·
Reads with Ravi tweet media
ZXX
34
772
5.9K
284.8K
Oshan Ivantha
Oshan Ivantha@_ivantha·
The bitter sting of getting a paper rejection email. Ouch!
English
0
0
0
19
Oshan Ivantha retweetet
Mathieu
Mathieu@miniapeur·
Mathieu tweet media
ZXX
7
151
1.6K
54.6K
Oshan Ivantha retweetet
Yann LeCun
Yann LeCun@ylecun·
To people who think "China is surpassing the US in AI" the correct thought is "Open source models are surpassing closed ones" See ⬇️⬇️⬇️
English
484
1.2K
11.9K
1.2M
Oshan Ivantha retweetet
Elon Musk
Elon Musk@elonmusk·
Elon Musk tweet media
ZXX
3.8K
9.8K
176.1K
65.5M
Oshan Ivantha retweetet
Wolf of X
Wolf of X@WolfofX·
Imagine a land so vast, so rich, it feels extraterrestrial.. Africa is truly like another planet🧵 1. Magnificent visual of the River basins of Africa
Wolf of X tweet media
English
766
11.9K
106.8K
16.8M
Oshan Ivantha
Oshan Ivantha@_ivantha·
Deadpool & Wolverine. A 7.5/10.
English
0
0
0
19
Oshan Ivantha retweetet
Nuwan I. Senaratna
Nuwan I. Senaratna@nuuuwan·
Both for-profit education & for-profit healthcare are both highly efficient & highly profitable for shareholders. The only problem is that students & patients are not the shareholders. #CommonSense
BladeoftheSun@BladeoftheS

In 1968 Finland banned for profit education, the few private schools that exist in Finland have to reinvest any profit they make or pay it back to parents. It has been in the Top 3 in education for the last 20 years. There should be no profit in education or healthcare.

English
1
4
14
751
Oshan Ivantha retweetet
Yann LeCun
Yann LeCun@ylecun·
Yet another opportunity to point out that reasoning abilities and common sense should not be confused with an ability to store and approximately retrieve many facts.
Rohan Paul@rohanpaul_ai

📌 This paper investigates the dramatic breakdown of state-of-the-art LLMs' reasoning capabilities when confronted with a simple common sense problem called the "Alice In Wonderland (AIW) problem". This is despite their strong performance on standardized reasoning benchmarks. The key conclusion is that current LLMs lack basic reasoning skills, and existing benchmarks fail to properly detect these deficiencies. 'Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models' 📌 The AIW problem is a concise natural language task that asks: "Alice has N brothers and she also has M sisters. How many sisters does Alice's brother have?" While easily solvable by humans using common sense reasoning (the correct answer is M+1), most tested LLMs, including GPT-3.5/4, Claude, Gemini, LLaMA, Mistral, and others, show a severe collapse in performance, often providing nonsensical answers and reasoning. 📌 Notably, even when LLMs occasionally provide correct answers, they often express strong overconfidence in their wrong solutions and generate confabulations (persuasive but nonsensical explanations) to justify their incorrect responses. Standard interventions like enhanced prompting or asking models to re-evaluate their answers fail to improve performance. 📌 The authors introduce a harder variation called AIW+, which causes an even stronger performance collapse across all tested models, including GPT-4 and Claude 3 Opus, which performed relatively better on the original AIW problem. 📌 This study highlights a striking discrepancy between LLMs' high scores on standardized reasoning benchmarks (e.g., MMLU, ARC, Hellaswag) and their poor performance on the AIW problem, suggesting that current benchmarks do not adequately reflect models' true reasoning capabilities and weaknesses. 📌 The authors emphasize the need for the ML community to develop new reasoning benchmarks that can properly detect such deficits and guide the improvement of LLMs' reasoning skills. They also stress the importance of fully open and reproducible training pipelines, including dataset composition, to enable proper analysis and progress in this area.

English
76
260
1.6K
328.2K