Energised

3.4K posts

Energised banner
Energised

Energised

@g_theuri

Love the entrepreneurial landscape....

Katılım Ocak 2013
1.2K Takip Edilen192 Takipçiler
Energised
Energised@g_theuri·
Interesting. Do we even do even have crop yield modeling advising policy in Kenya?
Yohan@yohaniddawela

A pre-trained model with no feature engineering, no hyperparameter tuning, and no domain expertise just matched 14 days of computation on a 500-node CPU cluster to forecast crop yields. And it did it in 2 hours on a single GPU. A team at the European Commission's Joint Research Centre spent 14 days running 500 CPU nodes to forecast South African maize yields. A different model, given the same data, did the job in 360 seconds on 4 CPUs. The accuracy gap was 2 percentage points. The 14-day pipeline is the standard machine learning workflow for crop forecasting. You start with raw satellite and weather data: • dekadal time series of FPAR (a measure of green biomass) • soil moisture • rainfall • temperature • solar radiation. Then you engineer features: Monthly averages, monthly maxima, monthly sums, different windows over the growing season. Then you test 14 different feature sets across 6 model types (XGBoost, GBR, Random Forest, LASSO, GPR, SVR), with optional principal component analysis, optional MRMR feature selection, and one-hot encoding of region. That's 96 pipeline configurations per model, each with its own hyperparameters to tune, all wrapped in nested leave-one-year-out cross validation to avoid leaking information from the test year. 14 days on a high-throughput cluster with hundreds of nodes. The alternative is TabPFN, a transformer pretrained on millions of synthetic tabular datasets. You hand it the raw features. No selection, no reduction, no tuning, no engineered aggregates beyond what you've already computed. One forward pass. Done. For maize, the best ML pipeline (Gaussian Process Regression with reduced remote sensing and soil moisture features, PCA, yield trend, and one-hot encoded region) hit 6.8% rRMSE with R² of 0.91 at the national level. TabPFN hit 8.8% with R² of 0.86. ANOVA found no statistically significant difference between them. Both beat the trend baseline (12.9%) and the peak FPAR baseline (14.8%). For soybeans, the gap was even tighter: 13.51% vs 15.1%. For sunflowers, no significant difference between any of the models tested. The data setup tells us why this is important. South Africa has 23 years of yield statistics across 5 to 8 provinces. That's 184 labelled observations for maize, 138 for soybeans, 115 for sunflowers. This is obviously small data territory, where deep learning traditionally fails. TabPFN's pretraining on synthetic data lets it sidestep the small-sample problem because it's not really learning the task from your data. It's pattern-matching against everything it's already seen. The 2024 operational test was the real validation. Both models forecast yields in early April, at 75% of the growing season. Both tracked the official Crop Estimates Committee figures within roughly 10% on maize and 22% on soybeans across 8 provinces. Both flagged the same anomaly in North West province, where they predicted higher yields than CEC, with environmental indicators supporting the model view. TabPFN also produced 95% confidence intervals natively, something the ML pipeline doesn't give you without extra work. The cost asymmetry is what changes the picture. A government statistical office in Mozambique or Zambia can't justify 14 days on 500 CPUs to fit a maize model. They can run TabPFN on a laptop in 6 minutes. The accuracy penalty is 2 percentage points of rRMSE on a forecast that already sits well inside the noise of the official CEC trend-and-survey methodology. For most operational purposes, that's a free upgrade from no forecast to a usable forecast. There's a broader pattern here that goes beyond crop yields. Foundation models for tabular data are doing for small structured datasets what large language models did for text. The expensive, bespoke, expert-tuned pipeline used to be the only path to good performance. Now a generic pretrained model gets you 90% of the way for 0.05% of the compute. The remaining gap between TabPFN and the 14-day pipeline is the value an experienced ML engineer adds. That's still positive. It's also small enough that for most users in most settings, it isn't worth paying for. The authors are now scaling the approach across multiple African countries. If TabPFN holds up in Ethiopia, Kenya, Burkina Faso, the implication is that operational subnational yield forecasting just stopped being a specialist service. It became a default capability anyone with a laptop and the public ASAP environmental data feed can run. Link to paper: nature.com/articles/s4159…

English
0
0
0
10
Energised retweetledi
-valar morghulis-
-valar morghulis-@eldivine·
"One gallon of liquid fuel contains ten times as much usable energy as a lithium-ion battery of the same weight." Hydrocarbons are not going anywhere.
English
21
84
358
23.6K
Energised
Energised@g_theuri·
@georgediano Inakaa budget line ya Transport refunds iko juu.....
Filipino
0
0
0
6
George T. Diano
George T. Diano@georgediano·
This is not the first delegation to visit Statehouse. Several of such are held every week. Such an event with everyone walking away with an envelope will cost nothing less than 100 million. After these delegate visits, he'll go back to their counties in the name if “development” rally just to ask them the same thing he was asking them today “Watu Ya Kirinyaga nyinyi Mnasemaje” just for him to hear Tutam to satisfy his ego. In those raylly, he'll spend not less than 50 million for mobilization & theatrics. Every single day we are wasting hundreds of millions just for optics and fake popularity impressions. Any sane President wouldn't waste all this kind of money when a Cancer machine has broken down and remains non-functional for months at KNH? In education, schools are bleeding as a result of reduced capitation. CBE is on its knees because public junior schools don't have laboratories, ICT labs or even sports facilities. When will the president work for the people. His predecessors never wasted money like him. This is one of the reasons why Statehouse is requesting billions of money every single time. It doesn't matter if you're pro government or anti government, this man needs to be told the truth. This country is bleeding & Kenyans are suffering. Why can't he work for Kenyans & stop wasting money on BULLSHIT!!
George T. Diano tweet mediaGeorge T. Diano tweet media
English
97
506
1.7K
63.5K
Energised
Energised@g_theuri·
@SangKip4 That Kamchele garage should franchise and service F1 vehicles, si miraa vehicles only
English
0
0
5
713
CHOBOS™
CHOBOS™@SangKip4·
Biggest challenge ya miraa pilots ni wasee wa Alto 😂😂😂😂
English
19
85
428
8.7K
Energised
Energised@g_theuri·
What about livestock. A low hanging fruit. Will review other sectors later. Wonderful evening tweeps. With X, what do we call people nowadays? Been a while since I posted...
English
0
0
0
5
Energised
Energised@g_theuri·
Public transport is a whole topic that needs its own thread. How many man hours do we loose in the cities? Do we even factor mental health when stuck in a 3 hour two way return jam? Umeamka 4.00 AM to cover 20km to arrive by 8.30 am. Ukitoka 7.00am kwisha wewe Thika road peeps.
English
1
0
0
41
Anjeyo E. Ananda
Anjeyo E. Ananda@anj_116_·
Basically what Kenya Railways are doing with this car transport thingy is give car yards and business people an option or "Boss Ingia WhatsApp Kidogo" moments will continue.
English
11
33
209
29.9K