固定されたツイート
Kwok
1.6K posts

Kwok
@kodemojo
Senior engineer and AI enthusiast. I mentor developers to build, ship, and operate production-grade systems.
参加日 Mart 2024
712 フォロー中868 フォロワー

7 benefits of canary releases for junior devs:
1. Smaller release bets
2. Faster feedback from real users
3. Earlier bug detection
4. Easier rollback decisions
5. Less pressure during deployment
6. Better habits around monitoring
7. A clearer sense of production ownership
Small rollout first.
Big rollout later.
English

Legacy code is not always the enemy. Sometimes it is just old code that still pays the bills.
The real mistake is turning modernization into a heroic rewrite. That creates years of risk before users see value.
Better architecture is patient.
Small slice.
Real traffic.
Safe rollback.
Then the next slice.
English

Config drift detection matters because:
1. It catches silent differences
2. It keeps staging aligned with production
3. It makes incidents easier to explain
4. It exposes manual fixes that were never cleaned up
5. It stops one server from becoming special
6. It turns guessing into checking
Compare them regularly.
Alert while the differences are still small.
English

You need better percentiles.
Average latency tells you what usually happens.
P99 tells you where users start waiting, refreshing, complaining, and sending screenshots.
Averages are for reports.
P99 is for product reality.
6 steps to improve P99:
1. Measure end to end
2. Split the request into timed sections
3. Look at distribution
4. Group by endpoint, tenant, region, device, cache hit
5. Remove the biggest tail source
6. Alert on P99 after deploys
Do this before adding more servers.
English

Flaky tests seem harmless at first.
One red build.
One rerun.
One “probably nothing.”
But the real cost is not the extra five minutes in CI.
The real cost is what happens after enough false alarms.
Engineers stop trusting the signal.
And once that happens, the test suite becomes decoration with opinions.
For software architects, flaky tests are not just a testing issue. They are a design smell.
They often expose problems with time, state, data, boundaries, and ownership.
The better question is not:
"Can we rerun it?"
It is:
"What is this test trying to tell us?"
English

1/5
The teams that handle production well are usually not better at guessing.
They are better at logging.
That matters more than most developers think.
Because when production breaks, the real question is not:
"Who is the smartest person in the room?"
It is:
"Who can find the facts fastest?"
English
Kwok がリツイート

