Mark Qian

90 posts

Mark Qian banner
Mark Qian

Mark Qian

@MarkQian16

Building https://t.co/WWzLRVQL0G - the very first system design practice platform

Katılım Kasım 2021
94 Takip Edilen134 Takipçiler
Mark Qian
Mark Qian@MarkQian16·
Hey software engineers I am building codemia.io the very first platform for practicing system design problems. What are you working on? Let's connect!!
English
4
0
4
156
Mark Qian
Mark Qian@MarkQian16·
Promote yourself. What are you working on? I am building codemia.io - the very first system design practice platform. I like to connect!
English
30
0
19
631
Sabbir Ahmed
Sabbir Ahmed@Sabbir_Ahmd_·
@MarkQian16 Launched LeadLeap (leadleap.app) — capture website leads automatically and instantly follow-up l without per-email costs.
English
1
1
1
15
Mark Qian
Mark Qian@MarkQian16·
Kafka transactions don't make data disappear; they just make it invisible until it's safe. How does this lead to atomicity and exactly-once semantics (EOS)? KAFKA TRANSACTIONS . Transactional producers write records with a transactional ID . Records append to the log immediately, not visible yet . Transaction coordinator writes COMMIT/ABORT markers later . Uncommitted records exist; visibility controlled by markers COMMIT MARKER . Signals all records in a transaction are visible . Applied across all partitions for consistency . Read_committed consumers see complete transactions only . Prevents partial state visibility ABORT MARKER . Records are skipped, never seen by consumers . Ensures consistency by hiding aborted transactions . Reduces risk of processing half-complete transactions . Protects data integrity by avoiding corrupted states Kafka's approach to transactions centers on visibility, not rollback. By managing the visibility of records via COMMIT and ABORT markers, Kafka ensures atomicity at the read level. Either all records in a transaction become visible, or none do, maintaining a consistent view for consumers. This mechanism supports exactly-once semantics by making each record processed once across failures. Mistakes and insights: 1. Misunderstanding visibility control: Developers confuse rollback with visibility. Kafka doesn't erase records; it controls when they're seen, maintaining a consistent state. 2. Ignoring read_committed: Some developers overlook setting consumers to read_committed. This oversight can lead to processing uncommitted records, which breaks atomicity. 3. Neglecting idempotence: Without idempotent producers, retries can lead to duplicate writes. Idempotency is vital for exactly-once processing. 4. Misconfiguring transaction boundaries: Incomplete transaction boundaries can cause inconsistencies. Properly manage your transaction scopes. 5. Forgetting about offset commits: Stream processors should commit offsets within the same transaction to align data processing and state. Save this post for your next deep dive into Kafka EOS, especially when transactional guarantees get murky.
GIF
English
0
0
1
22
shantanu bhatt
shantanu bhatt@auditormusic19·
@MarkQian16 Building iGrow, the AI roleplay app that helps you practice difficult conversations before they happen in the most personalized way, check it out here: i-grow.co
English
1
0
1
17
Karim Zitouni
Karim Zitouni@kzitouni1·
I'm tired of seeing the same faces on my feed I need brothers in tech, AI, startups, marketing, distribution, vibecoding, bootstrappped, in sf, nyc or elsewhere to build along like, drop your startup, let's make some friends.
English
78
2
83
3K
Meghanjana Nag
Meghanjana Nag@megh_anjana·
@MarkQian16 Your first instinct about a person - co-founder, hire, investor - is usually right. The second instinct is just you talking yourself into something.
English
1
0
1
13
Mark Qian
Mark Qian@MarkQian16·
What’s the best lesson you’ve learned as a founder ?
English
1
0
1
43
Jeroen van Welsenes
Jeroen van Welsenes@Guronnimo·
My feed is full of builders right now and I love it 😍 If you’re: • building products • sharing in public • experimenting with AI Drop what you’re working on 👇 Let’s connect
English
89
0
60
2.7K
Mark Qian
Mark Qian@MarkQian16·
@1001binary Hi Jimmy, these looks cool. The expense app seems like something I would use. Are you planning to creating an android version?
English
1
0
1
12
Jimmy Lee
Jimmy Lee@1001binary·
Hey, let’s connect, man 👋I’ve shipped 2 apps recently and would genuinely love your feedback: ZapExpense: a super fast expense tracker built to make logging spending basically frictionless (~1 second per entry, everything else updates automatically). apps.apple.com/app/zapexpense… MiniScheduler: a simple calendar tool that helps you use small 10–30 min gaps in your day for quick tasks instead of letting them go to waste (auto-detects free slots + one-tap scheduling). apps.apple.com/app/minischedu… Curious what you’re building right now and what stage you’re at 👀
Jimmy Lee tweet media
English
1
0
0
7
Umair Shaikh
Umair Shaikh@1Umairshaikh·
Founders Make connections in the comments. Enjoy the internet leverage.
English
65
1
50
3.5K
Mark Qian
Mark Qian@MarkQian16·
Think Kafka loses data randomly? It only loses what you permit it to. HIGH WATERMARK (HW) . The highest offset all ISR replicas have . Offsets ≤ HW are committed and durable . Consumers read only up to HW . It's the line of truth in Kafka COMMITTED VS UNCOMMITTED DATA . Committed data: offsets ≤ HW, won't be lost in clean election . Uncommitted data: offsets > HW, may reside only on leader . Clean election keeps committed data safe . Unclean election risks losing committed data Kafka's design choices boil down to balancing safety and availability. A clean election prioritizes data durability, choosing leaders only from ISR, preserving committed data. An unclean election, however, prioritizes availability, electing any active replica, even if it's out-of-sync, risking data loss. Where Kafka setups go astray: 1. Allowing unclean election by default. This increases availability but risks committed data loss. Evaluate the tradeoff based on your resilience needs. 2. Misconfiguring ISR thresholds. Too strict, and replicas fall out of ISR; too lenient, and they risk lagging significantly. Both hurt reliability. 3. Ignoring follower lag. Significant lag can lead to a reduced ISR, increasing the chance of unclean elections. Regularly monitor and adjust follower replication settings. 4. Over-relying on a single leader. If a leader carries uncommitted loads and fails, the subsequent election type dictates data safety. Balance load across brokers. 5. Assuming all data is safe post-election. Always verify replica synchronization to HW post-election, especially after unclean elections. Your choice in leader election type fundamentally shapes Kafka's behavior. Will you favor data safety or system availability? The decision will dictate what you can afford to lose. Save this for when Kafka's tradeoffs keep you up at night.
GIF
English
0
0
1
30
Mark Qian
Mark Qian@MarkQian16·
I am building codemia.io. Founders/builders/vibe coders let me know what you are working on! Let's connect!
English
97
0
61
2.9K
Mark Qian
Mark Qian@MarkQian16·
Kafka’s storage model? It’s not one-size-fits-all. Confusing Kafka’s retention and compaction models is how bugs sneak into production. RETENTION MODEL . Treat this as event history, not permanent storage . Producers append records continuously . Data kept for a limited time or size window . Expired segments get purged, reducing storage overhead COMPACTION MODEL . Think of this as state management . Producers write updates using consistent keys . Periodic compaction pass removes older key-value pairs . Latest value per key is retained, preserving state These models address different needs. Retention provides a temporal snapshot, ideal for scenarios where recent data suffices but historical data isn’t required. Compaction, on the other hand, ensures that the most current state is always available, critical for applications where outdated information leads to inconsistency or errors. Common pitfalls to watch out for: 1. Using retention for stateful data. This leads to data loss as old records expire, leaving consumers with incomplete information, causing eventual consistency issues. 2. Misconfiguring compaction intervals. If compaction runs too infrequently, outdated data sticks around, potentially leading to incorrect reads. 3. Ignoring key distribution in compaction. Uneven key distribution can cause some keys to compact less frequently, skewing state accuracy. 4. Forgetting about the async nature of compaction. Temporary duplicates of old values may mislead consumers expecting immediate cleanup. 5. Overlooking retention settings in analytics pipelines. Critical log events may disappear before being processed, leading to gaps in analysis. Understanding Kafka isn't just about treating it as a queue. Configure it wrong, and you might as well be feeding bugs into your system. Retention for temporal data, compaction for state, align your storage model to your system’s needs. Bookmark this when setting up your next Kafka cluster.
GIF
English
0
0
1
60
Mark Qian
Mark Qian@MarkQian16·
"Retries don't cause bugs. Side effects do." So why are retries feared as bug generators in distributed systems? They aren't the enemy. The real culprit is unintended side effects when idempotency isn't enforced. Whether charging a credit card or updating inventory, repeat actions without safeguards can wreak havoc. IDEMPOTENCY KEYS . Unique key sent with each request to the API . Ensures one-time effect of operations, even if retried . Checked against an idempotency store for duplicates . No key match? Proceed. Key match? Return original result, avoid duplicate writes RETRIES . Retries expected in a flaky network environment . Timeouts common; retries mitigate transient failures . Work in tandem with idempotency to prevent side effects . Without safety net, retries risk replaying operations The dual-pronged approach of retries plus idempotency is non-negotiable in distributed systems. Retries handle transient network issues, while idempotency ensures operations don’t have unintended side effects. Together, they transform what could be catastrophic into something routine. Where teams falter: 1. Skipping idempotency keys for non-critical operations. Every write operation is critical. Even a log entry can cause cascading issues if duplicated. 2. Mismanaging idempotency key storage. Should be persistent and reliable. Use a database or a distributed cache, not ephemeral in-memory storage. 3. Assuming network failures imply operation failures. Just because a response is lost doesn't mean the action failed. Validation via idempotency is needed to check what actually happened. 4. Overcomplicating the idempotency logic. Keep it simple: generate, store, validate. Complex logic introduces more failure points. 5. Ignoring eventual consistency. Keys must remain valid long enough to account for network latency and retries, but not indefinitely, to avoid stale entries. Idempotency converts retries from a hazard to a safety feature. Without it, you risk operational chaos: double charges, over-reserved inventory, lost customer trust. If your API changes state and supports retries, it must be idempotent. Bookmark this for when a network timeout leaves you guessing what just happened.
GIF
English
0
0
2
80