Building a real-time geospatial data platform (QuackNet) for crowdsourced network intelligence, I faced a common solo-dev dilemma: how to handle high-volume event ingestion without ops complexity?

My first instinct was to skip Kafka entirely—Redis Streams or NATS would be simpler. But after stress-testing the architecture, I realized I needed 100+ concurrent writers, fault-tolerant persistence, and partitioned parallelism. That's Kafka's job. The problem: traditional Kafka with ZooKeeper is overkill for one developer.

Enter KRaft mode: Kafka's new consensus protocol that eliminates ZooKeeper. Single-process broker+controller, one configuration file, and it just works. Here's what I learned shipping it.

Why Kafka at All?

The QuackNet pipeline is simple in theory:

Without Kafka, I'd either poll PostgreSQL constantly (wasteful) or have the API block until ClickHouse writes complete (slow, defeats the purpose of async).

Kafka lets the API fire-and-forget: write to Kafka in milliseconds, let the consumer do batch inserts into ClickHouse in parallel. If the consumer crashes, Kafka's topic partitions have the data—no events lost.

Also: multiple consumers. The same scan event feeds analytics, fraud detection, and push alerts. Kafka broadcasts to all subscribers. No code duplication.

The KRaft Trade-Off

Traditional Kafka setup:

KRaft mode:

Trade-off: KRaft is newer (GA in Kafka 3.2+), fewer operators know it well, and edge cases may exist. For a solo project? Perfect. For Airbnb's infrastructure? Maybe stick with ZooKeeper.

Docker Compose Setup (What Actually Works)

I tried bitnami/kafka first. Broke immediately. The bitnami image uses confusing env var prefixes and old Kafka versions. After 2 hours of debugging, I switched to apache/kafka:3.8.1 (official). That worked.

version: '3.8' services: kafka: image: apache/kafka:3.8.1 ports: - "9092:9092" # broker (internal) - "9094:9094" # advertised (external) environment: KAFKA_NODE_ID: 1 KAFKA_PROCESS_ROLES: "broker,controller" KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: "CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT" KAFKA_INTER_BROKER_LISTENER_NAME: "PLAINTEXT" KAFKA_ADVERTISED_LISTENERS: "PLAINTEXT://localhost:9094" KAFKA_CONTROLLER_QUORUM_VOTERS: "1@kafka:9092" KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true" KAFKA_LOG_RETENTION_HOURS: 168 volumes: - kafka-data:/var/kafka-logs volumes: kafka-data:

Key settings:

Spin it up:

docker compose up -d kafka docker exec kafka kafka-topics.sh --create \ --topic netintel.scan.events \ --partitions 6 \ --replication-factor 1 \ --bootstrap-server localhost:9092

Done. One command, one container. No ZooKeeper cluster to manage.

The context.Background() Bug

Here's where I got burned. In my Go API handler, I was writing scans to Kafka asynchronously:

func (h *Handler) SubmitScan(c *gin.Context) { go func() { err := h.kafkaProducer.Produce( c.Request.Context(), // ← WRONG! "netintel.scan.events", scanJSON, ) }() c.JSON(200, gin.H{"status": "accepted"}) }

The problem: c.Request.Context() cancels when the HTTP response is sent. By the time the goroutine tries to write to Kafka, the context is already dead. Silent failure, no error log—the event just got dropped.

The fix:

func (h *Handler) SubmitScan(c *gin.Context) { go func() { err := h.kafkaProducer.Produce( context.Background(), // ← Detached context "netintel.scan.events", scanJSON, ) }() c.JSON(200, gin.H{"status": "accepted"}) }

context.Background() is never cancelled. It's the root context. Goroutines spawned with it will complete their writes to Kafka even after the HTTP response goes out.

Golden rule: For async background work (Kafka, email, logging), use context.Background(). For handler-scoped work (database queries), use c.Request.Context().

When to Use Kafka vs Alternatives

Kafka is powerful but adds complexity. Here's my mental model:

Use Case Kafka Redis/NATS
Durability (persist to disk?) Yes, auto Optional, costly
Replay old events? Yes No
Partitioning / scaling? Native Manual sharding
Ops complexity? Higher (KRaft helps) Much simpler
Best for Multi-consumer analytics pipelines Task queues, caching

For QuackNet: we need durability (scan data is valuable), replay capability (re-process events if our ClickHouse pipeline breaks), and multiple consumers (analytics, fraud detection, push alerts). Kafka wins.

If I were just building a job queue? Redis Streams or Bull would be faster to ship.

The Scaling Trap

One gotcha: don't run Kafka in development on your laptop if you're doing serious testing. Kafka's memory footprint is small, but if you generate millions of events, replication and compaction kick in. I once filled my 256GB SSD with Kafka logs because I didn't set KAFKA_LOG_RETENTION_HOURS correctly.

Solution:

Is KRaft Production-Ready?

Yes, but with caveats.

Confluent (Kafka stewards) marked KRaft as GA in Kafka 3.2 (early 2022). It's been battle-tested by mid-tier companies for ~2 years. But the operator community is smaller than ZooKeeper-based Kafka.

For a solo founder with a real-time pipeline? Go for it. For a 100-person SaaS company running 50 Kafka clusters with SLA guarantees? Maybe hire someone who knows ZooKeeper inside-out.

I'm shipping QuackNet with KRaft. When we hit scaling issues (and we will), I'll have time to migrate. For now, it's one less thing to think about.

Conclusion

Kafka KRaft lets solo developers use a battle-tested, production-grade event streaming platform without ops overhead. No ZooKeeper cluster. One Docker image. Six environment variables.

The gotchas are real (context cancellation, image selection, config typos), but they're learnable. If you're building an analytics pipeline, fraud detection layer, or real-time dashboard, Kafka is worth the effort.

KRaft makes it accessible. That's a win for indie builders.