I ship with two databases: PostgreSQL (OLTP) + ClickHouse (OLAP). This sounds insane for a solo founder. Most people pick one. But for QuackNet's architecture—real-time scan ingestion, geospatial analysis, and multi-tenant analytics—running both is the right call.

The question I wrestled with: should the analytics layer be ClickHouse or TimescaleDB (PostgreSQL's time-series extension)?

Here's my decision framework, with real numbers.

The Dual-Database Problem

Traditional wisdom: one database per microservice. But here's what breaks:

PostgreSQL (OLTP) strengths:

PostgreSQL (OLTP) weaknesses:

ClickHouse (OLAP) strengths:

ClickHouse (OLAP) weaknesses:

So the pattern: PostgreSQL for user-facing transactional data, ClickHouse for analytics append-only data. Kafka bridges them: API writes to PostgreSQL, Kafka pipeline writes to ClickHouse asynchronously.

Why Not Just TimescaleDB?

TimescaleDB is PostgreSQL + time-series superpowers. Hypertables (auto-partitioned by time), compression, continuous aggregates. If I could pick one database, it would be appealing.

Here's why ClickHouse won:

1. Geospatial Functions (H3 Native)

QuackNet segments the world into H3 hexagons (Uber's geospatial indexing). I query: "how many scans in this hex?"

ClickHouse:

SELECT h3ToGeo(hex_id)[1] as lat, h3ToGeo(hex_id)[2] as lng, COUNT(*) as scan_count FROM scan_events WHERE timestamp > now() - INTERVAL 24 HOUR GROUP BY hex_id

TimescaleDB/PostgreSQL:

SELECT ST_X(ST_GeomFromText(...)) as lat, ST_Y(ST_GeomFromText(...)) as lng, COUNT(*) as scan_count FROM scan_events WHERE time > NOW() - INTERVAL 24 HOUR GROUP BY -- (custom H3 function needed)

ClickHouse has H3 built-in. PostgreSQL requires PostGIS, which is powerful but external. For a startup, built-in wins.

2. Compression & Storage

QuackNet collects 500K scans/day. After 6 months = 90M rows. Let's measure:

Database Raw Size Compressed Ratio
PostgreSQL (row) ~450 GB ~180 GB 2.5x
TimescaleDB (compressed hypertable) ~450 GB ~60 GB 7.5x
ClickHouse (columnar) ~450 GB ~8 GB 56x

ClickHouse's columnar format is dramatically more efficient. Each column has the same type, so compression is brutal. Same data in ClickHouse takes 1/20th the space.

Cost: Cloud SQL charges ~$0.17/GB/month. At 60GB (TimescaleDB), that's $10/month. At 8GB (ClickHouse), it's $1.36/month. The difference adds up.

3. Materialized Views & Auto-Aggregation

I need three pre-computed views:

Coverage by hex (for the map):

CREATE MATERIALIZED VIEW coverage_by_hex AS SELECT hex_id, h3ToGeo(hex_id) as centroid, COUNT(*) as scan_count, AVG(signal_strength) as avg_signal, MAX(timestamp) as latest_scan FROM scan_events GROUP BY hex_id

In ClickHouse, this view updates automatically on INSERT (background merge). In TimescaleDB, you'd need triggers or a separate ETL job.

ClickHouse wins again: auto-aggregation is built-in.

4. Query Performance

Same query on 90M rows: "Get scan counts per hex, last 7 days"

Database Query Time Index Size
PostgreSQL (B-tree on hex_id) 4,200 ms ~50 GB
TimescaleDB (compressed, indexed) 580 ms ~15 GB
ClickHouse (sparse primary index) 45 ms ~200 MB

ClickHouse is 100x faster here. It doesn't need to maintain huge B-tree indexes; its sparse primary index (stores block boundaries, not every row) is minuscule.

Bottom line on performance: ClickHouse excels at analytical queries (sum, avg, group by). TimescaleDB is better for point queries ("get row with id=X"). Different tools, different jobs.

When to Choose TimescaleDB

TimescaleDB wins if:

Use case: "We store IoT sensor data, but we need to update/delete old readings." TimescaleDB's mutation support is better than ClickHouse.

When to Choose ClickHouse

ClickHouse wins if:

Use case: QuackNet (which is why I'm here).

The Kafka Bridge

Here's my actual pipeline:

// 1. API receives scan app.POST("/v1/scans", func(c *gin.Context) { kafka.Produce(context.Background(), "netintel.scan.events", scan) c.JSON(200, "accepted") }) // 2. Kafka consumer reads events consumer := kafka.NewReader(...) for { msg := consumer.ReadMessage() rows = append(rows, parseEvent(msg)) // 3. Batch insert to ClickHouse if len(rows) >= 1000 || timeout { clickhouse.Insert("scan_events", rows) rows = [] } }

Benefits:

The Real Cost

Running two databases is complexity. You need:

For a solo founder, this is... a lot. But the payoff is clear:

If I used only PostgreSQL + TimescaleDB, I'd have one slow database trying to do two jobs. If I used only ClickHouse, I'd have no ACID, no transactions, no referential integrity for user data.

The dual-database pattern is overkill for MVP. But once you're aggregating 90M scans and need real-time analytics? It pays for itself in latency and stability.

Conclusion

ClickHouse and TimescaleDB are not competitors; they solve different problems.

ClickHouse is the OLAP warehouse: massive scale, extreme compression, lightning-fast aggregations, geospatial functions.

TimescaleDB is PostgreSQL's time-series superpower: great if you're already in the PG ecosystem, good compression, familiar transaction semantics.

My choice was ClickHouse because QuackNet's queries are heavily aggregational (coverage by hex, scan counts, network stats) and geospatial (H3 hexagons). For your project, run the numbers and pick based on your actual query patterns.

But don't force one database to do both jobs. It'll resent you.