B-Tree vs LSM-Tree

Duas estruturas fundamentais de storage engines: trade-offs entre read e write performance

B-Tree (PostgreSQL, MySQL, InnoDB)🌳Root Node[30 | 70][10|20]Internal[50|60]InternalLeaf [5,8,10]4-16KB pageLeaf [12,15,20]4-16KB pageRead PathO(log_B n)1B rows, B=500 → ~3 readsREAD OPTIMIZEDWrite PathIn-place update + WALRandom I/O (page rewrite completa)WAL + RANDOM I/OB-Tree Trade-offs✓ Reads O(log_B n) — muito rápidos✓ Range scans eficientes (folhas encadeadas)✗ Random I/O para writes (WAL + page rewrite)✗ Fragmentação com muitos deletesLSM-Tree (RocksDB, Cassandra, LevelDB)🧠MemtableIn-memory (Red-Black Tree)~64MBL0 SSTablesUnsorted between📝WALAppend-onlyflushWALL110x size, no overlapL2100x sizeLn10^n × sizecompactLEVELEDRead PathMemtable → L0 → L1 → ... → LnBloom filter: skip ~99% SSTablesWorst case: O(levels × log n)LSM-Tree Trade-offs✓ Writes 10-100x mais rápidos (sequential I/O)✓ Sem fragmentação (immutable SSTables)✗ Reads mais lentos (múltiplos levels)✗ Compaction CPU spikes + space amplificationQuando usar cada um?B-Tree → OLTP read-heavy: PostgreSQL, MySQL, SQLite (transacional, baixa latência de read)LSM → Write-heavy: Cassandra, RocksDB, LevelDB (logs, time-series, IoT, analytics ingest)Híbrido → TiDB (B-Tree para OLTP + LSM para OLAP), CockroachDB (Pebble/LSM com SQL)

B-Tree: estrutura balanceada com nós internos (ponteiros) e folhas (dados). Usado em PostgreSQL, MySQL

0/9