Skill

Estimate System Capacity & Scale Requirements

Calculate RPS, storage, bandwidth, and infrastructure needs from user counts using proven formulas and technology benchmarks.

Works with postgresqlmysqlmongodbcassandradynamodb

57
Spark score
out of 100
Updated 8 days ago
Version 1.0.0

Add to Favorites

Why it matters

Engineers hire this skill to quickly calculate infrastructure capacity, throughput, and cost requirements when designing or scaling distributed systems. It provides battle-tested formulas and benchmarks to translate user counts into concrete resource specifications.

Outcomes

What it gets done

01

Convert daily active users into requests-per-second with peak multipliers by application type

02

Calculate storage requirements including replication factors and cache hit ratios

03

Look up throughput limits for databases, queues, and caches to determine when to shard or cluster

04

Estimate infrastructure costs and architecture needs across user scale tiers from 1K to 10M DAU

Install

Add it to your toolbox

Run in your project directory:

curl -fsSL https://spark.entire.vc/get/ag-scale-benchmarks | bash

Capabilities

What this skill does

Generate code

Writes source code or scripts from a description.

Overview

MONOPOLY - Scale Benchmarks & Estimation Formulas

What it does

Scale estimation reference providing formulas for RPS/storage/bandwidth calculation and capacity benchmarks for common technologies across DAU tiers

How it connects

When sizing infrastructure, estimating costs, or determining when to shard/cluster databases and queues based on load projections

Source README

MONOPOLY - Scale Benchmarks & Estimation Formulas

Quick Estimation Formulas

User → RPS Conversion

Requests per second (avg) = DAU × avg_requests_per_user_per_day / 86400
Requests per second (peak) = avg_RPS × peak_multiplier

Peak multipliers by app type:
  Social media:      5-10×
  E-commerce:        3-5× (higher during sales)
  News / media:      10-20× (breaking news spike)
  B2B SaaS:          2-3× (business hours spike)
  Gaming:            5-15× (event-driven)

Storage Estimation

Storage per day    = requests_per_day × avg_payload_size
Storage per year   = storage_per_day × 365
With replication   = storage_per_year × replication_factor (3× typical)
With CDN/cache     = reduce by cache_hit_ratio (80% hit = 20% origin load)

Common payload sizes:
  Tweet / short text:    500B
  Social post with text: 2KB
  Profile data:          5KB
  Image (compressed):    200KB-2MB
  Video (per minute):    50MB (720p), 150MB (1080p)
  API JSON response:     1-20KB

Bandwidth Estimation

Inbound bandwidth  = avg_request_size × RPS
Outbound bandwidth = avg_response_size × RPS

Convert: 1 Gbps = 125 MB/s
         10 Gbps = 1.25 GB/s

Known Scale Limits of Common Technologies

Databases

Technology Single Node Writes Reads (with replicas) Recommended Shard/Cluster Trigger
PostgreSQL ~5K-20K writes/s ~50K-200K reads/s >5TB data or >20K writes/s
MySQL ~10K-25K writes/s ~60K-250K reads/s >5TB or >25K writes/s
MongoDB ~20K-50K writes/s ~50K-100K reads/s >100GB or >50K writes/s
Cassandra ~200K-1M writes/s ~200K-500K reads/s Almost never needs explicit sharding
DynamoDB Unlimited (managed) Unlimited (managed) Use provisioned capacity mode
Redis ~500K-1M ops/s Same >50GB data or cluster needed
Elasticsearch ~10K-50K docs/s ~1K-10K queries/s >100M documents per index

Queues / Streams

Technology Max Throughput Max Consumers Retention
Kafka 1M+ msgs/s per cluster Unlimited consumer groups Configurable (days-forever)
RabbitMQ ~50K-100K msgs/s Limited by connections Until consumed
SQS Standard Unlimited (AWS-managed) Unlimited 14 days
SQS FIFO 3K msgs/s per queue Per group 14 days
Redis Pub/Sub ~1M msgs/s Limited by subscribers None (fire-and-forget)

Caching

Technology Max Memory (single) Max Throughput Latency
Redis ~1TB RAM ~1M ops/s <1ms
Memcached ~64GB RAM ~1M ops/s <1ms
In-process (Caffeine/Guava) JVM heap Unlimited (local) <0.1ms

Capacity Planning by User Scale

1K DAU

Avg RPS:       ~1-5 RPS
Peak RPS:      ~10-50 RPS
DB size/year:  ~10-50GB
Infra needed:  Single server, managed DB (RDS t3.medium), basic CDN
Monthly cost:  $50-200

10K DAU

Avg RPS:       ~10-50 RPS
Peak RPS:      ~100-500 RPS
DB size/year:  ~100-500GB
Infra needed:  2-4 app servers, RDS r5.large, Redis t3.medium, CDN
Monthly cost:  $300-800

100K DAU

Avg RPS:       ~100-500 RPS
Peak RPS:      ~1K-5K RPS
DB size/year:  ~1-5TB
Infra needed:  ASG (5-10 app servers), RDS r5.xlarge + 2 replicas, Redis cluster, CDN, ALB
Monthly cost:  $2K-8K

1M DAU

Avg RPS:       ~1K-5K RPS
Peak RPS:      ~10K-50K RPS
DB size/year:  ~10-50TB
Infra needed:  ASG (20-50 servers), DB sharding or Aurora, Redis cluster, Kafka, CDN, WAF
Monthly cost:  $20K-80K

10M DAU

Avg RPS:       ~10K-50K RPS
Peak RPS:      ~100K-500K RPS
DB size/year:  ~100-500TB
Infra needed:  Multi-region, microservices, distributed DB (Cassandra/CockroachDB), full CDN, dedicated SRE
Monthly cost:  $200K-2M+

Common SLO Targets

Tier Availability Monthly Downtime Allowed
99% Basic 7.2 hours/month
99.9% (three nines) Standard production 43.8 minutes/month
99.95% Important services 21.9 minutes/month
99.99% (four nines) Critical services 4.38 minutes/month
99.999% (five nines) Telecom / payments 26 seconds/month

Achieving four nines requires: Multi-AZ deployment, automated failover, zero-downtime deploys, chaos engineering, 24/7 on-call.


Latency Budget Guidelines

User perceived latency targets:
  < 100ms  → Feels instant
  100-300ms → Acceptable for most interactions
  300ms-1s → Noticeable; optimize if possible
  > 1s     → Frustrating; unacceptable for critical paths

Network latency by distance (approximate):
  Same datacenter:    0.5ms
  Same region (AZ):   1-2ms
  Cross-region US:    30-60ms
  US to Europe:       80-120ms
  US to Asia:         150-250ms

Database query targets:
  Simple key-value:   < 1ms (cache)
  Simple DB query:    < 5ms
  Complex query:      < 50ms
  Reporting query:    < 500ms (async if > 1s)

Limitations

  • This is a reference document and may not cover all edge cases. Always verify architectures before production.

Discussion

Questions & comments · 0

Sign In Sign in to leave a comment.