We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

Back to home

S3 scales to petabytes a second on top of slow HDDs

Source

Hacker News

Published

TL;DR

AI Generated

Amazon Web Services' S3 storage service operates at a massive scale, handling 1 petabyte per second and 150 million requests per second using commodity hard drives. Despite hard drives being slower than SSDs due to mechanical limitations, S3 leverages their cost-effectiveness and sequential access optimization. Through techniques like erasure coding and massive parallelism, S3 achieves high throughput and tolerable latency, even under random access patterns. The system employs load-balancing strategies, such as shuffle sharding and the Power of Two Random Choices, to distribute data effectively and avoid hot spots. As S3 continues to grow, it focuses on rebalancing data, optimizing parallelism, and benefiting from workload decorrelation to enhance performance and scalability.