AWS’ Jeff Barr announced yesterday that their S3 service:
holds more than 449 billion objects and processes up to 290,000 requests per second for them at peak times
I’m a very happy user of S3 and much of the rest of AWS, but seeing figures like this forces me into risk-assessment mode: at what point does S3 (or similar services) break?
What I’m wondering about is, where does S3 et al. start hitting fundamental limits that amp problems up from oops to ohhh shit?
S3 is presumably the largest system of its kind ever; it’s not clear to me that anyone would really know what its failure modes, thresholds, or weak links might be as it continues to grow. Anything from a breakdown in the Dynamo architecture to hard infrastructure limits to failures in operations and management strike me as plausible. What will we see first: data loss, increases in latency, repeated catastrophic outages, or “other”?