Notes: The good parts of AWS
Please buy the book by Daniel Vassallo, these are just my notes.
"Features that have passed the test of time by being at the backbone of most things on the internet."
Dynamo DB
Serverless, NoSQL, fully managed database with single-digit millisecond latency.
- A highly durable B-tree data structure in the cloud.
- Similar to Redis but immediately consistent.
- No ability to aggregate + query DB side. Requires queries to happen at the app level.
- 1TB in Dynamo costs around $256 / month.
- 1TB in S3 costs around $23.55 / month.
- Start with on demand pricing, switch to provisioned if you can predict capacity, provisional will assume uniform access patterns, include headroom.
- Use Global Indexes, Local are old school and restrictive.
S3 (Simple Storage Service)
Object storage.
- Infinite bandwidth.
- Infinite storage.
- Zero capacity management.
- Think of it as a giant hash table / key-value store.
- Start with S3 as a database if you can.
- Key can be any string up to 1kb.
- Value any blob from 1 byte to 5 TB.
- 14ms average read latency, same region.
- 42ms average write latency (overwrites), same region.
- Scale based on key prefix.
- Streams at a rate of 90 MB/s.
- Unlimited parallel uploads and downloads.
- Reduced redundancy is legacy, don't use it.
- Use standard storage class.
- Access costs can be ok at human rates, computer rates can make things expensive.
- Can't append to objects, you have to periodically flush chunks to S3.
- Can stream to SQS or Kinesis Streams to use as a durable buffer.
- Not so great for static sites, no support for HTTPS, though can match up with Cloudfront.
- Bucket names have to be globally unique, your name could be taken.
- Check the bucket owner before interacting with your bucket, especially uploading.
EC2 (Elastic Compute Cloud)
Rent VM instances.
- Per second billing.
- Use savings plans instead of reserved instances.
- Use on demand until you require cost optimising.
- Security group is individual firewall per instance.
- VPC ACL is network firewall.
AWS Auto Scaling
Monitors your applications and automatically adjusts capacity.
- No real use for the auto part of auto scaling. Unless your access patterns are predictable and grow gradually, not bursts of traffic.
- If you can afford 30% headroom then may as well stick with it, rather than scale up and down.
- Decide how many instances you need, ensure you have plenty of headroom.
- Can auto replace unhealthy instances.
VPC (Virtual Private Cloud)
A network firewall / subnet.
- Wraps your AWS resources ensuring networking between those resources is secure.
- You control what traffic is allowed in / out.
ELB (Elastic Load Balancer)
Proxy load balancer.
- 2 main variants, ALB (application), NLB (network).
- ALB is an HTTP reverse proxy, allows things like authentication, sticky sessions, routing based on HTTP headers etc.
- NLB is a network packet router.
- Runs inside a VPC the same as the application instance. VPC ensures network safety against other AWS clients.
- ALBs terminate SSL, where as NLBs with TCP passthrough enabled allow e2e encryption between client and application.
- NLBs scale faster than ALBs, though work at a lower packet layer.
Cloudfront
Global CDN.
- Similar to Cloudflare from the CDN aspect.
- Setup caching rules for how long content should be held in the CDN before hitting the origin again.
Lambda (Serverless)
Run code without thinking about servers or clusters.
- Great for what it is, a place to run a small bit of code that doesn't change often.
- Fitting a normal application into Lambda can be tough, and require compromises.
- Lambda functions are better as infrastructure, rather than application code.
- Useful for adding additional small bits of functionality to things like S3 (processing image conversions), Cloudfront (adding request rewrites for A/B testing) etc.
- Cold start, network performance may improve over time.
- Assumse each request is stateless, no natural persistence. Mixing with Dynamo for persistence could get pricey.
Cloudformation (CF)
Infrastructure as code.
- Similarish to Terraform, but AWS specific.
- Be careful when using both CF and manually changing things in the console (which is the single source of truth).
- Use CF for more static things like VPC config, Security groups, load balancers, pipelines, IAM roles.
Aurora DB
MySQL and PostgreSQL compatible relational database. Part of their RDS (Relational Database Service) family.
- SQL relational database.
- Fully managed and hosted.
- Has a serverless variant "Aurora Serverless".
Route 53
DNS Name Server.
- Basic DNS server.
- Integrates well with CF (above) and ELB.
SQS (Simple Queue Service)
Fully managed message queuing for microservices, distributed systems, and serverless applications.
- In cloud queue, generally FIFO (first in first out), but not strict about that fact.
- Guarantees at-least-once delivery.
- Duplicate messages can occur.
- Zero capacity management, similar to S3.
- No capacity limits, and no throttling limits in normal mode.
- Strict ordering and exacly once delivery can be enabled (enabling FIFO mode), though creates a 300 messages / second throughput limit.
- Requires polling to recieve messages.
- SQS can only have one consumer, once polled the message is deleted. A basic buffer queue.
- SQS can only have one consumer! Use with SNS (simple notification service) to allow multiple consumers.
- Message payloads can contain up to 256KB of text in any format.
- Each 64KB chunk of payload is billed as 1 request.
- Can retain messages in queues for up to 14 days.
- If coming from RabbitMQ or something, opt for AmazonMQ instead, matches interfaces.
SNS (Simple Notification Service)
Pub/Sub push notification service.
- A distributed publish-subscribe service.
- Can push messages to multiple consumers.
- Requires consumers to be online otherwise message is lost.
- Standard mode has almost infinite throughput.
- Ordering not by default.
- FIFO mode ensures ordering, though restricted to 300 messages / second throughput, or 10MB per second throughput.
- Can push to different kinds of endpoints, SQS, Kinesis, Lambda, HTTPS (web hook), SMS, Mobile Push, and Email.
Kinesis
Stream data such as event info, analytics for processing down stream.
- A giant linked list in the cloud. Guarantees ordering.
- Allows multiple consumers all reading from different points in the list.
- Potentially cheaper than SQS.
- 1 KB messages in SQS at an average rate of 500 messages per second will cost you $34.56 per day.
- A Kinesis stream with 50% capacity headroom can handle that same volume for just $0.96 per day.
- Append only, data is maintained though expires after a retention period, usually 24hrs.