Scalability: an application / system can handle greater loads by adapting.
There are two kinds of scalability
- Vertical Scalability (=scale up / down)
- increasing the size of the instance, like better laptop, or from t2.micro to t2.large
- Horizontal Scalability(= elasticity)(= scale out / in)
- increasing the number of instances or systems for your application. This implies distributed systems
- Auto Scaling Group
- Load Balancer
Availability (= run instances for the same app across multi AZs)
- High Availability usually goes hand inhand with horizontal scaling
- High availability means running your application /system in at least 2 data centers (== vailability Zones)
- The goal of high availability is to survive a data center loss
- The high availability can be passive (forRDS Multi AZ for example)
Load balancing:
servers that forward traffic to multiple servers (ex. EC2 instances) downstream
Why use a load balancer?
- Spread load across multiple downstream instances
- Expose a single point of access (DNS) to your application
- Seamlessly handle failures of downstream instances
- Do regular health checks to your instances
- Provide SSL termination (HTTPS) for your websites
- Enforce stickiness with cookies
- High availability across zones
- Separate public traffic from private traffic
Elastic load balancer
- a managed load balancer
- cheaper that creating your own load balancer, and integrate with many AWS offerings
Health Checks
- Crucial for load balancers
- They enable the load balancer to know if instances it forwards traffic to are available to reply to requests
- The health check is done on a port and a route (/health is common)
- If the response is not 200 (OK), then the instance is unhealthy