Episode 59: Elastic Load Balancing

Elastic Load Balancing, often shortened to ELB, is a foundational service in AWS that distributes incoming application traffic across multiple targets. Its purpose is threefold: to balance workloads so no single server is overwhelmed, to add resilience by routing around failures, and to simplify operations by handling traffic management at scale. Beginners should think of ELB as a skilled traffic cop at a busy intersection: rather than letting one lane clog while others sit empty, it directs vehicles smoothly, ensuring everyone gets through efficiently. In AWS, this distribution is not just about convenience — it directly contributes to fault tolerance, availability, and user experience.
AWS offers three main types of load balancers. The Application Load Balancer (ALB) operates at Layer 7, the application layer, and can make routing decisions based on content such as HTTP headers, paths, or hostnames. The Network Load Balancer (NLB) functions at Layer 4, focusing on TCP and UDP traffic, with ultra-low latency and high throughput. The Gateway Load Balancer (GWLB) is designed to insert third-party appliances, such as firewalls or intrusion detection systems, transparently into traffic flows. Beginners should imagine ALB as a receptionist who listens to the request and decides where to send it, NLB as a lightning-fast turnstile that checks tickets, and GWLB as a security checkpoint that every visitor must pass through for inspection.
Listeners and rules define how traffic enters and flows through a load balancer. A listener checks for connection requests on specific ports, like port 80 for HTTP or port 443 for HTTPS. Rules then determine what to do with the request — for example, routing all /images paths to one target group and /api to another. Beginners should picture this as an office building with different reception desks: one desk accepts visitors for customer service, another for finance, each filtering traffic to the right department. This flexibility is what allows ELB, especially ALB, to intelligently manage modern applications with multiple services behind the same entry point.
Target groups are the destinations where load balancers send requests. Targets can be EC2 instances, IP addresses, or even AWS Lambda functions. Each group is associated with one or more listeners, and traffic is distributed among its members based on health and capacity. Beginners should think of target groups as teams of workers: the load balancer assigns each incoming task to a healthy, available worker. This abstraction simplifies scaling because new workers can be added to or removed from groups without changing how traffic enters the system.
Health checks ensure only functional targets receive traffic. An ELB periodically sends requests to each target and evaluates responses based on success codes, intervals, and thresholds. If a target fails checks, it is marked unhealthy and removed from rotation until it recovers. Beginners should imagine this as a manager checking in on employees: only those who respond correctly are given new assignments. Properly tuned health checks prevent downtime from spreading, as unhealthy instances are isolated automatically, preserving service availability.
Cross-zone load balancing is another critical feature. By default, a load balancer may distribute requests unevenly if one Availability Zone has more targets than another. Enabling cross-zone balancing ensures traffic is spread evenly across all zones, improving efficiency. The trade-off is increased inter-AZ data transfer, which may add costs. Beginners should picture this as distributing customers across all cash registers in a store rather than just the registers nearest the entrance. Balanced queues improve performance, but they may require extra coordination.
Stickiness, or session affinity, ties a client to a specific target for the duration of a session. This is often achieved through cookies. It can be useful for applications that store session data locally rather than in a shared cache. However, stickiness reduces distribution efficiency and may overload certain targets. Beginners should compare this to a customer always visiting the same cashier, even if other registers are free. It feels consistent but can cause bottlenecks. For most modern applications, stickiness is avoided unless absolutely necessary.
TLS termination is another vital ELB capability. Instead of each target handling encryption and decryption, the load balancer can terminate TLS at the edge. Certificates are managed through AWS Certificate Manager, simplifying operations. Beginners should imagine this as a security guard at the front of a building checking every visitor’s credentials. Once verified, visitors can move freely inside without repeating the process at every door. TLS termination offloads work from targets, reducing overhead and centralizing encryption management.
Idle timeout and header size are subtle but important considerations. ELB enforces defaults for how long idle connections remain open and how large request headers can be. These settings influence application behavior, especially for long-lived connections or large cookies. Beginners should see this as rules in a waiting room: no one may sit indefinitely, and luggage must fit certain dimensions. Adjusting these values ensures compatibility with applications that have specific requirements.
Access logs and request tracing provide deep visibility into ELB behavior. Logs can capture every request, recording details such as client IPs, latencies, and response codes. Tracing with tools like AWS X-Ray extends this visibility across distributed systems, helping diagnose bottlenecks. Beginners should imagine a guestbook at the entrance of a building where every visitor signs in with their arrival time and purpose. Without these logs, understanding traffic patterns or resolving problems becomes guesswork.
Application Load Balancers also support advanced features such as WebSocket connections, HTTP/2, and customizable request headers. These enable modern, real-time applications and microservices architectures to function seamlessly. Beginners should view these as upgrades to the building’s infrastructure — adding intercoms, fast elevators, and new mail systems — that make interactions smoother. ALB is designed for flexibility in content-based routing and modern protocols, making it a cornerstone for web-facing workloads.
Network Load Balancers bring their own advanced capabilities. They provide static IP addresses for consistent entry points and support TLS pass-through, where encrypted traffic is forwarded directly to targets without termination. This is essential for applications requiring end-to-end encryption. Beginners should compare this to a sealed envelope that passes through the mailroom unopened. The NLB doesn’t inspect contents but ensures secure, rapid delivery to the right recipient. This focus on speed and transparency makes NLB ideal for performance-critical applications.
Gateway Load Balancers add transparency for security appliances. They allow third-party firewalls, intrusion prevention systems, and monitoring tools to be inserted without requiring application reconfiguration. For learners, think of GWLB as a hidden checkpoint on a highway: every car passes through, but drivers don’t need to alter their route. This invisibility and integration make GWLB especially useful in enterprise networks where compliance and inspection tools must sit inline.
Security group behavior varies by load balancer. ALBs can be associated with security groups, which act like firewalls controlling inbound and outbound traffic. NLBs, however, do not use security groups directly; instead, security must be managed at the target level. Beginners should imagine ALBs as gates with guards checking IDs at the entrance, while NLBs are more like open highways where guards are stationed only at each destination. Understanding this difference prevents misconfigurations that can lead to exposure or blocked access.
Finally, load balancers integrate with AWS WAF for protection. WAF can be attached directly to an ALB or deployed at the CloudFront edge for global coverage. Choosing placement depends on whether inspection should occur regionally or globally. Beginners should picture this as setting up metal detectors at the building entrance (ALB) versus installing them at every airport terminal worldwide (CloudFront). On the exam, expect questions asking which ELB type or WAF placement fits the scenario, testing your ability to map features to use cases.
For more cyber related content and books, please check out cyber author dot me. Also, there are other prepcasts on Cybersecurity and more at Bare Metal Cyber dot com.
High availability is one of the core promises of Elastic Load Balancing, and it is achieved by spreading targets across multiple Availability Zones. By combining multi-AZ deployments with health checks, an ELB ensures that if one zone experiences failures, traffic is automatically rerouted to healthy targets elsewhere. This avoids single points of failure and strengthens resilience. Beginners should imagine a chain of restaurants across a city: if one branch closes due to a power outage, customers are smoothly redirected to the nearest open branch. The design is seamless to the customer, and service continuity is maintained.
ELBs also play an important role in supporting blue/green deployments. By associating different versions of an application with distinct target groups, you can shift traffic from the old environment (blue) to the new one (green) with minimal disruption. Traffic can be cut over instantly or shifted gradually for canary testing. Beginners should think of this as opening a new restaurant location while the old one still operates. Once the new location proves reliable, customers are directed there full-time. This controlled transition minimizes risk during application upgrades.
Weighted routing can extend ELB functionality when paired with Route 53. In this setup, traffic can be divided between multiple load balancers or Regions according to defined percentages. This enables phased rollouts or geographic balancing strategies. Beginners should compare this to a shipping company deciding how much cargo goes through each port: 80 percent may go to the busiest port, while 20 percent diverts to a secondary port for testing or relief. Weighted routing provides flexibility for distribution strategies at a global scale.
Elastic Load Balancing integrates tightly with Auto Scaling, creating dynamic environments that grow or shrink with demand. When new instances are launched by an Auto Scaling Group, they automatically register with the load balancer. Health checks further ensure unhealthy instances are removed until they recover. Lifecycle hooks can even pause scaling events for initialization tasks before targets join rotation. Beginners should think of this as hiring seasonal workers at a store: new employees are trained and then added to the cashier pool, while anyone unable to perform is temporarily sidelined. The system remains responsive without manual oversight.
ELBs can also operate privately, not just for internet-facing traffic. Internal load balancers are deployed within a VPC to manage access between private services. For example, a backend microservice may only be reachable via an internal ALB. Beginners should view this as hallways inside a corporate office: while customers enter through the main lobby, internal teams connect through secure, private corridors. Internal load balancers reinforce segmentation and security, supporting architectures that avoid unnecessary exposure to the internet.
Hybrid architectures, where on-premises networks connect to AWS through VPN or Direct Connect, also benefit from ELBs. Private load balancers can serve traffic coming from data centers into AWS, creating seamless pathways between cloud and local workloads. Beginners should picture this as a private bridge connecting a factory to a regional warehouse: all shipments funnel through controlled checkpoints that balance and secure the flow. ELBs provide consistent entry points, regardless of whether the client sits in AWS or on-premises.
Observability is critical for managing ELBs. Metrics such as request counts, latency, error codes, and consumed Load Balancer Capacity Units (LCUs) are visible in CloudWatch. Logs provide deeper context for troubleshooting. Beginners should think of this as a dashboard in a car: the speedometer, fuel gauge, and warning lights provide quick insights into system health. Without these signals, diagnosing whether slow performance stems from the load balancer, the targets, or the network becomes guesswork. Observability closes that gap.
Cost awareness with ELB requires understanding LCUs, which measure different dimensions of usage: new connections, active connections, processed bytes, and rules. The highest of these dimensions drives billing. Beginners should compare this to a cell phone bill with multiple factors like minutes, texts, and data, where the most expensive category dominates the total. Designing with LCU optimization in mind — such as reusing connections or limiting unnecessary rules — keeps costs under control while maintaining performance.
Performance tuning sometimes requires adjusting parameters like keep-alive settings, header size, or idle timeouts. For example, long-lived WebSocket connections require higher idle timeout values, while excessive header size can break requests. Beginners should see this as customizing tools for a job: a wrench must be adjusted to fit the correct bolt size. Fine-tuning load balancer parameters ensures applications perform consistently and avoid unexpected errors in production.
Security posture is reinforced through ELBs by managing TLS policies and cipher choices. Administrators can enforce modern encryption standards, deprecating weaker ciphers to meet compliance frameworks. Certificates managed through AWS Certificate Manager integrate directly into ALBs and NLBs. Beginners should think of this as a building requiring up-to-date locks — older keys may exist, but they’re too weak to resist break-ins. Enforcing strong TLS policies ensures communication remains encrypted and trustworthy, aligning technical controls with compliance needs.
Multi-Region deployment patterns extend ELB’s reach. Failover patterns rely on Route 53 health checks to redirect traffic to a secondary Region if the primary goes offline. Active/active models distribute traffic across Regions continuously for performance and redundancy. Beginners should compare this to global retail chains: some keep a backup store ready only in emergencies, while others operate multiple stores simultaneously to serve customers faster. Both approaches enhance resilience, but the right choice depends on application requirements.
Common pitfalls with ELBs often involve misconfigured health checks or incorrect security groups. If health checks target the wrong path or port, healthy instances may appear unavailable. Similarly, overly strict security group rules can block legitimate traffic. Beginners should picture this as a fire drill where the manager knocks on the wrong office door, assuming it’s empty, or guards refusing entry to employees because badges weren’t configured. On the exam, scenarios often hinge on spotting these misconfigurations as the root cause of outages.
Migrating from Classic Load Balancers to modern ALBs or NLBs is another theme. Classic Load Balancers predate features like host-based routing or enhanced observability and are now considered legacy. AWS recommends shifting to ALB or NLB depending on whether Layer 7 or Layer 4 features are required. Beginners should imagine upgrading from an old flip phone to a modern smartphone: while the old device still works, newer ones provide more capabilities, efficiency, and integration. Migration ensures organizations benefit from ongoing AWS innovations.
Documenting standard load balancer patterns helps organizations avoid reinventing the wheel. Many companies maintain catalogs of ELB configurations for common use cases, such as internet-facing web apps, private microservices, or blue/green deployments. This documentation provides clarity and accelerates future projects. Beginners should think of this as having a cookbook with proven recipes: instead of guessing, teams follow tried-and-true instructions that already account for best practices. Documentation transforms one-off designs into repeatable enterprise patterns.
From an exam perspective, the skill lies in matching symptoms to the correct load balancer type. If the scenario describes HTTP header routing, the answer is ALB. If it emphasizes high-performance TCP with static IPs, it points to NLB. If transparent firewall insertion is needed, GWLB is correct. Beginners should train themselves to recognize these cues quickly, as exam questions often hide the family name and instead describe the capabilities. By focusing on features and use cases, you’ll select the right load balancer even when acronyms aren’t given.
In conclusion, Elastic Load Balancing is more than a traffic router. It is a resilience tool, a scaling enabler, a security enforcer, and an operations simplifier. By decoupling traffic from individual instances, it ensures uptime and flexibility in evolving environments. For learners, the key message is that ELBs are not just optional add-ons but central pillars of modern AWS architectures. Whether serving web traffic, securing connections, or enabling safe deployments, ELB stands as the front door of cloud applications, keeping them reliable, scalable, and secure.

Episode 59: Elastic Load Balancing
Broadcast by