Episode 77: Security Groups vs. NACLs

In Amazon VPCs, two layers of network controls help manage traffic: Security Groups and Network Access Control Lists (NACLs). While both serve to filter and regulate communications, they operate at different layers of scope and with different philosophies. Security Groups act like firewalls attached to individual resources, while NACLs serve as subnet-level gatekeepers. Understanding when to apply each is crucial to designing secure, predictable environments. Security Groups provide granular, instance-level control, while NACLs define broader, subnet-wide traffic policies. Together, they form a layered defense model—fine-grained rules for specific resources, combined with coarse filters to protect entire sections of the network. By blending both, administrators ensure traffic flows where it should, while stopping unintended or malicious access at the right layer.
Security Groups, often abbreviated as SGs, function as virtual firewalls for Elastic Network Interfaces (ENIs) and, by extension, EC2 instances or other attached services. They are stateful, meaning if an inbound request is allowed, the corresponding outbound response is automatically permitted. This makes configuration simpler, since administrators don’t have to define both directions of a connection. For example, if a rule allows inbound HTTPS traffic, the return packets flow back without explicit configuration. This contrasts with stateless controls, which require explicit return paths. Security Groups thus reduce administrative complexity while still providing strong access control. Their stateful nature aligns well with most application traffic, where predictable flows need easy return handling.
The scope of Security Groups is tightly bound to ENIs. They are evaluated last in the packet path, after subnet-level controls like NACLs. Rules in Security Groups are allow-only; there is no ability to deny explicitly. Administrators define which inbound traffic is permitted—such as SSH from a corporate IP or HTTPS from all users—and which outbound traffic is allowed, like database queries to a private subnet. If traffic is not explicitly allowed, it is denied by default. A powerful feature of SGs is the ability to reference other SGs as sources, enabling tier-to-tier controls. For instance, an application SG might permit inbound connections only from the web tier SG, ensuring that only approved components communicate. This referencing system simplifies multi-tier designs while preserving isolation.
Network ACLs, by contrast, operate at the subnet boundary. Every packet entering or leaving a subnet is evaluated against NACL rules. Unlike SGs, NACLs are stateless, which means both inbound and outbound directions must be defined explicitly. For example, if inbound HTTP traffic is allowed, an outbound rule must also exist for the response traffic. Rules are numbered and evaluated in order, with the first matching rule applied, whether it is an allow or deny. This makes NACLs more complex but also more flexible, as they can explicitly block traffic. For example, a NACL might deny traffic from a known malicious IP range while still allowing other traffic into the subnet. NACLs thus serve as coarse subnet-level filters, complementing the resource-specific granularity of SGs.
Default configurations in Security Groups and NACLs demonstrate their philosophies. The default Security Group denies all inbound traffic and allows all outbound traffic, enforcing a least-privilege model by requiring administrators to explicitly open inbound access. The default NACL, however, allows all inbound and outbound traffic until modified. This means NACLs start permissive but can be hardened, while SGs start restrictive and require intentional allowances. Awareness of these defaults is important, as they affect how new VPCs behave. Without adjusting defaults, one might mistakenly assume stronger or weaker protections exist. Knowing these starting points helps architects implement consistent security baselines and avoid surprises during deployment.
Ephemeral ports play a hidden but crucial role in traffic flows. When a client connects to a server, it uses a random ephemeral port for its end of the session. For example, a client connecting to an EC2 instance’s HTTPS service on port 443 might use port 49152 locally. NACLs must account for these ephemeral ranges in their outbound or return rules, otherwise connections may fail unexpectedly. This is where Security Groups simplify matters: their stateful nature automatically handles return traffic without worrying about ephemeral ranges. Misunderstanding ephemeral ports is a common cause of connectivity issues, particularly when strict NACLs are configured. Recognizing this nuance is essential for troubleshooting network behavior.
Logging and visibility tools help administrators understand how SGs and NACLs behave. VPC Flow Logs capture metadata about traffic, showing whether packets were accepted or rejected at the ENI or subnet boundary. For example, if traffic is being dropped, flow logs can confirm whether the NACL denied it or the SG lacked a matching rule. This visibility transforms troubleshooting from guesswork into a data-driven process. Without flow logs, diagnosing why a packet never reached its destination could take hours. With them, administrators can pinpoint misconfigurations quickly, reinforcing the principle that observability is as important as control in network security.
In practice, Security Groups and NACLs are often used together following common patterns. SGs control application-level logic, such as which tiers can talk to one another, while NACLs provide coarse subnet-wide filters, such as blocking traffic from untrusted IP ranges. For example, a web application might use SGs to permit connections from the web tier to the app tier, while a NACL denies all inbound traffic from outside the corporate CIDR block. This layered approach provides both precision and broad defense, ensuring no single misconfiguration exposes the system. By combining both controls, organizations achieve defense in depth at the network layer.
The principle of least privilege applies equally to SGs and NACLs. Every rule should exist for a reason, granting only the minimal access required for functionality. Overly permissive SGs, such as allowing inbound from 0.0.0.0/0 on SSH, expose resources to unnecessary risk. Likewise, NACLs that allow all traffic reduce their protective value. Regular audits help maintain hygiene, ensuring rules reflect current requirements rather than historical leftovers. For example, removing outdated rules for deprecated applications reduces attack surfaces. Least privilege is not a one-time design choice but an ongoing discipline, adapting controls as systems evolve.
Change control and rule lifecycle management are operational necessities. Rules in both SGs and NACLs should be documented, tagged, and reviewed periodically. Without lifecycle hygiene, configurations accumulate clutter and risk conflicting entries. For instance, a NACL with overlapping rules might cause unexpected results if administrators forget about rule priority. Similarly, unused SGs can linger, creating confusion about which ones matter. Establishing processes for requesting, reviewing, and retiring rules ensures network controls remain clean, understandable, and secure. Governance practices around change control elevate SGs and NACLs from ad hoc filters to trusted, auditable components of network architecture.
For exam preparation, the key distinction is that Security Groups are stateful and evaluated at the ENI level, while NACLs are stateless and evaluated at the subnet boundary. SGs allow only, while NACLs allow or deny. SGs handle return traffic automatically; NACLs require explicit return rules. Questions often test whether learners can recognize which control applies to a given scenario. For example, if the question emphasizes blocking malicious IP ranges, NACLs are likely the answer. If the question emphasizes tier-to-tier application access, Security Groups fit best. By focusing on stateful versus stateless, learners can confidently distinguish between the two layers.
For more cyber related content and books, please check out cyber author dot me. Also, there are other prepcasts on Cybersecurity and more at Bare Metal Cyber dot com.
A classic example of applying both Security Groups and NACLs can be found in a three-tier web application. The load balancer in the public subnet has a Security Group that allows inbound traffic on ports 80 and 443 from the internet. The application servers in the private subnet have a Security Group that only accepts inbound connections from the load balancer’s Security Group, not from arbitrary IP ranges. Finally, the database servers have a Security Group that permits inbound traffic solely from the application tier’s Security Group, usually on port 3306 for MySQL or 5432 for PostgreSQL. This design enforces clear boundaries: external users can reach the web tier, the web tier can reach the app tier, and only the app tier can reach the database. Layering rules this way ensures each component has access only to what it truly requires.
NACLs add value when organizations want subnet-level deny rules. Since Security Groups lack explicit deny capability, NACLs provide a way to blacklist known malicious IP ranges. For example, a subnet hosting public-facing instances might have a NACL that blocks traffic from a list of suspicious IP addresses, while still allowing general internet access. This setup creates a coarse safety net, catching traffic before it even reaches the instance-level Security Groups. It’s important to remember that NACL rules are evaluated in order, with the first match applied. A deny rule listed early will block traffic regardless of subsequent allow rules. Used thoughtfully, NACLs complement SGs by applying blanket protections across broader sections of the network.
Troubleshooting connectivity issues often requires understanding how Security Groups and NACLs interact. Packets are evaluated by the NACL first, then the Security Group at the ENI. If either layer blocks traffic, the connection fails. Flow Logs help by showing whether the block occurred at the subnet or ENI level. Additionally, NACL counters can reveal which rules are being hit, helping diagnose unexpected denials. For instance, an administrator might discover that a return packet is being blocked by a missing ephemeral port rule in the NACL. Awareness of order of evaluation and the stateless nature of NACLs is key to resolving these situations quickly.
AWS Reachability Analyzer provides another way to troubleshoot and validate SG and NACL configurations. By simulating a connection between two endpoints, it evaluates whether the current rules, route tables, and gateways permit traffic to flow. For example, an administrator can test whether an application server can reach a database, confirming that all rules and routes align. This proactive tool prevents guesswork and speeds up troubleshooting, particularly in complex environments with many overlapping rules. It turns connectivity validation into a predictable process rather than a trial-and-error exercise.
Certain ports frequently appear in Security Group and NACL configurations, reflecting common protocols. ICMP is often used for ping tests to confirm reachability. RDP on port 3389 and SSH on port 22 provide administrative access, though these should be tightly restricted to trusted sources. Database ports such as 1433 for SQL Server or 27017 for MongoDB must be carefully limited to internal application servers. Opening these ports broadly to the internet is one of the most common and dangerous misconfigurations. By controlling these well-known ports with least-privilege principles, administrators reduce exposure while maintaining functionality.
Centralized egress patterns often rely on proxies or controlled gateways. Instead of letting every instance access the internet directly, organizations may enforce all outbound traffic through a proxy or firewall. Security Groups and NACLs then restrict outbound rules, ensuring instances cannot bypass controls. For example, developers might configure instances to send outbound traffic only to a proxy’s IP, where content filtering and logging occur. This model provides stronger oversight of egress paths, limiting risks such as data exfiltration. It demonstrates how SGs and NACLs integrate into broader enterprise security patterns beyond AWS alone.
Hybrid architectures extend these principles into VPN or Direct Connect links. Security Groups must allow traffic from on-premises IP ranges, while NACLs provide subnet-wide filters for hybrid pathways. For example, a hybrid database cluster might need to accept queries from on-premises analytics tools. SGs would permit the necessary ports only from trusted ranges, while NACLs add a second layer of subnet-wide protection. Aligning both layers ensures hybrid connectivity remains secure, preventing misconfigurations that might expose cloud resources unnecessarily. This highlights how AWS networking controls adapt seamlessly to hybrid and multi-cloud designs.
In multi-account setups with shared VPCs, SGs and NACLs remain central to segmentation. Different accounts might share a VPC for cost or operational efficiency, but SGs ensure workloads remain isolated. For instance, an application in one account may only reference another SG in the same VPC, never exposing ports broadly. NACLs provide a broader guardrail, ensuring subnet-level consistency across accounts. This arrangement allows organizations to enforce both centralized governance and decentralized ownership, giving teams autonomy without sacrificing security. It shows how SGs and NACLs scale from single environments to complex enterprise topologies.
Automation plays a key role in maintaining SG and NACL hygiene. Tools can audit configurations for overly permissive rules, identify drift from baselines, and enforce tagging standards. For example, a script might flag any SG allowing 0.0.0.0/0 on SSH, prompting review. Regular audits catch mistakes early, while descriptive comments and tags ensure rules remain understandable. Without this discipline, rulesets grow unwieldy, becoming opaque and error-prone. Automated governance aligns with the principle that security is not static but an evolving practice requiring ongoing attention.
Avoiding overly permissive rules is one of the most important lessons. Allowing all IPs (0.0.0.0/0) to connect on sensitive ports like SSH or RDP exposes instances directly to brute force attacks. While this may seem convenient for testing, it creates significant risks in production. Safer practices involve restricting access to corporate IP ranges, using bastion hosts, or leveraging AWS Systems Manager for administrative access. By narrowing rules to only what is necessary, administrators drastically reduce the attack surface. This discipline reflects the broader cloud security mindset: convenience must never outweigh protection.
Testing plays an important role in validating security configurations. Canary hosts—small instances placed in subnets—can confirm whether rules permit or block traffic as intended. Packet captures, when combined with controlled test flows, provide deeper insights into how SGs and NACLs interact. These tests reveal both expected and unintended behavior, catching misconfigurations before they cause outages or security incidents. For example, testing might uncover that a return packet is silently blocked by a missing NACL rule. Proactive testing builds confidence, ensuring controls work not only in theory but also in practice.
Performance is another consideration, as sprawling rulesets can create administrative complexity and increase latency in evaluation. While AWS scales SGs and NACLs efficiently, overly complicated configurations are harder to manage and more error-prone. Consolidating redundant rules and simplifying structures improves both performance and maintainability. For example, grouping related application servers under a single SG reduces clutter compared to managing dozens of overlapping groups. The goal is to achieve balance: enough granularity to enforce security without drowning in rule sprawl. Simplicity, when aligned with least privilege, is often the best security design.
In incident response, Security Groups become powerful tools for isolation. Quarantine SGs can be created with no inbound or outbound rules, applied immediately to compromised instances to cut them off from the network. This approach allows forensic analysis without risk of lateral movement or data exfiltration. For example, if an instance shows signs of compromise, administrators can swap its SG to quarantine mode within seconds. NACLs may play a role in broader subnet lockdowns, but SGs offer faster, more precise isolation. Embedding this practice into response plans ensures security teams can act quickly under pressure.
From an exam perspective, the cues often hinge on scope and state. If the question emphasizes instance-level controls, tier-to-tier references, or automatic handling of return traffic, Security Groups are the answer. If the question highlights subnet-level filtering, explicit deny rules, or ephemeral port considerations, NACLs fit best. Questions may also ask about defaults: SGs deny all inbound and allow all outbound by default, while NACLs initially allow everything until modified. Remembering these distinctions helps map scenarios to the right control. Real-world design follows the same cues, ensuring architects pick the right tool for each job.
In conclusion, Security Groups and NACLs are complementary tools for controlling traffic in AWS. Security Groups provide fine-grained, stateful protection at the instance level, while NACLs enforce stateless, subnet-wide rules. Together, they deliver layered security, balancing precision with broad defense. Best practices include enforcing least privilege, avoiding overly permissive rules, maintaining hygiene through automation, and testing configurations regularly. In incidents, SGs enable quick isolation, while NACLs provide coarse deny lists. For both exam preparation and production design, the takeaway is clear: prefer SGs for application logic, use NACLs for subnet guardrails, and combine both for robust, resilient security.

Episode 77: Security Groups vs. NACLs
Broadcast by