Episode 87: Backup & Glacier

Amazon Simple Notification Service, or SNS, is AWS’s fully managed pub/sub messaging platform, built for broadcast and fanout communication. Where Amazon SQS focuses on point-to-point messaging with queues, SNS is about one-to-many delivery, ensuring that a single message can reach multiple subscribers simultaneously. This makes it ideal for event-driven systems where multiple downstream actions need to be triggered by a single event. For example, a new user registration could trigger an SNS topic that simultaneously sends a confirmation email, logs the event for analytics, and notifies a monitoring system. By decoupling producers from consumers and enabling broadcast patterns, SNS provides scalability, flexibility, and resilience in distributed architectures.
At the heart of SNS are topics and subscriptions. A topic acts as a logical channel where producers publish messages, and subscribers receive them. Subscribers can take many forms: HTTP/S endpoints for webhooks, email addresses for notifications, Lambda functions for serverless triggers, or SQS queues for durable delivery. For example, a financial trading platform might publish price updates to a topic, with one subscriber feeding an analytics pipeline and another updating a real-time dashboard. This design ensures producers remain simple—they publish once—while SNS handles the complexity of fanning out messages to all interested consumers.
Message filtering enhances the power of pub/sub by allowing subscribers to define which messages they want to receive. Filters use message attributes—key-value metadata attached to each message—to route only relevant notifications to each subscriber. For example, a logistics company could publish all shipment events to a topic but filter by region, so only the Europe system receives messages tagged with “EU.” This reduces noise, minimizes downstream processing, and saves costs by avoiding unnecessary deliveries. Without filtering, every subscriber would receive all messages, often leading to wasted compute cycles. Filtering transforms broad broadcasts into more precise, targeted flows.
SNS also supports FIFO topics, which combine ordering and deduplication with the pub/sub model. FIFO ensures that messages are delivered in the exact order they are published and are not duplicated, making them suitable for workflows where sequence matters. For example, processing financial transactions or updating account balances requires strict order guarantees. By combining FIFO topics with FIFO queues, AWS enables end-to-end ordering from producer through subscriber. The trade-off is reduced throughput compared to Standard topics, but the benefits of ordering and exactly-once delivery can outweigh performance considerations in sensitive workloads.
Delivery reliability is reinforced through retries and exponential backoff. If a subscriber, such as an HTTP endpoint, is temporarily unavailable, SNS retries delivery with increasing delays between attempts. This prevents overloading the subscriber while still ensuring eventual delivery if it recovers. For example, if a webhook endpoint is offline during maintenance, SNS retries until the endpoint comes back, minimizing lost events. These retries make SNS robust in real-world scenarios, where network hiccups and service outages are unavoidable. Paired with dead-letter strategies, they create a resilient, fault-tolerant broadcast system.
Beyond cloud integration, SNS extends to mobile push notifications. With integrations for Apple Push Notification Service, Google Firebase Cloud Messaging, and others, SNS can send notifications directly to mobile devices. This makes it possible to power features like app alerts, promotional messages, or real-time updates without building a dedicated notification infrastructure. For example, a ridesharing app might use SNS to push driver assignment alerts to passengers’ phones. By abstracting the complexities of working with mobile push providers, SNS provides a unified platform for both backend eventing and end-user messaging.
Security in SNS is enforced through topic policies, which function like resource-based access control lists. These policies determine who can publish or subscribe to a topic, ensuring only authorized principals participate. For example, a topic might allow only a billing service role to publish invoices, while restricting subscriptions to the finance team’s account. This prevents unauthorized services from injecting messages into sensitive channels. Topic policies align SNS with AWS’s broader least-privilege model, embedding governance directly into the messaging fabric.
Encryption at rest is supported with SSE-SNS, which uses AWS-managed keys or customer-managed KMS keys to secure stored messages. This ensures compliance for regulated industries, where sensitive data must remain encrypted even while in transit through messaging systems. Encryption at rest complements TLS in transit, creating end-to-end protection for messages. For example, a healthcare platform publishing medical alerts to a topic can prove that messages are encrypted from producer to subscriber, satisfying HIPAA or GDPR requirements. These controls make SNS viable for sensitive workloads, not just casual notifications.
SNS also supports message signing to provide authenticity guarantees. Messages are signed with certificates, and subscribers can validate these signatures to confirm they came from AWS and not an imposter. This prevents spoofing or tampering with messages, especially for HTTP/S subscribers. For example, a webhook receiving alerts can verify the signature before processing, ensuring the message truly originated from SNS. Authenticity complements encryption, ensuring both confidentiality and trustworthiness in communications.
Cross-account and cross-Region delivery patterns expand SNS’s reach. Cross-account subscriptions allow one AWS account to publish while another consumes, useful in multi-account strategies. Cross-Region delivery ensures messages flow across AWS Regions, supporting global systems. For example, a central event hub might broadcast to regional consumers worldwide, each subscribing in their own account and Region. These patterns ensure SNS scales not only within an account but across organizational and geographic boundaries, aligning with enterprise and global architectures.
Dead-letter handling is often implemented through SQS subscribers. If a subscription repeatedly fails to process a message, SQS acts as a durable DLQ, capturing problematic events for later analysis. For example, if a Lambda function crashes repeatedly on a particular message, SNS can forward it into an SQS queue for triage rather than losing it. This pairing of SNS and SQS ensures resilience: fanout broadcasts remain reliable, even when individual consumers fail. Dead-letter handling turns failures into manageable incidents instead of silent data loss.
Observability in SNS comes from CloudWatch metrics, which track publishes, deliveries, retries, and failures. Alarms can detect issues such as unusually high failure rates or sudden spikes in publishes. For example, if deliveries to an HTTP endpoint spike in errors, administrators can be alerted and investigate the subscriber’s health. Logging and monitoring transform SNS from a fire-and-forget service into a transparent, observable system, ensuring broadcast patterns remain reliable.
The cost model for SNS is straightforward but requires awareness. Charges apply per published message and per delivery attempt, meaning fanout multiplies costs. For example, one message delivered to 10 subscribers counts as 10 deliveries. Mobile push notifications may have additional provider-specific costs. Filtering helps control costs by preventing unnecessary deliveries, while careful architecture ensures broadcasts only reach those who need them. SNS remains economical, but mindful design prevents surprise bills in high-fanout scenarios.
Finally, it’s important to distinguish fanout via SNS from direct-to-queue communication. While producers could send messages directly to multiple SQS queues, this requires duplicating logic in the producer and increases coupling. SNS abstracts this complexity: producers publish once, and SNS handles fanout to as many subscribers as needed. This separation maintains loose coupling and simplifies producer design, reinforcing the principle that simplicity and scalability go hand in hand in distributed systems.
For more cyber related content and books, please check out cyber author dot me. Also, there are other prepcasts on Cybersecurity and more at Bare Metal Cyber dot com.
One of the most common patterns with SNS is fanout to SQS queues for parallel consumers. Instead of having a producer directly target multiple queues, the producer publishes once to a topic, and SNS delivers the message to every subscribed SQS queue. Each consumer application then processes messages independently, at its own pace. For example, an e-commerce system publishing an “OrderPlaced” event could fan out to one queue for fulfillment, another for billing, and another for analytics. This approach decouples consumers, provides durability through SQS, and ensures that failure in one system does not block others. Fanout via SNS-to-SQS is one of the most recognizable patterns in event-driven AWS architectures.
SNS also integrates seamlessly with Lambda, making it a natural trigger mechanism for serverless applications. By subscribing Lambda functions to a topic, developers can build reactive systems where events automatically invoke code without managing servers. For example, a customer registration event could trigger a Lambda that provisions an account, another Lambda that sends a welcome email, and another that logs the activity. This pairing of SNS and Lambda allows for highly flexible workflows where adding or removing subscribers is simple, keeping producers oblivious to downstream complexity. It is a powerful way to implement loosely coupled, scalable systems that adapt as needs evolve.
HTTP and HTTPS endpoints are another subscriber type, supporting webhook-style integrations. However, they must be secured carefully. SNS includes a subscription confirmation handshake to prevent accidental or malicious subscription of endpoints. Message signing also allows endpoints to verify that messages truly originate from SNS. For example, a monitoring service might subscribe an HTTP endpoint to receive alerts, validating each message before acting. Without validation, attackers could spoof SNS messages to trigger false alarms or worse. Proper use of subscription confirmation and signature validation ensures HTTP/S endpoints remain safe and trustworthy in SNS architectures.
Message filtering reduces unnecessary downstream processing by delivering only relevant events to each subscriber. Instead of blasting all messages to all consumers, filters ensure each subscriber gets exactly what it needs. For example, a global logistics application might tag events with a “region” attribute, and only send messages marked “APAC” to the Asia-Pacific queue. This reduces compute load, bandwidth, and cost for consumers while keeping the system responsive. Filtering shifts intelligence into SNS, simplifying downstream logic and improving overall efficiency in event-driven designs.
FIFO topics extend ordering and deduplication into the pub/sub model. By pairing FIFO topics with FIFO queues, architects can achieve end-to-end exactly-once ordered delivery. This is critical for sensitive applications like banking or order processing, where event order dictates correctness. For example, a payment followed by a refund must be processed in sequence. FIFO topics ensure this by respecting message group IDs and deduplication windows. While throughput is limited compared to Standard topics, the guarantees of strict order and exactly-once delivery are worth the trade-off in mission-critical workflows.
SNS imposes a 256 KB message size limit, which is often sufficient but can be restrictive for larger payloads. A common strategy is the S3 pointer pattern: large data is stored in an S3 bucket, and the SNS message contains a pointer to the object. Consumers then fetch the data directly from S3. For example, an analytics pipeline might generate large reports, notify subscribers through SNS, and include S3 URLs for retrieval. This pattern keeps messages lightweight while still enabling distribution of large data sets, ensuring compliance with limits without sacrificing functionality.
Topic policies provide governance and isolation between producers and consumers. Producers may be restricted so that only trusted roles can publish to a topic, preventing unauthorized or noisy services from injecting events. Consumers can also be restricted to specific accounts or roles, controlling who receives sensitive data. For example, a financial institution might allow only the billing service to publish to a topic, and only specific departments to subscribe. These policies ensure that messaging remains intentional, preventing accidental exposure or abuse in multi-team or multi-account environments.
SNS can be used for both global broadcast and tenant-specific patterns. Global broadcast involves publishing to one topic that fans out broadly, such as sending critical system alerts to multiple teams. Tenant-specific patterns, common in SaaS, involve creating dedicated topics per customer or per application environment. This isolates messages, ensuring that one tenant’s events are never exposed to another. For example, a SaaS provider might create per-tenant topics for customer updates, enforcing security and privacy boundaries. This flexibility allows SNS to support both wide-reaching broadcasts and tightly scoped, secure communication.
Monitoring delivery health is essential in SNS. CloudWatch metrics expose publish success, delivery attempts, retries, and failures. For example, a sudden increase in delivery failures to an HTTPS endpoint may indicate an outage or misconfiguration. Administrators can act quickly by checking logs and notifying endpoint owners. Subscriptions themselves must also be monitored, since unsubscribed or disabled endpoints may silently stop receiving messages. Observability ensures that fanout systems remain reliable and that failures are caught before they cascade into missed business events.
Common pitfalls include failing to configure message filters, leading to downstream consumers being overwhelmed with irrelevant data. Another pitfall is leaving topics open to publish without proper access policies, which could allow unauthorized services—or even external accounts—to flood a topic with junk data. Careless design can undermine the security and efficiency of SNS, so adhering to least privilege and using filters is essential. Best practices focus on minimizing noise, isolating responsibilities, and validating subscriber endpoints carefully.
Cost optimization in SNS involves minimizing redundant deliveries. Because billing is per delivery, fanning out one message to ten subscribers costs ten times as much. Filters help reduce this by ensuring only necessary subscribers receive messages. Architectural decisions—such as whether to broadcast broadly or use targeted topics—also influence cost. For example, a tenant-specific design may reduce waste by delivering only to the customers who need messages rather than broadcasting everything to everyone. By designing with precision, organizations ensure SNS delivers maximum value without inflated expenses.
It is important to distinguish SNS from EventBridge. SNS is optimized for pub/sub fanout—publish once, deliver to many subscribers—while EventBridge specializes in routing and filtering events with fine-grained patterns and SaaS integrations. For example, broadcasting alerts to multiple consumers is an SNS job, but routing only high-value orders to a fraud-detection service is an EventBridge use case. Both services complement one another, but SNS remains the simpler choice for high-speed, broad fanout patterns. This distinction is often tested on the exam and in real-world decisions.
From an exam perspective, SNS is the right choice whenever questions describe broadcasting messages to multiple subscribers, fanout across SQS queues, or triggering Lambda functions from a central event. Keywords like “pub/sub,” “fanout,” and “notify multiple systems” should immediately signal SNS. If the scenario emphasizes routing logic or SaaS integration, EventBridge is more likely. Recognizing these signals ensures exam candidates can quickly and confidently choose the right service.
In conclusion, Amazon SNS enables scalable, reliable broadcast and targeted delivery through pub/sub messaging. Topics and subscriptions provide the foundation, while filtering, retries, FIFO support, and DLQ integration enhance reliability and precision. By integrating with Lambda, SQS, and external endpoints, SNS becomes a hub for decoupling systems and enabling parallel workflows. Security policies and monitoring ensure governance, while cost optimization keeps fanout affordable. For learners, the message is simple: use SNS when one event must reach many consumers, leveraging filtering and governance to keep systems efficient, secure, and resilient.

Episode 87: Backup & Glacier
Broadcast by