Episode 55: EC2 Basics (Virtual Servers)

When designing on AWS, one of the most important choices you’ll make is storage. The platform offers several options — object, block, file, and archival storage — and each is optimized for different patterns of durability, cost, and performance. Selecting the wrong one can lead to unnecessary expense or poor application performance. Selecting the right one unlocks efficiency, scalability, and resilience. Beginners should think of storage like different types of containers in a kitchen: a refrigerator for perishable items, a pantry for dry goods, a freezer for long-term preservation, and a toolbox for specialized equipment. Each has its role, and AWS storage services follow the same principle.
Amazon S3, or Simple Storage Service, is the foundation of object storage in AWS. It is designed for virtually unlimited scalability and high durability — eleven nines, or ninety nine and 5 nines percent. S3 stores data as objects inside buckets, making it perfect for backups, static content, media assets, and big data lakes. For learners, picture S3 as a giant warehouse with shelves that never run out of space, where every object is tagged with its own identifier and metadata. Unlike block or file storage, S3 is not about mounting drives — it is about retrieving objects directly, making it a flexible backbone for modern workloads.
S3 offers multiple storage classes, starting with S3 Standard, which provides the highest availability across multiple Availability Zones. This is the default for frequently accessed data. Intelligent-Tiering introduces automation by moving objects between frequent and infrequent access tiers depending on usage patterns. Beginners should imagine a warehouse that reorganizes shelves automatically, moving items closer to the entrance if they’re popular, and further back if they’re rarely used. Intelligent-Tiering reduces costs without sacrificing availability, making it ideal when access patterns are unpredictable.
For less frequently accessed data, S3 Standard-IA (Infrequent Access) and One Zone-IA provide lower-cost options. Standard-IA stores data across multiple Availability Zones, maintaining resilience, while One Zone-IA keeps data in a single Availability Zone at a further discount. Beginners should think of Standard-IA as renting climate-controlled storage across multiple buildings, while One Zone-IA is like keeping it in just one warehouse. The tradeoff is resilience: One Zone-IA costs less but risks availability if that single zone fails. Choosing wisely depends on whether the data is easily reproducible or mission-critical.
Archival storage in S3 comes through Glacier tiers. S3 Glacier Instant Retrieval provides millisecond access for archived objects that are rarely used but still must be retrievable quickly. Glacier Flexible Retrieval allows retrieval within minutes to hours, with options for expedited or bulk retrievals. Glacier Deep Archive is the lowest-cost option, suitable for long-term retention where retrieval can take up to twelve hours. Beginners should view these like different depths of a freezer: the top drawer for quick access, the middle shelves for occasional use, and the deep back for long-term frozen goods. Each balances access speed with cost.
S3 also includes management features that strengthen governance. Versioning ensures older versions of objects are retained even if overwritten or deleted, enabling rollback after mistakes. Lifecycle policies automatically move objects between classes or delete them when no longer needed, optimizing costs. Replication copies objects across Regions for resilience or compliance. Beginners should see these features as automated housekeeping staff in the warehouse: they keep everything organized, rotated, and backed up without manual effort.
Amazon Elastic Block Store, or EBS, provides block-level storage for EC2 instances. Unlike S3, EBS volumes behave like traditional hard drives, with low-latency access and the ability to be formatted with file systems. General-purpose gp3 volumes balance cost and performance, while provisioned IOPS volumes, io1 and io2, deliver guaranteed high input/output for databases or critical workloads. Throughput-optimized st1 and cold storage sc1 serve large, sequential workloads like logs or backups at lower cost. Beginners should think of EBS as the toolbox of specialized hard drives: you choose the model that fits performance and durability requirements.
Snapshots allow EBS volumes to be backed up to S3, providing durability and disaster recovery. Snapshots are incremental, meaning only changes are saved, reducing storage costs. Fast Snapshot Restore enables volumes created from snapshots to be immediately performant without warming up. For learners, this is like photocopying only the edited pages of a manuscript instead of the entire book, saving time and space. Snapshots make it easy to back up, replicate, and restore block storage consistently.
Amazon Elastic File System, or EFS, offers shared file storage that multiple EC2 instances can mount simultaneously. It is scalable, elastic, and managed by AWS, expanding or contracting automatically as files are added or removed. Performance modes, such as General Purpose and Max I/O, adjust behavior for latency-sensitive versus highly parallel workloads. Throughput modes offer baseline performance or provisioned levels for heavy-duty tasks. Beginners should view EFS as a shared filing cabinet accessible by multiple workers at once, scaling seamlessly as more folders are added.
Hybrid transfer services extend AWS storage beyond the cloud. AWS Transfer Family supports traditional protocols like SFTP, FTPS, and FTP, making it easier to migrate existing workflows into S3. AWS DataSync accelerates large-scale transfers, automating scheduling, encryption, and verification. Beginners should picture these as delivery trucks that can move large amounts of records from old offices into the new warehouse. Transfer services reduce the friction of adopting cloud storage while preserving existing workflows.
Storage Gateway bridges on-premises and AWS storage, allowing organizations to cache or extend local storage into the cloud. File Gateway exposes cloud-backed file shares, Volume Gateway provides block volumes with cloud snapshots, and Tape Gateway replaces physical tape systems with virtual ones. Beginners should think of Storage Gateway as adding hidden tunnels between your local office and AWS’s warehouses, making it feel like the cloud is just another wing of your building. It supports hybrid adoption while preserving local access.
For offline or disconnected environments, the AWS Snow Family provides physical devices for bulk data transfer. Snowball and Snowcone devices can be shipped to customer sites, loaded with data, and returned to AWS for import. Snowmobile, a shipping container-sized system, is used for massive petabyte-scale transfers. For learners, this is like using trucks or even an entire train to physically move documents when networks cannot handle the load. Snow devices solve the challenge of data gravity when moving vast datasets into the cloud.
Across all these storage services, encryption and access control remain non-negotiable. AWS provides server-side encryption for S3, EBS, and EFS, often integrated with KMS keys. IAM policies, bucket policies, and resource permissions ensure that only authorized identities can access data. Beginners should see encryption as the locks on every cabinet and access control as the keys and badges. Without them, even the most durable storage becomes unsafe. Governance and security must always complement durability, cost, and performance.
For more cyber related content and books, please check out cyber author dot me. Also, there are other prepcasts on Cybersecurity and more at Bare Metal Cyber dot com.
The best way to understand AWS storage is to map services to real-world use cases. For a static website or content delivery, the natural choice is S3 combined with CloudFront. S3 provides object storage for HTML, CSS, images, and videos, while CloudFront accelerates delivery globally and adds protection at the edge. Beginners should think of this as storing your content in a central warehouse while a network of local shops distributes it quickly to customers around the world. This pattern is cost-effective, scalable, and extremely resilient.
When running EC2 instances, the best practice is to pair them with EBS volumes. EBS acts like a hard drive, providing persistent block storage for operating systems, databases, and applications. If the instance stops or fails, the data remains on the EBS volume, ready to reattach to another EC2 instance. For learners, this is like parking a car in a garage: the car may move or change, but the garage remains, holding its contents securely. EBS delivers consistency and low-latency performance needed for compute workloads.
Shared Linux file systems across multiple instances are best handled by EFS. With its fully managed, elastic file storage, EFS lets many instances read and write simultaneously, making it ideal for web applications, content management systems, or analytics workflows. Beginners can picture this as a communal filing cabinet in an office: multiple workers can access the same folders at the same time, and the cabinet grows automatically as more documents are added. This eliminates manual provisioning and ensures continuous availability.
For long-term data retention, S3 Glacier tiers are the go-to option. Whether backups, archives, or compliance records, Glacier tiers reduce costs by trading off retrieval time. Glacier Instant is best when you need rare but rapid access, Flexible Retrieval balances speed and cost, and Deep Archive provides the lowest cost for data rarely touched but legally required to be kept. Beginners should think of this as choosing between quick-access storage boxes, regular storage lockers, and deep vaults in a warehouse. Each serves a different balance of cost and retrieval urgency.
Data lakes for analytics often use S3 as their foundation. By storing raw and processed data in S3, organizations can then use AWS Glue for cataloging and Amazon Athena for querying. This eliminates the need for traditional databases to store massive amounts of structured and unstructured data. For learners, this is like organizing a library where every book and article is stored in one central archive, and researchers use search tools to pull exactly what they need. S3 provides the durability and flexibility to support modern analytics.
Cost tuning in storage often revolves around choosing the right class and using lifecycle rules. For example, objects may start in S3 Standard, transition to Intelligent-Tiering, and eventually move to Glacier. Lifecycle policies automate this process, ensuring humans don’t need to manually shuffle data. Beginners should think of this as a cleaning crew that periodically moves old files from desks to storage rooms, and eventually to the basement archive. The result is significant cost savings without sacrificing compliance.
Performance tuning is particularly important with EBS. Provisioned IOPS volumes can deliver thousands of consistent I/O operations for workloads like Oracle or SQL Server. Throughput-optimized volumes are better for sequential reads and writes like logs. Beginners should see this as choosing between a sports car for speed, a delivery truck for throughput, or a commuter car for balanced needs. Selecting the right EBS type ensures you get the performance you pay for without overspending.
Durability and availability differ between services. S3 and Glacier promise eleven nines of durability and multiple Availability Zones for resilience. EBS provides high durability but is tied to a single AZ unless you back it up with snapshots. EFS is regional by default, spanning multiple zones for resilience. Beginners should see this as comparing safes: some are fireproof and distributed across multiple buildings, while others are strong but located in just one building. Matching durability needs to service design is key for reliability.
Cross-Region replication enhances resilience further. S3 buckets can replicate objects automatically to other Regions, supporting disaster recovery or compliance requirements. Similarly, EBS snapshots can be copied across Regions. For learners, this is like keeping a backup of important documents in another city in case the main office is disrupted. Replication adds cost but ensures continuity when regional outages or disasters strike.
Compliance often requires immutable storage, and S3 Object Lock delivers this. By enabling write-once, read-many retention, Object Lock prevents objects from being deleted or modified until their retention period expires. This is essential for regulations like SEC 17a-4 or financial audit requirements. Beginners should think of Object Lock as sealing records in tamper-proof envelopes: they remain available but cannot be altered until rules allow. This protects against both accidental deletion and intentional tampering.
Hybrid storage needs are solved with tools like Storage Gateway and DataSync. Storage Gateway enables organizations to extend existing applications into AWS with file shares, block volumes, or tape replacements. DataSync accelerates bulk movement of data into S3, EFS, or FSx with encryption and verification. For learners, these services are like building tunnels and high-speed highways between local offices and AWS warehouses. They support gradual cloud adoption and ongoing hybrid strategies.
Common pitfalls with AWS storage often appear on the exam. These include leaving S3 buckets public unintentionally, choosing the wrong storage class and overpaying, or failing to configure lifecycle rules, leading to spiraling costs. Beginners should see these as rookie mistakes in office management: leaving filing cabinets unlocked, using premium storage for dusty archives, or never cleaning out old files. Avoiding these pitfalls is just as important as knowing the features themselves.
From an exam perspective, the skill is selecting storage based on access patterns, cost, and service-level agreements. If frequent random access is required, choose EBS. If multiple servers need shared files, choose EFS. If static or infrequently accessed objects are stored, S3 and its classes fit. If archival and compliance retention are the concern, Glacier tiers or Object Lock are the right answers. Learners should practice mapping scenarios to storage choices quickly, as many exam questions are framed this way.
In conclusion, AWS offers a spectrum of storage solutions: S3 for object, EBS for block, EFS for shared file, and Glacier for archival. Each service shines in different contexts, and the real power comes from automating lifecycle transitions, replication, and encryption. For learners, the playbook is simple: match storage type to workload, optimize for cost and durability, and use governance tools to keep everything compliant. When done correctly, AWS storage not only holds data but also enables performance, resilience, and long-term business value.

Episode 55: EC2 Basics (Virtual Servers)
Broadcast by