Skip to content

Latest commit

 

History

History
213 lines (167 loc) · 8.11 KB

6. database.md

File metadata and controls

213 lines (167 loc) · 8.11 KB
theme background class highlighter lineNumbers
shibainu
/images/1-background.jpg
text-center
shiki
false

Database

Đà Nẵng, năm COVID thứ 2

Relational Database Service (RDS)

  • Relational DB PaaS (almost PaaS).
  • Supported engine options: MySQL, PostgreSQL, Oracle, Microsoft SQL, MariaDB, Amazon Aurora.
  • each DB engine has a set of parameters in a DB parameter group that control the behavior of the databases that it manages.
  • Multi-AZ: synchronous replica in secondary AZ.
  • ACID Compliance: transactions must be ACID compliant or be Atomic, Consistent, Isolated and Durable (ACID) to ensure data integrity.
  • Billing options: on-demand and reserved instance.
  • Bring your own licenses supported on Oracle.
  • Automated or manual backups.
  • Automated or manual upgrades.
  • Can be placed in a VPC to have control on the network and security.

RDS Multi-AZ:

  • Synchronous replica in secondary AZ. Master-Standby relationship.
  • The standby replica is invisible.
  • DB snapshots taken against the replica instance.
  • AWS manages DNS for failover.
  • Multi-AZ is different from an RDS read replica.

RDS Read Replica:

  • Read only instance, for load offloading.
  • Created from a snapshot of the master instance.
  • Asynchronous replication.
  • Up to 5 read replicas to each DB Instance.
  • Same or different AZ. Same or different region.
  • Read Replicas in a different region is supported only on MariaDB, MySQL, Oracle, or PostgreSQL. Not supported on MS-SQL.
  • For MS SQL, the source DB instance must be a Multi-AZ deployment with Always On Availability Groups (AGs). This is an Enterprise Edition feature.
  • For some DB engines, you can transform a read replica to a normal independent instance: promote to master.

RDS Automated Backups:

  • Automated daily incremental volume snapshot of your entire DB instance (VM), not just the DBs.
  • Backups are stored in Amazon S3.
  • Automated backups occur daily during the preferred backup window.
  • If you don't specify a preferred backup window, RDS assigns a default 30-minute backup window that is selected at random from an 8-hour block of time for each AWS Region.
  • Retention: Up to 35 days. Default: 1 day.
  • Deleting an RDS instance deletes the automated snapshots but not the manual snapshots.
  • RDS uploads transaction logs for DB instances to Amazon S3 every 5 minutes. RPO is therefore 5mn.

RDS Restore:

  • You cannot restore to an existing instance.
  • Customized DB parameters and SGs are not restored.
  • RDS combines daily backups with transaction logs to restore the DB instance (point-in-time recovery).
  • You can Restore up to the last five minutes.
  • Third party backup and restore process and options are different.

Amazon Aurora:

  • A MySQL and PostgreSQL-compatible relational database built for the cloud.
  • With some workloads, Aurora can deliver up to 5x the throughput of MySQL and up to 3x the throughput of PostgreSQL.
  • The underlying storage grows automatically as needed.
  • Automatic clustering, replication, and storage allocation.
  • An Aurora Cluster is composed of :
    • DB instances: up to 16.
    • A cluster volume that manages the data for those DB instances.

  • Aurora cluster volume:
    • A storage volume that spans multiple AZs, with each AZ having 2 copies of the DB cluster data.
    • Each data chunk (10GB) is written in 6 copies: 3 AZs, 2 copies per AZ.
    • Supports loss of 2 copies.
    • Based on SSD disks.
    • Automatically grows and shrinks as the amount of data in your database increases or lowers.
    • Can grow to a maximum size of 128 tebibytes (TiB).
    • You are only charged for the space that you use in an Aurora cluster volume. -Two types of DB instances make up an Aurora DB cluster:
    • Primary DB instance: Read and Write. One per cluster.
    • Aurora Replica: Read only. Up to 15 per cluster.
  • Replicas can be placed in other AZs.
  • Aurora automatically fails over to an Aurora Replica.
  • Aurora multi-master clusters: all DB instances have read/write capability.
  • Clusters are created in a VPC. Security Groups can be used.
  • Storage can be encrypted.

Aurora Global Database:

  • A feature that allows a single Amazon Aurora database to span multiple AWS regions.
  • Composed of a primary DB cluster in one Region, and up to 5 read-only secondary DB clusters in different Regions.
  • Enables fast local reads in each region.
  • Asynchronous replication. Replication latency: 1 sec (RPO = 1s).
  • Manual failover: a secondary region can be promoted to full read/write capabilities in less than 1 minute (RTO = 1mn).

Aurora Serverless:

  • An on-demand, autoscaling configuration for Aurora.
  • Automatically starts up, shuts down, and scales capacity up or down based on your application's needs.
  • Advantages: Similar than provisioned Aurora, Scalable, Cost-effective.
  • Does not support Global databases, multi-master, replicas, …

Amazon DynamoDB

  • Complete PaaS NoSQL DB with high scalability and low latency.
  • Automatically and synchronously replicates data across 3 AZs.
  • Not ideal for Joins and complex transactions.
  • All items with the same partition key value are stored together (physical storage), in sorted order by sort key value.

DynamoDB Capacity modes:

  • Amazon DynamoDB has two read/write capacity modes for processing reads and writes on your tables:
    • On-demand: pay-per-request.
    • Provisioned: default.
  • The mode can be changed.
  • Provisioned mode :
    • free-tier eligible,
    • puts limits for cost predictability,
    • supports auto scaling
    • supports reserved capacity.
  • Capacity Units:
    • You specify throughput requirements in terms of capacity units.
    • A read capacity unit (RCU) represents one strongly consistent read per second, or two eventually consistent reads per second, for an item up to 4 KB in size.
    • A write capacity unit (WCU) represents one write per second, for an item up to 1 KB in size.

DynamoDB Accelerator (DAX):

  • A managed in-memory cache for DynamoDB.

DynamoDB Global Tables:

  • Provide a fully managed solution for deploying a multiregion, multi-active database.
  • A DynamoDB global table consists of multiple replica tables (one per Region) that DynamoDB treats as a single unit.
  • All replicas are Read/Write.
  • Newly written items are usually propagated to all replica tables within a second.
  • When doing a read, your application can choose between "eventually consistent" (default) and "strongly consistent" reads.
  • If your application requires strongly consistent reads, it must perform all of its strongly consistent reads and writes in the same Region. DynamoDB does not support strongly consistent reads across Regions.
  • If applications update the same item in different regions, the last writer wins.
  • Failover: You can apply custom business logic to determine when to redirect requests to other Regions.

DynamoDB Backup: Two types of backups:

  • on-demand backup:
    • full backups of your tables for long-term archival.
    • All on-demand backups are cataloged, discoverable, and retained until explicitly deleted.
    • Can be scheduled using AWS Backup.
  • continuous automatic backups:
    • enables Point-in-time Restore (PITR),
    • you can restore to any time between 5 minutes and 35 days. Both types support cross-region restores.

Performance: RDS vs DynamoDB:

  • DynamoDB can handle higher rates for read/write operations (10K+ operations per second).
  • DynamoDB can scale with sharding. RDS has a single node/shard.

Amazon ElastiCache:

  • Managed In-memory caching service.
  • Choice of engines: Memcached or Redis.

Memcached:

  • simple features,
  • multithreaded (sharding).
  • low maintenance,
  • easy horizontal scaling with Auto Discovery.

Redis:

  • Has much more features than Memcached.

Amazon RedShift:

  • Fully managed petabyte-scale relational data warehouse service. Amazon Redshift Spectrum:
  • Enables you to query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Amazon Neptune:
  • A fully managed graph database service. Amazon DocumentDB:
  • A fully managed database service for MongoDB-compatible databases in the cloud.