This guide provides an introduction to what Amazon RDS is, and what it's used for. Then we'll walk you through how to set up an RDS database, with detailed instructions.
Amazon RDS (Relational Database Service) is an on-demand, managed database service offering from AWS (Amazon Web Services).
- Per Amazon, "Customers use Amazon RDS databases primarily for online-transaction processing (OLTP) workloads," while Amazon Redshift, for comparison, is used primarily for analytics and reporting. If Redshift will better meet your needs, be sure to connect with Zuar for expert help.
- Unlike Amazon's DynamoDB, RDS is not completely managed outside of making tables, so 'Some Assembly Required' to get rolling. But being an expert in EC2 or some other compute instance is not necessary; neither is an advanced DBA (Database Administrator) skill set needed.
The RDS service allows for spinning up both open and closed-source database frameworks, as well as the free-to-use product called Amazon Aurora, which is MySQL and PostgreSQL compatible. RDS provides a Free Tier with limitations, but that rolls off your account after a year of it opening.
It could be said that the Aurora product has more features managed by Amazon than the other RDS service offerings. It is advertised that Aurora bakes in replication, backup, monitoring, upgrades and other tasks that in some cases are tasks a user of a different flavor of RDS offering would be responsible for.
In addition of Aurora, RDS is also available on MySQL, PostgresSQL, MariaDB, Oracle, and Microsoft SQL Server. The latter two have the ability to use a free/community edition, import a license, or purchase a license alongside the provisioning of RDS itself.
How to Choose the Right RDS Database Engine
This question may not have the clearest answer! If you are moving a workload from a legacy system, especially one from before 2014, you may want to continue using Oracle or MS SQL Server and not rewrite code, so this makes the answer very clear. If your system was a user of PostgreSQL or MySQL, you may want to consider Aurora for its ease of overall management and TCO (Total Cost of Ownership).
The decision points to keep in mind are:
- Cost: How much will this cost to run? Do I need a license? How hard is it to manage?
- Performance: Do I know my replication factors? Do I need to be multi region? Do I know how much CPU and memory I need?
- Future Proofing: PostgreSQL, MySQL and Aurora are growing in use while closed source distributions like Oracle and MS SQL Server are declining in use.
How to Set Up and Configure an RDS Database
The preferred method to instantiate RDS instances is from the AWS Management Console, seen below.
Notice that AWS is encourage the user to create an Aurora database. There is a pane below the 'Resources' pane (not shown) that allows you to create a database other than Aurora.
Alternatively, a database that is active can be viewed here:
Configuring a Database (Non Aurora)
There are two creation methods to choose from, regardless of database type. First we'll walk through 'Easy create', then 'Standard create', as 'Easy create' is extremely simple and a great way to get rolling. Although it uses defaults for almost all features besides the password, so is inadequate any moderately-complex network or VPC scheme.
You can select any of the database types, and for each there is always a username field. All but Aurora have a password field, and some make you decide a main database name, but with the default filled. Also note that the MySQL, MariaDB, and MS SQL Server options will have a 'DB Instance Size' of 'Free Tier' as well.
Note that we've limited the examples below to the PostgreSQL configuration.
Once configuring identifiers, usernames and passwords you can view the defaults it sets. It can be useful to use an AWS-generated password or generated strong password stored in a vault, as after this screen you will need to use a tool such as PgAdmin or CLI to update the password.
Inspect the defaults. Assuming this is for a service within your default VPC it should work out of the box. It's important to realize that this 'Easy create' method will not be able to talk to the open internet. You will need a VPN service to use this on an external computer (with PgAdmin, for example). Also, note that 'Easy create' does not tell you how much it costs, or even an estimate for costs. Standard (see below) shows this, or it can be calculated here.
Similar to the 'Easy create' method, select the engine type, and the credentials settings will be similar.
Same with the DB instance class and storage, as this is an AWS-specific setting. Here, the defaults will cover most workloads, but you may need to do some research on what type of storage is needed for your perceived workloads. Furthermore, autoscaling can be a very powerful tool for not having to worry about growing your database as it grows, or at least not have production or dev outages once the allocated storage threshold is reached.
Availability and durability will differ on each platform, and is not always a configurable option. This is mainly a means of having database replication distributed across AWS Availability Zones (AZs).
Connectivity is the hardest part of standing up RDS for the uninitiated. Blindly opening up the database to the public internet is highly discouraged, so this is set to 'No' by default. When you make an account, it makes a default vPC Security Group and any other AWS infrastructure within this security group will be able to interact with your database. You can learn more about VPC security groups here.
Without very targeted IAM roles and granular controls, or standing up Kerberos, password might be the best means of authentication.
Furthermore, the 'Additional configuration' section allows for much more granular changes.
Performance insights will be stored for 7 days unless otherwise specified, and this is also stored encrypted with KMS keys (note that this cannot be changed).
Monitoring will be out of the box for PostgreSQL and AWS services, and defaults to being enabled. Maintenance windows can also be modified to allow for controlled outages of the database.
Finally, deletion protection can be enabled, which ensures that the database cannot be deleted while the option is enabled.
The monthly cost estimator uses a baked-in formula, so actual costs may be much cheaper or much more expensive based on how you use your RDS instances.
Managing Your Database
Congratulations, your RDS instance is turned on!
Now you'll want to use a CLI (see below), or a Database Management Tool (platform specific like MS SQL Server Studio or PgAdmin, or generic like DBeaver). This will allow you as a user to tune some aspects of the database as if it was installed on premise. With that being said, some management features will be disabled when using RDS distribution versus a physical install.
In certain instances there are EC2 templates for installs of databases. These are not considered RDS and require much more management, but are options to consider.
Besides the CLIs that are baked into database distributions, the AWS CLI allows for many changes and updates to your RDS instances from a terminal, and makes scripting a breeze. After you instantiate the RDS instance with the console, you can change your settings there as well.
This stuff is complex. But that's why Zuar was founded. We work with organizations of all sizes to help them with their database needs, and get them set up with data strategies that utilize up-to-date yet proven technologies.
- Get the most out of your data without hiring an entire team to make it happen. Learn about Zuar's data services for migration, integrations, pipelines, infrastructure, and models.
- Pulling data into a single destination and normalizing that data, whether in the cloud or OnPrem, can be difficult for any organization. Zuar's Mitto solution provides comprehensive ETL and automated pipeline functionality without the learning curve and cost of many other solutions. You can learn more here.