Amazon provides the AWS Database Migration Service to migrate data to and from different database platforms, whereby either the source or target is on AWS. It supports a range of database platforms, and is cost effective by comparison to many of the licensed migration tools in the market. Amazon declares that DMS can migrate your database in “just a few clicks”. But is it ever as simple as this in practice? The DMS user guide provides plenty of information on its limitations, but you may find yourself jumping between pages, before finally discovering a showstopper. Do the limitations of the tool present too many barriers to its ease of use? And importantly, can we rely on it as an enterprise migration tool? This article discusses some of the considerations that should be made before you use it as a migration method in your organisation, with a particular focus on Oracle as a source data store.
How it Works
A high-level overview of AWS DMS is as follows:
- Create a replication instance.
- Create source and target endpoints. These store connection information about your databases.
- Create migration task/s to migrate the data. This has 3 major phases:
- Full data load.
- Apply cached changes made to source during full load.
- Ongoing replication (Change Data Capture) until point of application switchover.
Considerations & Limitations
The replication instance is an Amazon EC2 instance that hosts one or more migration tasks. The replication instance should be sized appropriately for the number of concurrent migration tasks you plan to run, in terms of CPU, memory and storage required for the Change Data Capture process. When performing production migrations, it may be prudent to use separate replication instances to avoid resource contention, particularly if you are unsure of the resource requirements. Choosing the Multi-AZ option also provides High Availability via a synchronous standby replica of the replication instance.
Source and Target Endpoints
AWS DMS supports a number of database engines as source and target endpoints. They can be on-premise, Amazon EC2 hosted, or an Amazon RDS service. Amazon S3 buckets can also be used as a data source for AWS DMS.
Supported data stores for Source Endpoints:
Supported data stores for Target Endpoints:
Different considerations and limitations exist depending on the type of data store. We will discuss some of these later.
There is also a limit to the size of a database that AWS DMS can migrate, based on resource quotas per AWS Region per AWS account. This should be considered before choosing AWS DMS as a migration tool.
The migration task migrates data between the source and target data stores. It only creates objects in the target that are considered necessary to migrate the data – tables, primary keys, and some unique indexes. Other indexes, constraints and code objects must be handled separately.
For homogenous migrations, it is recommended to use the database software’s native tools to migrate the metadata, before using AWS DMS to migrate the data.
For heterogenous migrations, the AWS Schema Conversion Tool can be used for the schema metadata. AWS SCT allows to automatically convert most of the schema metadata to the target database format, but since it has its own limitations, it highlights all code that requires manual actions. It can be a useful tool for identifying which elements of an application schema do not translate from one database platform to another and require redesign.
When AWS DMS performs the full data load phase, it does not do so in any particular order, so child table data could be loaded before parent table data. This means that any referential integrity constraints between tables, enforced by foreign keys or triggers, must be disabled, to prevent violations and adverse effects on the data load. Of course, disabling foreign key constraints and triggers in the live source database would likely be unacceptable. The foreign key constraints and triggers should be disabled on the target side after the schema creation, and prior to the data load with DMS. They should then be reenabled after the data migration is complete, or optionally after the start of the Change Data Capture (CDC) phase – which will cause a higher load on the target database during the CDC phase, as result of constraints validation of the migration data stream but will have the benefit of a reduced application switchover time.
AWS DMS begins applying cached changes for a table as soon as the full load for that table is complete. Therefore, only after DMS has applied all cached changes for all tables, and switched to the ongoing replication phase, will the database be transactionally consistent. This is important to understand in the context of any migration progress checks on the target.
Considerations when using an Oracle Database as a Source data store
There are a significant number of limitations of AWS DMS in supporting an Oracle Database as a Source data source. Some of these will be very possible eventualities in the real world and therefore must be carefully considered before choosing AWS DMS as a migration tool.
- Changes resulting from partition or subpartition operations (ADD, DROP, EXCHANGE, TRUNCATE) are not replicated.
- Any data in LONG or LONG RAW columns longer than 64KB is truncated to 64KB.
- Virtual columns are not supported.
- Global temporary tables are not loaded or captured.
- The ROWID datatype, and materialized views based on a ROWID column are not supported.
- BFILE, REF, UROWID, ANYDATA and user-defined data types cannot be replicated, resulting in a null column.
- For Oracle nested tables, only one level of table nesting is supported.
Refer to the following page for more:
The Change Data Capture phase of the migration task can be performed by either Oracle LogMiner or AWS DMS Binary Reader. The default is to use LogMiner, but there are again various situations under which each method may be unsuitable or unsupported.
- In order to use a pluggable database as a source data store, the Binary Reader option must be chosen for the Change Data Capture.
- Binary Reader does not support all Oracle compression and encryption options. Specifically, HCC Query Low compression is not supported, and TDE is only supported for self-managed databases (on-premise or Amazon EC2 hosted, not Amazon RDS).
There are further considerations and prerequisites. In order to use the Binary Reader CDC method, additional privileges must be granted to the migration user specified in the source endpoint. If the redo logs are stored on non-ASM storage, the suggestion from AWS is to grant the CREATE ANY DIRECTORY system privilege, which will allow DMS to create directory objects to read from the redo logs. Granting of the CREATE ANY DIRECTORY system privilege is likely prohibited in many production databases, due to the security hole it exposes. The alternative is to manually create directories for all redo log locations and grant read access on the individual directory objects. The migration user in the source Oracle database is also required to have SELECT privileges on every table that will be replicated. Migrations often involve the whole database or whole schemas and granting SELECT ANY TABLE could again raise security concerns. The SELECT privileges would therefore need to be granted individually to each table.
When creating the migration task, there is an option relevant to Oracle source databases – “Include LOB columns in replication”. The options are to exclude LOB columns entirely from the migration, to include them, or to truncate them to a maximum size. It is difficult to imagine a real world use case for truncating LOBs to an arbitrary size, but the option is there and is recommended by AWS to be used whenever possible, offering “significant performance gains”. In Full LOB mode, LOBs are migrated one at a time, chunk by chunk. If the LOBs are smaller than the maximum VARCHAR size (64K), then DMS can treat them as VARCHAR and fetch them in bulk. Hence AWS recommends a maximum LOB size of 64K when using DMS.
The declaration that the AWS Database Migration Service can migrate your database in “just a few clicks” may be true for a very simple data set. But in the real world, there are many considerations before you can reap the promised benefits of its simplicity. Planning is crucial to the success of any database migration. When using Cloud services such as AWS DMS, whereby there is a further layer of abstraction between the administrators and the databases, understanding the potential pitfalls becomes even more important.