There are several reasons why a user might consider migrating from Amazon DocumentDB to MongoDB Atlas.
- Features: Amazon DocumentDB has a limited set of features compared to MongoDB Atlas service, which provides advanced features such as global clusters, multi-cloud support, and real-time analytics.
- Scalability :Amazon DocumentDB has limited scalability options and might struggle to handle large amounts of data.MongoDB Atlas provides a fully managed and highly scalable MongoDB service.
- Performance: Amazon DocumentDB can have performance issues when dealing with complex and resource-intensive queries.MongoDB Atlas is built on top of the latest version of MongoDB, which provides improved performance.
- Cost savings: Amazon DocumentDB can be more expensive compared to MongoDB Atlas which offers a cost-effective, pay-as-you-go pricing model that eliminates the need for upfront investments in hardware and infrastructure.
- Ease of use & Flexibility : Amazon DocumentDB has limited customisation options and might not offer the same level of flexibility as MongoDB Atlas which offers a user-friendly web interface and API for managing your database, making it easier for developers to get started and scale their applications.
- Support: MongoDB Atlas provides 24/7 support from experienced MongoDB experts, ensuring that you have access to the resources you need to succeed.
These are some of the bottlenecks that Amazon DocumentDB users might encounter compared to MongoDB Atlas and might consider migrating.
However, the specific limitations will depend on the use case and requirements of each individual user, and the ultimate decision to migrate will depend on the specific needs and requirements of your organization.
Architecture
In this article, we will see how we can migrate data from Amazon DocumentDB to Atlas Cluster using Amazon DMS.
Later, when the migration is complete and the data is in sync, we can simply change the connection string from Amazon DocumentDB to Atlas Cluster within the application.
First, we will get a brief understanding of some of the key components involved in the migration, and then we will go over the migration approach.
MongoDB Atlas
MongoDB Atlas is a fully-managed cloud database service provided by MongoDB Inc. It allows you to run, manage, and scale MongoDB databases in the cloud. Atlas supports various cloud providers, including AWS, Azure, and GCP. With MongoDB Atlas, you don’t have to worry about the underlying infrastructure, security, or maintenance, leaving you free to focus on your applications and services. It offers features such as auto-scaling, real-time backups, and easy-to-use management tools, making it a popular choice for modern applications
Amazon DocumentDB
Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, and fully managed document database service provided by Amazon Web Services (AWS). It is designed to be compatible with MongoDB, a popular NoSQL database, which allows developers to use MongoDB APIs and tools to interact with DocumentDB.
AWS DMS
Amazon Web Services (AWS) Data Migration Service (DMS) is a managed service provided by AWS for migrating data to AWS. It allows organizations to move data from a variety of sources, including relational databases, NoSQL databases, data warehouses, and cloud-based storage services, to target databases, data warehouses, and S3 storage.
DMS supports homogeneous migrations (e.g., Oracle to Oracle) and heterogeneous migrations (e.g., Microsoft SQL Server to Amazon Aurora).
AWS DMS handles the complexities of data migration, such as data transformation, network optimization, and error handling, and provides real-time data replication, minimal downtime, and near-zero data loss.
AWS DMS is a cost-effective solution for data migration and can help organizations move their data to the cloud quickly and easily, with minimal effort and without sacrificing performance or security.
Endpoints
AWS Database Migration Service (DMS) endpoints refer to the target databases or data stores that you want to migrate data to or from using AWS DMS.
DMS supports a variety of database engines and data stores as source and target endpoints, including relational databases like Amazon RDS, Microsoft SQL Server, and Oracle, as well as non-relational databases like Amazon DynamoDB and MongoDB.
When setting up a DMS migration, you must specify the source and target endpoints, which can either be on-premises or in the cloud.
By using AWS DMS endpoints, you can simplify the data migration process and focus on the important task of migrating your data to the desired target.
Replication/Migration Instance
AWS Migration Instance is a term used in the context of Amazon Web Services (AWS) Database Migration Service (DMS).
In DMS, a migration instance is a managed Amazon Elastic Compute Cloud (EC2) instance that runs the DMS replication engine. It is responsible for performing the data migration tasks, such as connecting to the source and target endpoints, reading the data from the source, and writing it to the target.
You can choose from a variety of EC2 instance types to run your migration instances, depending on the size and complexity of your data migration. Larger and more complex migrations may require more powerful migration instances with more memory and CPU resources.
AWS DMS automatically manages the migration instances, including starting, stopping, and scaling the instances as needed. This can help simplify the migration process and reduce the operational overhead of managing the instances.
Using migration instances in DMS is a cost-effective way to migrate your data, as you only pay for the EC2 instances when they are running, and can stop or terminate them when not in use. Additionally, DMS provides a scalable and flexible migration solution that can handle large and complex migrations with minimal downtime
Replication/Migration Tasks
A migration task is a unit of work that represents a single data migration operation.It is created when you set up a data migration using AWS DMS. It is defined by specifying the source and target endpoints, the data to be migrated, and the migration type (e.g., full load, incremental load).
DMS uses migration tasks to perform the actual data migration. During a migration, DMS reads the data from the source endpoint and writes it to the target endpoint, using the migration instance to run the replication engine.
AWS DMS provides a number of options to configure and manage migration tasks, including the ability to:
- Schedule migrations to run at specific times or intervals
- Monitor migration task status and progress
- Adjust migration task settings, such as the batch size and maximum number of errors
- Pause and resume migration tasks
- Stop and restart migration tasks
- Test migration tasks before running them in production
By using migration tasks in AWS DMS, you can simplify the data migration process and manage your migrations in a flexible and scalable manner. This can help reduce downtime, minimize the impact on your systems, and ensure that your data is accurately and securely migrated.
Pre-requisite
AWS DocumentDB
- Need an AWS Account
- Create a DocumentDB Cluster with appropriate Engine Version, Instance Class, use Default VPC and subnet settings.
- Create username and password for connecting to DocumentDB.
- Optionally create an EC2 instance within the same VPC to access DocumentDB and load test data for testing.
- Enable change stream on the database or collection if interested in CDC.
Mongo Atlas ( AWS )
- Create an Atlas account.
- Create an Organisation with one project.
- Create an Atlas Cluster( Peering option is available only on M10 and above)
- Create a DB User with access to any database
DocumentDB 2 Atlas: Overview of the Approach using Amazon DMS
Now that we know the key components of DMS, let’s leverage them for a simple migration from DocumentDB to Atlas. Below are the key steps involved in the migration.
- Create a Replication Instance to be used for the the migration.
- Upload the SSL Certificates associated with DocumentDB and Atlas.
- Create Source(DocumentDB) & Target(Atlas) Endpoints.
- Enable Change Stream on Source Database(DocumentDB)
- Create the Migration Task based on your requirements leveraging the Endpoints and Replication Instances.
Let’s go over each of the steps in detail.
Replication Instance
Create a replication instance to be used in the migration by configuring some of the details mentioned in the below snapshot.
- Make it publicly accessible and add this IP address to the whitelist on Atlas.
- Choose the required instance and Storage based on the volume requirements.
Certificates
Download the certificates for Amazon DocumentDB and Atlas from here and upload them in the Certificates section of the DMS service to leverage them during endpoint creation.
Amazon DocumentDB certificate
Atlas certificate
DocumenDB Source Endpoints
Create a source endpoint with the details of the Amazon DocumentDB instance.
- Leverage the DocumentDB certificate created in Certificates section.
- Optionally create an EC2 instance within the same VPC to connect to the DocumentDB
- Once the endpoint is created test the connectivity from the migration instance using the Connections tab within the Endpoints sections
- When Migrating from 3.6 version during Endpoint creation explicitly mention the database name to support change stream.
Atlas Target Endpoint
Currently, DMS only supports MongoDB as a native source endpoint for data replication. But it supports Amazon DocumentDB (with MongoDB compatibility) as both a source and a target.
Since Amazon DocumentDB is compatible with MongoDB, we can leverage Amazon DocumentDB as target engine and configure them with the details of Atlas Cluster.
- Leverage the Atlas certificate created in Certificates section..
- Create a database user in Atlas and leverage them for authentication
- During Server name configuration provide the primary server name of Atlas replication set.
Enable Change Capture
To enable change capture in Amazon DocumentDB, you need to use the change data capture (CDC) feature, which allows you to track changes made to your Amazon DocumentDB cluster. Here are the steps to enable change capture in Amazon DocumentDB:
Login to the EC2 instance and enable the CDC by running the below command.
db.adminCommand({
modifyChangeStreams: 1,
database: "DB_NAME",
collection: "",
enable: true
});
Note that Amazon DocumentDB CDC only captures changes made to the primary instance of the cluster and does not capture changes made to read replicas.
Migration Task
Create a data migration Task of type (Initial changes + CDC) leveraging the source/target endpoints and migration instance created in the previous steps
- Choose the appropriate migration type from choices : Migrate existing data, Migrate existing data and replicate ongoing changes and Replicate data changes only.
- In the Task Settings enable cloud watch logs if we need to get detailed logging about the migration.
- Choose truncate option to drop any collections in the target and start it fresh
- In the table mapping section you have a choice to migrate all collection at once or do it collection by collection. Configure as per the need
- DMS provides a number of pre-defined schema mappings for common source and target databases. You can also create custom schema mappings for more complex migrations.
- Choose any initial assessment to be done before migration if required
- We can create multiple migration task as required.
Schema Mapping is the process of defining the relationships between the source and target databases, including the mapping of tables, columns, and data types. The schema mapping is used by DMS to convert the data from the source database to the format required by the target database during the migration process.
Monitoring Migration Task
Once the migration task has started the task can be monitored at individual collection level both for initial snapshot and ongoing changes as needed.
Conclusion
And finally, below are some important points to consider when migrating data from Amazon DocumentDB to MongoDB Atlas:
- Compatibility: Ensure that the data in your Amazon DocumentDB cluster is compatible with MongoDB Atlas. This may require data transformation or schema changes if required.Leverage schema mapping feature offered by DMS for complex migrations.
- Data Volume: Evaluate the volume of data you need to migrate and choose an appropriate migration strategy.
- Network: Make sure you have a reliable and secure network connection between your Amazon DocumentDB cluster and MongoDB Atlas.
- Security: Ensure that the migration process is secure and that sensitive data is properly protected in transit and at rest using VPC peering options.
- Performance: Consider the impact on performance during the migration process and ensure that the migration does not cause downtime for your applications.
- Monitoring: Monitor the migration process closely and troubleshoot any issues as they arise.
- Testing: Test the migrated data thoroughly in MongoDB Atlas to ensure that all data has been successfully migrated and that all applications are functioning as expected.
- Rollback Plan: Have a plan in place to roll back to Amazon DocumentDB in case of any issues during the migration process.
- Ongoing Maintenance: Consider the ongoing maintenance requirements for your MongoDB Atlas cluster, including backup, scaling, and monitoring.
This article considered a simple migration of the data from Amazon DocumentDB to Atlas hosted in AWS using the Data Migration Service offered by Amazon.
Happy Migrations!!!