• Overview

    PDF

    Overview

    About this document

    This document provides information you can use to configure and manage replication on your Unity storage system. Along with relevant concepts and instructions to configure replication using the Unisphere GUI, this document also include information on the CLI commands associated with configuring replication.

    For more information on other Unisphere features or CLI commands, refer to the Unisphere online help and CLI User Guide.

    Additional resources

    As part of an improvement effort, revisions of the software and hardware are periodically released. Therefore, some functions described in this document might not be supported by all versions of the software or hardware currently in use. The product release notes provide the most up-to-date information on product features. Contact your technical support professional if a product does not function properly or does not function as described in this document.
    Where to get help

    Support, product, and licensing information can be obtained as follows:

    For product and feature documentation or release notes, go to Unity Technical Documentation at: www.emc.com/en-us/documentation/unity-family.htm.

    For information about products, software updates, licensing, and service, go to Online Support (registration required) at: https://Support.EMC.com. After logging in, locate the appropriate Support by Product page.

    For technical support and service requests, go to Online Support at: https://Support.EMC.com. After logging in, locate Create a service request. To open a service request, you must have a valid support agreement. Contact your Sales Representative for details about obtaining a valid support agreement or to answer any questions about your account.

    Special notice conventions used in this document
    Indicates a hazardous situation which, if not avoided, will result in death or serious injury.
    Indicates a hazardous situation which, if not avoided, could result in death or serious injury.
    Indicates a hazardous situation which, if not avoided, could result in minor or moderate injury.
    Addresses practices not related to personal injury.
    Presents information that is important, but not hazard-related.

    About replication

    Data replication is one of the many data protection methodologies that enable your data center to avoid disruptions in business operations. It is a process in which storage data is duplicated to a remote or local system. It provides an enhanced level of redundancy in case the main storage backup system fails. It minimizes the downtime-associated costs of a system failure and simplifies the recovery process from a natural disaster or human error. The replication feature leverages the Unified Snapshots technology to produce a read-only, point-in-time copy of source storage data and periodically updates the copy to keep it consistent with the source data. It leverages crash consistent replicas to provide remote data protection of storage resources.

    The system supports asynchronous replication of all storage resources and NAS servers. For block-based storage resources (LUNs, consistency groups, VMware VMFS datastores, and thin clones), synchronous replication is also supported.

    When replicating from a Unity system running OE verison 4.1.x or later to a system running OE version 4.0.x, you cannot have new 4.1.x or later features enabled on the source system.
    Replication modes

    Replication can operate in the following modes:

    • Asynchronous – (Applies to block and file storage.) Use this mode when you want the data between the source and destination storage resources synchronized automatically at a specific interval, based on the Recovery Point Objective (RPO).
    • Synchronous – (Applies to block storage only.) Use this mode when you want the data between the source and destination storage resources to always remain in sync.
    • Manual – (Applies to block and file storage.) Use this mode when you want to manually synchronize changes in the source storage resource to the destination storage resource. When you choose this mode, ensure that you periodically synchronize the session to avoid excessive pool space consumption.
    Recovery Point Objective

    Recovery Point Objective (RPO) is an industry accepted term that indicates the acceptable amount of data, measured in units of time, that may be lost in a failure. When you set up an asynchronous replication session, you can configure automatic synchronization based on the RPO. You can specify an RPO from a minimum of 5 minutes up to a maximum of 1440 minutes (24 hours). The default RPO is set at 60 minutes (1 hour) interval. In the case of synchronous replication, RPO is set to 0. You can use the Unisphere CLI or Unisphere Management REST API to specify a more granular RPO.

    Although a smaller time interval provides more protection and lesser space consumption, it also has a higher performance impact, and results in more network traffic. A higher RPO value may result in more space consumption. This may affect the snapshot schedules and space thresholds.
    Source and destination storage resources

    For all replicated storage resources except for thin clones, once replication is configured, the destination storage resource is automatically created. For LUN storage provisioning, you can select whether to convert a thin LUN to a non-thin (thick) LUN, or a thick LUN to a thin LUN in an all-Flash pool. Also, you can select whether to enable compression on a thin LUN in an all-Flash pool. For file systems, replication matches the destination storage resource to the source. Therefore, thin and compression cannot be selected for file systems. Any modifications to the attributes of the source storage resource are not automatically synchronized over to the destination storage resource. When a failover occurs, ensure that you modify the attributes of the associated destination storage resource to match the attributes of the source storage resource.

    When a thin clone is replicated, the destination resource is automatically created with the same attributes as the source thin clone, except that the destination resource is a full copy, rather than a thin clone.

    Snapshots

    Asynchronous replication supports the replication of read-only user snapshots to either a local or a remote site along with the resource data. Both scheduled snapshots and user created snapshots can be replicated. Snapshots are supported for all resources that support asynchronous replication (that is, UFS64, LUN (standalone), LUN consistency group, VMFS, and VMNFS).

    User snapshots do not apply to the NAS server resource type.

    Snapshot replication for scheduled snapshots can be enabled during session creation, or enabled or disabled at any time in the lifetime of the replication session. User snapshots can be replicated with a remote retention policy that is different than that of the source.

    To support snapshot replication, both the source and destination systems must be running Unity OE version 4.2 or later. Snapshot replication can be enabled on an existing Unity OE version 4.0.x- or OE version 4.1.x-based session after both the production and the remote systems have been upgraded to Unity OE version 4.2 or later. Only read-only snapshots are eligible for replication, and they can only be replicated to the disaster recovery site where the replication destination storage resource is located. Any snapshots that are writable, such as attached block snapshots or file snapshots with shares or exports, are not replicated.

    Snapshots that exist prior to replication session creation can be selected for replication during replication session creation. Snapshots that are older than the last sync (RPO) time can be manually selected for replication and included in the next RPO sync.

    A user snapshot can have one of the following replication state attributes:

    • Not marked for replication (No) - snapshot is not marked for replication
    • Pending sync (Pending) - snapshot is marked for replication but is awaiting transfer
    • Replicated (Yes) - snapshot has successfully transferred to the disaster recovery resource
    • Failed to replicate (Failed) - snapshot failed to replicate
    File-based replication session actions

    On Unity systems running OE version 4.2, the following asynchronous replication actions affect both the NAS server and its associated file systems when run at the NAS server level:

    • Failover
    • Failover-with-sync
    • Failback
    • Pause
    • Resume

    Each of these actions triggers a group operation towards the NAS server replication session and its associated file system replication sessions. A NAS server replication as a group is available for local and remote asynchronous replication.

    Do not perform a group operation at both sides of a replication session at the same time. This action is not prohibited by the storage system, however, a group operation performed at the same time at both sides of a replication session can cause the group replication session to enter an unhealthy state. Also, failover-with-sync is not a transparent operation. During the failover-with-sync process, hosts' write/read requests may be rejected.

    A group asynchronous replication session operation on a NAS server supports up to 500 file system replication sessions in such a way that those sessions look like one replicated unit. If group operations are conducted on a group session whose file system replication session numbers exceeds 500, the group replication session may enter an unhealthy state, along with some file system replication sessions.

    Although a group asynchronous replication session looks like one operation, each file system is replicated individually. If any of the individual file system replication sessions fail, you can resolve the issue and then select the individual file system to replicate.

    Those same asynchronous replication actions towards a file system remain at the file system level. Those actions are still individual operations toward file system replication sessions.

    The following asynchronous replication actions affect only the NAS server when run at the NAS server level or are still individual operations toward file system replication sessions:

    • Create
    • Sync
    • Delete
    • Modify

    Using replication for disaster recovery

    In a disaster recovery scenario, the primary (source) system is unavailable due to a natural or human-caused disaster. Data access is still available because a replication session was configured between the primary and destination systems, and the destination system contains a full copy, or replica, of the production data. The replica is up-to-date in accordance with the last time the destination synchronized with the source, as specified by the automatic synchronization recovery point objective (RPO) setting. By issuing a session failover on the destination system, you make the destination system the new production system, using the replica of the primary system’s data that resides on the destination system. Using replicas for disaster recovery minimizes potential data loss. The amount of potential data loss is affected by the RPO that is configured when setting up the replication session. In synchronous replication configuration, where the RPO is set to 0, the amount of potential data loss will be minimal.

    The failover operation always restores the destination resource to the replication common base snapshot. If failing over to the common base is not sufficient and replicated user snapshots exist, the destination resource should be manually restored to any of the replicated user snapshots.

    Once the session is failed over to the destination system, the destination storage resource becomes read-write. At this point, ensure that the storage resource has the correct access permissions to the host and share. When originally establishing a replication session between the primary and destination systems, create the right host access on the destination system ahead of time to reduce downtime in an event of a disaster.

    To resume operations on the source, fail back the replication session.

    File-based replication consideration

    Switch over the NAS server replication session using the Failover option. This action triggers a group operation towards the NAS server replication session and its associated file system replication sessions.

    The NAS server replication session should be in one of the following states in order for it to be failed over to the destination system:

    • Idle
    • Auto Sync Configured
    • Lost Communication
    • Unrecoverable Error

    If the NAS server replication session is in one of the following states, it cannot be failed over to the destination system:

    • Paused
    • Error states other than Lost Communication or Unrecoverable Error

    To resume operations on the source, fail back the NAS replication session.

    Using replication for planned downtime

    Unlike a disaster, in which the primary (source) system is lost due to an unforeseen event, planned downtime is a situation for which you plan and take the source system offline for maintenance or testing purposes on the destination system. Prior to the planned downtime, both the source and destination are running with an active replication session. When you want to take the source offline in this scenario, the destination system is used as the production system for the duration of the maintenance period. Once maintenance or testing completes, return production to the original source system. Planned downtime does not involve data loss.

    To initiate a planned downtime, use the Failover with sync option on the source system. When you fail over a replication session from the source system, the destination system is fully synchronized with the source to ensure that there is no data loss. The session remains active for synchronous replication, and paused for asynchronous replication, while the source becomes Read-Only and the destination becomes Read-Write. The destination storage resource can be used for providing access to the host.

    Performing a failover with sync operation results in replication copying all the data, including any snapshots that have been created or marked for copy since the last sync occurred, to the destination site. Once the copy is finished, the destination is an exact replica of the source site and the roles are switched similar to the failover operation.

    To restore operations on the source, fail back the replication session.

    File-based replication consideration

    The NAS server replication session should be in one of the following states in order for it to be failed over with sync to the destination system:

    • Idle
    • Auto Sync Configured

    If the NAS server replication session is in one of the following states, it cannot be failed over with sync to the destination system:

    • Paused
    • Error states

    To minimize disruption during a planned downtime window, ensure that the NAS server and associated file system replication sessions are manually synchronized first and then failed over. Follow these steps:

    1. Synchronize the NAS server replication session using the Sync option.
    2. Synchronize the replication sessions for each of the file systems associated with the NAS server using the Sync option. This ensures that the destination file systems have the latest data and minimal data will need to be transferred when the replication sessions switch over.
    3. Inform file system users and quiesce I/O operations from hosts and applications using the file systems in the NAS server.
    4. Switch over the NAS server replication session using the Failover with sync option. This action triggers a group operation towards the NAS server replication session and its associated file system replication sessions.
    5. Once all replication sessions have successfully failed over, resume I/O operations with the relevant applications and hosts.
      Any I/O attempted when the failover is occurring may result in read/write errors or stale file handle exceptions.

    Failback a replication session

    To resume operations on a source system, the associated replication session needs to be failed back. To fail back a replication session, use the Failback option on the original destination system. Failback synchronizes the original source with the changes made on the original destination after failover, including any snapshots that have been created since the failover operation occurred. It then restores the source as the production system and restarts the replication session in the original direction.

    File-based replication consideration

    To resume operations on a source system, the associated NAS server replication session needs to be failed back. To fail back a NAS server replication session, use the Failback option on the original destination system. This action triggers a group operation towards the NAS server replication session and its associated file system replication sessions.