NetApp - Configuration Backup and Restore
Baking up the cluster configuration enables you to restore the configuration of any node or the cluster in the event of a disaster or emergency.
Configuration backup files are archive files(.7z) that contain information for all configurable options that are necessary for the cluster, and the node within it, to operate properly. There are two types of configuration files.
Node configuration backup file
Each healthy node in the cluster includes a node configuration backup file, which contains all of the configuration information and metadata necessary for the node to operate healthy in the cluster.
Cluster configuration backup file
These files include an archive of all of the node configuration backup files in the cluster, plus the replicated cluster configuration information (the replicated database, or RDB file). Cluster configuration backup files enable you to restore the configuration of the entire cluster or of any node in the cluster. There cluster configuration backup schedules create these files automatically and store them on several nodes in the cluster.
Procedure to perform configuration backup
On node cluster1-02
cluster1::*> system configuration backup create -node cluster1-02 -backup-type cluster -backup-name test
[Job 1950] Job is queued: Cluster Backup OnDemand Job.
On node cluster1-04
cluster1::*> system configuration backup copy -from-node cluster1-02 -backup test.7z -to-node cluster1-04
On node cluster1-02
cluster1::*> cluster modify -node cluster1-04 -eligibility false
On node cluster1-04
cluster1::*> system configuration recovery node restore -backup test.7z -nodename-in-backup cluster1-04
Warning: This command overwrites local configuration files with files contained
in the specified backup file. Use this command only to recover from a
disaster that resulted in the loss of the local configuration files.
The node will reboot after restoring the local configuration.
Do you want to continue? {y|n}: y
Verifying that the node is offline in the cluster.
Verifying that the backup tarball exists.
Extracting the backup tarball.
Verifying that software and hardware of the node match with the backup.
Stopping cluster applications.
...
...
cluster1::> cluster show
Node Health Eligibility
--------------------- ------- ------------
clus false false
cluster1-02 false true
cluster1-03 false true
cluster1-04 false false
4 entries were displayed.
On node cluster1-02
cluster1::*> cluster modify -node cluster1-04 -eligibility truecluster1::> cluster show
Node Health Eligibility
--------------------- ------- ------------
clus false false
cluster1-02 true true
cluster1-03 true true
cluster1-04 true true
4 entries were displayed.
Procedure to perform restore cluster from the backup
To restore a cluster configuration from an existing configuration you re-create the cluster using the cluster configuration and made available to the recovery node.
cluster1::*> storage failover show
Takeover
Node Partner Possible State Description
-------------- -------------- -------- -------------------------------------
clus - - Node unreachable
cluster1-02 cluster1-01 true Connected to cluster1-01
cluster1-03 cluster1-04 true Connected to cluster1-04
cluster1-04 cluster1-03 true Connected to cluster1-03
4 entries were displayed.
cluster1::*> storage failover modify -node cluster1-02 -enabled false
cluster1::*> storage failover show
Takeover
Node Partner Possible State Description
-------------- -------------- -------- -------------------------------------
clus - - Node unreachable
cluster1-02 cluster1-01 false Connected to cluster1-01, Takeover
is not possible: Storage failover is
disabled
cluster1-03 cluster1-04 false Connected to cluster1-04, Takeover
is not possible: Storage failover is
disabled
cluster1-04 cluster1-03 false Connected to cluster1-03, Takeover
is not possible: Storage failover is
disabled
4 entries were displayed.
Halt each node except for the recovering node
cluster1::*> system node halt -node cluster1-03
Warning: Are you sure you want to halt node "cluster1-03"? {y|n}: y
cluster1::*> system node halt -node cluster1-04
Warning: Are you sure you want to halt node "cluster1-04"? {y|n}: y
cluster1::*> system configuration recovery cluster recreate -from backup -backup test.7z
Warning: This command will destroy your existing cluster. It will rebuild a
new single-node cluster consisting of this node by using the contents
of the specified backup package. This command should only be used to
recover from a disaster. Do not perform any other recovery operations
while this operation is in progress. This command will cause all the
cluster applications on this node to restart, causing an interruption
in CLI and Web interface.
Do you want to continue? {y|n}: y
Executing cluster recreate script.
Checking to ensure that backup replicas exist.
Stopping cluster applications.
The management gateway server restarted. Waiting to see if the connection can be reestablishedRemoving current replicas.
Restoring replicas from backup.
Restarting cluster applications; access to the CLI and Web interface will be available shortly.
The management gateway server restarted. Waiting to see if the connection can be reestablished..
The connection with the management gateway server has been reestablished.
If the root cause of the interruption was a process core, you can see the core file details by issuing the following command:
system node coredump show -node local -type application -corename mgwd.* -instance
cluster1::> set -priv advanced
Warning: These advanced commands are potentially dangerous; use them only when
directed to do so by NetApp personnel.
Do you want to continue? {y|n}: y
cluster1::*> system configuration recovery cluster show
Recovery Status: in-progress
Is Recovery Status Persisted: true
Boot each node that needs to be rejoined to the re-created cluster. Reboot one node at a time
cluster1::*> system configuration recovery cluster rejoin -node cluster1-03
Warning: This command will rejoin node "node2" into the local
cluster, potentially overwriting critical cluster
configuration files. This command should only be used
to recover from a disaster. Do not perform any other
recovery operations while this operation is in progress.
This command will cause node "node2" to reboot.
Do you want to continue? {y|n}: y
The target node reboots and then joins the cluster. Make sure the node is part of the cluster
cluster1::>cluster show -eligibility true
Once all nodes are healthy and if restore is done from the backup file use given to complete the recovery status
cluster1::>system configuration recovery cluster modify -recovery-status complete
In case RDB on the other node is not in sync use given command from the healthy node to sync the RDB
clustser1::*>system configuration recovery cluster sync -node cluster1-03
No comments:
Post a Comment