Always do something...Never do nothing: 2018

Sunday, 24 June 2018

HPE Superdome X

What is the HPE Superdome X?

The Superdome X is an enterprise level x86 server that’s designed to support mission critical workloads that require maximum scalability and reliability. It is intended to run the most resource intensive business processing, decision support, virtualization and database workloads, including SQL Server, SAP and Oracle. The Integrity Superdome X consists of a single compute enclosure containing one to eight BL920s Gen8 or Gen9+ blades as well as interconnect modules, manageability modules, fans, power supplies, and an integrated LCD Insight Display. The Insight

Display can be used for basic enclosure maintenance and displays the overall enclosure health. The compute enclosure supports four XFMs that provide the crossbar fabric which carries data between blades.

To service any internal compute enclosure component, complete the following steps
in order:
1. Power off the partition.
2. Power off all XFMs.
3. Disconnect the power cables from the lower power supplies.
4. Disconnect the power cables from the upper power supplies.

Each BL920s server blade contains two x86 processors and up to 48 DIMMs. Server blades and partitions Integrity Superdome X supports multiple nPartitions of 2, 4, 6, 8, 12, or 16 sockets (1, 2, 3, 4, 6, or 8 blades). Each nPartition must include blades of the same type but the system can include nPartitions with different blade types.

Integrity Superdome X provides I/O through mezzanine cards and FlexLOMs on individual server blades. Each BL920s blade has two FLB slots and three Mezzanine slots.

The Integrity Superdome X compute enclosure supports two power input modules, using either single phase or 3-phase power cords. Connecting two AC sources to each power input module provides 2N redundancy for AC input and DC output of the power supplies. There are 12 power supplies per Integrity Superdome X compute enclosure. Six power supplies are installed in the upper section of the enclosure, and six power supplies are installed in the lower section of the enclosure.

Isn’t the Superdome X an Itanium server?

No. HPE markets a separate server called the Integrity Superdome 2 that is built around the Itanium chip and it runs HP-UX. The HPE Superdome X is an x86 server that uses the Intel Xeon Processor E7 v3 processor family and it runs SLES, RHEL, Microsoft Windows Server 2012 R2, VMware vSphere and CentOS. It will also be certified for Windows Server 2016 when Microsoft releases it.

What is the maximum scalability of the Superdome X?

The HPE Superdome X provides extreme scalability. In its maximum configuration it can support up to 16 sockets and 288 cores. You can configure the Superdome X with one to eight scalable BL920 Gen9 x86 blades. The maximum memory capacity is 3 TB per blade for a total of 24 TB of RAM for a fully configured Superdome X server. SQL Server 2016 can scale to consume all of these cores and with Windows Server 2016, scalability will be up to 640 cores.

What availability features does the Superdome X have?

The HPE Superdome X is designed to provide five nines (99.999 percent) of availability. All key Superdome X hardware components are redundant and hot-swappable. This includes components like power supplies, fans, and I/O switches. The Superdome X uses a “firmware first” architecture that is able to contain errors in the firmware before any corrupted data can reach the OS. In addition, the built-in Error Analysis Engine (EAE) constantly analyzes all possible hardware faults, predicts errors and can automatically initiate recovery actions without any operator actions.

What are nPars?

The Superdome X supports multiple hardware partitions that are called nPars. Each nPar partition can be completely electrically isolated from the other partitions. Using the HPE Superdome X nPar technology, you can effectively run multiple diverse workloads on the same server system and those workloads will not interfere with one another. For instance, you can run an instance of the SQL Server relational database in one partition and SQL Server Analysis Services and Reporting Services in another partition. Even though these workloads have very different characteristics, they would be completely isolated from one another just as if they were running on separate systems.

Which virtualization technologies are supported with Superdome X?

Superdome X is certified for Hyper-V, VMware vSphere, and KVM/RHEV virtualization.

HPE Integrity Superdome X Management
see the entire Superdome X system through the Superdome Onboard Administrator (OA).
•    iLO Management—remote access the individual servers
•    HPE System Update Manager—firmware management and system updates system updates.
•    HPE Insight Remote Support (7.x) software—24x7 remote monitoring, automated case creation, diagnosis, notifications, and connectivity to HPE Support.
•    HPE Insight Online and the mobile dashboard—monitor device health and alerts, contract and warranties, or service credits.

IBM XIV 2810/12-114

--42U 19" standard rack
--1 ATS, 3UPS, 15Module, 12Drive per module, 1U management console -Chabuka module4-5-6-7-8-9 interface module.
--laptop on laptop port, dhcp enabled, ip received is 14.10.202.1 (Laptop port works as dhcp server ip is 14.10.202.250).
-- ta tool is required for initial configuration, code load and various maintainenace activity. guided procedure technician/????????.
-- logical configuration XIV GUI admin/adminadmin technician.
-- XCLI command state_list, monitor_redist, help, event_list, componenet_phaseout componenet=componentid,
--componenet_list filter=notok, componenet_test component=1:module:10
-- fc_port_list, fc_connectivity_list logged_in=yes
--servicecenter.xiv.ibm.com for remote management.
support_center_connect and support_center_disconnect
--/dev/sda CF configuration and root filesystem
--/dev/sdb 37gb from each disk. SX - Traces/Events/Cores
--/dev/sdc 60gb 1 vol per If Mod (x6)
-- bootup time 4minu
-- upgrade from 10.0 to 10.1 -disruptive
-- upgrade since 10.1 -concurrent i/o cutover time for host < 13sec
--With 1tb drives --> 180tb - 12*1tb + 3*1tb + 6.8tb for SX => mirror /2 79tb
--with 2tb drives --> 360tb - 12*2tb + 3*2tb + 8tb for for SX => mirror /2 161tb
-- data broken in 1MB partition and mirrored such that partitions are stored on separate module.
-- Each logical volume is created from the partitions of all drives and entire modules are rebuilt in the event of a failure and only used capacity is rebuilt.
-- All DDM take part in the rebuild.
-- storage pool ==> volume
-- host connection ==> host ==> map volume to host
--The storage space of IBM XIV storage system is partitioned into storage pools, where each volume belongs to specific storage pool.
Storage pool provides improved management and regulation of storage space.
--size of storage pool is 17GB to 80654GB. Size of the storage pool can be increased limited only by the free space on the system
--size of storage pool can be decreased limited only by the space consumed by volumes and clones.
-- volumes can be moved between storage pools as long as there is enough free space in the target storage pool.
-- All of the above transaction on storage pools are pure accounting transactions and do not impose any data copying from one storage pool to another.
-- Volume can belong to 1 storage pool. 1 consistency group. All volumes in the consistency group belong to same storage pool. Volume can have multiple clone. A clone is point in time copy of a volume.
-- XIV queue depth 1400 per port and 256 per volume.

IOPS = Queue Depth/Latency
Throughput=IOPS*IO Size ==> Queue Depth/Latency *IO Size

-- Host queue depth - A minimum queue depth of 64 should be used. Performance can be improved with higher values dependent on relative workload levels and content.
-- Multi Path : AIX MPIO Currently only default Path Control Module is supported in active/passive mode
-- Linux Device Mapper - Requires RPMs to be installed and kernel recompile.
-- Solaris Native MPxIO, windows DSM to support XIV, VMware native active/standby
-- Not advisable to use two protocols(FCP/iSCSI) to access the same volume. Might be used to migrate a host from FC to iSCSI. To access different volumes
from the same host through different protocols, use separate host definition.
-- supports traditional (scsi2) and persistent (scsi3) reserves
--reserves can be displayed and cleared using xcli commands
reservation_list - list volume reservations
reservation_key_list - list reservation keys
reservation_clear - clear reservations of a volume
-- In multi Host environments reserves can be used to block volume accesses from other hosts while it is updated from the reserving host.
Problem exists, when the reserving host crashes while the reserve is still outstanding. The above commands can be used to analyze the situation by the customer and resolve the problem by clear the reservation.
-- SSR should never use such commands as the risk to damage the data integrity is very high.

Monday, 2 April 2018

Benchmarks/Metrics

Several tools test system performance. Some are specific to an application or environment, while others are more general. Whenever a tool is used, it is critical to understand for what the tool was designed and how it operates in different environments and with different storage array features such as deduplication. The following are several tools and the associated use case for each:

• IOmeter. IOmeter is an I/O subsystem measurement and characterization tool for single and clustered systems. It was originally developed by Intel Corporation. For more information about IOmeter, refer to http://iometer.org.

• OLTP-A. OLTP-A consists of a single workload with 8K blocks and a 40%/60% read/write mix with both random and sequential patterns. Though most databases are more read intensive, this benchmark was selected because of the write environment, which in an all-SSD system tends to be the bottleneck. The workload simulates a write-heavy online transaction-processing database and executes both queries and updates to the database during operation.

• SLOB (Silly Little Oracle Benchmark). SLOB is a complete toolkit for generating I/O through an Oracle database and is used to analyze the I/O capabilities of the database. SLOB is designed to generate high I/O levels without causing application bottlenecks.

• VMmark. VMmark is a free tool from VMware to measure performance for applications running in VMware environments. For example, this tool helps to identify the number of applications that can be supported using a single storage system.

• sqlio. sqlio is a tool provided by Microsoft that can also be used to determine the I/O capacity of a configuration.

• Vdbench. Vdbench is a command-line tool to generate disk I/O for validating storage performance.

• STAC-M3. The STAC Benchmark Council developed the STAC-M3 Benchmarks. These benchmarks are primarily used in the financial community to measure performance associated with financial applications.

Metrics Terminology

The terms used to describe performance can be thought of in three pillars: I/Os per second, latency, and throughput or bandwidth. Depending on the workload, the performance test values obtained are dramatically different by design intent. For example, if the workload is predominantly small random reads, the IOPS values are high, but the throughput values are relatively low in comparison. Conversely, if the workload is mostly large sequential I/O, especially 1M I/O, the measured IOPS values and throughput values are close to the same. Both of these circumstances are expected behaviors. In general terms, the relationship between IOPS and throughput can be expressed as:

Throughput (MBps) = IOPS multiplied by block size (MB)

The following sections explore each performance measure in more detail.

I/Os per Second (IOPS)

I/Os per second is the measure of how many input/output communications pass between an initiator (host server) and a target (E-Series storage system) in one second. Based on the protocol that manages the path between the initiator and target, the maximum level of IOPS achievable varies significantly. For example, a 10Gbps iSCSI link cannot process the same level of I/Os as a 32Gb FC link. Factors external to the storage system also introduce overhead to communications and affect IOPS resulting from multiple network settings and host settings on HBAs, HCAs, and NICs to settings in OSs and multiple other host-side or application-related issues. Therefore, achievable IOPS for a given system depends on many factors and must be understood holistically for you to achieve the best performance profile for a given environment.

Latency

The second pillar is latency: the time required to move an I/O from the initiator to the target or back in the other direction. All of the same factors that affect IOPS have a related impact on latency. However, the other measures ultimately top out to the limits of the hardware and protocols involved. Latency spikes and grows dramatically as hardware and protocol limits are exceeded. As a result, latency is the measure used to define operating ranges that apply to normal data center workloads.

Throughput/Bandwidth

Throughput is the third pillar of storage performance and is a measure of how much data can pass between host initiators and storage system targets in one second. Like IOPS, throughput is heavily related to the type of workflow. For example, from a throughput perspective, it takes many small I/Os to equal one large IOPS. As a result, the type of I/O transferred as well as the bandwidth (size of links) and protocol used play significant roles in the amount of data that can be transferred in one second. With this fact in mind, some host OS suppliers have built in tunable settings to allow hosts to use larger I/Os or block sizes when transferring data to and from storage. As a result, just like IOPS, the ability to achieve certain throughput requirements is a multilevel activity and depends on factors both inside the storage system and within each unique customer data center.

Bye...

Sunday, 18 February 2018

Remote Management System

Intel uses RMM2 (remote management 2),

Dell uses DRAC (Dell Remote Access Control),

Sun (now Oracle) uses ILOM (Integrated Lights Out Manager)

IBM uses IMM (Integrated Management Module)

HP uses ILO (Integrated Lights-Out).

Coming out with pros and cons of each of these....

Bye...

Dell Technologies - VxRail

- Jointly by Dell EMC and VMware

- Dell EMC PowerEdge 14th Generation - VMware - VSAN

- VxRail Appliances are built using a distributed-cluster architecture consisting of modular blocks that scale linearly as the system grows from as small as 3 nodes to as large as 64 nodes. Nodes are available with different form factors, with single-node appliances for use cases: E entrylevel systems; P performance optimized; V VDI optimized with GPU; and S storage-optimized configurations supporting high-capacity HDD drives

- All appliance models support either 10GbE or 1GbE network. 10Gb Ethernet networks are required for all-flash configurations and environments that will scale to more than 8 nodes. Additional ports are available, allowing the customer to expand VM-network traffic.

- Scale up and Scale out

The number of Ethernet switch ports required depends on the VxRail model and whether it is configured for hybrid storage or for all flash. The all-flash system requires two 10GbE ports, and hybrid systems use either two 10GbE ports per node or four 1GbE ports per node. For 1GbE networks, the 10GbE ports auto-negotiate down to 1GbE. Additional network connectivity can be accomplished by adding additional NIC cards. The additional PCIe NICs are not configured by VxRail management, but can be used by the customer to support non-VxRail traffic, primarily VM traffic. The additional ports are managed through vCenter. Network traffic is segregated using switch-based VLAN technology and vSphere Network I/O Control (NIOC). Four types of network traffic exist in a VxRail cluster:

Management - Management traffic is use for connecting to VMware vCenter web client, VxRail Manager, and other management interfaces and for communications between the management components and the ESXi nodes in the cluster. Either the default VLAN or a specific management

VLAN is used for management traffic.

vSAN - Data access for read and write activity as well as for optimization and data rebuild is performed over the vSAN network. Low network latency is critical for this traffic and a specific VLAN isolates this traffic.

vMotion - VMware vMotion allows virtual-machine mobility between nodes. A separate VLAN is used to isolate this traffic.

Virtual Machine - Users access virtual machines and the service provided over the VM network(s). At least one VM VLAN is configured when the system is initially configured, and others may be defined as required.

VxRail Manager - VxRail management platform, is the appliance hardware lifecycle management and serviceability interface for VxRail clusters. In newer VxRail version 4.7 plugin for vCenter allow the entire activities from within the vCenter.

vSphere - vCenter and ESXi,, vSAN Software Defined storage (at least 1 SSD required for VSAN)
After the hardware and network configuration is complete access VxRail cluster with default IP address of 192.168.10.200 and follow step by step procedure to configure VxRail. This can also be automated by putting all input values in JSON file. JSON files can be created using VxRail PEQ(pre engagement questionnaire).

Initial Screen on browser with 192.168.10.200

One of the screen during configuration

VxRail Cluster Initialized

ESRS - EMC Secure Remote Services,
/var/log/VMware/marvin/tomcat/log/marvin.log
http://vxrail-ip/stats/log

Bye...

NetApp E-Series

Controller State:

Optimal - The remaining controller marks the internal state of its alternate as 'Not Present"

Quiesced - The remaining controller marks the internal state of its alternate as "Not Present". No I/O requests are processed until the state is no longer quiesced.

Service Mode - The remaining controller marks the internal state of its alternate as "Not Present". No I/O requests are processed until the state is no longer service mode.

Suspended - The remaining controller becomes Online.

Lockdown - The remaining controller remains in the Lockdown state.

Offline - The remaining controller is released from reset, enter the Service Mode state, and the process with Start-of-Day(SOD) processing.

7-segment display code

Dynamic Disk Pools:

These were initially called CRUSH(Controlled Replication Under Scalable Hashing)

Stripe with 3 segments - 1 segment on drive 1, 2nd segment on drive 2, 3 segment on drive 3

volume with 3 piece - 1 piece on drive 1, 2nd piece on drive 2, 3 piece on drive

piece 1 contains all segment on drive 1, piece 2 contains all segment on drive 2

segment combine to for strips
piece combine to for volume

Stripe are broken in segments. All segments residing on drive is called Pieces. Each piece of the drive is written to disk of raid group.

In DDP C-Stripe or D-Stripe - 5GB (4GB data and 1GB parity)

C-Piece or D-Piece

Stripes always has 10 piece irrespective of the number of disks in Dynamic pools.

Each Raid 6 Stripes is 1MB (8+2 with 128K segment size)

C-stripe has 4096 traditional Raid 6 Stripes

No drive contains two C-piece from the Same Stripes. Each C-Piece is 512MB in size.

Preservation Capacity

When disk pools are created, a certain amount of capacity is preserved for emergency use. This capacity is expressed in terms of a number of disks in the management software but the actual implementation is stored across the entire pool of disks. The default amount of capacity that is preserved is based on the number of disks in the pool.

preservation capacity 0 - 10 disks. Preservation capacity is active.

Dynamic Pools - TPV

4GB minimum repository size and expansions must be in 4GB increments.

Virtual capacity can be specified between 32MB to 63TB.

Provisioned capacity between 4GB and 64TB. Provisioned capacity quota limits automatic expansion of repository. Quota equals provisioned capacity when expansion mode is manual.

DS5300 - 7.3x - 10.73

MD3260 - DE6600 - 60 drives /DE5600 - 24 drives /DE1600 12 drives - 4U/2U - 6Gbps ESM
DE460C/DE224C/DE212C - 4U/2U - 12Gbps IOM
MD3460 - E2760 80.20.x - 11.xx
E2800,E5700 - 8.4x - 11.84 (Embed Web User Interface)

Dacstore - All configuration information is stored in Dacstore. Dacstore is stored on all drives but is invisible to hosts and users. Capacity reserved for dacstore is subtracted from the usable capacity of a volume group. Dacstore resides on innermost portion of the disk drives. Read/Writes to innermost tracks are slower and the faster outer track are reserved for customer data.
Bye...

Sunday, 21 January 2018

NetApp HCI

NetApp HCI is architected in building blocks either at the chassis or node level. Each chassis can hold 4 nodes made up of storage running SolidFire Element OS and/or compute nodes with VMware hypervisor (or another hypervisor… may be in later stage). Nodes are inserted and removed from the back of the chassis and SSD’s for storage nodes are populated in the front. Minimum configuration is 2 chassis with 6 nodes, 4 storages and 2 computes. 2 additional blank spots can be used for expansion. Compute and Storage nodes can be mixed and matched.

Storage Nodes and compute nodes comes in 3 configurations small, medium, and large.

Storage - Large 22TB/44TB, Medium 6TB/22TB, Small 3TB/11TB

Compute - Large 36 cores 768GB, Medium 24 cores 512GB, Small 16 cores 384GB

The specific value propositions of NetApp HCI are the following

Guaranteed performance: delivers predictable performance, consolidates mixed workloads, and provides granular control at the virtual machine level.

Flexibility and scale: scales compute and storage independently, optimizes and protects existing investments, and eliminates HCI "tax" by separating the scaling of computer and storage.

Automated infrastructure: deploys capabilities rapidly, automates and streamlines management, and simplifies processes through a comprehensive API library.

First-generation HCI scales compute and storage together in fixed ratio. NetApp HCI scales independently sot that if customers need only compute, they do not pay for and overprovision storage. Because NetApp storage and compute nodes scale independently, customers can mix and match to fit their needs. All nodes in the minimum configuration should be the same size and the largest node should be no more than one-third larger than the combination of the rest of the nodes.

With NetApp Deployment Engine(NDE) HCI can be deployed quickly (around 30min)

NetApp has automated and streamlined the deployment steps, reducing more than 400 entries to fewer than 30 entries. This automation reduces the risk of error and enables customers to begin using HCI in about 30 minutes. Because they system is intuitive, process data, such as user name and passwords, when possible, so customers need to enter the data only once. Customers are not required to reenter data or select several options at varying complexity levels. The system automatically checks for user errors, eliminating manual checks.

Originally, data enters were constructed with hardware. Software played only a supporting role. Hyper converged infrastructure(HCI) is "software-defined" because it employs a high degree of virtualization for storage, servers and support services. The virtualization layer, which is a common software layer, runs on and manages the hardware, Software-defined data center (SDDC) architecture also enables higher degrees of automation. The software layer has automation helpers, such as APIs.

HCI addresses business requirements by improving data efficiency and simplifying management of all infrastructure resources and virtual machines. HCI accomplishes this goal by providing a single point of administration at a fraction of the cost of a three-tier architecture. Bringing all data center resources into the resource stack improves performance, and the data architecture improves data efficiency by providing one-time deduplication, compression and optimization of data. A reduced need for hardware resources, streamlined operations, and automation greatly reduce the TCO.

NetApp HCI is good for work consolidating in highly virtualized, mixed-workload environments, where customers want to run thousands of applications predictably, with guaranteed performance.

NetApp HCI is good for web infrastructures where customers want to deliver predictable performance to web applications and scale resources independently to meet or exceed SLAs.

NetApp HCI is good for databases environments running SQL and NoSQL (for example MongoDB) database workloads that need resources to run properly without the capital expenditure(capex) and operational expenditure(opex) burdens of dedicated hardware.

NetApp HCI is good for end-user computing environments where customers want to cost-effectively deliver the flexibility and adaptability that are required to manage an evolving large-scale, end-user computing environment. With granular quality-of-server (QoS) controls and independent scale-out architecture. NetApp HCI is uniquely suited to manage and adapt to the mixed and unpredictable performance for every application and true multitenancy. NetApp HCI is designed for the Data Fabric, so customers can access their data across any cloud – hybrid, public or private.

How it differs from Nutanix...look for my future blogs or see updates on the same blog itself.

Bye...

Saturday, 6 January 2018

NetBackup 7.x Technical Overview

NetBackup Components and Architecture

NetBackup's 3-tier architecture (Master Server, Media Server, Client servers) gives the power, scalability, and flexibility needed to match the demands of modern enterprise-class workloads.

Master Server Overview

Hosts catalog database, backup policy creation and scheduling, administration console, enterprise Media Manager, Centralized monitoring, reporting, and restore execution. EMM server managed and allocates resources required for NetBackup operations. Its part of master server and can be installed with master or on separate server.

NetBackup is not a program but rather a collection of process that work together.
Process name prefixes
bp____= legacy process (bp comes from Backup Plus the orginal product)
np____= newer processes. Multithreaded (6.x) always running.
nbrb= NetBackup resource Brocket. Allocates and tracks resources.
nbproxy=NetBackup Proxy used to talk to legacy process. Its intermediate between old bp____ and new nb_____ process.

Master Server Processes

bprd(request daemon) always running on the master server and responsible for taking backup and restore request.

nbpem(schedule/policy execution) is a process for creating a policy and running them at scheduled time. In case policy is updated, nbpem is informed and all client and objects in that policy are updated too.

bpjobd(job monitor)

nbjm(job manager) takes the job information from nbpem and update the nbpem once the job is completed.

bpdbm(database manager) is responsible for database and catalog. It is running all the time on NetBackup master server.

EMM server can be running on the master server or it can run separately and provide resources to other master servers too. nbrb and nbemm runs only on emm server. bpsched(pre 6.x) has been replaced by nbpem, nbjm, nbrb

nbrb(EMM) (resource broker) acquire the resources from nbemm running on emm server.

nbemm(EMM) (media manager)

nbproxy(EMM) process is required for retrieving Storage Lifecycle polciy from the client so that it can give input to Ops Center within NetBackup.

Media Server

Media Server, FT (Fiber Transport)Media Server - transfer data over SAN, control storage interaction, reads/writes data to/from storage, controlled by master server, multiple media servers can be used for load balancing.

Media Server Process

bpbrm(backup/restore manager)

bptm/bpdm(tape/disk manager)

bpcd(communication between master and clients)

nbftsrv/nbfdrv64(FT services)

Client Overview

Software agent installed to client, standard client, SAN client, snapshot client, data movement engine, controlled by Master, encryption, deduplication.

Clients Process

bpcd (communication)

vnetd(firewall communications)

bpbkar(backup/archive client)

tar(restore service)

nbflclnt (SAN client)

Basic disk storage Unit

NetBackup can use simple disk storage as backup and staging location and it does not require license. It has some limitation when compared to advanced disk. Disk storage device can be local or available via the network (NAS). Disk storage devices can be exposed to NetBackup as a Basic Disk storage units. Once defined as a storage unit, devices can be used as a backup destination within a policy.

Advanced Disk requires DPO (Data Protection Optimization) feature. With advanced disk multiple disk volume can be pooled to create logical units (pools). It supports SLP (Storage Lifecycle Policies). It is easy to add capacity to Advanced Disk pool. It supports CIFS/NFS shares and encryption.

Basic MSDP(Media Server Deduplication Pool), deduplication engine is embeded in NBU7.x code base. Deduplication can be done at client level, media server level or third-party appliances. Media server hosts deduplicates data on local host. In Off-host deduplication media server runs deduplication inline. It requires DPO

OpenStorge(OST) requires DPO. It enables multiple NetBackup media servers to share intelligent disk appliance storage.

NetBackup Appliances is purpose built backup appliance gives standard and predictable performance. NetBackup 5230 and NetBackup 5330 storage shelf have RAID6. Monitored by Veritas support via call home. Operating system is on RAID1

Management Options

WebGUI, install NetBackup remote client on 64-bit system, through SSH

IPMI - Manage system remotely, change BIOS settings, power on/off or recycle appliance, reimage appliance.

NetBackup Features

NetBackup Instant recovery for VMware enables to start the VM from the backup and then do the VMotion to move VM from backup storage to regular storage. High speed recovery event boots backup VM images directly from storage safe; backup VM image kept in read-only mode during recovery.

Auto Image Replication (AIR) move image from one domain to another. Requires DPO. AIR leverages SLP to simplify multi-site disaster recovery.

Accelerator Technology can transform the way you protect your critical IT infrastructure by providing the power of full back up using incremental backup. uses Synthetic backup.

FlashBackup capability is designed specifically to offer a performance solution for server with highly utilized disk file system containing large number of files. NetBackup Client creates raw backup of file system instead of file-by-file backup. Can increase performance for highly utilized file system with many files. Supports restore of individual file objects. File system backup transferred to NetBackup Media Server as a single, raw image. Backup process change from file stream into a bit stream.

NetBackup helps customers leverage flexibility of public cloud storage by supporting all major cloud storage providers and differentiates from other solutions through proprietary OpenStorge(OST) technology.

NetBackup OpsCenter is reporting and monitoring tools. It can manage multiple domains centrally. NetBackup OpsCenter is free and NetBackup OpsCenter analytics is licensed and can forecast and generate custom reports.

NetBackup should always be updated from the top down. OpsCenter, Master Server, Media Server, Client. They do not all need to be done at the same time. A master can work with mixed media server versions and mixed client versions with some limitations and exceptions. OpsCenter must always be the highest level or at least match the master server.

/user/openv/netbackup/bp.conf

BPRD_VERBOSE = 5

#/user/openv/netbackup/bin/bprdreq - rereadconfig

vxlogging can be configured by the command line or GUI. Some process such as NBEMM, NBPROXY and PBX have to be configured through the command line using vxlogcfg.

Use vxlogview to retrieve the logs

Cleaning up

/usr/openv/netbackup/vxlogmgr -F purge all vxlogs.

NetBackup Support Utility - NBSU - collects logs for support analysis.

/usr/openv/netbackup/bin/support

Troubleshooting -

Documentation and preparation are key. The catalog backup e-mail contains most of the information you need to perform the recovery. Annual DR tests should be performed to keep documentation current. For DR tests bring extra backup tapes for each application (including multiple catalog tapes). Do not user "overwrite files" on system restores unless your system admins tell you to.

Network communication between master/media or media/client

../admincmd/bptestbpcd - host -hostname -debug -verbose

/usr/openv/bpclntcmd -pn - checks connectivity to master server from a media server or client.

../netbackup/bin/vxlogview -p 51216 -t 00:05:00 - To print log output of the last 5 minutes run

to restrart PBX process when NBU is stopped.

/opt/VRTSpbx/bin/vxpbx_exchanged (stop|start)

DataCollect is a utility included with NetBackup appliances to collect logs for support analysis.

Catalog Backup configuration

/usr/openv/netbackup/db

/usr/openv/var

/user/open/netbackup/vault/sessions

/usr/openv/db/staging

Given important files are missing from catalog backup

/usr/openv/netbackup/bp.conf

/usr/openv/volmgr/vm.conf

/usr/openv/netbackup(include/exclude lists)

HKLM\software\Veritas\CurrentVersion\Config

Daily full catalog backup and differential increment backup - every 6 hours and retention 1-2 week

User Storage Lifecycle Policies (SLPs) to make multiple copies when possible for automation.

NetBackup Auto Image Replication introduced in NetBackup 7.1, allows a NetBackup domain to replicate its backup storage and catalog to one or more NetBackup domains.

OpenStorage allows storage vendors to become part of STEP (Symantec Technology enabled program) and get access to OST API. Storage vendors can write plugin using OST APIs that can be installed on the NetBackup Media server. This enables tight integration between the storage and NetBackup. OpenStorage supports any connectivity, any protocol(FC, TCP/IP, combination) and any format. Without OST if storage device like DataDomain performs deduplication, replication, creating copies and writing directly to tape then NetBackup will never come to know about this. That is OST is required for the tight integration of storage with NetBackup.

Deploying OST plug-in for AltaVault 4.2 and NetBackup Media service 7.6/7.7 with OS updated. Download OST-plug for Windows/Redhat from NetApp AltaVault.

In AltaVault, create OST share on AltaVault and select OST user
In NetBackup Disk Storage Server, select OST (OpenStorage) - OST sharename Underscore AltaVault name and OST user and disk pool will be created by same wizard and then create storage unit. Create policy to use just created storage unit and initiate backup.
NetBackup restore using the client. Images manually expired will be removed from OST share on AltaVault.

admin/admin
config t
ost enable
no ost enable
show ost server
ost user ?
ost share ?
ost ssl enable - require NetBackup stop and NetBackup start to take effect
no ost ssl enable - require NetBackup stop and NetBackup start to take effect

on Linux to see if plug-in is installed correctly use
/usr/openv/NetBackup/bin/admincmd/bpstsinfo -pi | grep NetApp

NetBackup Starts with bprd process on master server and ltid on media and master server. All process starts including nbpem, nbjm, nbrb, nbemm as required and install it on Media Server.

Backup flow at the process

NBPEM(policy execution manager) -> NBJM -> BPJOBD(make entry in jobDB) -> NBJM -> NBRB -> NBEMM -> NBJM -> BPJOBD -> NBJM -> BPDBM (catalog entry) -> NBJM -> BPBRM (media server) -> BPBKAR(client) -> LTID -> BPTM(spawn of BPTM and BPBKAR sends data to BPTM child which puts it into buffers) -> BPTM (BPTM parent puts the data in the storage) once complete then above processes runs in reverse to give completion acknowledgement.

Reference - Symantec Veritas website

https://www.youtube.com/watch?v=PBYg8naRf1M

for NetBackup pre 6.x version refer

https://vox.veritas.com/t5/Backup-Recovery-Community-Blog/Netbackup-processes-and-commands/ba-p/778784

https://annurkarthik.wordpress.com/category/data-protection/symantec-netbackup/full-system-level-restore-symantec-netbackup/

Bye...