Always do something...Never do nothing

Tuesday, 17 September 2019

Miscellaneous - My reference

- Transient sudden rise in voltage for a short period of time of 5 nanoseconds to 50 nanoseconds. ESD and lightning ESD are example. ESD can have 8000Volts for a billionth of a second. This time is enough to damage electronic item. solution is Transient voltage surge suppressor and ground the extra voltage.
- Spike-Surge-Swell(high voltage) / Sag(low voltage) in voltage for short period less than minute can be conditioned by UPS. State of the art ups have power factor of 1 i.e. va=w. ups that I may buy for pc may have power factor of 60% i.e. 100va will give 60watts.
- Over voltage or under voltage is high or low voltage for longer than 1 minutes. damages electrical equipment. I will not use my water pump during over or under voltage as it is directly connected without any stabilizer. My ac is getting power through stabilizer it should handle over and under voltage condition. for DC you need to have power conditioner and UPS. same is require for voltage fluctuation.
- IN dc standby power need is fulfilled by DG or by battery
- DG component starter starts the DG, alternator converts mechanical energy to AC, voltage regulator controls the voltage produced by alternator, governor determines the quality of AC output. Once the AC is stabilized it power feeding will start from DG. When two or more generators are paralleled for more output or redundancy they must be governed at the same speed. If two DG are out of sync one of them will carry larger fraction of load which needs correction that is done by governor.
- 42U 19" standard rack 1U=1.75inch ADU - Air Distribution Unit
ARU - Air removing unit,
- Never mix hot and cold aisle
- Use of blanking panel for open space in rack improves air flow
- Use thermostat normally you don't always need 22 degree Celsius 24 degree Celsius is okay in many cases
- Environment cooling can be used to reduce overall cost
- proper cabling, avoid spaghetti of cables

Monday, 16 September 2019

Miscellaneous - My reference

- 99.999 uptime may not give right picture. for example DC1 and DC2 down for 5 minutes.

DC1 down once for 5 minute

DC2 down 10 times each with downtime of 30 second each. Total downtime is 5 minute here

Time to recover should be considered as well

- Total flooding fire extinguishing system - halon(ozon depeleter) not used, flourine based compounds and compound with inert gases.

water sprinklers get activated at 75 degree centigrade

- tmp 22-24 celcius with max approved by ashare 27.22 celcius. important frequent temperature variance can alter characteristics of chip. That is one among many reason why servers are not powered off even when it is not required for processing.

- Humidity 40-60% low humidity will increase the chances of static charges high humidity can cause droplets and corrosion.

-CRAC unit should not work in competition. one is cooling-dehumidifying and other heating-humidfying. use of dcim/bms can be used to check this condition.
CRAC units should be tested to ensure that measured temperatures(supply and return) and humidity readings are consistent with design values. Set points for temperature and humidity shuld be consistent on all CRAC units in the data center. Unequal set points will lead to demand fighting and fluctuations in the room.

CRAC computer room air condition self contained precision cooling.Good for small data center.

CRAH computer room air handlers work with big chillers. good for bigger dc and also cost less than CRAC for moderate or bigger dc(>500KW)
spot cooler / floor mounted coolers 1KW to 5 KW
large floor mounted cooler - 20kw to 200kw
CRAC - 100KW to 400KW
CRAH - 500KW+
Air cooled DX system requires roof(3m)
Air cooled self contained system require duct
Glycol cooled system requires roof (3m)
Water cooled system requires roof(3m)
chilled water system requires roof (3m)

Increase in temperature requires more humidity to maintain relative humidity. Rise in temperature cause more vapors that can be trapped then it actually has hence vapors concentration decreases you need to use humidifier to maintain vapor concentration.

With cooling opposite is the case as temperature decrease ability to hold humidity decreases so density of vapors increases so you need to dehumidify to maintain relative humidity.

Sunday, 15 September 2019

Miscellaneous - My reference

Deduplication Compression Erasure Coding

Deduplication - fingerprints using sha-1
Compression - lz4(inline) lz4ha(postprocess)
EC - generally post process. Logic built inside the system initiates migration from RF2/RF3 to EC.

For frequently access data avoid deduplication and compression as both of them are resource intensive
VDI workload and backup good for dedupliaction
regular files good for compression
CAD files bad for compression
High IO requirement do not use EC.

For EC minimum 4 nodes. vSAN implemented only in All-Flash Nutanix is implemented with hybrid as well. Same is true with compression and deduplication.
For large block say 16KB(Nutanix) it is less resource intensive to perform deduplication than 4KB (like solidfire and NetApp)

RF2-FTT1 2 copies of data is maintained and 3 copies of metadata and configuration is maintained. Can withstand 1 disk or node failure.
RF3-FTT3 3 copies of data is maintained and 5 copies of metadata and configuration is mainatained. Can withstand 2 disk or 2 node failure. (For nutanix with RF3 storage container can be Replication factor of 2 or Replication factor of 3. No RF changes to EC enabled storage container)
FTT 1 Raid 1 - minimum 3 nodes
EC Raid 5 - minimum 4 nodes
ESXi and NTNX boot partition remain unencrypted. SEDs support encrypting individual disk partitions selectively using the “BAND” feature (a range of blocks).

Sunday, 8 September 2019

Miscellaneous - My reference

25MW DC can run around 50000 servers => 25000:50000 => 1KW=2servers 1 server with 500 Watts but DC PUE of 1 that is not possible in Middle East. In Middle East 25MW DC can power more or less around 20000 server
DCIM

Hotspot in DC identification. Some capable of creating CFD to find hotspot
Power capacity available, used and forecast of power requirement
Locate and inventory of assets like Rack, IT - Network devices, cooling, ups. Require the DCIM process need to be followed as assets are placed and deployed in DC
Maximize uptime by generating alerts and reporting predictive failures.
OpenDCIM(free sofware), Sunbird, Struxware, Sunird, Equinix IBX SmartView

TCP and UDP both are layer 4 protocol of OSI Model. TCP is connection oriented, ack packet delivery and hence slow but reliable while UDP is connection-less and doesnot ack packet delivery and hence fast.

Layer 2 - Data link deals with MAC and Layer 3 consist of IP/ARP capable of routing
Docker container enable to create multiple container that share same kernel/OS and faster to deploy multiple container on single system. Kubernetes is orchestration engine that can be used to deploy multiple docker container
VMware HA enable the VM to start on another node of the cluster in case actual node on which VM running fails. Distributed Resource Scheduling enable the movement of VMs among the cluster nodes depending on the load on each node. VMs can be moved using vMotion from heavily utilized nodes to less utilized nodes.
For normal operations core is vCPU ration is 1:4 it can be 1:1 or 1:2 for high performance requirement.
In case customer has around 1000 systems/storage and wants to move to hyper converged or some sort of tech refrsh then given details should be collected

Performance

Total cores and Total frequency vs utilized cores and utilized frequency
Total Memory vs utilized memory
Total storage vs allocated storage vs used storage, rate of storage efficiency like compression, deduplication, compaction, Raid/Erasure Code/Redundancy factor, replication factor used for data and metadata.
Latency IOPS and Throughput for local storage and networked shared storage
Network throughput, type of network Ethernet 1gb/10gb/25gb/40gb copper/fiber, FC 8/16/32gb

Tools that can be utilized liveoptics, onCommand insight, HP Openview, Solarwinds

Tick-Tock should be Tock-Tick as micro architecture change is Tock and then process to shrink processor that is Tick. Bride(sandy, ivy)-> well(hash,broad)->lake(sky, copper, tiger, meteor)

S3 bucket can be mounted on EC2 instance as filesystem known as s3fs. s3fs is a fuse(file system in user space) file system that enables mounting of S3 on local filesystem.
#sudo yum update all
#sudo yum install automake fuse fuse-devel gcc-c++ git libcurl-devel libxml2-devel make openssl-devel
#git clone https://github.com/s3fs-fuse/s3fs-fuse.git (clone s3fs source code from git)
# cd s3fs-fuse
# ./autogen.sh
#./configure --prfix=/usr --with=openssl
#make
#sudo make install
#which s3fs
authentication require access key and secret access key
#cat > /etc/passwd-s3fs
#accesskey:secretkey
#sudo chmod 640 /etc/passwd-s3fs
#mkdir /mys3bucket
#s3fs vijayraj -o use_cache=/tmp -o allow_other -o uid=1001 -o mp_umask=002 -o multireq_max=5 /mys3bucket
make etries in rc.local
#vi /etc/rc.local
/usr/bin/s3fs your_bucketname -o use_cache=/tmp -o allow_other -o uid=1001 -o mp_umask=002 -o multireq_max=5 /mys3bucket

Multipath

systemctl start multipathd.service
systemctl enable multipathd.service
/sbin/mpathconf --enable --user_friendly_names y
systemctl start multipathd
Multipath -ll

# multipath -ll
mpatha (3600a098000f78555000000e65d47e35f) dm-3 NETAPP ,INF-01-00
size=400T features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 7:0:0:0 sdb 8:16 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
`- 7:0:1:0 sdd 8:48 active ready running

[root@xxxxx ~]# mkfs.xfs /dev/mapper/mpatha

# mkdir -p /dat02_share
# chown nfsnobody:nfsnobody /dat02_share

# mount /dev/mapper/mpatha /dat02_share

# systemctl start nfs
firewall-cmd --permanent --zone=public --add-service=nfs
firewall-cmd --reload

iptables -I INPUT -j ACCEPT
iptables -I OUTPUT -j ACCEPT

iptables -I INPUT -p tcp --dport 80 -j ACCEPT -m comment --comment "Allow HTTP"
iptables -I INPUT -p tcp --dport 443 -j ACCEPT -m comment --comment "Allow HTTPS"
iptables -I INPUT -p tcp -m state --state NEW --dport 22 -j ACCEPT -m comment --comment "Allow SSH"
iptables -I INPUT -p tcp --dport 8071:8079 -j ACCEPT -m comment --comment "Allow torrents"

iptables -A INPUT -i lo -j ACCEPT -m comment --comment "Allow all loopback traffic"
iptables -A INPUT ! -i lo -d 127.0.0.0/8 -j REJECT -m comment --comment "Drop all traffic to 127 that doesn't use lo"
iptables -A OUTPUT -j ACCEPT -m comment --comment "Accept all outgoing"
iptables -A INPUT -j ACCEPT -m comment --comment "Accept all incoming"
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT -m comment --comment "Allow all incoming on established connections"
iptables -A INPUT -j REJECT -m comment --comment "Reject all incoming"
iptables -A FORWARD -j REJECT -m comment --comment "Reject all forwarded"

Bye...

Sunday, 24 June 2018

HPE Superdome X

What is the HPE Superdome X?

The Superdome X is an enterprise level x86 server that’s designed to support mission critical workloads that require maximum scalability and reliability. It is intended to run the most resource intensive business processing, decision support, virtualization and database workloads, including SQL Server, SAP and Oracle. The Integrity Superdome X consists of a single compute enclosure containing one to eight BL920s Gen8 or Gen9+ blades as well as interconnect modules, manageability modules, fans, power supplies, and an integrated LCD Insight Display. The Insight

Display can be used for basic enclosure maintenance and displays the overall enclosure health. The compute enclosure supports four XFMs that provide the crossbar fabric which carries data between blades.

To service any internal compute enclosure component, complete the following steps
in order:
1. Power off the partition.
2. Power off all XFMs.
3. Disconnect the power cables from the lower power supplies.
4. Disconnect the power cables from the upper power supplies.

Each BL920s server blade contains two x86 processors and up to 48 DIMMs. Server blades and partitions Integrity Superdome X supports multiple nPartitions of 2, 4, 6, 8, 12, or 16 sockets (1, 2, 3, 4, 6, or 8 blades). Each nPartition must include blades of the same type but the system can include nPartitions with different blade types.

Integrity Superdome X provides I/O through mezzanine cards and FlexLOMs on individual server blades. Each BL920s blade has two FLB slots and three Mezzanine slots.

The Integrity Superdome X compute enclosure supports two power input modules, using either single phase or 3-phase power cords. Connecting two AC sources to each power input module provides 2N redundancy for AC input and DC output of the power supplies. There are 12 power supplies per Integrity Superdome X compute enclosure. Six power supplies are installed in the upper section of the enclosure, and six power supplies are installed in the lower section of the enclosure.

Isn’t the Superdome X an Itanium server?

No. HPE markets a separate server called the Integrity Superdome 2 that is built around the Itanium chip and it runs HP-UX. The HPE Superdome X is an x86 server that uses the Intel Xeon Processor E7 v3 processor family and it runs SLES, RHEL, Microsoft Windows Server 2012 R2, VMware vSphere and CentOS. It will also be certified for Windows Server 2016 when Microsoft releases it.

What is the maximum scalability of the Superdome X?

The HPE Superdome X provides extreme scalability. In its maximum configuration it can support up to 16 sockets and 288 cores. You can configure the Superdome X with one to eight scalable BL920 Gen9 x86 blades. The maximum memory capacity is 3 TB per blade for a total of 24 TB of RAM for a fully configured Superdome X server. SQL Server 2016 can scale to consume all of these cores and with Windows Server 2016, scalability will be up to 640 cores.

What availability features does the Superdome X have?

The HPE Superdome X is designed to provide five nines (99.999 percent) of availability. All key Superdome X hardware components are redundant and hot-swappable. This includes components like power supplies, fans, and I/O switches. The Superdome X uses a “firmware first” architecture that is able to contain errors in the firmware before any corrupted data can reach the OS. In addition, the built-in Error Analysis Engine (EAE) constantly analyzes all possible hardware faults, predicts errors and can automatically initiate recovery actions without any operator actions.

What are nPars?

The Superdome X supports multiple hardware partitions that are called nPars. Each nPar partition can be completely electrically isolated from the other partitions. Using the HPE Superdome X nPar technology, you can effectively run multiple diverse workloads on the same server system and those workloads will not interfere with one another. For instance, you can run an instance of the SQL Server relational database in one partition and SQL Server Analysis Services and Reporting Services in another partition. Even though these workloads have very different characteristics, they would be completely isolated from one another just as if they were running on separate systems.

Which virtualization technologies are supported with Superdome X?

Superdome X is certified for Hyper-V, VMware vSphere, and KVM/RHEV virtualization.

HPE Integrity Superdome X Management
see the entire Superdome X system through the Superdome Onboard Administrator (OA).
•    iLO Management—remote access the individual servers
•    HPE System Update Manager—firmware management and system updates system updates.
•    HPE Insight Remote Support (7.x) software—24x7 remote monitoring, automated case creation, diagnosis, notifications, and connectivity to HPE Support.
•    HPE Insight Online and the mobile dashboard—monitor device health and alerts, contract and warranties, or service credits.

IBM XIV 2810/12-114

--42U 19" standard rack
--1 ATS, 3UPS, 15Module, 12Drive per module, 1U management console -Chabuka module4-5-6-7-8-9 interface module.
--laptop on laptop port, dhcp enabled, ip received is 14.10.202.1 (Laptop port works as dhcp server ip is 14.10.202.250).
-- ta tool is required for initial configuration, code load and various maintainenace activity. guided procedure technician/????????.
-- logical configuration XIV GUI admin/adminadmin technician.
-- XCLI command state_list, monitor_redist, help, event_list, componenet_phaseout componenet=componentid,
--componenet_list filter=notok, componenet_test component=1:module:10
-- fc_port_list, fc_connectivity_list logged_in=yes
--servicecenter.xiv.ibm.com for remote management.
support_center_connect and support_center_disconnect
--/dev/sda CF configuration and root filesystem
--/dev/sdb 37gb from each disk. SX - Traces/Events/Cores
--/dev/sdc 60gb 1 vol per If Mod (x6)
-- bootup time 4minu
-- upgrade from 10.0 to 10.1 -disruptive
-- upgrade since 10.1 -concurrent i/o cutover time for host < 13sec
--With 1tb drives --> 180tb - 12*1tb + 3*1tb + 6.8tb for SX => mirror /2 79tb
--with 2tb drives --> 360tb - 12*2tb + 3*2tb + 8tb for for SX => mirror /2 161tb
-- data broken in 1MB partition and mirrored such that partitions are stored on separate module.
-- Each logical volume is created from the partitions of all drives and entire modules are rebuilt in the event of a failure and only used capacity is rebuilt.
-- All DDM take part in the rebuild.
-- storage pool ==> volume
-- host connection ==> host ==> map volume to host
--The storage space of IBM XIV storage system is partitioned into storage pools, where each volume belongs to specific storage pool.
Storage pool provides improved management and regulation of storage space.
--size of storage pool is 17GB to 80654GB. Size of the storage pool can be increased limited only by the free space on the system
--size of storage pool can be decreased limited only by the space consumed by volumes and clones.
-- volumes can be moved between storage pools as long as there is enough free space in the target storage pool.
-- All of the above transaction on storage pools are pure accounting transactions and do not impose any data copying from one storage pool to another.
-- Volume can belong to 1 storage pool. 1 consistency group. All volumes in the consistency group belong to same storage pool. Volume can have multiple clone. A clone is point in time copy of a volume.
-- XIV queue depth 1400 per port and 256 per volume.

IOPS = Queue Depth/Latency
Throughput=IOPS*IO Size ==> Queue Depth/Latency *IO Size

-- Host queue depth - A minimum queue depth of 64 should be used. Performance can be improved with higher values dependent on relative workload levels and content.
-- Multi Path : AIX MPIO Currently only default Path Control Module is supported in active/passive mode
-- Linux Device Mapper - Requires RPMs to be installed and kernel recompile.
-- Solaris Native MPxIO, windows DSM to support XIV, VMware native active/standby
-- Not advisable to use two protocols(FCP/iSCSI) to access the same volume. Might be used to migrate a host from FC to iSCSI. To access different volumes
from the same host through different protocols, use separate host definition.
-- supports traditional (scsi2) and persistent (scsi3) reserves
--reserves can be displayed and cleared using xcli commands
reservation_list - list volume reservations
reservation_key_list - list reservation keys
reservation_clear - clear reservations of a volume
-- In multi Host environments reserves can be used to block volume accesses from other hosts while it is updated from the reserving host.
Problem exists, when the reserving host crashes while the reserve is still outstanding. The above commands can be used to analyze the situation by the customer and resolve the problem by clear the reservation.
-- SSR should never use such commands as the risk to damage the data integrity is very high.

Monday, 2 April 2018

Benchmarks/Metrics

Several tools test system performance. Some are specific to an application or environment, while others are more general. Whenever a tool is used, it is critical to understand for what the tool was designed and how it operates in different environments and with different storage array features such as deduplication. The following are several tools and the associated use case for each:

• IOmeter. IOmeter is an I/O subsystem measurement and characterization tool for single and clustered systems. It was originally developed by Intel Corporation. For more information about IOmeter, refer to http://iometer.org.

• OLTP-A. OLTP-A consists of a single workload with 8K blocks and a 40%/60% read/write mix with both random and sequential patterns. Though most databases are more read intensive, this benchmark was selected because of the write environment, which in an all-SSD system tends to be the bottleneck. The workload simulates a write-heavy online transaction-processing database and executes both queries and updates to the database during operation.

• SLOB (Silly Little Oracle Benchmark). SLOB is a complete toolkit for generating I/O through an Oracle database and is used to analyze the I/O capabilities of the database. SLOB is designed to generate high I/O levels without causing application bottlenecks.

• VMmark. VMmark is a free tool from VMware to measure performance for applications running in VMware environments. For example, this tool helps to identify the number of applications that can be supported using a single storage system.

• sqlio. sqlio is a tool provided by Microsoft that can also be used to determine the I/O capacity of a configuration.

• Vdbench. Vdbench is a command-line tool to generate disk I/O for validating storage performance.

• STAC-M3. The STAC Benchmark Council developed the STAC-M3 Benchmarks. These benchmarks are primarily used in the financial community to measure performance associated with financial applications.

Metrics Terminology

The terms used to describe performance can be thought of in three pillars: I/Os per second, latency, and throughput or bandwidth. Depending on the workload, the performance test values obtained are dramatically different by design intent. For example, if the workload is predominantly small random reads, the IOPS values are high, but the throughput values are relatively low in comparison. Conversely, if the workload is mostly large sequential I/O, especially 1M I/O, the measured IOPS values and throughput values are close to the same. Both of these circumstances are expected behaviors. In general terms, the relationship between IOPS and throughput can be expressed as:

Throughput (MBps) = IOPS multiplied by block size (MB)

The following sections explore each performance measure in more detail.

I/Os per Second (IOPS)

I/Os per second is the measure of how many input/output communications pass between an initiator (host server) and a target (E-Series storage system) in one second. Based on the protocol that manages the path between the initiator and target, the maximum level of IOPS achievable varies significantly. For example, a 10Gbps iSCSI link cannot process the same level of I/Os as a 32Gb FC link. Factors external to the storage system also introduce overhead to communications and affect IOPS resulting from multiple network settings and host settings on HBAs, HCAs, and NICs to settings in OSs and multiple other host-side or application-related issues. Therefore, achievable IOPS for a given system depends on many factors and must be understood holistically for you to achieve the best performance profile for a given environment.

Latency

The second pillar is latency: the time required to move an I/O from the initiator to the target or back in the other direction. All of the same factors that affect IOPS have a related impact on latency. However, the other measures ultimately top out to the limits of the hardware and protocols involved. Latency spikes and grows dramatically as hardware and protocol limits are exceeded. As a result, latency is the measure used to define operating ranges that apply to normal data center workloads.

Throughput/Bandwidth

Throughput is the third pillar of storage performance and is a measure of how much data can pass between host initiators and storage system targets in one second. Like IOPS, throughput is heavily related to the type of workflow. For example, from a throughput perspective, it takes many small I/Os to equal one large IOPS. As a result, the type of I/O transferred as well as the bandwidth (size of links) and protocol used play significant roles in the amount of data that can be transferred in one second. With this fact in mind, some host OS suppliers have built in tunable settings to allow hosts to use larger I/Os or block sizes when transferring data to and from storage. As a result, just like IOPS, the ability to achieve certain throughput requirements is a multilevel activity and depends on factors both inside the storage system and within each unique customer data center.

Bye...