Status Overview

The health of the system is monitored using a collection of JumpScripts, documented in Monitoring the System Health.

On the Status Overview page you get an immediate view on the health of the system.

You can access the Status Overview page in two ways:

  • By clicking the green/orange/red status dot in the top navigation bar:

  • or via the left navigation bar, under Grid Portal click Status Overview

Under the Process Status you get an overview of the health based on the last health check.

By clicking Run Health Check a new health check gets scheduled to start immediately.

Clicking any of the Details links brings you to the Node Status page, providing detailed health information for the selected node:

Clicking Run Health Check on Node will first ask you for your confirmation:

Once confirmed all health check jobs (JumpScripts) will start, as you can verify on the Jobs page:

On the Node Status page you can see more details by clicking the various section titles. You also have the option here to start the health check related to any of them items listed under each of the sections.

Depending on the type of node, following sections are available:

Section Master Node CPU Node Storage Node
AYS Process X X X
Databases X
Disks X X X
JSAgent X X X
Network X
Orphanage X X
Redis X X X
System Load X X X
Temperature X X X
Workers X X X
Hardware X X
Stack Status X
Deployment Test X
OVS Services X

AYS Process

Databases

Disks

JSAgent

Network

Orphanage

Depending on the node, you will see information about "orphan" disks or "orphan" virtual machines.

In case of the master node, this look like this:

In case of a CPU node you will get an overview of all "orphan" virtual machines. This is about virtual machines that are marked as destroyed in the Grid and Cloud Broker Portal, while they still exist in reality on a physical node. This is obviously unwanted, and as part of automatic health checks, "orphan" virtual machines will get removed.

In order to manually remove "orphan" virtual machines use the following commands at the command line of the physical machine where the "orphan" virtual machine exists:

vm="vm-8"
disks="$(virsh dumpxml $vm | grep 'source file' | cut -d "'" -f 2)"
virsh destroy $vm; virsh undefine $vm
rm $disks
rm -rf /mnt/vmstor/$vm

Redis

[]

System Load

Temperature

Workers

Hardware

Node Status

Deployment Test

OVS Services

results matching ""

    No results matching ""