Adding a CIM Provider VIB file to the SCFB CIMOM on ESXi 5.0/5.1 using esxcli

Background

The ESXi 5.x series of VMware ESX servers are a highly updated platform from its ESX/ESXi 4.x series. Aside from a ton of updates and improvements, one major change in the 5.x series is that the Service Console (which was basically a Linux based shell around the VMkernel) is completely removed. In its place, there is an optional stripped down version of a shell that has a few basic Unix-like commands (based on the Busybox package), and minimal shell command line support.

The removal of the service console essentially means that installing customized software on the ESXi server itself is substantially restricted. No longer can we merely bundle our own code/libraries and expect them to work on the ESXi server. Instead, the new format of the VIB file needs to be conformed to. Under the hood, the VIB format is simply a zipped up package (allegedly based on the Debian packaging format) that contains the binaries that we want to install as well as descriptor XML files listing out dependencies, paths where the binaries need to go, etc. In addition, a signed VIB file will contain a certificate identity as well as a unique hash identifying the package. Also, the esxcli command is the best and recommended way of installing VIB files/checking for various hardware and software information on the platform. While it takes some getting used to, it is infinitely more powerful and convenient that earlier avatars of the same command.

Lastly, one big change in the ESXi 5.x series is that the SFCB (Small Footprint CIM Broker) is the standard CIMOM that comes pre-installed on the platform. This means that if we want to plug-in some CIM providers, it would be easier to plug-in the SFCB-compliant version of the CIM provider into the SFCB CIMOM. That is the problem that will be solved in this blog, using a sample CIM provider mundanely entitled, “my-cim-provider“.

The script


#!/bin/sh

PROVIDER_VIB=my-provider
CFG_FILE=/etc/sfcb/sfcb.cfg
CFG_BACKUP_FILE=/etc/sfcb/sfcb.cfg_bk

#Check if the hostd daemon is running.
#This is required for the esxcli command.
check_hostd()
{
echo
echo "[Checking for hostd daemon]"

HOSTD_STATUS=`/etc/init.d/hostd status`

if [ "$HOSTD_STATUS" = "hostd is not running." ];then
echo "hostd is not currently running."
echo "Starting hostd as it is required for the installation"

HOSTD_START_STATUS=`/etc/init.d/hostd start`
if [[ "$HOSTD_START_STATUS"=="hostd started" ]]; then
echo "hostd started successfully"
fi
else
echo "hostd daemon is currently running on the machine"
fi

echo "[Finished checking for hostd daemon]"
echo
}

#Check if the VIB file is already installed on the machine.
check_if_vib_already_installed()
{
echo
echo "[Checking if the CIM Provider is already installed on the machine]"

esxcli software vib list | grep -i $PROVIDER_VIB >/dev/null

if [ "$?" = "0" ]; then
echo "The CIM Provider is already installed."
echo "Would you like to uninstall the VIB file? Enter 'y' or 'n'"
read option
if [ "$option" = "y" ];then
uninstall_vib_file
else
echo "Exiting installation"
exit 0
fi
else
echo "The CIM Provider is currently not installed on the machine"
fi

echo "[Finished checking if the CIM Provider is already installed on the machine]"
echo
}

#Uninstall the existing VIB file, if present.
uninstall_vib_file()
{
echo
echo "[Uninstalling the VIB file: $PROVIDER_VIB]"

/etc/init.d/sfcbd-watchdog stop >/dev/null
esxcli software vib remove --vibname=$PROVIDER_VIB --maintenance-mode -f

if [ "$?" = "0" ];then
echo "VIB file: $PROVIDER_VIB uninstalled successfully."
/etc/init.d/sfcbd-watchdog start >/dev/null
echo "Rebooting machine as it is required by the uninstallation"
reboot -f
else
echo "Failed to uninstall the VIB file: $PROVIDER_VIB"
/etc/init.d/sfcbd-watchdog start >/dev/null
exit 1
fi
}

#Edit the SFCB config file with desired values for
#CIMOM parameters.
modify_sfcb_cfg_file()
{
echo
echo "[Updating the file: $CFG_FILE]"

echo "Backing up the existing config file first..."
#Backup the original sfcb.cfg file
cp -f $CFG_FILE $CFG_BACKUP_FILE
echo "Finished backing up the config file to $CFG_BACKUP_FILE"

#Values to be changed
doBasicAuth=false
enableHttp=true
httpLocalOnly=false
sslClientCertificate=ignore
httpProcs=10

#Set the values in the config file
sed -i "s/doBasicAuth:.*/doBasicAuth:   $doBasicAuth/g" $CFG_FILE
sed -i "s/enableHttp:.*/enableHttp:   $enableHttp/g" $CFG_FILE
sed -i "s/sslClientCertificate:.*/sslClientCertificate:   $sslClientCertificate/g" $CFG_FILE
sed -i "s/httpLocalOnly:.*/httpLocalOnly:   $httpLocalOnly/g" $CFG_FILE
sed -i "s/httpProcs:.*/httpProcs:   $httpProcs/g" $CFG_FILE

#Restart the scfb service
/etc/init.d/sfcbd-watchdog restart >/dev/null

echo "[Finished updating the config file: $CFG_FILE]"
echo
}

#In the case the user wants to reboot the machine later.
reboot_canceled()
{
echo "You have decided to cancel the machine reboot. Please reboot the machine to complete the installation"
echo "[Installation of CIM Provider complete]"
exit 0
}

#The main installation logic.
install_vib_file()
{
echo
VIB_FILE=`pwd`/qlogic-cna-provider.vib

echo "[Installing the QLogic Provider VIB file: $VIB_FILE]"
esxcli software vib install -v file://$VIB_FILE -f --maintenance-mode --no-sig-check
echo "[Finished installing the QLogic Provider VIB file: $VIB_FILE]"

#Update the SFCB config file with specific values required by IIAS
modify_sfcb_cfg_file

#reboot the machine - required after installation
trap 'reboot_canceled' INT
echo "Rebooting the machine to complete installation. Press <Ctrl+C> to cancel reboot in "

for i in 10 9 8 7 6 5 4 3 2 1
do
echo $i seconds...
sleep 1
done

echo "[Rebooting machine NOW. Installation of CIM Provider is complete]"
reboot -f
}

#Main script starts here
echo "[Starting installation of CIM Provider]"

check_hostd
check_if_vib_already_installed
install_vib_file

Explanation

The code is pretty straightforward. Thankfully, basic shell scripting is still allowed on the ESXi 5.x console. However, please note that in order to use the command line, you need to enable the SSH service on the ESXi 5.x server using the vSphere Client (Configuration->Security Profile).

The first thing we need to do is to to check if the hostd daemon is running or not. This is required for the esxcli command to work. I found this out the hard way since it had been some time since I had exposure to the ESXi platform (the last one I worked with being the ESXi 4.1 platform), and documentation for the ESXi platform has been meager at best, and it’s even worse for the 5.x series. In case the hostd daemon is not running, we start it up.

The second thing we do is to check if the VIB file (given by the variable, PROVIDER_VIB) is already installed on the machine. In this specific case, we assume that update is not possible, and we need to uninstall the existing package before we can proceed with the installation of a possibly newer version of the same package. If this is not true, then this check can be skipped, and an update command invoked instead of the normal installation command, later on. One additional check that might possibly be done here is to check for the package version, if that is relevant to your specific needs. In this case, if the VIB file is already installed, we need to uninstall it first, and so we provide the user with that option.

If the user has chosen to proceed with the uninstallation of the existing VIB file, we need to stop the SFCB service (via its watchdog), and then invoke the command to uninstall the VIB file:


esxcli software vib remove --vibname=$PROVIDER_VIB --maintenance-mode -f

Different VIB files have different requirements when it comes to uninstallation or installation. For our CIM provider, we need to put the ESXi machine into maintenance mode, and we also need to forcefully uninstall it, if need be (using the -f flag). Also, in this case, we need to reboot the machine after the uninstallation. This need not be the case for other VIB files.

After the uninstallation is done (or if the VIB file was not present on the machine in the first place), we proceed with the actual installation of the VIB file. For this, we set up the variable, VIB_FILE, to contain the absolute path to the CIM Provider VIB file. In this case, we assume that the VIB file is in the same directory as the installer script. If this is not the case, you can set up the path to the VIB file accordingly, the only requirement being that it must be the absolute path to the VIB file, anywhere visible to the esxcli command (i.e., the ESXi 5.x console). The command used for the installation of the package is:


esxcli software vib install -v file://$VIB_FILE -f --maintenance-mode --no-sig-check

Again, we put the machine into maintenance mode using the –maintenance-mode flag, and then additionally we request the installation to forgo the check of the signature on the package using the –no-sig-check flag (if the package is signed). This is not a good practice, but it will work in case there are some problems with the signature, and we still want to proceed with the installation. Finally, we force the installation using the -f flag.

Now comes the interesting part. For our CIM Provider, my-cim-provider, we require to modify some of the default values of the SFCB CIMOM. This configuration is located in the /etc/sfcb/sfcb.cfg file (given by the variable, CFG_FILE). The specific parameters that we want to modify are: disable basic authentication (doBasicAuth=false), enable the HTTP port (enableHttp=true), enable the HTTP port for non-local connections (httpLocalOnly=false), ignore the SSL Client Certificate(sslClientCertificate=ignore) since we don’t want to use SSL, and finally increase the number of HTTP processes used by the SFCB CIMOM from the default 4 to a healthy 10(httpProcs=10). For your specific needs, the values of different parameters might need to be modified in different ways. The same approach can be used to achieve the same. Note that any time there is a change to the SFCB configuration, we need to restart the SFCB daemon.

First off, we backup the existing SFCB configuration file, so that the user can restore his original settings in case of any issues. The we use sed to update the required parameter values to the new values. A sample command is:


sed -i "s/doBasicAuth:.*/doBasicAuth:   $doBasicAuth/g" $CFG_FILE

What this line basically means is, replace the  string matched by the regex (doBasicAuth:.*) with the new string ($doBasicAuth). For syntactic sugar, we include as many spaces before the value, $doBasicAuth (which is “false”) as were in the original SFCB configuration file. The /g switch simply instructs sed to perform the replacement for every match of the regex in the whole file. This will not be the case on most machines, and this is more of a safety measure to ensure that even if there are multiple instances of the same parameter in the same file, the updates to the values are consistent, and according to the desired values. sed is a powerful tool that is often overlooked in favor of other tools such as awk and Perl, but in terms of string manipulation and replacement in-place in files, nothing really comes close to its power and versatility. Finally, we restart the SFCB service. Note that I consistently direct the output of the commands to /dev/null (not just standard errors, but all output). This is to ensure a more or less cross-platform compatibility to avoid echoing the messages from the invocation of the commands. While seeing the verbose output of the commands might be useful for debugging during development, it is hardly fair to overload the customer with such extraneous messages. Customize it as per your own needs.

Finally, we need to reboot the machine after the installation of the vib file (again, this may not be the case for your own VIB file). I again provide the user the option to reboot the machine at a later stage. For this, I make use of a nifty feature of various shells that if often under-appreciated: traps. The general for of the trap command is:


trap '<your logic/function call>' <SIGNAL, such as SIGINT or simply, INT>

For this specific script, I instruct the user to press <Ctrl+C> within 10 seconds to abort the reboot. This sends a SIGINT (or INT for short) trap, which I then redirect to the reboot_canceled function, which informs the user appropriately, and exits the installer script normally. In case the trap is not received within 10 seconds, the machine is rebooted.

After the reboot, the user can then check the status of the VIB file to ensure that it has been installed successfully. It can be done with the following command (which, arguably, can be put in its own script and then executed by the user to check the status of the VIB installation):


esxcli software vib list | grep -i my-cim-provider

So that’s it – as simple as it can get on the new ESXi 5.x platform!

Creating a service and a service watchdog using simple shell scripts in Linux

Recently at work I was given a feature to support the customization and installation of OpenPegasus CIMOM (CIM Server) on Linux machines in binary mode. What this means is that instead of building from source code on the Linux machines (as would be the sane thing to do in view of the huge compatibility issues), it was decided to create the binaries on my development box, and then bundle only the required portions as part of an installation script. The main reason for this was the fact that we had a dependency on an external CIM Provider (QLogic), who obviously provided us only with the binaries built on a base Linux machine (specifically, RHEL 5.8).

There were many interesting problems that arose due to library dependencies, OS/ABI incompatibilities, and GCC/GLIBC dependencies. I also learned a lot about the whole process of working with third-party vendors. I plan to cover all of them in a series of upcoming blog posts. For now, however, I would like to post some useful information about I helped the installer team enhance their installation scripts by creating a service and a service watchdog for the OpenPegasus CIMOM bundled with the QLogic provider. For representative purposes, I will use the term “My Service” to refer to the hypothetical service. I will also provide the main logic of the relevant scripts that I wrote for the purpose without violating any NDA restrictions of my workplace! So let’s get right on to it then.

Creating a service in Linux using a shell script

Creating a service in Linux is a pretty simple task. You really just add execution privileges to the shell script, drop it into the /etc/init.d folder, and then invoke a series of commands. The code for the service that installs the OpenPegasus (version 2.11.0 used) CIMOM with the bundled QLogic CIM Provider binaries is listed as follows:

#!/bin/sh

# chkconfig: 2345 55 10
# description:My service
# processname:myservice

usage() {
        echo "service myservice {start|stop|status|"
        exit 0
}

export PEGASUS_ROOT=/opt/pegasus2.11.0
export PEGASUS_HOME=$PEGASUS_ROOT
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PEGASUS_ROOT/lib
export PEGASUS_PLATFORM=LINUX_IX86_GNU
export PATH=$PATH:$PEGASUS_ROOT/bin
export PEGASUS_HAS_SSL=yes

case $1 in

	start) $PEGASUS_ROOT/bin/cimserver
		;;
	stop) $PEGASUS_ROOT/bin/cimserver -s
		;;
	status) if [ `pidof $PEGASUS_ROOT/bin/cimserver` ]; then
			echo "Running"
		else
			echo "Not running"
		fi
		;;
	*) usage
		;;
esac

Explanation:

We start off with the usual shebang followed by the path to the “sh” executable (#!/bin/sh). The following lines are quite interesting, and worth explaining in a bit more detail. The # chkconfig 2345 90 10 line merely informs the OS that we want this service script to be activated for Linux Run Levels 2,3,4 and 5. Check out “Linux Run Levels” for more information on Run Levels in Linux. The parameter 90 refers to the priority to be assigned for service startup (we usually want this to be a moderately high value) while the last parameter 10 refers to the service stop priority (this can be a moderately low value). The specific values for this parameter will depend on your service’s usage patterns. The #description line is optional, and is used to give a descriptive name to the service. The #processname line is the name that you will use for your service, and is usually the same as your script name.

The rest of the logic is pretty simple: I want to support three options – start, stop, and status. For this purpose, I export the relevant environment variables in this script itself so that it does not pollute any other namespace (you could export them in ~/.profile, or ~/.bash_profile, or ~/.bashrc for instance if you want them to be globally available). Then I merely put the logic to start/stop/query the cimserver executable, which is the executable that actually represents the OpenPegasus CIMOM. The core logic of this service script is the command pidof $PEGASUS_ROOT/bin/cimserver, which returns the PID of the specified executable in the current environment.

To install this script as a service, the following commands are performed:

#cp myservice /etc.init.d
#chmod +x myservice
#chkconfig --add myservice
#chkconfig --level 2345 myservice on

The #chkconfig –add myservice is the command that actually adds your script as a Linux service. For this, the script must be executable (chmod +x might be too permissive, feel free to choose a lower level of execution permission), and must be present in /etc/init.d (or at least a soft-link created to the file in this directory). Then, finally, the #chkconfig –level 2345 myservice on command makes your service automatically start with system boot-up. This ensures that your service is always on so long as your Linux box is up. Neat!

But what happens if the service crashes while the machine is still up? It certainly will not restart itself. For this purpose, I decided to add a service watchdog for “myservice”, as shown in the following section.

Creating a service watchdog in Linux using a shell script

The service watchdog’s responsibility is to monitor the main service (say, every minute or so), check its status, and then restart it if it is not running. This ensures a maximum downtime of a minute (or whatever value you chose) for your service. It is quite a nifty feature indeed. This is similar to the scenario where, in Windows, you would set the service properties to “Automatically Restart”. The code for the watchdog for “myservice” is given below:

#!/bin/sh

#chkconfig: 2345 90 10
#description: watchdog for myservice
#processname: myservice-watchdog

MYSERVICE_PID=`pidof /opt/pegasus2.11.0/bin/cimserver`

check_myservice() {
        if [ -z $MYSERVICE_PID ];then
                service myservice start
        fi
}

check_myservice

usage() {
	echo "myservice-watchdog {start|stop|status}"
	exit 0
}

case $1 in
	start ) if [-z $MYSERVICE_PID ];then
		service myservice start
		else
			echo "myservice is already running"
		fi
		;;
	stop ) if [ -n $MYSERVICE_PID ];then
		service myservice stop
		else
			echo "myservice is already stopped"
		fi
		;;
	status) if [ -z $MYSERVICE_PID ];then
			echo "myservice is not running"
		else
			echo "myservice is running"
		fi
		;;
	*) usage
		;;
esac

Explanation:

The logic for the watchdog might seem curiously similar to that of the service itself, and that is right. There were a number of reasons why I chose this approach:

  • The idea is to always monitor the state of the executable itself, and not the service. This ensures that if, for some reason, the service script returns spurious data, the watchdog can avoid spawning multiple instances of the executable, which would most likely fail anyway.
  • The watchdog is also installed a service.  This is not usually required, but in this case it needs to support the following options: start, stop, and status. In addition, the check_myservice function is the one that is used to monitor the service itself (actually the executable).
  • The watchdog is triggered to be run every minute using crontab. This will only run the check_myservice function, whereas any direct invocation of the watchdog will have to supply any one of the following options: start/stop/status.
  • The idea is to always handle the executable indirectly via the watchdog (start/stop/status) rather than directly through the service itself, even if that is also possible. This is more of a best practice than a strict requirement.

The watchdog is installed as a service using the following commands:

#cp myservice /etc.init.d
#chmod +x myservice
#chkconfig --add myservice-watchdog
#chkconfig --level 2345 myservice-watchdog on

The explanation for the steps is the same as that for the installation of the main service itself. It is also worth noticing that the watchdog is also installed as a daemon.

Then we need to create a cron job that will trigger the check_myservice function of the watchdog every minute. For this, the best option (since we are triggering the whole process through an installation script) is to create a cron job in a text file, place that file in the /etc/cron.d directory (where user cron jobs can be placed), and the restarting the crond daemon process to make the new cron job visible to the OS, as follows:

#echo "* * * * * /etc/init.d/myservice-watchdog" > my.cron
#echo "" >> my.cron
#cp my.cron /etc/cron.d
#service crond restart

And that’s it! The most important bit to remember here is that the #echo “” >> my.cron line is required because of a bug in the way crontab behaves – it expects a newline or an empty line after the last cron job in the file. If it is missing, crontab will not fail, or throw an error, but silently avoid triggering the job! Trust me, this is mental agony that you definitely do not want to experience. The cronjob itself is pretty simple – simply call the watchdog every minute (read up on the syntax and semantics of cron jobs in Linux if you are confused by that line).

I hope that this serves a useful purpose for anyone that is planning to explore creating services and watchdogs using shell scripts in Linux.