Creating a service and a service watchdog using simple shell scripts in Linux


Recently at work I was given a feature to support the customization and installation of OpenPegasus CIMOM (CIM Server) on Linux machines in binary mode. What this means is that instead of building from source code on the Linux machines (as would be the sane thing to do in view of the huge compatibility issues), it was decided to create the binaries on my development box, and then bundle only the required portions as part of an installation script. The main reason for this was the fact that we had a dependency on an external CIM Provider (QLogic), who obviously provided us only with the binaries built on a base Linux machine (specifically, RHEL 5.8).

There were many interesting problems that arose due to library dependencies, OS/ABI incompatibilities, and GCC/GLIBC dependencies. I also learned a lot about the whole process of working with third-party vendors. I plan to cover all of them in a series of upcoming blog posts. For now, however, I would like to post some useful information about I helped the installer team enhance their installation scripts by creating a service and a service watchdog for the OpenPegasus CIMOM bundled with the QLogic provider. For representative purposes, I will use the term “My Service” to refer to the hypothetical service. I will also provide the main logic of the relevant scripts that I wrote for the purpose without violating any NDA restrictions of my workplace! So let’s get right on to it then.

Creating a service in Linux using a shell script

Creating a service in Linux is a pretty simple task. You really just add execution privileges to the shell script, drop it into the /etc/init.d folder, and then invoke a series of commands. The code for the service that installs the OpenPegasus (version 2.11.0 used) CIMOM with the bundled QLogic CIM Provider binaries is listed as follows:

#!/bin/sh

# chkconfig: 2345 55 10
# description:My service
# processname:myservice

usage() {
        echo "service myservice {start|stop|status|"
        exit 0
}

export PEGASUS_ROOT=/opt/pegasus2.11.0
export PEGASUS_HOME=$PEGASUS_ROOT
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PEGASUS_ROOT/lib
export PEGASUS_PLATFORM=LINUX_IX86_GNU
export PATH=$PATH:$PEGASUS_ROOT/bin
export PEGASUS_HAS_SSL=yes

case $1 in

	start) $PEGASUS_ROOT/bin/cimserver
		;;
	stop) $PEGASUS_ROOT/bin/cimserver -s
		;;
	status) if [ `pidof $PEGASUS_ROOT/bin/cimserver` ]; then
			echo "Running"
		else
			echo "Not running"
		fi
		;;
	*) usage
		;;
esac

Explanation:

We start off with the usual shebang followed by the path to the “sh” executable (#!/bin/sh). The following lines are quite interesting, and worth explaining in a bit more detail. The # chkconfig 2345 90 10 line merely informs the OS that we want this service script to be activated for Linux Run Levels 2,3,4 and 5. Check out “Linux Run Levels” for more information on Run Levels in Linux. The parameter 90 refers to the priority to be assigned for service startup (we usually want this to be a moderately high value) while the last parameter 10 refers to the service stop priority (this can be a moderately low value). The specific values for this parameter will depend on your service’s usage patterns. The #description line is optional, and is used to give a descriptive name to the service. The #processname line is the name that you will use for your service, and is usually the same as your script name.

The rest of the logic is pretty simple: I want to support three options – start, stop, and status. For this purpose, I export the relevant environment variables in this script itself so that it does not pollute any other namespace (you could export them in ~/.profile, or ~/.bash_profile, or ~/.bashrc for instance if you want them to be globally available). Then I merely put the logic to start/stop/query the cimserver executable, which is the executable that actually represents the OpenPegasus CIMOM. The core logic of this service script is the command pidof $PEGASUS_ROOT/bin/cimserver, which returns the PID of the specified executable in the current environment.

To install this script as a service, the following commands are performed:

#cp myservice /etc.init.d
#chmod +x myservice
#chkconfig --add myservice
#chkconfig --level 2345 myservice on

The #chkconfig –add myservice is the command that actually adds your script as a Linux service. For this, the script must be executable (chmod +x might be too permissive, feel free to choose a lower level of execution permission), and must be present in /etc/init.d (or at least a soft-link created to the file in this directory). Then, finally, the #chkconfig –level 2345 myservice on command makes your service automatically start with system boot-up. This ensures that your service is always on so long as your Linux box is up. Neat!

But what happens if the service crashes while the machine is still up? It certainly will not restart itself. For this purpose, I decided to add a service watchdog for “myservice”, as shown in the following section.

Creating a service watchdog in Linux using a shell script

The service watchdog’s responsibility is to monitor the main service (say, every minute or so), check its status, and then restart it if it is not running. This ensures a maximum downtime of a minute (or whatever value you chose) for your service. It is quite a nifty feature indeed. This is similar to the scenario where, in Windows, you would set the service properties to “Automatically Restart”. The code for the watchdog for “myservice” is given below:

#!/bin/sh

#chkconfig: 2345 90 10
#description: watchdog for myservice
#processname: myservice-watchdog

MYSERVICE_PID=`pidof /opt/pegasus2.11.0/bin/cimserver`

check_myservice() {
        if [ -z $MYSERVICE_PID ];then
                service myservice start
        fi
}

check_myservice

usage() {
	echo "myservice-watchdog {start|stop|status}"
	exit 0
}

case $1 in
	start ) if [-z $MYSERVICE_PID ];then
		service myservice start
		else
			echo "myservice is already running"
		fi
		;;
	stop ) if [ -n $MYSERVICE_PID ];then
		service myservice stop
		else
			echo "myservice is already stopped"
		fi
		;;
	status) if [ -z $MYSERVICE_PID ];then
			echo "myservice is not running"
		else
			echo "myservice is running"
		fi
		;;
	*) usage
		;;
esac

Explanation:

The logic for the watchdog might seem curiously similar to that of the service itself, and that is right. There were a number of reasons why I chose this approach:

  • The idea is to always monitor the state of the executable itself, and not the service. This ensures that if, for some reason, the service script returns spurious data, the watchdog can avoid spawning multiple instances of the executable, which would most likely fail anyway.
  • The watchdog is also installed a service.  This is not usually required, but in this case it needs to support the following options: start, stop, and status. In addition, the check_myservice function is the one that is used to monitor the service itself (actually the executable).
  • The watchdog is triggered to be run every minute using crontab. This will only run the check_myservice function, whereas any direct invocation of the watchdog will have to supply any one of the following options: start/stop/status.
  • The idea is to always handle the executable indirectly via the watchdog (start/stop/status) rather than directly through the service itself, even if that is also possible. This is more of a best practice than a strict requirement.

The watchdog is installed as a service using the following commands:

#cp myservice /etc.init.d
#chmod +x myservice
#chkconfig --add myservice-watchdog
#chkconfig --level 2345 myservice-watchdog on

The explanation for the steps is the same as that for the installation of the main service itself. It is also worth noticing that the watchdog is also installed as a daemon.

Then we need to create a cron job that will trigger the check_myservice function of the watchdog every minute. For this, the best option (since we are triggering the whole process through an installation script) is to create a cron job in a text file, place that file in the /etc/cron.d directory (where user cron jobs can be placed), and the restarting the crond daemon process to make the new cron job visible to the OS, as follows:

#echo "* * * * * /etc/init.d/myservice-watchdog" > my.cron
#echo "" >> my.cron
#cp my.cron /etc/cron.d
#service crond restart

And that’s it! The most important bit to remember here is that the #echo “” >> my.cron line is required because of a bug in the way crontab behaves – it expects a newline or an empty line after the last cron job in the file. If it is missing, crontab will not fail, or throw an error, but silently avoid triggering the job! Trust me, this is mental agony that you definitely do not want to experience. The cronjob itself is pretty simple – simply call the watchdog every minute (read up on the syntax and semantics of cron jobs in Linux if you are confused by that line).

I hope that this serves a useful purpose for anyone that is planning to explore creating services and watchdogs using shell scripts in Linux.

Advertisements
Creating a service and a service watchdog using simple shell scripts in Linux

4 thoughts on “Creating a service and a service watchdog using simple shell scripts in Linux

  1. sandeep says:

    I created a service, which is used to restart apache tomcat server when ever it is stopped. Its working fine. But whenever the service stops, automatically apache tomcat is getting stopped, service is killing the port on which tomcat is running. Whats the reason ? Is there any solution ?

    Linux distribution : ubuntu.

    Like

    1. Hi Sandeep,

      Could you post your service code that monitors the Apache Tomcat Server process? Unless this service explicitly kills the Tomcat process when it itself is stopped (and I wouldn’t imagine why this would be so), it is very strange behavior indeed. Maybe I can take a look at the service code, and try to figure out if something is the matter with it!

      Like

  2. @Brian,

    Yes, you’re right. I have not included the actual user invocations of the service/watchdog commands. The “usage” portion of the code in both the Service code as well as the Watchdog code shows how to use these commands. For instance, to check the status using the watchdog service, the user would simply need to type:

    $ myservice-watchdog status

    Likewise for start and stop:

    $myservice-watchdog stop

    $myservice-watchdog start

    Like

Speak your mind!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s