
1.前言
本文主要介绍如何通过Nagios软件来监控Weblogic服务运行状况,其中主要包括Weblogic Server以及Weblogic JDBC Pool的运行状态。Nagios的插件中本身并不提供对于Weblogic服务监控的功能,所以要根据Nagios Plugin API编写自己的脚本,扩展其插件,完成我们所需要的功能。对于Weblogic运行状态信息的获得需通过JMX。
本文参考了Nagios3的官方文档中有关Nagios Plugin部分,以及Weblogic官方文档有关JMX和命令行部分,具体的Weblogic版本是8.14。
2.Nagios Plugin API概述
作为一个Nagios插件,无论你是用脚本(如shell、perl)还是用c编译后的可执行程序实现,它必须至少完成两件事,
1、退出时有一个返回值。
2、至少向标准输出设备(STDOUT)输出一行文本。
返回值定义:
Plugin Return Code Service State Host State
0 OK UP
1 WARNING UP or DOWN/UNREACHABLE*
2 CRITICAL DOWN/UNREACHABLE
3 UNKNOWN DOWN/UNREACHABLE
输出文本至少要一行,其信息主要反映被监控应用、服务的状态。
例如:DISK OK - free space: / 3326 MB (56%);
3.监控Weblogic的实现方法
对于Weblogic运行状况的获得,我们是通过命令行的方式实现的,通过调用Weblogic 的weblogic.Admin类实现的。这个类的功能很强大,可以通过它管理和配置Weblogic。
以下介绍几个常用的命令写法。
1、获得server运行状态
2、获得JDBC Pool运行状态
将黄色标记部分的变量替换成相应真实环境值即可。
${URL}weblogic的URL,例如t3://192.168.1.2:7002
${USER_NAME}用户名
${PASS_WORD}密码
${DOMAIN_NAME}weblogic域的名称,如mydomain
${SERVER_NAME}Server名
${POOL_NAME}JDBC Pool名称
在运行上述命令前需要设置JAVA_HOME,并且将$JAVA_HOME/bin添加到PATH中,将weblogic的weblogic81/server/lib/weblogic.jar包添加到CLASSPATH中。
4.具体实现的shell脚本
有了监控的方法,根据Nagios Plugin API规则编写自己的shell实现脚本。具体的shell脚本如下:
check_wls.sh
#!/bin/ksh
#check_wls.sh --jdbcpool url username password domainname servername poolname
#check_wls.sh --server url username password domainname servername
PROGNAME=`basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\\\/][^\\\\/][^\\\\/]*$,,'`
REVISION=`echo '$Revision: 1749 $' | sed -e 's/[^0-9.]//g'`
. $PROGPATH/utils.sh
print_usage() {
echo "Usage:"
echo " $PROGNAME --jdbcpool url username password domainname servername poolname
echo " $PROGNAME --server url username password domainname servername
echo " $PROGNAME --help"
echo " $PROGNAME --version"
}
print_revision $PROGNAME $REVISION
echo ""
print_usage
echo ""
echo "Check Weblogic status"
echo ""
echo "--jdbcpool url username password domainname servername poolname" echo " Check Weblogic JDBC Pool"
echo "--server url username password domainname servername"
echo " Check Weblogic Server"
}
if [[ -z "$JAVA_HOME" ]]
then
echo "Please set JAVA_HOME!"
exit $STATE_UNKNOWN
fi
if [[ -z "$CLASSPATH" ]]
then
echo "Please set CLASSPATH!"
exit $STATE_UNKNOWN
else
echo $CLASSPATH | grep "weblogic.jar" | wc -l | read N
if [[ "$N" = "0" ]]
then
echo "Please add weblogic.jar to CLASSPATH!"
exit $STATE_UNKNOWN
fi
fi
PATH=$JAVA_HOME/bin:$PATH
export PATH
JDBC_TYPE="JDBCConnectionPoolRuntime"
SERVER_TYPE="ServerRuntime"
cmd="$1"
# Information options
case "$cmd" in
--help)
print_help
exit $STATE_OK
;;
-h)
print_help
exit $STATE_OK
;;
--version)
print_revision $PROGNAME $REVISIONexit $STATE_OK
;;
-V)
print_revision $PROGNAME $REVISION
exit $STATE_OK
;;
esac
case "$cmd" in
--server)
URL=${2}
USER_NAME=${3}
PASS_WORD=${4}
DOMAIN_NAME=${5}
SERVER_NAME=${6}
SERVER_INFO="${DOMAIN_NAME}:${SERVER_NAME}"
RE=`java weblogic.Admin -url ${URL} -username ${USER_NAME} -password ${PASS_WORD} get -pretty \
-mbean "${DOMAIN_NAME}:Location=${SERVER_NAME},Name=${SERVER_NAME},Type=$ {SERVER_TYPE}"`
printf "${RE}" | grep ^"-" | wc -l | read N
if [[ "$N" -lt "1" ]]
then
#error
printf "${RE}" | awk '{ printf $0 }' | read ERR_INFO
echo "CRITICAL - ${ERR_INFO}"
exit $STATE_CRITICAL
fi
if [[ "$N" -ge "1" ]]
then
HEALTH_STATE=""
RUN_STATE=""
#HealthState State
printf "${RE}" | while read NAME VALUE
do
#PoolState WaitingForConnectionCurrentCount State
#echo "NAME:${NAME} VALUE:${VALUE}"
case "${NAME}" in
HealthState:)
HEALTH_STATE=${VALUE}
;;
State:)
RUN_STATE=${VALUE}
;;
esac
done
#echo "HEALTH_STATE:${HEALTH_STATE}"
#echo "RUN_STATE:${RUN_STATE}"HEALTH_STATE_INFO=${HEALTH_STATE}
echo ${HEALTH_STATE_INFO} | awk -F, '{ print $1 }' | awk -F: '{ print $2 }' | read HEALTH_STATE
#echo "HEALTH_STATE:${HEALTH_STATE}"
#HEALTH_OK HEALTH_WARN HEALTH_CRITICAL HEALTH_FAILED
if [[ "${RUN_STATE}" != "RUNNING" ]]
then
echo "CRITICAL - ${SERVER_INFO} State is ${RUN_STATE}"
exit $STATE_CRITICAL
fi
case "${HEALTH_STATE}" in
EALTH_OK)
;;
HEALTH_WARN)
echo "WARN - ${SERVER_INFO} HealthState is ${HEALTH_STATE_INFO}"
exit $STATE_WARNING
;;
HEALTH_CRITICAL)
echo "CRITICAL - ${SERVER_INFO} HealthState is ${HEALTH_STATE_INFO}"
exit $STATE_CRITICAL
;;
HEALTH_FAILED)
echo "FAILED - ${SERVER_INFO} HealthState is ${HEALTH_STATE_INFO}"
exit $STATE_CRITICAL
;;
esac
fi
echo "OK - ${SERVER_INFO} State is ${RUN_STATE},HealthState is ${HEALTH_STATE_INFO}"
exit $STATE_OK
;;
--jdbcpool)
URL=${2}
USER_NAME=${3}
PASS_WORD=${4}
DOMAIN_NAME=${5}
SERVER_NAME=${6}
POOL_NAME=${7}
POOL_INFO="${DOMAIN_NAME}:${SERVER_NAME}:${POOL_NAME}"
RE=`java weblogic.Admin -url ${URL} -username ${USER_NAME} -password ${PASS_WORD} GET
-pretty \
-mbean "${DOMAIN_NAME}:Location=${SERVER_NAME},Name=${POOL_NAME},ServerRuntime=$ {SERVER_NAME},Type=${JDBC_TYPE}"`
printf "${RE}" | grep ^"-" | wc -l | read N
if [[ "$N" -lt "1" ]]then
#error
printf "${RE}" | awk '{ printf $0 }' | read ERR_INFO
echo "CRITICAL - ${ERR_INFO}"
exit $STATE_CRITICAL
fi
if [[ "$N" -ge "1" ]]
then
POOL_STATE=""
WAIT_CNT=""
RUN_STATE=""
printf "${RE}" | while read NAME VALUE
do
#PoolState WaitingForConnectionCurrentCount State
#echo "NAME:${NAME} VALUE:${VALUE}"
case "${NAME}" in
PoolState:)
POOL_STATE=${VALUE}
;;
WaitingForConnectionCurrentCount:)
WAIT_CNT=${VALUE}
;;
State:)
RUN_STATE=${VALUE}
;;
esac
done
#echo "POOL_STATE:${POOL_STATE}"
#echo "WAIT_CNT:${WAIT_CNT}"
#echo "RUN_STATE:${RUN_STATE}"
if [[ "${POOL_STATE}" != "true" ]]
then
echo "CRITICAL - ${POOL_INFO} PoolState is ${POOL_STATE}"
exit $STATE_CRITICAL
fi
if [[ "${RUN_STATE}" != "Running" ]]
then
echo "CRITICAL - ${POOL_INFO} State is ${RUN_STATE}"
exit $STATE_CRITICAL
fi
if [[ "${WAIT_CNT}" -gt "0" ]]
then
echo "WARNING - ${POOL_INFO} WaitingForConnectionCurrentCount is ${WAIT_CNT}" exit $STATE_WARNING
fi
else
#error
printf "${RE}" | awk '{ printf $0 }' | read ERR_INFO
echo "CRITICAL - ${ERR_INFO}"
exit $STATE_CRITICAL
fi
echo "OK - ${POOL_INFO} State is ${RUN_STATE},PoolState is $
{POOL_STATE},WaitingForConnectionCurrentCount is ${WAIT_CNT}"
exit $STATE_OK
;;
*)
print_usage
exit $STATE_UNKNOWN
;;
esac
5.配置Weblogic监控
将check_wls.sh上传到Nagios软件的libexec目录下,并创建一个ln文件check_wls。
$ ln -s ./check_wls.sh ./check_wls
在nrpe的配置文件中增加相关的命令定义。
Weblogic的具体配置信息如下,
${URL}t3://172.17.1.2:7001
${USER_NAME}weblogic
${PASS_WORD}weblogic
${DOMAIN_NAME}mydomain
${SERVER_NAME}myserver
${POOL_NAME}mypool
编辑nrpe.cfg文件,增加如下内容,
在nrpe的启动脚本中添加环境变量(CLASSPATH、JAVA_HOME)
编辑监控主机的nagios.cfg文件,添加如下内容。
验证配置是否正确。
重启监控主机上的nagios服务以及远程主机上的nrpe服务。
通过IE观察监控情况。
图5.1
就此配置工作完成。
6.结语本文介绍了一种通过Nagios监控Weblogic应用的实现方式,按照Nagios Plugin API 规则编写自己的Shell脚本实现该功能,并简单的描述了配置过程,提供了Shell源码。希望大家指正。
