<服务>Nagios两种监控模式

时间:Feb. 13, 2017 分类:

目录:

1 directory, 13 files

## 配置主配置包含配置文件

[root@why-3 etc]# vi nagios.cfg cfg_file=/usr/local/nagios/etc/objects/service.cfg cfg_file=/usr/local/nagios/etc/objects/hosts.cfg

cfg_file=/usr/local/nagios/etc/objects/localhost.cfg

## 生成新配置文件

[root@why-3 objects]# head -51 localhost.cfg > hosts.cfg [root@why-3 objects]# chown nagios.nagios hosts.cfg [root@why-3 objects]# touch service.cfg
[root@why-3 objects]# chown nagios.nagios service.cfg

## 主机配置
### 配置介绍

[root@why-3 objects]# vi hosts.cfg define host{ use linux-server ; Name of host template to use ; This host definition will inherit all variables that are defined ; in (or inherited by) the linux-server host template definition. #监控项模板,被定义在/usr/local/nagios/etc/objects/templates.cfg中 host_name localhost #被监控主机名 alias localhost #被监控主机名别名 address 127.0.0.1 #被监控主机IP #未定义的参数都以监控项模板中的参数值加载 } define hostgroup{ #主机分组 hostgroup_name linux-servers ; The name of the hostgroup alias Linux Servers ; Long name of the group members localhost ; Comma separated list of hosts that belong to this group#组成员,可以通过host_name,用','隔开 }

### 配置

define host{ use linux-server ; Name of host template to use ; This host definition will inherit all variables that are defined ; in (or inherited by) the linux-server host template definition. host_name why-203 alias why-203-alias address 192.168.0.203 } define host{ use linux-server ; Name of host template to use ; This host definition will inherit all variables that are defined ; in (or inherited by) the linux-server host template definition. host_name why-202 alias why-202-alias address 192.168.0.202 } define hostgroup{ hostgroup_name why-nagis ; The name of the hostgroup alias why-nagis-alias ; Long name of the group members why-202,why-203 ; Comma separated list of hosts that belong to this group }

## 配置监控服务

[root@why-3 objects]# vi /usr/local/nagios/etc/objects/service.cfg define service { use generic-service host_name why-202 service_description Disk Partition check_command check_nrpe!check_disk }

## 配置文件检查
### 第一种检查方式

[root@why-3 objects]# /etc/init.d/nagios checkconfig Running configuration check... CONFIG ERROR! Check your Nagios configuration.

### 第二种检查方式

[root@why-3 objects]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 3.5.1 Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 08-30-2013 License: GPL

Website: http://www.nagios.org Reading configuration data... Read main config file okay... Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'... Processing object config file '/usr/local/nagios/etc/objects/contacts.cfg'... Processing object config file '/usr/local/nagios/etc/objects/timeperiods.cfg'... Processing object config file '/usr/local/nagios/etc/objects/templates.cfg'... Processing object config file '/usr/local/nagios/etc/objects/service.cfg'... Processing object config file '/usr/local/nagios/etc/objects/hosts.cfg'... Read object config files okay...

Running pre-flight check on configuration data...

Checking services... Error: Service check command 'check_nrpe' specified in service 'Disk Partition' for host 'why-202' not defined anywhere! Checked 1 services. Checking hosts... Warning: Host 'why-203' has no services associated with it! Checked 2 hosts. Checking host groups... Checked 1 host groups. Checking service groups... Checked 0 service groups. Checking contacts... Checked 1 contacts. Checking contact groups... Checked 1 contact groups. Checking service escalations... Checked 0 service escalations. Checking service dependencies... Checked 0 service dependencies. Checking host escalations... Checked 0 host escalations. Checking host dependencies... Checked 0 host dependencies. Checking commands... Checked 24 commands. Checking time periods... Checked 5 time periods. Checking for circular paths between hosts... Checking for circular host and service dependencies... Checking global event handlers... Checking obsessive compulsive processor commands... Checking misc settings...

Total Warnings: 1 Total Errors: 1

*> One or more problems was encountered while running the pre-flight check...

 Check your configuration file(s) to ensure that they contain valid
 directives and data defintions.  If you are upgrading from a previous
 version of Nagios, you should be aware that some variables/definitions
 may have been removed or modified in this version.  Make sure to read
 the HTML documentation regarding the config files, as well as the
 'Whats New' section to find out what has changed.
### 修改启动脚本

[root@why-3 objects]# vi /etc/init.d/nagios checkconfig) printf "Running configuration check..." $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1; if [ $? -eq 0 ]; then echo " OK." else echo " CONFIG ERROR! Check your Nagios configuration." exit 1 fi ;;

可以看到错误输出和标准输出都被重定向到/dev/null中了,我们只需要把/dev/null 2>&1去掉即可。
### 解决报错信息

Error: Service check command 'check_nrpe' specified in service 'Disk Partition' for host 'why-202' not defined anywhere! Checked 1 services.

命令check_nrpe没有被定义,我们只需要去objects/commands.cfg定义即可。

[root@why-3 objects]# vi /usr/local/nagios/etc/objects/commands.cfg

'check_nrpe' command definition

define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
} [root@why-3 objects]# /etc/init.d/nagios checkconfig Total Warnings: 1 Total Errors: 0

Things look okay - No serious problems were detected during the pre-flight check OK. [root@why-3 objects]# /etc/init.d/nagios reload Running configuration check...done. Reloading nagios configuration...done

### 配置解读

$USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$等价于/usr/local/nagios/libexec/check_nrpe -H 192.168.0.202 -c check_disk

我们配置的`check_command   check_nrpe!check_disk`,check_nrpe找的是commands.cfg中的check_nrpe命令,check_disk是check_nrpe命令中的$ARG1$,找的是nagios.cfg配置文件中的command,我们配置的`host_name       why-202`,通过host_name找的hosts.cfg中hostname对应的address赋值给$HOSTADDRESS$

[root@why-3 ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.202 -c check_disk DISK OK - free space: / 11925 MB (63% inode=84%);| /=6760MB;15748;18110;0;19685


### web端查看
![](http://image.whysdomain.com/linux/nagios5.PNG)
我们可以看到报错,大体意思为看起来你没有许可查看你请求的服务,如果你相信这是一个错误,那么请你检查一下http访问cgi的权限配置或者检查cgi的配置文件。

[root@why-3 objects]# vi ../cgi.cfg authorized_for_system_information=nagiosadmin

可以看到有权限的用户为nagiosadmin,我们可以通过在vi编辑器中通过:%s/nagiosadmin/why/g修改

[root@why-3 objects]# /etc/init.d/nagios reload Running configuration check...done. Reloading nagios configuration...done1

![](http://image.whysdomain.com/linux/nagios6.PNG)
更多的监控服务就可以在service.cfg配置。
# 报错问题解决
### Error: Service check command

Error: Service check command 'check_nrpe' specified in service 'Disk Partition' for host 'why-202' not defined anywhere! Checked 1 services.

命令check_nrpe没有被定义,我们只需要去objects/commands.cfg定义即可。

[root@why-3 objects]# vi /usr/local/nagios/etc/objects/commands.cfg

'check_nrpe' command definition

define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

### Command 'check_ping' not defined

[root@why-3 libexec]# ./check_nrpe -H 192.168.0.203 -c check_ping NREP: Command 'check_ping' not defined

模块command[check_ping]没有在nagios.cfg中被定义

如果验证正常,但是web端仍然出现这个问题,那就可以是services.cfg中调用的问题
### Error - Could not complete SSL handshake

[root@why-3 libexec]# ./check_nrpe -H 192.168.0.203 -c check_ping CHECK_NRPE: Error - Could not complete SSL handshake

目标主机nrpe.cfg中没有添加allowed_hosts没有添加服务端
### Unknown agrument

[root@why-3 libexec]# ./check_nrpe -H 192.168.0.203 -c check_ping Unknown agrument Usage: 省略部分

报错说明nagios.cfg中的command[check_ping]使用模块的监控参数有问题
### Unable to read output
web端显示:NREP: Unable to read output

说明模块没有执行权限
### Connection refused by host
说明客户端NRPE关闭或者因为其他网络原因无法对客户端进行通信
### 更多check_nrpe使用
可以通过check_nrpe --help来进行查看。

# 主动模式

[root@why-3 libexec]# ./check_tcp --help check_tcp v1.4.16 (nagios-plugins 1.4.16) Copyright (c) 1999 Ethan Galstad nagios@nagios.org Copyright (c) 1999-2008 Nagios Plugin Development Team nagiosplug-devel@lists.sourceforge.net

This plugin tests TCP connections with the specified host (or unix socket).

Usage: check_tcp -H host -p port [-w ] [-c ] [-s ] [-e ] [-q ][-m ] [-d ] [-t ] [-r ] [-M ] [-v] [-4|-6] [-j] [-D [,]] [-S ] [-E] [root@why-3 libexec]# ./check_tcp -H 192.168.0.202 -p 80 TCP OK - 0.002 second response time on port 80|time=0.001631s;;;0.000000;10.000000 [root@why-3 libexec]# ./check_tcp -H 192.168.0.202 -p 22 TCP OK - 0.002 second response time on port 22|time=0.002433s;;;0.000000;10.000000

## 插件使用

[root@why-3 libexec]# ./check_http --help
check_http v1.4.16 (nagios-plugins 1.4.16) Copyright (c) 1999 Ethan Galstad nagios@nagios.org Copyright (c) 1999-2011 Nagios Plugin Development Team nagiosplug-devel@lists.sourceforge.net

This plugin tests the HTTP service on the specified host. It can test normal (http) and secure (https) servers, follow redirects, search for strings and regular expressions, check connection times, and report on certificate expiration times.

Usage: check_http -H | -I [-u ] [-p ] [-w ] [-c ] [-t ] [-L] [-a auth] [-b proxy_auth] [-f ] [-e ] [-s string] [-l] [-r | -R ] [-P string] [-m :] [-4|-6] [-N] [-M ] [-A string] [-k string] [-S ] [--sni] [-C [,]] [-T ] [-j method] NOTE: One or both of -H and -I must be specified

## 通过SHELL检查插件

[root@why-3 libexec]# ./check_http -I 192.168.0.202 HTTP OK: HTTP/1.1 200 OK - 236 bytes in 0.105 second response time |time=0.105220s;;;0.000000 size=236B;;;0

## 配置主动监控

[root@why-3 libexec]# cd ../etc [root@why-3 etc]# vi nagios.cfg cfg_dir=/usr/local/nagios/etc/services [root@why-3 etc]# mkdir services [root@why-3 etc]# cd !$ cd services [root@why-3 services]# vi webservice.cfg #名字可以随意起,但是一定要以cfg结尾 [root@why-3 services]# cat webservice.cfg define service{ use generic-service host_name why-202 service_description blog_url check_command check_weburl!-I 192.168.0.202 max_check_attempts 3 normal_check_interval 2 retry_check_interval 1 check_period 24x7 notification_interval 30 notification_period 24x7 notification_options w,u,c,r contact_groups admins } [root@why-3 services]# cd .. [root@why-3 etc]# vi objects/commands.cfg

'check_weburl' command definition

define command{ command_name check_weburl command_line $USER1$/check_http $ARG1$ -w 10 -c 30 } [root@why-3 etc]# /etc/init.d/nagios checkconfig [root@why-3 etc]# /etc/init.d/nagios reload
Running configuration check...done. Reloading nagios configuration...done

![](http://image.whysdomain.com/linux/nagios7.PNG)
服务可以根据主机名,端口,服务进行配置名,也可以根据不同的创建目录进行区分,也可以通过-u根据url进行监控,如果没有做伪静态的复杂url需要用""。如果是监控域名,相同域名主机可以通过DNS或hosts映射别名,对别名url进行监控。监控mysql可以通过check_mysql通过-S参数进行监控,可用参数-H指定主机,-P指定端口,-u指定用户,-p指定密码即可
# 服务分组

[root@why-3 etc]# cd services/ [root@why-3 services]# vi servergroup.cfg define servicegroup{ servicegroup_name Disk Partition alias Two Servers ; members why-203,Disk Partition,why-202,Disk Partition } [root@why-3 services]# /etc/init.d/nagios checkconfig [root@why-3 services]# /etc/init.d/nagios reload
Running configuration check...done. Reloading nagios configuration...done ``` 选择Service Groups