open-falcon
目录:
安装openfalcon
openfalcon由小米开源,因为是国人开发,文档都是中文的,不过多介绍,参考项目介绍
环境准备
参考环境准备
$ yum install -y redis mariadb-server git
$ systemctl start mariadb
$ systemctl start redis
$ cd /tmp/ && git clone https://github.com/open-falcon/falcon-plus.git
$ cd /tmp/falcon-plus/scripts/mysql/db_schema/
$ mysql -h 127.0.0.1 -u root -p < 1_uic-db-schema.sql
$ mysql -h 127.0.0.1 -u root -p < 2_portal-db-schema.sql
$ mysql -h 127.0.0.1 -u root -p < 3_dashboard-db-schema.sql
$ mysql -h 127.0.0.1 -u root -p < 4_graph-db-schema.sql
$ mysql -h 127.0.0.1 -u root -p < 5_alarms-db-schema.sql
$ rm -rf /tmp/falcon-plus/
设置mysql密码可以
MariaDB [(none)]> set password=password('123456')
下载二进制包
$ wget https://github.com/open-falcon/falcon-plus/releases/download/v0.2.1/open-falcon-v0.2.1.tar.gz
安装
Agent
Agent用于采集数据指标,每隔60s以push的方式发送给transfer(建立了长链),agent也提供了http接口用于接收用户手工push的一些数据
配置参考官方文档,配置文件目录
启动agent服务
$ ./open-falcon start agent
[falcon-agent] 4367
查看agent日志
$ ./open-falcon monitor agent
2019/03/22 12:08:21 cfg.go:128: read config file: /root/agent/config/cfg.json successfully
2019/03/22 12:08:21 var.go:31: get local addr failed !
2019/03/22 12:08:21 http.go:74: listening :1988
2019/03/22 12:08:21 rpc.go:41: dial 0.0.0.0:6030 fail: dial tcp 0.0.0.0:6030: getsockopt: connection refused
2019/03/22 12:08:23 rpc.go:41: dial 0.0.0.0:6030 fail: dial tcp 0.0.0.0:6030: getsockopt: connection refused
2019/03/22 12:08:27 rpc.go:41: dial 0.0.0.0:6030 fail: dial tcp 0.0.0.0:6030: getsockopt: connection refused
2019/03/22 12:08:35 rpc.go:41: dial 0.0.0.0:6030 fail: dial tcp 0.0.0.0:6030: getsockopt: connection refused
验证
$ ./agent/bin/falcon-agent --check
ps aux ... ok
kernel ... ok
net.if ... ok
cpustat ... ok
ss -s ... ok
netstat ... ok
ss -tln ... ok
du -bs ... ok
df.bytes ... ok
loadavg ... ok
disk.io ... ok
memory ... ok
本地会启动1988端口
打开对应的1988端口web服务可以在web上看到业务情况
直接往agent写入数据
$ ts=`date +%s`; curl -X POST -d "[{\"metric\": \"metric.demo\", \"endpoint\": \"qd-open-falcon-judge01.hd\", \"timestamp\": $ts,\"step\": 60,\"value\": 9,\"counterType\": \"GAUGE\",\"tags\": \"project=falcon,module=judge\"}]" http://127.0.0.1:1988/v1/push
transfer
transfer用于接收agent上报的数据,然后按照哈希规则进行数据分配,然后push到graph和judge等组件
配置参考官方文档
启动服务
$ ./open-falcon start transfer
[falcon-transfer] 6036
查看一下启动的服务
$ ss -nlpt | grep transfer
LISTEN 0 128 :::4444 :::* users:(("falcon-transfer",pid=6036,fd=6))
LISTEN 0 128 :::6060 :::* users:(("falcon-transfer",pid=6036,fd=3))
LISTEN 0 128 :::8433 :::* users:(("falcon-transfer",pid=6036,fd=5))
graph
graph用于存储绘图数据,数据来源于transfer,同时处理api组件的查询请求,返回绘图数据
$ ./open-falcon start graph
[falcon-graph] 16267
配置参考官方文档
注意数据库的配置需要进行配置
"db": {
"dsn": "root:123456@tcp(127.0.0.1:3306)/graph?loc=Local&parseTime=true",
"maxIdle": 4
},
查看一下启动的服务
$ ss -nlpt | grep graph
LISTEN 0 128 :::6070 :::* users:(("falcon-graph",pid=16267,fd=5))
LISTEN 0 128 :::6071 :::* users:(("falcon-graph",pid=16267,fd=6))
6070端口用于接收数据,6071端口为控制端口
API
API用于提供restfulAPI操作接口
$ ./open-falcon start api
dashboard
mkdir /home/work
export HOME=/home/work
export WORKSPACE=$HOME/open-falcon
mkdir -p $WORKSPACE
cd $WORKSPACE
git clone https://github.com/open-falcon/dashboard.git
yum install -y python-virtualenv
yum install -y python-devel
yum install -y openldap-devel
yum install -y mysql-devel
yum groupinstall "Development tools"
cd $WORKSPACE/dashboard/
virtualenv ./env
# -i制定源为豆瓣的源,不过貌似这个源有问题
./env/bin/pip install -r pip_requirements.txt -i https://pypi.douban.com/simple
修改配置文件rrd/config.py
根据实际情况修改组件
# portal database
# TODO: read from api instead of db
PORTAL_DB_HOST = os.environ.get("PORTAL_DB_HOST","127.0.0.1")
PORTAL_DB_PORT = int(os.environ.get("PORTAL_DB_PORT",3306))
PORTAL_DB_USER = os.environ.get("PORTAL_DB_USER","falcon")
PORTAL_DB_PASS = os.environ.get("PORTAL_DB_PASS","falcon")
PORTAL_DB_NAME = os.environ.get("PORTAL_DB_NAME","falcon_portal")
# alarm database
# TODO: read from api instead of db
ALARM_DB_HOST = os.environ.get("ALARM_DB_HOST","127.0.0.1")
ALARM_DB_PORT = int(os.environ.get("ALARM_DB_PORT",3306))
ALARM_DB_USER = os.environ.get("ALARM_DB_USER","root")
ALARM_DB_PASS = os.environ.get("ALARM_DB_PASS","")
ALARM_DB_NAME = os.environ.get("ALARM_DB_NAME","alarms")
启动服务
$ bash control start
falcon-dashboard started..., pid=28399
查看日志
$ bash control start
falcon-dashboard started..., pid=28399
[root@VM_16_11_centos dashboard]# bash control tail
[2019-03-22 16:29:05 +0000] [28399] [INFO] Starting gunicorn 19.9.0
[2019-03-22 16:29:05 +0000] [28399] [INFO] Listening at: http://0.0.0.0:8081 (28399)
[2019-03-22 16:29:05 +0000] [28399] [INFO] Using worker: sync
[2019-03-22 16:29:05 +0000] [28404] [INFO] Booting worker with pid: 28404
[2019-03-22 16:29:05 +0000] [28405] [INFO] Booting worker with pid: 28405
[2019-03-22 16:29:05 +0000] [28406] [INFO] Booting worker with pid: 28406
[2019-03-22 16:29:05 +0000] [28408] [INFO] Booting worker with pid: 28408
用户需要手动注册,第一个手动注册的root用户被设置为超级管理员
如果不需要注册可以将api组件的配置文件cfg.json,将signup_disable配置项修改为true,然后重启api服务
Judge
Judge用于告警判断,agent同时将数据push给transfer和judge
judge提供了一个http接口/count
可以获取juage实例处理了多少的数据量
配置参考官方文档
$ ./open-falcon start judge
[falcon-judge] 1839
Alarm
alarm模块用于处理报警event,judge产生的报警event写入redis,alarm从redis读取并处理
报警逻辑在alarm中,可以对event做报警合并等等,已发出的告警alarm会写入到MySQL中保存,可以在dashboard中查询
$ ./open-falcon start alarm
[falcon-alarm] 2856
报警配置
邮件、短信、微信、电话发送接口,falcon同一定义了http发送的数据
短信发送http接口:
method: post
params:
- content: 短信内容
- tos: 使用逗号分隔的多个手机号
邮件发送http接口:
method: post
params:
- content: 邮件内容
- subject: 邮件标题
- tos: 使用逗号分隔的多个邮件地址
im发送http接口:
method: post
params:
- content: im内容
- tos: 使用逗号分隔的多个im号码
HBS(Heartbeat Server)
用于检测agent是否存活
$ ./open-falcon start hbs
Nodata
nodata用于检测监控数据的上报异常,和jadge模块协同工作,过程为配置了Nodata的采集项在没有上报数据的时候会发送nodedata信息出发jadge报警
Aggregator
集群聚合模块,汇聚集群下的所有主机的某个指标,提供集群视角的监控
还有等等等等辅助服务