Kapacitor 是一个开源框架,用来处理、监控和警告时间序列数据,它使用 TICKscript 脚本来定义任务。Kapacitor是InfluxData开源的数据处理引擎。它可以处理来自InfluxDB的流数据和批处理数据,可以周期性将InfluxDB中的数据汇总、处理后再输出到InfluxDB当中,或者告警(支持Email、HTTP、TCP、 HipChat, OpsGenie, Alerta, Sensu, PagerDuty, Slack等多种方式)。
一.安装
修改kapacitor.conf文件中的参数data_dir和[logging]的路径,重启报错,错误如下:
[XXXXXXXX kapacitor]# systemctl status kapacitor ● kapacitor.service - Time series data processing engine. Loaded: loaded (/usr/lib/systemd/system/kapacitor.service; disabled; vendor preset: disabled) Active: failed (Result: start-limit) since Fri 2019-01-29 15:17:42 CST; 312ms ago Docs: https://github.com/influxdb/kapacitor Process: 28842 ExecStart=/usr/bin/kapacitord -config /etc/kapacitor/kapacitor.conf $KAPACITOR_OPTS (code=exited, status=1/FAILURE) Main PID: 28842 (code=exited, status=1/FAILURE) Jan 29 15:17:42 systemd[1]: Unit kapacitor.service entered failed state. Jan 29 15:17:42 systemd[1]: kapacitor.service failed. Jan 29 15:17:42 systemd[1]: kapacitor.service holdoff time over, scheduling restart. Jan 29 15:17:42 systemd[1]: Stopped Time series data processing engine.. Jan 29 15:17:42 systemd[1]: start request repeated too quickly for kapacitor.service Jan 29 15:17:42 systemd[1]: Failed to start Time series data processing engine.. Jan 29 15:17:42 systemd[1]: Unit kapacitor.service entered failed state. Jan 29 15:17:42 systemd[1]: kapacitor.service failed.
查看service 文件/usr/lib/systemd/system/kapacitor.service,发现启动账号为kapacitor。
解决方案:对替换的文件授权即可
chown -R kapacitor:kapacitor data chown -R kapacitor:kapacitor logs
2.调试报错
调试命令如下:
kapacitor record stream -task cpu_alert -duration 60s
报错信息:
failed to create recording file: open /var/lib/kapacitor/replay/119w1985-0101-120c-83b0-c9XXXXXXXXX.srpl: permission denied
查看报错文件的权限
解决方案:
chown -R kapacitor:kapacitor replay
3.log过多过大,调试log打印级别
上线运行一周发现kapacitor.log有4G,过多过大。
4.服务的开启/关闭/查看
启动服务
systemctl start kapacitor.service
关闭服务
systemctl stop kapacitor.service
查看服务状态
systemctl status kapacitor.service
5.部分命令
查看已部署的task
kapacitor list tasks
如果需要调整代码,在TICKscript文件中直接编辑。调整后,重新生成task
例如某id为cpu_alert的tick
kapacitor define cpu_alert -tick cpu_alert.tick
6. bach类型的task 注意没有-duration 参数
例如:
kapacitor record batch -task XXXXX -duration 60s
报错信息如下:
flag provided but not defined: -duration
解释如下:
Usage: kapacitor record batch [options] Record the result of a InfluxDB query from a task. Prints the recording ID on exit. See 'kapacitor help replay' for how to replay a recording. Examples: $ kapacitor record batch -task cpu_idle -start 2015-09-01T00:00:00Z -stop 2015-09-02T00:00:00Z This records the result of the query defined in task 'cpu_idle' and runs the query until the queries reaches the stop time, starting at time 'start' and incrementing by the schedule defined in the task. $ kapacitor record batch -task cpu_idle -past 10h This records the result of the query defined in task 'cpu_idle' and runs the query until the queries reaches the present time. The starting time for the queries is 'now - 10h' and increments by the schedule defined in the task. Options: -no-wait Do not wait for the recording to finish. -past string Set start time via 'now - past'. -recording-id string The ID to give to this recording. If not set an random ID is chosen. -start string The start time for the set of queries. -stop string The stop time for the set of queries (default now). -task string The ID of a task. Uses the queries contained in the task.
参考资料
https://docs.influxdata.com/kapacitor/v1.5/introduction/getting-started/#test-the-task