新增
短信报警优化
新增post报警方式,可用于发送动态短信1
2
3
4
5
6
7
8
9
10
11# 报警方式
alert:
- "post"
http_post_url: "http://adminhome.jinhui365.cn/sendSmsForAppAlert"
# phoneList 手机号,逗号分割的字符串。例如:"15235446827,15235446827"
# content 短信内容。${}代表动态内容
http_post_static_payload:
phoneList: "15235446827"
content: "${@timestamp}, IOS ${l} 级别日志报警。版本${v},手机型号${d},日志数${num_hits}."
动态内容为邮件内容,目前仅支持首层的key字段
举个例子1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35邮件内容为:
@timestamp: 2018-01-16T02:38:08.340Z
@version: 1
_id: AWD81Lv0TU1e05j5IHmu
_index: ios-2018.01.16
_type: logs
arg: {
"appkey": "jh28a4c4bc6734f58b",
"branchNo": "88",
"client": "iOS",
"encrypt": 0,
"fundAccount": "881125524",
"signcode": "2FBDC7A4842E30B8F9ACD112261ECF28",
"timestamp": "1504442929685",
"token": "u4ftBaBm-xg=",
"uid": "1861574",
"version": "5.15.0"
}
c: iOS
d: iPhone 6s Plus
host: ubuntu
i: B6C1AB45-C3CC-4C60-BD10-259778C6699B
ip: 223.104.95.169 贵阳市 移动
l: warn
message: request failed:0 /receipt/list
n: WiFi
num_hits: 3182
num_matches: 3
o: 中国移动
p: com.jinhui365.iphone-pay
path: /data/node-dev-tools/logs/iOS.log
s: 10.3.1
t: 1504442928.55
uid: 1861574
v: 5.15.0
1 | content:"测试,测试,这个是个测试!message:${message},num_hits:${num_hits}" |
jira日志报警记录
1 | # 报警标题 |
结构
安装
- 根据ElastAlert Server安装教程进行kibana插件和ElastAlert Server安装.
- 根据ElastAlert官方网站安装Elastalert.
配置
Elastalert Server
1 | { |
详细配置见Elastalert Server 配置
ElastAlert
1 | es_host: 10.0.0.219 |
详细配置见Elastalert 配置
使用教程
打开方式
- 该功能嵌入至kibana中,打开kibana:http://10.0.0.219:5601左上角setting右侧有个展开按钮,可供选择进入elastalert功能。
- 直接输入http://10.0.0.219:5601/app/elastalert进入。
基本操作
- + New Rule可添加一条新的规则
- 点击规则可以删除和修改
- 规则页面右侧有一些模板可以点击展示
- 规则页面左上角退出,右上角分别为[测试],[保存],测试完成后右侧会有输出
注意
1.添加新的规则需测试后才能保存,若直接保存可能因规则错误导致监听停止。
2.添加新的规则如果不想继续添加,记得将目录页该规则删除。
3.若因错误操作导致监听停止,可删除错误规则文件后访问http://10.0.0.219:3030/status/control/start重新启动监听,通过http://10.0.0.219:3030/status查看监听状态
ElastAlert 配置和规则说明
ElastAlert 配置
参数 | 说明 | 备注 |
---|---|---|
es_host | Elasticsearch的host地址 | |
es_port | Elasticsearch的端口号 | 默认为9200 |
rules_folder | 规则文件夹的名称 | |
run_every | 用来设置定时向elasticsearch发送请求 | |
buffer_time | 用来设置请求李时间字段的范围 | |
realert | 设定报警后的一段时间内忽略报警 | 默认为1分钟,可以设置为0 |
query_delay | 减去查询所花的时间 | |
writeback_index | elastalert产生的日志在elasticsearch中创建的索引 | |
alert_time_limit | 失败重试的时间设置 | |
es_send_get_body_as | 查询Elasticsearch的请求方式 | 默认为get |
ElastAlert 规则
参数 | 说明 | 备注 |
---|---|---|
name | 规则名称 | 英文,不能包含中文 |
type | 报警规则检查类型 | |
alert | 报警的方式 | |
index | 监视的索引 | |
filter | 检索的条件 | |
realert | 设置n时间内只警报一次 | |
若报警有email方式,为收邮件的邮箱 | ||
aggregation | 聚合日志,能够攒齐了一段时间的警告再上报。也可以用schedule定时间发送这一段时间的所有警告 | 可以考虑是否使用 |
import | 可以引用公共部分 | 后续的规则多了之后考虑将公共部分抽出 |
报警类型
- any:只要有匹配就报警;
- blacklist:compare_key字段的内容匹配上 blacklist数组里任意内容;
- whitelist:compare_key字段的内容一个都没能匹配上whitelist数组里内容;
- frequency:在相同 query_key条件下,timeframe 范围内有num_events个被过滤出 来的异常;
- change:在相同query_key条件下,compare_key字段的内容,在timeframe范围内 发生变化;
- spike:在相同query_key条件下,前后两个timeframe范围内数据量相差比例超过spike_height。其中可以通过spike_type设置具体涨跌方向是up,down,both 。还可以通过threshold_ref设置要求上一个周期数据量的下限,threshold_cur设置要求当前周期数据量的下限,如果数据量不到下限,也不触发;
- flatline:timeframe 范围内,数据量小于threshold 阈值;
- new_term:fields字段新出现之前terms_window_size(默认30天)范围内最多的terms_size (默认50)个结果以外的数据;
- cardinality:在相同 query_key条件下,timeframe范围内cardinality_field的值超过 max_cardinality 或者低于min_cardinality
报警方式
- Command
- jira
- post
具体规则书写查看文章末尾规则模板
ElastAlert
查询方式
query_string
查询1
2
3filter:
- query_string:
query: "username: bob"query_string类型和Lucene的查询规则一致,具体细节可查看Lucene Query
也可以通过将kibana上面的json格式转化为yaml的格式查询term
精确匹配键值对1
2
3filter:
- terms:
field: ["value1", "value2"]terms
键值对匹配多个值wildcard
标准的 shell 通配符range
范围1
2
3
4
5filter:
- range:
status_code:
from: 500
to: 599Negation, and, or
与或非1
2
3
4
5
6
7
8
9
10
11
12
13filter:
- or:
- term:
field: "value"
- wildcard:
field: "foo*bar"
- and:
- not:
term:
field: "value"
- not:
term:
_type: "something"
以上规则在文档ElastAlert Filters中皆有详细描述
规则模板
复杂的query_string查询
1 | # 时间标准 |
与或非查询
1 | # 时间标准 |
ElastAlert Server API
This server exposes the following REST API’s:
GET
/
Exposes the current version running
GET
/status
Returns either ‘SETUP’, ‘READY’, ‘ERROR’, ‘STARTING’, ‘CLOSING’, ‘FIRST_RUN’ or ‘IDLE’ depending on the current ElastAlert process status.
GET
/status/control/:action
Where
:action
can be either ‘start’ or ‘stop’, which will respectively start or stop the current ElastAlert process.[WIP] GET
/status/errors
When
/status
returns ‘ERROR’ this returns a list of errors that were triggered.GET
/rules
Returns a list of directories and rules that exist in the
rulesPath
(from the config) and are being run by the ElastAlert process.GET
/rules/:id
Where
:id
is the id of the rule returned by GET/rules
, which will return the file contents of that rule.POST
/rules/:id
Where
:id
is the id of the rule returned by GET/rules
, which will allow you to edit the rule. The body send should be:1
2
3
4{
// Required - The full yaml rule config.
"yaml": "..."
}
DELETE
/rules/:id
Where
:id
is the id of the rule returned by GET/rules
, which will delete the given rule.GET
/templates
Returns a list of directories and templates that exist in the
templatesPath
(from the config) and are being run by the ElastAlert process.GET
/templates/:id
Where
:id
is the id of the template returned by GET/templates
, which will return the file contents of that template.POST
/templates/:id
Where
:id
is the id of the template returned by GET/templates
, which will allow you to edit the template. The body send should be:1
2
3
4{
// Required - The full yaml template config.
"yaml": "..."
}
DELETE
/templates/:id
Where
:id
is the id of the template returned by GET/templates
, which will delete the given template.POST
/test
This allows you to test a rule. The body send should be:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30{
// Required - The full yaml rule config.
"rule": "...",
// Optional - The options to use for testing the rule.
"options": {
// Can be either "all", "schemaOnly" or "countOnly". "all" will give the full console output.
// "schemaOnly" will only validate the yaml config. "countOnly" will only find the number of matching documents and list available fields.
"testType": "all",
// Can be any number larger than 0 and this tells ElastAlert over a period of how many days the test should be run
"days": "1"
// Whether to send real alerts
"alert": false
}
}
```
- **[WIP] GET `/config`**
Gets the ElastAlert configuration from `config.yaml` in `elastalertPath` (from the config).
- **[WIP] POST `/config`**
Allows you to edit the ElastAlert configuration from `config.yaml` in `elastalertPath` (from the config). The required body to be send will be edited when the work on this API is done.
## ElastAlert监控规律
ElastAlert根据config中的run_every设置的时间频率去轮询,每次查询的时间块都是buffer_time基本查询规律:
配置: run_every:20s, buffer_time:1min当前时间1月17日9时启动监控
当前时间 日志时间块
9:00:00 8:59:00~9:00:00
9:00:20 9:00:00~9:00:20
9:00:40 9:00:00~9:00:40
9:01:00 9:00:00~9:01:00
9:01:20 9:01:00~9:01:20
9:01:40 9:01:00~9:01:40
9:02:00 9:01:00~9:02:00`
传送门
- Bitsensor博客网站
- ElastAlert Kibana Plugin
- Elastalert Server
- Elastalert Github
- ElastAlert官方网站