ELK 构架的搭建和 Logstash 的深入解析

ELK介绍

1.什么是ELK

ELK是三个软件
1.E：elasticsearch       java程序      存储，查询日志
2.L: logstash           java程序      收集、过滤日志
3.K: kibana             java程序      提供web服务，将数据页面化

4.F: filebeat           go          收集、过滤日志

2.ELK作用

1.收集： 收集所有服务器的日志
2.传输： 把日志稳定的传输到ES或者其他地方
3.存储： ES能有效快速的存储日志数据
4.分析： 通过web页面分析数据
5.监控： 监控集群架构

3.ELK优点

1.处理方式灵活：elasticsearch是实时全文索引，具有强大的搜索功能
2.配置相对简单：elasticsearch全部使用JSON 接口，logstash使用模块配置，kibana的配置文件部分更简单。
3.检索性能高效：基于优秀的设计，虽然每次查询都是实时，但是也可以达到百亿级数据的查询秒级响应。
4.集群线性扩展：elasticsearch和logstash都可以灵活线性扩展
5.前端操作绚丽：kibana的前端设计比较绚丽，而且操作简单

4.为什么使用ELK

#收集所有的日志
web服务日志
业务服务日志
系统日志

#统计、分析：
1.统计访问量
2.统计访问量前10的IP
3.站点访问次数最多的URL
4.查询一上午以上三个值
5.查询一下午以上三个值
6.对比一下上下午用户访问量
7.对比这一周，每天用户增长还是减少

ELK搭建

综合构架:

IP地址	安装服务	最低内存
172.16.1.51	Elasticsearch Logstash Kibana	3G
172.16.1.52	Elasticsearch Logstash	2G
172.16.1.53	Elasticsearch Logstash	2G

# 主配置文件:
/etc/elasticsearch/elasticsearch.yml
/etc/kibana/kibana.yml
/etc/logstash/logstash.yml  (一般不使用)
/etc/logstash/conf.d/*.conf (一般用这种方式)

# 主程序文件
/usr/lib/systemd/system/elasticsearch.service
/usr/share/logstash/bin/logstash
/etc/systemd/system/kibana.service

#1.安装前必须统一ELK三个软件的版本号以避免不必要的不兼容情况的发生
#2.安装前尽量统一3台主机的环境,字符集等等.
#3.安装前必须时间同步,保持主机时间一致性.

1. Elasticsearch的集群搭建

1) ES搭建

1.服务器时间同步（重要）
[root@db01 ~]# yum install -y ntpdate
[root@db01 ~]# ntpdate time1.aliyun.com

2.安装java环境
#上传
[root@db01 ~]# rz jdk-8u181-linux-x64.rpm

#安装
[root@db01 ~]# rpm -ivh jdk-8u181-linux-x64.rpm

3.安装ES
1.上传或下载包
[root@db01 ~]# rz elasticsearch-6.6.0.rpm
#下载地址：https://www.elastic.co/downloads/elasticsearch

2.安装
[root@db01 ~]# rpm -ivh elasticsearch-6.6.0.rpm

3.根据提示继续操作
[root@db01 ~]# systemctl daemon-reload
[root@db01 ~]# systemctl enable elasticsearch.service
Created symlink from /etc/systemd/system/multi-user.target.wants/elasticsearch.service to /usr/lib/systemd/system/elasticsearch.service.
[root@db01 ~]# systemctl start elasticsearch.service

可能会报错,解决方法:
#配置启动文件中内存锁
[root@db01 ~]# vim /usr/lib/systemd/system/elasticsearch.service
[Service]
... ...
LimitMEMLOCK=infinity

#再次启动ES
[root@db01 ~]# systemctl daemon-reload
[root@db01 ~]# systemctl start elasticsearch.service

4.验证
[root@db01 ~]# netstat -lntp     
tcp6       0      0 127.0.0.1:9200          :::*                    LISTEN      20040/java
tcp6       0      0 127.0.0.1:9300          :::*                    LISTEN      20040/java

2) ES集群搭建注意点

1.集群节点的配置，不需要将所有节点的IP都写入配置文件，只需要写本机IP和集群中任意一台机器的IP即可
    52配置：    discovery.zen.ping.unicast.hosts: ["10.0.0.51", "10.0.0.52"]
    53配置：    discovery.zen.ping.unicast.hosts: ["10.0.0.51", "10.0.0.53"]

2.集群选举节点配置数量，一定是 集群数量/2+1
    discovery.zen.minimum_master_nodes: 2

3.ES默认5个分片1个副本，索引创建以后，分片数量不得修改，副本数可以修改

3) ES集群搭建配置

[root@db01 ~]# vim /etc/elasticsearch/elasticsearch.yml
#集群名称
cluster.name: my-application
#节点名称
node.name: node-1
#指定数据目录
path.data: /service/es/data
#指定日志目录
path.logs: /service/es/logs
#开启内存锁
bootstrap.memory_lock: true
#ES监听地址
network.host: 10.0.0.51 127.0.0.1
#ES端口
http.port: 9200

#culster transport port
transport.tcp.port: 9300
transport.tcp.compress: true
#集群的地址
discovery.zen.ping.unicast.hosts: ["192.168.60.201", "192.168.60.202","192.168.60.203"]       
#集群个节点IP地址，也可以使用els、els.shuaiguoxia.com等名称，需要各节点能够解析
discovery.zen.minimum_master_nodes: 2       
#集群投票切换,为了避免脑裂，集群节点数最少为 半数+1

#总配置
[root@db01 ~]# grep "^[a-z]" /etc/elasticsearch/elasticsearch.yml
node.name: node-1
path.data: /service/es/data
path.logs: /service/es/logs
bootstrap.memory_lock: true
network.host: 10.0.0.51,172.16.1.51,127.0.0.1
http.port: 9200
#集群配置
transport.tcp.port: 9300
transport.tcp.compress: true
discovery.zen.ping.unicast.hosts:["172.16.1.51", "172.16.1.52","172.16.1.53"]
discovery.zen.minimum_master_nodes: 2 
-----------------------
相同配置集群中的52和53机器(修改节点node.name)

2.搭建 Logstash

1）安装java环境

1.上传java包
2.安装Java环境

2）时间同步

[root@web01 ~]# ntpdate time1.aliyun.com

3）安装Logstash

1.上传包
[root@web01 ~]# rz logstash-6.6.0.rpm

2.安装
[root@web01 ~]# rpm -ivh logstash-6.6.0.rpm

3.授权
[root@web01 ~]# chown -R logstash.logstash /usr/share/logstash/

#启动程序
/usr/share/logstash/bin/logstash

3.logstash搭建

1）输入输出插件介绍

INPUT、OUTPUT插件
INPUT：插件使Logstash收集指定源的日志
OUTPUT：插件将事件数据发送到特定的目的地

INPUT支持事件源	OUTPUT支持输出源	CODEC编解码器支持编码
azure_event_hubs(微软云事件中心)	elasticsearch(搜索引擎数据库)	avro(数据序列化)
beats(filebeat日志收集工具)	email(邮件)	CEF(嵌入式框架)
elasticsearch(搜索引擎数据库)	file(文件)	es_bulk(ES中的bulk api)
file(文件)	http(超文本传输协议)	Json(数据序列化、格式化)
generator(生成器)	kafka(基于java的消息队列)	Json_lines(便于存储结构化)
heartbeat(高可用软件)	rabbitmq(消息队列 OpenStack)	line(行)
http_poller(http api)	redis(缓存、消息队列、NoSQL)	multiline(多行匹配)
jdbc(java连接数据库的驱动)	s3*(存储)	plain(纯文本，事件间无间隔)
kafka(基于java的消息队列)	stdout(标准输出)	rubydebug(ruby语法格式)
rabbitmq(消息队列 OpenStack)	tcp(传输控制协议)
redis(缓存、消息队列、NoSQL)	udp(用户数据报协议)
s3*(存储)
stdin(标准输入)
syslog(系统日志)
tcp(传输控制协议)
udp(用户数据报协议)

2）Logstash输入输出测试

#配置环境变量
[root@web01 ~]# vim /etc/profile.d/logstash.sh
export PATH=/usr/share/logstash/bin/:$PATH

[root@web01 ~]# source /etc/profile

#收集标准输入到标准输出测试
[root@web01 ~]# logstash -e 'input { stdin {} } output { stdout {} }'

#测试输入
123456
{
    #时间戳
    "@timestamp" => 2020-08-13T01:34:24.430Z,
          #主机
          "host" => "web01",
      #版本
      "@version" => "1",
       #内容
       "message" => "123456"
}

#收集标准输入到标准输出指定格式:CODEC编解码器支持编码中的rubydebug方式
[root@web01 ~]# logstash -e 'input { stdin {} } output { stdout { codec => rubydebug } }'
123456
{
       "message" => "123456",
      "@version" => "1",
    "@timestamp" => 2020-08-13T01:39:40.837Z,
          "host" => "web01"
}

3）Logstash收集标准输入到文件

#收集标准输入到文件
[root@web01 ~]# logstash -e 'input { stdin {} } output { file { path => "/tmp/test.txt" } }'

#收集标准输入到文件
[root@web01 ~]# logstash -e 'input { stdin {} } output { file { path => "/tmp/test_%{+YYYY-MM-dd}.txt" } }'

4）Logstash收集标准输入到ES

#收集标准输入到ES,[]中可放多个集群地址,
[root@web01 ~]# logstash -e 'input { stdin {} } output { elasticsearch { hosts => ["10.0.0.51:9200"] index => "test_%{+YYYY-MM-dd}" } }'

#随便输入些内容

#查看页面

4.kibana搭建

1）安装kibana
#上传代码包
[root@db01 ~]# rz kibana-6.6.0-x86_64.rpm

#安装
[root@db01 ~]# rpm -ivh kibana-6.6.0-x86_64.rpm

2）配置kibana
[root@db01 ~]# vim /etc/kibana/kibana.yml

[root@db01 ~]# grep "^[a-z]" /etc/kibana/kibana.yml
#进程的端口
server.port: 5601
#监听地址
server.host: "10.0.0.51"
#指定ES的地址
elasticsearch.hosts: ["http://10.0.0.51:9200"]
#kibana也会创建索引
kibana.index: ".kibana"

3）启动kibana
[root@db01 ~]# systemctl start kibana.service

#验证
[root@db01 ~]# netstat -lntp       
tcp        0      0 10.0.0.51:5601          0.0.0.0:*               LISTEN      88636/node

4）访问页面
http://10.0.0.51:5601
1.时间区域
2.日志列表区域
3.搜索区域
4.数据展示区

Logstash使用

Logstash是一个开源的数据收集引擎，可以水平伸缩，而且logstash整个ELK当中拥有最多插件的一个组件，其可以接收来自不同来源的数据并统一输出到指定的且可以是多个不同目的地。

1.logstash的配置文件

#默认的配置文件
/etc/logstash/logstash.yml
#一般不使用，只有用system管理时才使用

/etc/logstash/conf.d/*.conf
#一般使用这种方式来配置logstash收集日志
使用yml语言

2.收集文件中的日志到文件

1）配置

[root@web01 ~]# vim /etc/logstash/conf.d/message_file.conf
input {
  file {
    path => "/var/log/messages"
    start_position => "beginning"
  }
}
output {
  file {
    path => "/tmp/message_file_%{+YYYY-MM-dd}.log"
  }
}

2）启动

#检测配置
[root@web01 ~]# logstash -f /etc/logstash/conf.d/message_file.conf -t

#启动
[root@web01 ~]# logstash -f /etc/logstash/conf.d/message_file.conf

3）查看是否生成文件

[root@web01 tmp]# ll
total 4
-rw-r--r-- 1 root root 1050 Aug 13 11:24 message_file_2020-08-13.log

3.收集文件中的日志到ES

1）配置

[root@web01 ~]# vim /etc/logstash/conf.d/message_es.conf
input {
  file {
    path => "/var/log/messages"
    start_position => "beginning"
  }
}
output {
  elasticsearch {
    hosts => ["10.0.0.51:9200","10.0.0.52:9200","10.0.0.53:9200"]
    index => "message_es_%{+YYYY-MM-dd}"
  }
}

2）启动

[root@web01 ~]# logstash -f /etc/logstash/conf.d/message_es.conf

3）查看页面

4.启动多个logstash收集日志

1）创建多个数据目录

'由于logstash自己也有存储数据的目录,默认是一个存储文件夹.
所以按logstash -f /etc/logstash/conf.d/aa.conf这种方式是无法同时启动多个实例的.
需要单独给实例配置数据文件夹.'

[root@web01 ~]# mkdir /data/logstash/{message_file,message_es} -p

#授权
[root@web01 ~]# chown -R logstash.logstash /data/

2）启动时指定数据目录

[root@web01 ~]# logstash -f /etc/logstash/conf.d/message_es.conf --path.data=/data/logstash/message_es &
[1] 18693
[root@web01 ~]# logstash -f /etc/logstash/conf.d/message_file.conf --path.data=/data/logstash/message_file &
[2] 18747

3）验证

查看文件和ES页面

5.一个logstash收集多个日志

1）配置

[root@web01 ~]# vim /etc/logstash/conf.d/more_file.conf

input {
  file {
    type => "messages_log"
    path => "/var/log/messages"
    start_position => "beginning"
  }
  file {
    type => "secure_log"
    path => "/var/log/secure"
    start_position => "beginning"
  }
}
output {
  if [type] == "messages_log" {
    file {
      path => "/tmp/messages_%{+YYYY-MM-dd}"
    }
  }
  if [type] == "secure_log" {
    file {
      path => "/tmp/secure_%{+YYYY-MM-dd}"
    }
  }
}

实例: 同时保存nginx和tomcat的日志文件并传输到elasticsearch,最后在kibana展示

1.创建数据文件
[root@db01 ~]# mkdir /data/logstash/ngi_tom.conf

2.授权
[root@db01 ~]# chown -R logstash.logstash /data/

3.编辑配置文件
[root@db01 ~]# vim /etc/logstash/conf.d/ngi_tom_es.conf
-----
input {
  file {
    type => "nginx_log"
    path => "/var/log/nginx/access.log"
    start_position => "beginning"
  }
  file {
    type => "tomcat_log"
    path => "/usr/local/tomcat/logs/catalina.out"
    start_position => "beginning"
  }
}
output {
  if [type] == "nginx_log" {
  elasticsearch {
    hosts => ["10.0.0.51:9200","10.0.0.52:9200","10.0.0.53:9200"]
    index => "nginx_es_%{+YYYY-MM-dd}"
   }
  }
  if [type] == "tomcat_log" {
  elasticsearch {
    hosts => ["10.0.0.51:9200","10.0.0.52:9200","10.0.0.53:9200"]
    index => "tomcat_es_%{+YYYY-MM-dd}"
    }
  }
}
-------

4.启动logstash收集
[root@db01 ~]# logstash -f /etc/logstash/conf.d/ngi_tom_es.conf --path.data=/data/logstash/tom_ngi &

5.访问http://10.0.0.51:5601进入kibana检查数据

附注 logstash 命令参数:

Usage:
    bin/logstash [OPTIONS]

Options:
    -n, --node.name NAME          Specify the name of this logstash instance, if no value is given
                                  it will default to the current hostname.
                                   (default: "db01")
    -f, --path.config CONFIG_PATH Load the logstash config from a specific file
                                  or directory.  If a directory is given, all
                                  files in that directory will be concatenated
                                  in lexicographical order and then parsed as a
                                  single config file. You can also specify
                                  wildcards (globs) and any matched files will
                                  be loaded in the order described above.
    -e, --config.string CONFIG_STRING Use the given string as the configuration
                                  data. Same syntax as the config file. If no
                                  input is specified, then the following is
                                  used as the default input:
                                  "input { stdin { type => stdin } }"
                                  and if no output is specified, then the
                                  following is used as the default output:
                                  "output { stdout { codec => rubydebug } }"
                                  If you wish to use both defaults, please use
                                  the empty string for the '-e' flag.
                                   (default: nil)
    --field-reference-parser MODE Use the given MODE when parsing field
                                  references.

                                  The field reference parser is used to expand
                                  field references in your pipeline configs,
                                  and will be becoming more strict to better
                                  handle illegal and ambbiguous inputs in a
                                  future release of Logstash.

                                  Available MODEs are:
                                   - `LEGACY`: parse with the legacy parser,
                                     which is known to handle ambiguous- and
                                     illegal-syntax in surprising ways;
                                     warnings will not be emitted.
                                   - `COMPAT`: warn once for each distinct
                                     ambiguous- or illegal-syntax input, but
                                     continue to expand field references with
                                     the legacy parser.
                                   - `STRICT`: parse in a strict manner; when
                                     given ambiguous- or illegal-syntax input,
                                     raises a runtime exception that should
                                     be handled by the calling plugin.

                                   The MODE can also be set with
                                   `config.field_reference.parser`

                                   (default: "COMPAT")
    --modules MODULES             Load Logstash modules.
                                  Modules can be defined using multiple instances
                                  '--modules module1 --modules module2',
                                     or comma-separated syntax
                                  '--modules=module1,module2'
                                  Cannot be used in conjunction with '-e' or '-f'
                                  Use of '--modules' will override modules declared
                                  in the 'logstash.yml' file.
    -M, --modules.variable MODULES_VARIABLE Load variables for module template.
                                  Multiple instances of '-M' or
                                  '--modules.variable' are supported.
                                  Ignored if '--modules' flag is not used.
                                  Should be in the format of
                                  '-M "MODULE_NAME.var.PLUGIN_TYPE.PLUGIN_NAME.VARIABLE_NAME=VALUE"'
                                  as in
                                  '-M "example.var.filter.mutate.fieldname=fieldvalue"'
    --setup                       Load index template into Elasticsearch, and saved searches, 
                                  index-pattern, visualizations, and dashboards into Kibana when
                                  running modules.
                                   (default: false)
    --cloud.id CLOUD_ID           Sets the elasticsearch and kibana host settings for
                                  module connections in Elastic Cloud.
                                  Your Elastic Cloud User interface or the Cloud support
                                  team should provide this.
                                  Add an optional label prefix '<label>:' to help you
                                  identify multiple cloud.ids.
                                  e.g. 'staging:dXMtZWFzdC0xLmF3cy5mb3VuZC5pbyRub3RhcmVhbCRpZGVudGlmaWVy'
    --cloud.auth CLOUD_AUTH       Sets the elasticsearch and kibana username and password
                                  for module connections in Elastic Cloud
                                  e.g. 'username:<password>'
    --pipeline.id ID              Sets the ID of the pipeline.
                                   (default: "main")
    -w, --pipeline.workers COUNT  Sets the number of pipeline workers to run.
                                   (default: 1)
    --java-execution              Use Java execution engine.
                                   (default: false)
    -b, --pipeline.batch.size SIZE Size of batches the pipeline is to work in.
                                   (default: 125)
    -u, --pipeline.batch.delay DELAY_IN_MS When creating pipeline batches, how long to wait while polling
                                  for the next event.
                                   (default: 50)
    --pipeline.unsafe_shutdown    Force logstash to exit during shutdown even
                                  if there are still inflight events in memory.
                                  By default, logstash will refuse to quit until all
                                  received events have been pushed to the outputs.
                                   (default: false)
    --path.data PATH              This should point to a writable directory. Logstash
                                  will use this directory whenever it needs to store
                                  data. Plugins will also have access to this path.
                                   (default: "/usr/share/logstash/data")
    -p, --path.plugins PATH       A path of where to find plugins. This flag
                                  can be given multiple times to include
                                  multiple paths. Plugins are expected to be
                                  in a specific directory hierarchy:
                                  'PATH/logstash/TYPE/NAME.rb' where TYPE is
                                  'inputs' 'filters', 'outputs' or 'codecs'
                                  and NAME is the name of the plugin.
                                   (default: [])
    -l, --path.logs PATH          Write logstash internal logs to the given
                                  file. Without this flag, logstash will emit
                                  logs to standard output.
                                   (default: "/usr/share/logstash/logs")
    --log.level LEVEL             Set the log level for logstash. Possible values are:
                                    - fatal
                                    - error
                                    - warn
                                    - info
                                    - debug
                                    - trace
                                   (default: "info")
    --config.debug                Print the compiled config ruby code out as a debug log (you must also have --log.level=debug enabled).
                                  WARNING: This will include any 'password' options passed to plugin configs as plaintext, and may result
                                  in plaintext passwords appearing in your logs!
                                   (default: false)
    -i, --interactive SHELL       Drop to shell instead of running as normal.
                                  Valid shells are "irb" and "pry"
    -V, --version                 Emit the version of logstash and its friends,
                                  then exit.
    -t, --config.test_and_exit    Check configuration for valid syntax and then exit.
                                   (default: false)
    -r, --config.reload.automatic Monitor configuration changes and reload
                                  whenever it is changed.
                                  NOTE: use SIGHUP to manually reload the config
                                   (default: false)
    --config.reload.interval RELOAD_INTERVAL How frequently to poll the configuration location
                                  for changes, in seconds.
                                   (default: 3000000000)
    --http.host HTTP_HOST         Web API binding host (default: "127.0.0.1")
    --http.port HTTP_PORT         Web API http port (default: 9600..9700)
    --log.format FORMAT           Specify if Logstash should write its own logs in JSON form (one
                                  event per line) or in plain text (using Ruby's Object#inspect)
                                   (default: "plain")
    --path.settings SETTINGS_DIR  Directory containing logstash.yml file. This can also be
                                  set through the LS_SETTINGS_DIR environment variable.
                                   (default: "/usr/share/logstash/config")
    --verbose                     Set the log level to info.
                                  DEPRECATED: use --log.level=info instead.
    --debug                       Set the log level to debug.
                                  DEPRECATED: use --log.level=debug instead.
    --quiet                       Set the log level to info.
                                  DEPRECATED: use --log.level=info instead.
    -h, --help                    print help