ELK 搭建全过程

发布于:2023-02-04 ⋅ 阅读:(666) ⋅ 点赞:(0)

ELK 搭建

Elasticsearch 和Kibana

ElasticSearch 和Kibana 直接使用阿里云的ElasticSearch 服务器和配套的Kibana,所以安装配置步骤略过。

阿里云 Elasticsearch 服务器,版本号6.7.0

Logstash

下载地址:installing-logstash

下载版本:Logstash 6.7.0

版本兼容性:
产品兼容性
支持矩阵

下载

下载地址:logstash-6.7.0.tar.gz

cd /usr/local
mkdir logstash
cd logstash
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.7.0.tar.gz

部署

  • 解压
tar -zxvf logstash-6.7.0.tar.gz

cd logstash-6.7.0

体验下效果

./bin/logstash -e 'input { stdin { } } output { stdout {} }'

安装插件

#进入logstash 目录
cd /usr/local/logstash/logstash-6.7.0
# 检查已安装的插件
./bin/logstash-plugin list

配置文件

项目的日志输出格式:"[%d{yyyy-MM-dd HH:mm:ss.SSS}] [%X{requestId}] [%t] %-5level: [%c{1.}:%L] - %msg%xEx%n"

input{
    file{
        path => "/logs/application/app-test.log"
		codec => multiline{
                   pattern => "^\[" 
				   negate => true
                   what => "previous"
               }
    }
}
filter{
    grok{
        match => {"message"=>"\[(?<timestamp>%{YEAR}-%{MONTHNUM2}-%{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND}.%{NONNEGINT})\] \[(%{NOTSPACE:requestId})?\] \[%{NOTSPACE:thread}\] %{LOGLEVEL:level}%{SPACE}: \[%{JAVACLASS:class}:%{NONNEGINT:line}\] - %{JAVALOGMESSAGE:msgDesc}$"}
    }
	#修改@timestamp 的值
	#date {
    #    match => ["timestamp", "yyyy-MM-dd HH:mm:ss.SSS"]
    #    target => "@timestamp"
    #}
	#解决8小时的时差问题
	ruby {   
      code => "event.set('time', event.get('@timestamp').time.localtime + 8*60*60)"   
    }
	ruby { 
		code => "event.set('@timestamp', event.get('time'))" 
	}
	# 1. 增加一个字段,计算timestamp+8小时
	#ruby { 
	#	code => "event.set('index_date', event.get('@timestamp').time.localtime + 8*60*60)" 
	#    code => "event.set('index_date', event.get('@timestamp').time.localtime)" 
	#} 
	# 2. 用mutate插件先转换为string类型,gsub只处理string类型的数据,在用正则匹配,最终得到想要的日期
	#mutate { 
	#	convert => ["index_date", "string"] 
	#	gsub => ["index_date", "T([\S\s]*?)Z", ""] 
	#	gsub => ["index_date", "-", "."] 
	#}
	# 移除中间变量字段,移除的变量后面无法使用,比如移除了index_date 后面的就无法使用 index_date
	mutate {   
        remove_field => ["time"]
    }	
}
output{
	elasticsearch {
		hosts => ["http://localhost:9200"]
		index => "logstash_%{+YYYY.MM.dd}"
		document_type => "_doc"
		user => "elastic"
		password => "elastic"
	}
    #stdout {
    #    codec => rubydebug
    #}
}


测试配置文件:

#测试启动
./bin/logstash -t -f /usr/local/logstash/logstash-6.7.0/logstash-simple.conf

指定配置文件启动

#指定配置文件启动
./bin/logstash -f /usr/local/logstash/logstash-6.7.0/logstash-simple.conf
#指定配置文件启动,并自动重新加载配置文件
./bin/logstash -f /usr/local/logstash/logstash-6.7.0/logstash-simple.conf --config.reload.automatic

ES 开启自动创建Index

PUT /_cluster/settings
{
    "persistent" : {
        "action": {
          "auto_create_index": "true"
        }
    }
}

抓取到的数据效果

ElasticSearch 查询语句:

GET logstash_2022.07.28/_search
{
  "from": 0,
  "size": 200,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "@timestamp": {
              "gte": "2022-07-28T00:10:00.000Z",
              "lte": "2022-07-28T00:12:00.000Z"
            }
          }
        },
        {
          "match": {
            "class": "c.e.m.d.A.listByState"
          }
        }
        
      ]
    }
  }
}

查询结果:

在这里插入图片描述

Filebeat 部署

Beats官网:Beats Platform Reference

官网:Filebeat Reference [6.7]

安装

cd /usr/local
mkdir filebeat
cd filebeat
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.7.2-linux-x86_64.tar.gz
tar xzvf filebeat-6.7.2-linux-x86_64.tar.gz

Filebeat + Logstash 部署

修改Logstash 配置文件

增加一个新的配置文件logstash-filebeat.conf,修改Logstash 的输入源为beats,内容如下
参考官方配置:plugins-inputs-beats

input{
    beats {
        id => "test_beats"
	    port => "5044"
	}
}
filter{
    grok{
        match => {"message"=>"\[(?<timestamp>%{YEAR}-%{MONTHNUM2}-%{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND}.%{NONNEGINT})\] \[(%{NOTSPACE:requestId})?\] \[%{NOTSPACE:thread}\] %{LOGLEVEL:level}%{SPACE}: \[%{JAVACLASS:class}:%{NONNEGINT:line}\] - %{JAVALOGMESSAGE:msgDesc}$"}
    }
	#修改@timestamp 的值
	#date {
    #    match => ["timestamp", "yyyy-MM-dd HH:mm:ss.SSS"]
    #    target => "@timestamp"
    #}
	#解决8小时的时差问题
    ruby {   
         code => "event.set('time', event.get('@timestamp').time.localtime + 8*60*60)"   
    }
    ruby { 
         code => "event.set('@timestamp', event.get('time'))" 
    }
	# 1. 增加一个字段,计算timestamp+8小时
	#ruby { 
	#	code => "event.set('index_date', event.get('@timestamp').time.localtime + 8*60*60)" 
	#    code => "event.set('index_date', event.get('@timestamp').time.localtime)" 
	#} 
	# 2. 用mutate插件先转换为string类型,gsub只处理string类型的数据,在用正则匹配,最终得到想要的日期
	#mutate { 
	#	convert => ["index_date", "string"] 
	#	gsub => ["index_date", "T([\S\s]*?)Z", ""] 
	#	gsub => ["index_date", "-", "."] 
	#}
	# 移除中间变量字段,移除的变量后面无法使用,比如移除了index_date 后面的就无法使用 index_date
	mutate {   
        remove_field => ["time"]
    }	
}
output{
	elasticsearch {
		hosts => ["http://localhost:9200"]
		index => "logstash_%{[@metadata][beat]}_%{[@metadata][version]}_%{+YYYY.MM.dd}"
		document_type => "_doc"
		user => "elastic"
		password => "elastic"
	}
    #stdout {
    #    codec => rubydebug
    #}
}


指定配置文件启动

#指定配置文件启动
./bin/logstash -f /usr/local/logstash/logstash-6.7.0/logstash-filebeat.conf
#指定配置文件启动,并自动重新加载配置文件
./bin/logstash -f /usr/local/logstash/logstash-6.7.0/logstash-filebeat.conf --config.reload.automatic

修改Filebeat配置文件

配置文件默认为filebeat的当前目录下的filebeat.yml,具体的配置可以参考官方文档filebeat-configuration,官方同样在当前目录下提供了一个filebeat.reference.yml的参考文件。

###################### Filebeat Configuration Example #########################


#=========================== Filebeat inputs =============================

filebeat.inputs:

# 从log文件输入,可以有多个
- type: log
  # 是否启用该输入
  enabled: true
  # 配置log文件路径,可以配置多个
  paths:
    - /logs/application/app-test.log
  # 添加额外的字段,用于标记日志数据的来源
  fields:
    app_id: application
    env: test
  #  review: 1

  ###多行匹配配置
  # 配置不是[ 开头的都不是独立的行
  multiline.pattern: ^\[
  # 是否启用匹配
  multiline.negate: true
  # 非独立行追加的方式,通常组合为 true+ after
  multiline.match: after

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

#================================ Processors =====================================

# Configure processors to enhance or manipulate events generated by the beat.

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~


指定配置文件启动

#进入filebeat目录
cd /usr/local/filebeat/filebeat-6.7.2-linux-x86_64
#测试配置
./filebeat test config
#测试输出
./filebeat test output
#使用默认配置文件启动
./filebeat 
#或者
./filebeat run

测试结果

ElasticSearch 查询语句:

GET logstash_filebeat_6.7.2_2022.07.29/_search
GET logstash_filebeat_6.7.2_2022.07.29/_search
{
  "from": 0,
  "size": 200,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "@timestamp": {
              "gte": "2022-07-29T15:10:00.000Z",
              "lte": "2022-07-29T16:50:00.000Z"
            }
          }
        },
        {
          "match": {
            "class": "c.e.m.d.A.listByState"
          }
        }
        
      ]
    }
  }
}

查询结果:

{
  "took" : 49,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 12279,
    "max_score" : 2.8607354,
    "hits" : [
      {
        "_index" : "logstash_filebeat_6.7.2_2022.07.29",
        "_type" : "_doc",
        "_id" : "v8_YSIIBjAlbEWaZ6-k_",
        "_score" : 2.8607354,
        "_source" : {
          "beat" : {
            "name" : "izbp1b7qkjpdb078ppwg7mz",
            "hostname" : "izbp1b7qkjpdb078ppwg7mz",
            "version" : "6.7.2"
          },
          "line" : "159",
          "level" : "DEBUG",
          "message" : "[2022-07-29 11:46:46.294] [24f099ad-4707-4a36-9f80-ef96f56a2b1c] [系统定时任务-4] DEBUG: [c.e.m.d.A.listByState:159] - ==> Parameters: 20(Integer), 35(Integer)",
          "tags" : [
            "beats_input_codec_plain_applied"
          ],
          "offset" : 10198689,
          "msgDesc" : "==> Parameters: 20(Integer), 35(Integer)",
          "host" : {
            "containerized" : true,
            "os" : {
              "name" : "CentOS Linux",
              "family" : "redhat",
              "codename" : "Core",
              "platform" : "centos",
              "version" : "7 (Core)"
            },
            "name" : "izbp1b7qkjpdb078ppwg7mz",
            "id" : "f0f31005fb5a436d88e3c6cbf54e25aa",
            "architecture" : "x86_64"
          },
          "source" : "/logs/xxxxxxxxxxxx/xxxxx.log",
          "@version" : "1",
          "requestId" : "24f099ad-4707-4a36-9f80-ef96f56a2b1c",
          "thread" : "系统定时任务-4",
          "prospector" : {
            "type" : "log"
          },
          "@timestamp" : "2022-07-29T15:25:48.969Z",
          "log" : {
            "file" : {
              "path" : "/logs/xxxxxxxxxxxx/xxxxx.log"
            }
          },
          "class" : "c.e.m.d.A.listByState",
          "fields" : {
            "env" : "test",
            "app_id" : "xxxxxxxxxxxx"
          },
          "meta" : {
            "cloud" : {
              "region" : "cn-hangzhou",
              "instance_id" : "i-bp1b7qkjpdb078ppwg7m",
              "availability_zone" : "cn-hangzhou-g",
              "provider" : "ecs"
            }
          },
          "timestamp" : "2022-07-29 11:46:46.294",
          "input" : {
            "type" : "log"
          }
        }
      },

Grok 脚本工具

转换Log4j 格式为的Grok 匹配模式的工具:
https://grokconstructor.appspot.com/do/translator

标准的patterns列表:
https://github.com/logstash-plugins/logstash-patterns-core/blob/main/patterns/legacy/grok-patterns

注意事项

1、启动命令记得都要用后台运行方式启动,参考Java 项目启动。
2、单个应用的日志最好独立成单个索引,不然可能会导致单日的索引过大。

待处理的问题

1、logstash 时区的问题更好的解决方案
2、filebeat 直接输出到ElasticSearch的尝试(已测试,日志内容不好解析)
3、filebeat + Kibana Beats的尝试。
3、调研ElasticSearch Kibana 的索引生命周期策略的使用。

总结

1、 Filebeat 可以只将数据同步到ElasticSearch,但Filebeat 的日志解析功能比较弱。
2、Logstash 可以直接读取log文件,并且有强大grok日志格式解析。
3、推荐直接用 Logstash 或者 logstash + Filebeat +ElasticSearch +Kibana ,而不是直接Filebeat + ElasticSearch +Kibana。因为大多数公司的项目的日志输出格式是统一的,这样就可以在一个Logstash里配置好grok解析脚本,所有的项目将日志通过Filebeat抓起送到Logstash来解析,再由Logstash来输出到ElasticSearch。

本文含有隐藏内容,请 开通VIP 后查看