目前日志的痛点

运维要经常登陆到服务器上拿日志给开发、测试
每次都是出问题后才去看日志，不能提前通过日志预判问题
如果是集群服务，日志将要从多台机器取
开发人员搞出来的日志不规范，没有标准。日志目录不统一、日志类型也不明确（系统日志、错误日志、访问日志、运行日志、设备日志、debug日志）

以上痛点可以使用ELK解决，

要想让日志发挥作用，要有4个阶段，

收集
存储
搜索和展现
日志分析，做到故障预警和业务拓展

使用 elasticsearch logstash kibana 可以解决前3个阶段的问题

es：存储，搜索

logstash: 收集

kibanna: 展现

es 和 logstash都是使用java语言开发的，运行时使用jvm，所以运行环境要安装jdk(open-jdk,据说安卓系统将改用open-jdk,弃用sun-jdk,让安卓系统更轻一些)

es安装及配置

es安装的最佳实践是使用yum安装（也可以用源码安装，就是下载一个tar包，解压运行即可，好处是更新版本时很方便）

https://www.elastic.co/guide/en/elasticsearch/reference/current/rpm.html

1.Download and install the public signing key:

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

2.Create a file called elasticsearch.repo in the /etc/yum.repos.d/ directory for RedHat based distributions

[elasticsearch-6.x]

name=Elasticsearch repository for 6.x packages

baseurl=https://artifacts.elastic.co/packages/6.x/yum

gpgcheck=1

gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch

enabled=1

autorefresh=1

type=rpm-md

3.And your repository is ready for use. You can now install Elasticsearch with one of the following

sudo yum install elasticsearch

配置：

es要配置的地方不多，集群cluster名称（很重要），节点名称（很重要），是否锁住内存， data path, log path ,监听网络的IP ，监听网络的接口

elasticsearch.yml

cluster.name: oldgirl

node.name: linux-node-1

path.data: /var/lib/elasticsearch

path.logs: /var/log/elasticsearch

bootstrap.memory_lock: true

network.host: 0.0.0.0

http.port: 9200

这里bootstrap.memory_lock: true 是锁内存，启动的时候会报错，导致服务无法启动，那是因为limit.conf没开启锁的权限按照日志报错提示进行添加

2018-07-01T14:15:44,143][WARN ][o.e.b.JNANatives ] Increase RLIMIT_MEMLOCK, soft limit: 65536, hard limit: 65536

[2018-07-01T14:15:44,144][WARN ][o.e.b.JNANatives ] These can be adjusted by modifying /etc/security/limits.conf, for example:

# allow user 'elasticsearch' mlockall

elasticsearch soft memlock unlimited

elasticsearch hard memlock unlimited

[2018-07-01T14:15:44,144][WARN ][o.e.b.JNANatives

至此一个单节点的es安装完成，可以访问测试 http://IP:9200

{

"name" : "linux-node-1",

"cluster_name" : "oldgirl",

"cluster_uuid" : "5hmMNxc5QxG6q-2t2VNqrg",

"version" : {

"number" : "6.3.0",

"build_flavor" : "default",

"build_type" : "rpm",

"build_hash" : "424e937",

"build_date" : "2018-06-11T23:38:03.357887Z",

"build_snapshot" : false,

"lucene_version" : "7.3.1",

"minimum_wire_compatibility_version" : "5.6.0",

"minimum_index_compatibility_version" : "5.0.0"

},

"tagline" : "You Know, for Search"

}

看到以上结果，说明一个es已经搭建成功，es搭建成功后接下来就是往es里存数据了。

如何和es交互？两种大的方法

一种是java API 一种是resful api

我们使用restfulapi，以json数据格式与es交互

比如在shell环境中执行：

curl -H Content-Type:application/json -i -X GET 'http://127.0.0.1:9200/_count?pretty' -d '

{

"query": {

"match_all": {}

}

}'

返回结果

HTTP/1.1 200 OK

content-type: application/json; charset=UTF-8

content-length: 114

{

"count" : 0,

"_shards" : {

"total" : 0,

"successful" : 0,

"skipped" : 0,

"failed" : 0

}

}

-X GET 请求的方法

加-i是把响应头显示出来

这里要加-H Content-Type:application/json ，告诉服务器用json格式解析请求数据，否则会报如下错误：

HTTP/1.1 406 Not Acceptable

content-type: application/json; charset=UTF-8

content-length: 109

{

"error" : "Content-Type header [application/x-www-form-urlencoded] is not supported",

"status" : 406

}

这样使用shell命令行curl访问 es的restfulapi，但是不方便，es提供了很多插件，我们来使用官方推荐的插件，提供一个web管理的形式，来和es的restfulapi进行交互

官方推荐的插件在 elasticsearch 6.x版本不在支持，我们用开源的elasticsearch-head github地址：https://github.com/mobz/elasticsearch-head

安装方法：

Running with built in server

git clone git://github.com/mobz/elasticsearch-head.git

cd elasticsearch-head

npm install

npm run start

open http://localhost:9100/

然后去修改elasticsearch的配置文件

vim /etc/elasticsearch/elasticsearch.yml

最后添加如下两行

http.cors.enabled: true

http.cors.allow-origin: "*"

然后访问

打开http://localhost:9100/

添加http://localhost:9200

至此我们就可以使用web方式与elasticsearch的restfulapi进行交互了

接下来就是做一个elasticsearch集群

安装都是一样的，就在配置文件里把cluster name 设置成一样。

启动后es用多播或者组播对外宣称自己是哪个集群的。这里要注意的是，多播形式在6.x版本不好用，建议使用组播。组播的配置方式

discovery.zen.ping.unicast.hosts: ["host1", "host2"] 这里最好填写ip

这里并不需要把所有的节点名称都添加进去，只需要添加1到2个。因为他们会传播的。

如何判断是否加入集群了，两种方式，一种看elasticsearch-head 概述里能看到。

另外一种是通过看elasticsearch的日志，日志的名称为集群的名称。

还有就是监控插件bigdesk 很可惜从2.0后就不支持了。还有一个kopf插件3.0也不支持，总之现在es在做平台化，我们这里学习了解即可，，生产尽量使用平台产品。少很多运维成本。

常用的插件就这3个，有2个已经不能使用了。

es集群安装配置成功后，基本的使用和概念了解后，我们就开始学习logstash ，es的使用有很多知识，但是对于我们运维来说，最重要的是收集日志，所以接下来重点学习logstash的使用。

logstash的安装

是不是要在每一台服务器上安装logstash，不一定如果通过网络收就不需要。要是收集文本文件，那就是了。

https://www.elastic.co/guide/en/logstash/current/installing-logstash.html

YUM

Download and install the public signing key:

rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

Add the following in your /etc/yum.repos.d/ directory in a file with a .repo suffix, for example logstash.repo

vim /etc/yum.repos.d/logstash.repo

[logstash-6.x]

name=Elastic repository for 6.x packages

baseurl=https://artifacts.elastic.co/packages/6.x/yum

gpgcheck=1

gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch

enabled=1

autorefresh=1

type=rpm-md

And your repository is ready for use. You can install it with:

sudo yum install logstash

logstash使用gruby开发的。启动会有些慢

/usr/share/logstash/bin/logstash -e 'input { stdin{} } output { stdout{} }'

-e 执行

一个input 一个output

stdin{} ,stdout{} 是两个插件

运行需要等1分钟左右

[root@node2 elasticsearch]# /usr/share/logstash/bin/logstash -e 'input { stdin{} } output { stdout{} }'

WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults

Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console

[WARN ] 2018-07-01 15:03:59.682 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified

[INFO ] 2018-07-01 15:04:00.629 [LogStash::Runner] runner - Starting Logstash {"logstash.version"=>"6.3.0"}

[INFO ] 2018-07-01 15:04:03.885 [Converge PipelineAction::Create] pipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}

The stdin plugin is now waiting for input:

[INFO ] 2018-07-01 15:04:04.098 [Converge PipelineAction::Create] pipeline - Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x1b16cf42 run>"}

[INFO ] 2018-07-01 15:04:04.225 [Ruby-0-Thread-1: /usr/share/logstash/lib/bootstrap/environment.rb:6] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[INFO ] 2018-07-01 15:04:04.547 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}

hello world

{

"@version" => "1",

"@timestamp" => 2018-07-01T07:04:13.785Z,

"message" => "hello world",

"host" => "node2.shared"

}

hehehe

{

"@version" => "1",

"@timestamp" => 2018-07-01T07:04:20.411Z,

"message" => "hehehe",

"host" => "node2.shared"

}

以上就是标准输入输出的例子。

/usr/share/logstash/bin/logstash -e 'input { stdin{} } output { stdout{ codec => rubydebug } }'

...

hello

{

"message" => "hello",

"@version" => "1",

"@timestamp" => 2018-07-01T07:08:02.456Z,

"host" => "node2.shared"

}

我们把logstash进来的每条数据叫做事件，不叫一行，多行数据可能表示一个事件，比如一个报错肯定不止一行信息。

把内容写到es中

输入还是用标准，输出改下

/usr/share/logstash/bin/logstash -e 'input { stdin{} } output { elasticsearch { hosts => ["10.211.55.8:9200"] } }'

相关官方文档https://www.elastic.co/guide/en/logstash/current/index.html

输出到es 就是那么简单。

能不能同时输出到es和前端，可以，不是负载均衡是同时。一个input，可以有多个output

/usr/share/logstash/bin/logstash -e 'input { stdin{} } output { elasticsearch { hosts => ["10.211.55.8:9200"] } stdout { codec => rubydebug } }'

什么作用呢？生产上写到es的时候同时写到文本。文本保留是最好的，3个好处 1.最简单 2.可以2次加工 3. 压缩比最高日志记什么好？文本

接下来我们就要学习写logstash的配置文件，不能一直在命令行写，写到配置文件方便。

最简单的配置文件：

vim /etc/logstash/conf.d/logstash-simple.conf

input { stdin { } }

output {

elasticsearch { hosts => ["10.211.55.8:9200"] }

stdout { codec => rubydebug }

}

然后启动

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash-simple.conf

我们主要学习logstash的配置语法

This is a comment. You should use comments to describe

parts of your configuration.

input {

...

}

filter {

...

}

output {

...

}

input{},output{}是必须的，filter{}是可选的

input {

file {

path => "/var/log/messages"

type => "syslog"

}

file {

path => "/var/log/apache/access.log"

type => "apache"

}

}

案例 1

最常见的就是从文件输入

vim /etc/logstash/conf.d/file.conf

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

}

output {

stdout { codec => rubydebug }

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "system-%{+YYYY.MM.dd}"

}

}

接下来不仅收集系统日志而且要收集java日志

案例 2

vim /etc/logstash/conf.d/file.conf

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

file {

    path => "/var/log/elasticsearch/oldgirl.log"

    type => "es-error"

    start_position => "beginning"

}

}

这样通过type 字段做if判断。

6.x中file插件文档没写type属性，但是能用，还不能换成其他的

这里要注意的是我们还没有给massge信息里做域，域中是有type属性的，那么这时候你再在file里使用type用于判断那就会失效了。

当然也可以在一台服务器上启动多个logstash程序去实现不同服务的日志。不过占用cpu和内存

Detected a 6.x and above cluster: the type event field won't be used to determine the document _type {:es_version=>6}

启动时提示信息，告诉我们配置文件在file里设置的type并不是es 数据浏览中的_type

这样去elasticsearch中查看日志会有一个问题，就是一个错误信息应该是一个事件，显示在一个事件里才是最好的，但是从文件里读取导致这个数据被切成了多行。这样是很不方便的。怎么把它收集到一个事件里呢。该引入codec了

案例3

input {

stdin {

codec => multiline {

pattern => "pattern, a regexp"

negate => "true" or "false"

what => "previous" or "next"

}

}

}

上面三个参数的解释

pattern 正则，在什么情况下和并

negate

what

input {

stdin {

codec => multiline {

pattern => "^["

negate => "true"

what => "previous"

}

}

}

output {

stdout {

codec => rubydebug

}

}

以[开头的为一个事件，不以[开头的就合并到上一个事件去

vim /etc/logstash/conf.d/all.conf

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

file {

    path => "/var/log/elasticsearch/oldgirl.log"

    type => "es-error"

    start_position => "beginning"

        codec => multiline {

             pattern => "^\["

             negate => "true"

             what => "previous"

           }

}

}

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/all.conf

接下来从elastic-head查看不方便，就要引用我们的kibana服务

kibana是elasticsearch的可视化平台

https://www.elastic.co/guide/en/kibana/current/index.html

kibana 一开始PHP，改为ruby 又改成gruby 现在改成nodejs

wget https://artifacts.elastic.co/downloads/kibana/kibana-6.3.0-linux-x86_64.tar.gz

shasum -a 512 kibana-6.3.0-linux-x86_64.tar.gz

tar -xzf kibana-6.3.0-linux-x86_64.tar.gz

mv kibana-6.3.0-linux-x86_64/ /usr/local/

cd /usr/local/

ln -s kibana-6.3.0-linux-x86_64/ kibana

更改kibana配置文件

cd /usr/local/kibana/config

vim kibana.yml

4个地方修改

server.port: 5601

server.host: "0.0.0.0"

elasticsearch.url: "http://10.211.55.8:9200"

kibana.index: ".kibana"

kibana.index值得注意，kibana没有数据库，但数据总要又个地方存储，那么既然和es是生死之交，那就用es,直接告诉你帮我创建一个.kibana的索引，用来存储kibana数据

配置完成后，直接启动kibana

我们收集了system日志，java 的日志（es的运行日志），接下来我们收集nginx的日志。

es里有域的概念，域可以理解成表中的字段。 index 索引理解成数据库实例，_type 理解成数据库里的表，而域就是字段即把 message里的内容搞成key:value的形式

nginx 的日志通过配置nginx.conf文件，可以让ngingx的日志格式统一输出为json文件格式。而logstash 传递给es,es可以直接把这种json数据格式解析成k:v的形式，这样将为以后使用elk中的kibana进行搜索增加效率。

nginx配置日志使用json的方式如下：nginx.org

http://nginx.org/en/docs/http/ngx_http_log_module.html 查看nginx官网的关于日志模块的配置

其中

Syntax: log_format name [escape=default|json|none] string ...;

Default: log_format combined "...";

Context: http

我们只需要在nginx中的http配置块中添加

log_format main '$remote_addr - $remote_user [$time_local] "$request" '

'$status $body_bytes_sent "$http_referer" '

'"$http_user_agent" "$http_x_forwarded_for"';

log_format json '{"@timestamp":"$time_iso8601",'

'"@version":"1",'

'"url":"$uri",'

'"status":"$status",'

'"domain":"$host",'

'"host":"$server_addr",'

'"size":$body_bytes_sent,'

'"responsetime":$request_time,'

'"referer": "$http_referer",'

'"ua": "$http_user_agent"'

'}';

access_log /var/log/nginx/access_json.log json;

access_log /var/log/nginx/access.log main;

启动nginx,访问产生日志，并且确认是json格式的

此时写一个json.conf文件

vim /etc/logstash/conf.d/json.conf

input {

file {

path => "/var/log/nginx/access_json.log"

codec => json

}

}

output {

stdout {

codec => rubydebug

}

}

执行结果如下：

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/json.conf

[INFO ] 2018-07-01 22:22:36.797 [Ruby-0-Thread-1: /usr/share/logstash/lib/bootstrap/environment.rb:6] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

[INFO ] 2018-07-01 22:22:37.539 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}

{

"domain" => "10.211.55.8",

"@version" => "1",

"host" => "10.211.55.8",

"responsetime" => 0.0,

"@timestamp" => 2018-07-01T14:23:24.000Z,

"size" => 0,

"status" => "304",

"path" => "/var/log/nginx/access_json.log",

"ua" => "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36",

"url" => "/index.html",

"referer" => "-"

}

接下来我们就可以添加到all.conf中了

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

file {

path => "/var/log/nginx/access_json.log"

type => "nginx-log"

start_position => "beginning"

codec => json

}

file {

    path => "/var/log/elasticsearch/oldgirl.log"

    type => "es-error"

    start_position => "beginning"

        codec => multiline {

             pattern => "^\["

             negate => "true"

             what => "previous"

           }

}

}

output {

if [type] == "system" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "system-%{+YYYY.MM.dd}"

}

}

if [type] == "es-error" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "es-error-%{+YYYY.MM.dd}"

}

}

if [type] == "nginx-log" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "nginx-log-%{+YYYY.MM.dd}"

}

}

}

这样就可以在elasticsearch-head中查看到新的index

在kibana中添加新的索引，然后就可以进行查询了

message日志的收集

前面我们也收集了message日志，但是我们使用的是file插件，

我们知道系统的日志是由syslog程序生成，syslog是可以将日志写到远程的

所以我们应该使用logstash 监听一个端口，syslog直接将日志写到监听端口就行了。

最好的是生产上所有的业务都用syslog进行写日志，那就相当于不需要在每台机器上安装logstash进行抓取日志，只需要搞一个logstash服务端口

nginx 也有支持写到syslog,原生的不支持，淘宝开源的支持，还有nginx lua 支持

在 input 插件列表中能找到syslog

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-syslog.html

vim /etc/logstash/conf.d/syslog.conf

input {

syslog {

type => "system-syslog"

host => "10.211.55.8"

port => "514"

}

}

output {

stdout {

codec => "rubydebug"

}

}

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/syslog.conf

启动后确认514端口是开放的

接下来就是更改系统的rsyslog.conf配置文件

vim /etc/rsyslog.conf

找到

. @@remote-host:514

去掉#改成：

. @@10.211.55.8:514

然后重启rsyslog服务

systemctl restart rsyslog

重启下你就会立马看到日志

{

"pid" => "20915",

"severity" => 5,

"logsource" => "node2",

"facility_label" => "security/authorization",

"timestamp" => "Jul 2 20:56:43",

"type" => "system-syslog",

"program" => "polkitd",

"@timestamp" => 2018-07-02T12:56:43.000Z,

"facility" => 10,

"host" => "10.211.55.8",

"@version" => "1",

"message" => "Unregistered Authentication Agent for unix-process:1927:9050003 (system bus name :1.1149, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale zh_CN.UTF-8) (disconnected from bus)\n",

"priority" => 85,

"severity_label" => "Notice"

}

然后我们就可以把syslog.conf的配置写在all.conf配置文件中

input {

file {

path => "/var/log/messages"

type => "system"

start_position => "beginning"

}

file {

path => "/var/log/nginx/access_json.log"

type => "nginx-log"

start_position => "beginning"

codec => json

}

file {

    path => "/var/log/elasticsearch/oldgirl.log"

    type => "es-error"

    start_position => "beginning"

        codec => multiline {

             pattern => "^\["

             negate => "true"

             what => "previous"

           }

}

syslog {

	type => "system-syslog"

	host => "10.211.55.8"

	port => "514"

}

}

output {

if [type] == "system" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "system-%{+YYYY.MM.dd}"

}

}

if [type] == "es-error" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "es-error-%{+YYYY.MM.dd}"

}

}

if [type] == "nginx-log" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "nginx-log-%{+YYYY.MM.dd}"

}

}

if [type] == "system-syslog" {

elasticsearch {

hosts => ["10.211.55.8:9200"]

index => "sysetm-syslog-%{+YYYY.MM.dd}"

}

}

}

启动后

logger "hallo 1"

logger "hallo 1"

logger "hallo 1"

logger "hallo 1"

logger "hallo 1"

logger "hallo 1"

进行测试

上面这个可以当作生产的模版。

还有一个常见的logstash插件，tcp插件

system-syslog可以监听syslog日志，假如有应用程序不想把日志写到文件中，就可以用logstash直接启动tcp监听端口

这样，程序可以将日志直接写到tcp监听端口。

写法如下：

vim tcp.conf

input {

tcp {

host => "10.211.55.8"

port => "6666"

}

}

output {

stdout {

codec => "rubydebug"

}

}

启动 /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/tcp.conf

然后用nc进行测试

nc 10.211.55.8 6666 < /etc/resolv.conf

{

"host" => "node2.shared",

"message" => "# Generated by NetworkManager",

"@timestamp" => 2018-07-02T13:20:27.921Z,

"port" => 44257,

"@version" => "1"

}

{

"host" => "node2.shared",

"message" => "search localdomain shared",

"@timestamp" => 2018-07-02T13:20:27.943Z,

"port" => 44257,

"@version" => "1"

}

{

"host" => "node2.shared",

"message" => "nameserver 10.211.55.1",

"@timestamp" => 2018-07-02T13:20:27.944Z,

"port" => 44257,

"@version" => "1"

}

echo "hehe" | nc 10.211.55.8 6666

{

"host" => "node2.shared",

"message" => "hehe",

"@timestamp" => 2018-07-02T13:21:39.242Z,

"port" => 44259,

"@version" => "1"

}

echo "oldgirl" > /dev/tcp/10.211.55.8/6666

{

"host" => "node2.shared",

"message" => "oldgirl",

"@timestamp" => 2018-07-02T13:23:23.936Z,

"port" => 44260,

"@version" => "1"

}

a-z ↩︎

秒客网

ELK日志收集

grep "^[1]" /etc/elasticsearch/elasticsearch.yml

This is a comment. You should use comments to describe

parts of your configuration.

access_log /var/log/nginx/access.log main;

. @@remote-host:514

nc 10.211.55.8 6666 < /etc/resolv.conf

echo "hehe" | nc 10.211.55.8 6666

echo "oldgirl" > /dev/tcp/10.211.55.8/6666

相关文章

ELK日志收集

grep "[1]" /etc/elasticsearch/elasticsearch.yml

This is a comment. You should use comments to describe

parts of your configuration.

access_log /var/log/nginx/access.log main;

. @@remote-host:514

nc 10.211.55.8 6666 < /etc/resolv.conf

echo "hehe" | nc 10.211.55.8 6666

echo "oldgirl" > /dev/tcp/10.211.55.8/6666

相关文章

grep "^[1]" /etc/elasticsearch/elasticsearch.yml