Flume的Sink怎么使用

107次阅读

共计 3462 个字符，预计需要花费 9 分钟才能阅读完成。

这篇文章主要介绍“Flume 的 Sink 怎么使用”，在日常操作中，相信很多人在 Flume 的 Sink 怎么使用问题上存在疑惑，丸趣 TV 小编查阅了各式资料，整理出简单好用的操作方法，希望对大家解答”Flume 的 Sink 怎么使用”的疑惑有所帮助！接下来，请跟着丸趣 TV 小编一起来学习吧！

Logger Sink

Logs 会输出到 console，是为了 debug 用的。

[root@hftest0001 conf]# pwd
/opt/apache-flume-1.6.0-bin/conf
[root@hftest0001 conf]# vi s-exec_c-m_s-logger.conf 
agent.sources = exec_tail
agent.channels = memoryChannel
agent.sinks = loggerSink
agent.sources.exec_tail.type = exec
agent.sources.exec_tail.command = tail -F /opt/flume-data/exec-tail.log
agent.sources.exec_tail.channels = memoryChannel
agent.sinks.loggerSink.type = logger
agent.sinks.loggerSink.channel = memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 100
[root@hftest0001 apache-flume-1.6.0-bin]# pwd
/opt/apache-flume-1.6.0-bin
[root@hftest0001 opt]# mkdir -p /opt/flume-data/
[root@hftest0001 opt]# touch /opt/flume-data/exec-tail.log
[root@hftest0001 apache-flume-1.6.0-bin]# flume-ng agent -n agent -c conf/ -f conf/s-exec_c-m_s-logger.conf 

[root@hftest0001 opt]# echo  Hello Flume    /opt/flume-data/exec-tail.log
观察 console，类似于  Event:{headers{} bodys: xxx xx x x x x Hello Flume }

HDFS Sink

[root@hftest0001 conf]# pwd
/opt/apache-flume-1.6.0-bin/conf
[root@hftest0001 conf]# vi s-exec_c-m_s-hdfs.conf 
agent.sources = exec_tail 
agent.channels = memoryChannel
agent.sinks = hdfs_sink
agent.sources.exec_tail.type = exec
agent.sources.exec_tail.command = tail -F /opt/flume-data/exec-tail.log
agent.sources.exec_tail.interceptors = i1
agent.sources.exec_tail.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder
agent.sources.exec_tail.channels = memoryChannel
agent.sinks.hdfs_sink.type = hdfs
agent.sinks.hdfs_sink.hdfs.path = hdfs://10.224.243.124:9000/flume/events/%y-%m-%d = 写入 hdfs 的路径
#roll file 的三个策略，(避免生成大量的空文件，或者小文件)
#agent.sinks.hdfs_sink.hdfs.rollInterval = 30 = 基于时间:default 30s，设置为 0，则 disable
#agent.sinks.hdfs_sink.hdfs.rollSize = 1024 = 基于文件大小:default 1024bytes, 设置为 0，则 disable
#agent.sinks.hdfs_sink.hdfs.rollCount = 10 = 基于文件消息的数量:default 10 个，设置为 0，则 disable
agent.sinks.hdfs_sink.hdfs.fileType = DataStream = flume 写入 hdfs 的文件类型  default:SequenceFile
#SequenceFile =   类似于 hadoop.io.LongWritable  ora.apache.hadoop.io.ByteWritable... ...
#DataStream =   不会对输出进行压缩，即不能再设置 hdfs.codeC
#CompressedStream = 就会对输出进行压缩，并要求设置可用的 hdfs.codeC
#agent.sinks.hdfs_sink.hdfs.codeC
agent.sinks.hdfs_sink.hdfs.writeFormat = Text
agent.sinks.hdfs_sink.hdfs.filePrefix = flume
#agent.sinks.hdfs_sink.hdfs.hdfs.callTimeout = 10000 =  flume 对 Hdfs 的操作如 open，write，flush 等等，对 network 不佳的系统，可以适当的调大该参数
agent.sinks.hdfs_sink.channel = memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 100

File Roll Sink

[root@hftest0001 conf]# pwd
/opt/apache-flume-1.6.0-bin/conf
[root@hftest0001 conf]# vi s-exec_c-m_s-file-roll.conf 
agent.sources = exec_tail-1
agent.channels = memoryChannel
agent.sinks = file_roll-1
agent.sources.exec_tail-1.type = exec
agent.sources.exec_tail-1.command = tail -F /opt/flume-data/exec-tail.log
agent.sources.exec_tail-1.channels = memoryChannel
agent.sinks.file_roll-1.type = file_roll
agent.sinks.file_roll-1.sink.directory= /opt/flume-data/file-roll-1
#agent.sinks.file_roll-1.sink.rollInterval= 30 =  roll file 策略，default:30s  生成一个新的文件。设置为 0，则 disable roll file，即会全部写入单一的文件中
agent.sinks.file_roll-1.channel = memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 100

到此，关于“Flume 的 Sink 怎么使用”的学习就结束了，希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习，快去试试吧！若想继续学习更多相关知识，请继续关注丸趣 TV 网站，丸趣 TV 小编会继续努力为大家带来更多实用的文章！

正文完