共计 3435 个字符,预计需要花费 9 分钟才能阅读完成。
这篇文章将为大家详细讲解有关 Flink 怎么用,丸趣 TV 小编觉得挺实用的,因此分享给大家做个参考,希望大家阅读完这篇文章后可以有所收获。
Flink 安装准备
Flink 运行支持 Linux、苹果、Windows 主流平台。不过最好还是使用 Linux。下面给出安装前的准备:
安装 Jdk1.7.X 或者以上的版本
在 Flink 官网下载对应 Hadoop 预编译版本
将预编译版本解压,进入解压缩文件,为了方便,后文统一称此目录为:FLINK_HOME。
开始安装单机快速尝试
单机尝试非常简单,直接执行命令:
Linux 用户:sh bin/start-local.sh
Windows 用户,在命令窗户输入:bin\start-local.bat
等待其出现如下提示之后:
D:\Java\flink\flink-0.10.1 bin\start-local.bat
Starting Flink job manager. Webinterface by default on http://localhost:8081/.
Don t close this batch window. Stop job manager by pressing Ctrl+C.
在浏览器中输入:http://localhost:8081/,Flink 默认监听 8081 端口,防止其他进程占用此端口。此时出现下面的管理界面:
可以发现这个界面和 Spark 的管理界面的逻辑差不多,主要是管理正在运行的 Job,已经完成的 Job,以及 Task 管理和 Job 管理,Task 应该是管理 Job 的,以后再仔细分析里面的逻辑。
跑第一个例子
下面迫不及待先来跑一个分布式系统最经典的例子:WordCount,下面以 FLINK_HOME 的 README.txt 文件作为示例文件,测试 WordCount 程序,在 Windows 上面运行代码以及运行过程如下图:
D:\Java\flink\flink-0.10.1 bin\flink.bat run .\examples\WordCount.jar file:/D:/Java/flink/flink-0.10.1/README.txt file:/D:/Java/flink/flink-0.10.1/wordcount-result.txt
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.li
b.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
01/15/2016 16:30:51 Job execution switched to status RUNNING.
01/15/2016 16:30:51 CHAIN DataSource (at getTextDataSet(WordCount.java:142)
(org.apache.flink.api.java.io.TextInputFormat)) - FlatMap (FlatMap at main(WordCount.java:69)) - Combine(SUM(1), at main(WordCount.java:72)(1/1) switched to SCHEDULED
01/15/2016 16:30:51 CHAIN DataSource (at getTextDataSet(WordCount.java:142)
(org.apache.flink.api.java.io.TextInputFormat)) - FlatMap (FlatMap at main(WordCount.java:69)) - Combine(SUM(1), at main(WordCount.java:72)(1/1) switched to DEPLOYING
01/15/2016 16:30:52 CHAIN DataSource (at getTextDataSet(WordCount.java:142)
(org.apache.flink.api.java.io.TextInputFormat)) - FlatMap (FlatMap at main(WordCount.java:69)) - Combine(SUM(1), at main(WordCount.java:72)(1/1) switched to RUNNING
01/15/2016 16:30:52 Reduce (SUM(1), at main(WordCount.java:72)(1/1) switched to SCHEDULED
01/15/2016 16:30:52 Reduce (SUM(1), at main(WordCount.java:72)(1/1) switched to DEPLOYING
01/15/2016 16:30:52 CHAIN DataSource (at getTextDataSet(WordCount.java:142)
(org.apache.flink.api.java.io.TextInputFormat)) - FlatMap (FlatMap at main(WordCount.java:69)) - Combine(SUM(1), at main(WordCount.java:72)(1/1) switched to FINISHED
01/15/2016 16:30:52 Reduce (SUM(1), at main(WordCount.java:72)(1/1) switched to RUNNING
01/15/2016 16:30:53 DataSink (CsvOutputFormat (path: file:/D:/Java/flink/flink-0.10.1/wordcount-result.txt, delimiter: ))(1/1) switched to SCHEDULED
01/15/2016 16:30:53 DataSink (CsvOutputFormat (path: file:/D:/Java/flink/flink-0.10.1/wordcount-result.txt, delimiter: ))(1/1) switched to DEPLOYING
01/15/2016 16:30:53 Reduce (SUM(1), at main(WordCount.java:72)(1/1) switched to FINISHED
01/15/2016 16:30:53 DataSink (CsvOutputFormat (path: file:/D:/Java/flink/flink-0.10.1/wordcount-result.txt, delimiter: ))(1/1) switched to RUNNING
01/15/2016 16:30:53 DataSink (CsvOutputFormat (path: file:/D:/Java/flink/flink-0.10.1/wordcount-result.txt, delimiter: ))(1/1) switched to FINISHED
01/15/2016 16:30:53 Job execution switched to status FINISHED.
可以看到输出日志非常详细,很方便就清楚整个运行流程,得到输出文件 wordcount-result.txt 前面 10 条内容如下:
1 1
13 1
5d002 1
740 1
about 1
account 1
administration 1
algorithms 1
and 7
another 1
any 2
关于“Flink 怎么用”这篇文章就分享到这里了,希望以上内容可以对大家有一定的帮助,使各位可以学到更多知识,如果觉得文章不错,请把它分享出去让更多的人看到。