如何实现一个MapReduce读取数据存入HBase

190次阅读

共计 3245 个字符，预计需要花费 9 分钟才能阅读完成。

这篇文章给大家介绍如何实现一个 MapReduce 读取数据存入 HBase，内容非常详细，感兴趣的小伙伴们可以参考借鉴，希望对大家能有所帮助。

车辆位置数据文件，格式：车辆 id 速度：油耗：当前里程。

通过 MapReduce 算出每辆车的平均速度、油耗、里程

vid1 78:8:120
vid1 56:11:124
vid1 98:5:130
vid1 72:6:131
vid2 78:4:281
vid2 58:9:298
vid2 67:15:309

创建 Map 类和 map 函数

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class VehicleMapper extends Mapper Object, Text, Text, Text  {
 @Override
 public void map(Object key, Text value, Context context) throws IOException, InterruptedException {String vehicle = value.toString();//  将输入的纯文本的数据转换成 String
 //  将输入的数据先按行进行分割
 StringTokenizer tokenizerArticle = new StringTokenizer(vehicle,  \n 
 //  分别对每一行进行处理
 while (tokenizerArticle.hasMoreTokens()) {
 //  每行按空格划分
 StringTokenizer tokenizer = new StringTokenizer(tokenizerArticle.nextToken());
 String vehicleId = tokenizer.nextToken(); // vid
 String vehicleInfo = tokenizer.nextToken(); //  车辆信息
 Text vid = new Text(vehicleId);
 Text info = new Text(vehicleInfo);
 context.write(vid, info);
}

创建 Reduce 类

import java.io.IOException;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.io.Text;
public class VehicleReduce extends TableReducer Text, Text, ImmutableBytesWritable  {
 @Override
 public void reduce(Text key, Iterable Text  values, Context context) throws IOException, InterruptedException {
 int speed = 0;
 int oil = 0;
 int mile = 0;
 int count = 0;
 for (Text val : values) {String str = val.toString();
 String[] arr = str.split( : 
 speed += Integer.valueOf(arr[0]);
 oil += Integer.valueOf(arr[1]);
 mile += Integer.valueOf(arr[2]) - mile; //  累积里程
 count++;
 speed = (int) speed / count; //  求平均值
 oil = (int) oil / count;
 mile = (int) mile / count;
 String result = speed +  :  + oil +  :  + mile;
 Put put = new Put(key.getBytes());
 put.add(Bytes.toBytes( info), Bytes.toBytes(property), Bytes.toBytes(result));
 ImmutableBytesWritable keys = new ImmutableBytesWritable(key.getBytes());
 context.write(keys, put);
}

运行任务

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
public class VehicleMapReduceJob {public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {Configuration conf = new Configuration();
 conf = HBaseConfiguration.create(conf);
 Job job = new Job(conf,  HBase_VehicleInfo 
 job.setJarByClass(VehicleMapReduceJob.class);
 job.setMapperClass(VehicleMapper.class);
 job.setMapOutputKeyClass(Text.class);
 job.setMapOutputValueClass(Text.class);
 FileInputFormat.addInputPath(job, new Path(args[0])); //  设置输入文件路径
 TableMapReduceUtil.initTableReducerJob(vehicle , VehicleReduce.class, job);
 System.exit(job.waitForCompletion(true) ? 0 : 1);
}

将代码导出成 vehicle.jar，放在 hadoop-1.2.1 目录下，输入命令

./bin/hadoop jar vehicle.jar com/xh/vehicle/VehicleMapReduceJob input/vehicle.txt

HBase 结果查询：

关于如何实现一个 MapReduce 读取数据存入 HBase 就分享到这里了，希望以上内容可以对大家有一定的帮助，可以学到更多知识。如果觉得文章不错，可以把它分享出去让更多的人看到。

正文完

发表至：计算机运维

2023-08-17

转载说明：除特殊说明外本站除技术相关以外文章皆由网络搜集发布，转载请注明出处。

ubuntu和entos操作系统之间的差异是什么

如何解析全新的Windows 8资源管理器

linux中Shell脚本编程规范是什么

win7系统鼠标dpi如何调

如何进行Storm DRPC实现机制分析