KAFKA有哪些特性

132次阅读

共计 3307 个字符，预计需要花费 9 分钟才能阅读完成。

本篇内容主要讲解“KAFKA 有哪些特性”，感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷，实用性强。下面就让丸趣 TV 小编来带大家学习“KAFKA 有哪些特性”吧!

1、高并发
这个通常是说一个系统能承受大量的连接，已经非常高的并发；
在 kafka 中，主要是得益于优秀的网络通信框架设计，即前面讲到的结合 Reactor 设计模式实现的网络底座。
这个网络框架封装自 Java 的 NIO 库，底层的网络 IO 模型采用的是多路复用的网络 IO，也就是通过一个 selector 可以管理成千上万的连接，相比于传统 BIO 大大的节约了服务端维护连接的开销。
其次就是结合 Reactor 设计模式实现的网络底座，分为三个角色，acceptor、processor、handler，将网络事件与业务逻辑进一步拆分解解耦，提升了网络事件的执行效率。

2、高吞吐
吞吐需要分为两部分讨论
2.1、写入吞吐量，主要是得益于追加写的性能极高，kafka 是如何实现追加写的呢？简单的说来其实底层就是持有目标文件的 channel，然后基于 channel 去进行追加写即可，
那么是怎么持有文件的 channel 的呢？在创建 segment 也就是日志文件的时候就已经知道对应文件在哪儿并持有对应的 file 引用了，因此就避免了还需要进行磁盘寻址的开销，
基于这个文件的 channel 就可以进行追加写入。

public static FileRecords open(File file, boolean mutable, boolean fileAlreadyExists, int initFileSize, boolean preallocate) throws IOException {//  拿到这个 log 文件对应的 fileChannel FileChannel channel = openChannel(file, mutable, fileAlreadyExists, initFileSize, preallocate); int end = (!fileAlreadyExists   preallocate) ? 0 : Integer.MAX_VALUE; return new FileRecords(file, channel, 0, end, false);}

private static FileChannel openChannel(File file, boolean mutable, boolean fileAlreadyExists, int initFileSize, boolean preallocate) throws IOException { //  通过 RandomAccessFile 拿到对应的 fileChannel if (mutable) {if (fileAlreadyExists) {return new RandomAccessFile(file,  rw).getChannel(); } else {if (preallocate) { RandomAccessFile randomAccessFile = new RandomAccessFile(file,  rw  randomAccessFile.setLength(initFileSize); return randomAccessFile.getChannel(); } else {return new RandomAccessFile(file,  rw).getChannel(); }
 }
 } else {return new FileInputStream(file).getChannel(); }
}

public int writeFullyTo(GatheringByteChannel channel) throws IOException { //  这个 buffermemoryRecords 中的一个属性  //  在初始化的时候被赋值的  //  那么在哪里初始化的呢？这个是从 ProduceRequest 中被取出来的  buffer.mark(); //  经典的 NIO 写文件循环操作  int written = 0; while (written   sizeInBytes())//  直接写 os cache 中，而不是写在磁盘文件里  written += channel.write(buffer); buffer.reset(); return written;}

2.2、读取吞吐量，这个主要是利用网上常说的 zore copy，这零拷贝简单的说来 OS 提供了一个系统调用，可以让网卡根据少量的元数据信息，就可以直接从 OS CACHE 中读取目标数据
从而避免了这部分数据拷贝到用户空间（JVM），再拷贝到 socket 缓冲区，几乎消除了 CPU 拷贝数据的开销，同时也减少了用户态 / 内核态切换的开销，从而在数据发送的方面，zore copy 性能极高。

话又说回来，kafka 是怎么利用 zore copy 的呢？很简单，源码如下 FileRecords 的 writeTo 函数：

public long writeTo(GatheringByteChannel destChannel, long offset, int length) throws IOException {long newSize = Math.min(channel.size(), end) - start; int oldSize = sizeInBytes(); if (newSize   oldSize)throw new KafkaException(String.format( Size of FileRecords %s has been truncated during write: old size %d, new size %d , file.getAbsolutePath(), oldSize, newSize)); long position = start + offset; int count = Math.min(length, oldSize); final long bytesTransferred; if (destChannel instanceof TransportLayer) { TransportLayer tl = (TransportLayer) destChannel; bytesTransferred = tl.transferFrom(channel, position, count); } else { bytesTransferred = channel.transferTo(position, count, destChannel); }return bytesTransferred;}

3、高性能，低延时
这两个放在一起讨论呢，主要是这高性能这东西很泛，方方面面的良好设计才有了整体的高性能，举个栗子，前面提到的时间轮的设计，就是很经典的例子。
低延时主要是得益于可以写 OS CACHE，如果不设置强制刷盘的话，写入 OS CACHE 之后就算本地写入成功了，写内存是非常快的，所以结合追加写，整个操作的时延就非常低。

4、高可靠，高可用
高可靠一般是指消息高可靠，主要是基于副本设计，让一条数据有多个副本分散到不同的机器，从而提供了不错的高可靠性。
高可用一般是指机器出现宕机等异常情况依旧能正常提供服务，在服务端的体现的话，主要是就是 controller 的设计，可以通过 zk 感知到 broker 的变化，从而做一系列的状态变更；
最后还有 ISR 的设计，以及副本的主从设计，在出现 leader 副本所在 broker 宕机的时候，可以从剩余的优先副本中选出一个 leader 来继续提供服务，保障服务高可用。

到此，相信大家对“KAFKA 有哪些特性”有了更深的了解，不妨来实际操作一番吧！这里是丸趣 TV 网站，更多相关内容可以进入相关频道进行查询，关注我们，继续学习！

正文完