top of page

APACHE KAFKA - OVERVIEW (RHEL 7)

Writer's picture: Murat Can ÇOBANMurat Can ÇOBAN

In this article, I will try to give you various information from Apache Kafka installation to Kafka's own components and we will do this together with its applications.


What is the Apache Kafka?


Apache Kafka is one of the most popular technologies in the big data world.In short, it is a high-performance distributed messaging system.Developed by LinkedIn engineers and put on GitHub as Open Source in 2010, Kafka was proposed as an Apache Software Foundation Incubator Project in 2011 and was renamed Apache Kafka in 2012.



a)Apache Kafka Components

  • Producer

  • Topic

  • Partition

  • Offset

  • Consumer

  • Broker

  • Zookeeper


1)Producer

It is the unit that forwards the messages to the topics.


2) Topic

This is where incoming messages are stored.There can be more than one topic.You can think of it like tables in a database.


3)Partition

Kafka topics are partitioned within themselves.Partitions allow the parallelization of topics by dividing the data into specific sections.



4)Offset

Each record in partitions is assigned to certain offsets and they are placed in the order in which they come.


5)Consumer

It is the unit that reads messages from topics. There may be more than one.


6)Broker

Each Kafka server is called a Broker. Having more than one broker is important in order to make the system high availability. A leading broker is selected and in case of a crash of this broker, other brokers step in and ensure the stable operation of the system.


7)Zookeeper

Zookeeper runs as an informative service on distributed systems.In the Kafka Cluster environment, it enables Kafka components to communicate with each other.


b)Apache Kafka Setup



First, we start by installing java.

yum install java-1.8.0-openjdk.x86_64

Then we check the java version.

java -version




We add the JAVA_HOME and JRE_HOME environment variables to the /etc/bashrc file.

vi /etc/bashrc

export JRE_HOME=/usr/lib/jvm/jre
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk
PATH=$PATH:$JRE_HOME:$JAVA_HOME

To activate, we cite our file as the source.

source /etc/bashrc

Now we start the kafka installation and create a user.

useradd kafka -m

We assign a password to this user.

passwd kafka
sudo usermod -aG wheel kafka

Let's log in to the user we created.

su - kafka

We go to our home directory and install kafka.Here I want to show you how to install kafka offline.

cd /Downloads
mv kafka_2.12-3.1.1.tgz /home/kafka
cd /home/kafka
tar -xvzf kafka_2.12-2.1.0.tgz
mv kafka_2.12-2.1.0/* .
rmdir /home/kafka/kafka_2.12-2.1.0

Let's create a service for Zookeeper.


sudo vi /lib/systemd/system/zookeeper.service
[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/bin/zookeeper-server-start.sh /home/kafka/config/zookeeper.properties
ExecStop=/home/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Now we are creating a service for Kafka.

sudo vi /etc/systemd/system/kafka.service
[Unit]
Requires=network.target remote-fs.target zookeeper.service
After=network.target remote-fs.target zookeeper.service

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/bin/kafka-server-start.sh /home/kafka/config/server.properties
ExecStop=/home/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target


Let's edit Kafka configurations.

vi /home/kafka/config/server.properties
listeners=PLAINTEXT://:9092
log.dirs=/var/log/kafka-logs

Enter the code below to save the services we created.

systemctl daemon-reload

Let's create the Kafka log file and authorize the kafka user.

sudo mkdir -p /var/log/kafka-logs
chown kafka:kafka -R /var/log/kafka-logs

Enter the codes below to automate the services

systemctl enable zookeeper.service
systemctl enable kafka.service

Now let's start our services in order.

systemctl start zookeeper.service
systemctl start kafka.service

Let's check the status of the started services.

systemctl status zookeeper.service

systemctl status kafka.service

c)Apache Kafka Producer-Consumer App


We create a topic named data4tech by switching to Kafka user.


Let's create a producer message system on the console.

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic data4tech

Let's create a consumer message system on the console.

bin/kafka-console-consumer.sh --broker-list localhost:9092 --topic data4tech --from-beginning

Now, let's do a test about receiving the messages we sent from the producer by the consumer.

As you can see, the messages we send from the producer reach the consumer immediately.




In this article, we discussed Apache Kafka with you.If you liked the article, you can support us by sharing.


Hope to see you in new posts, take care.

...



101 views0 comments

Recent Posts

See All

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating

©2021, Data4Tech 

bottom of page