In this article, I will try to give you various information from Apache Kafka installation to Kafka's own components and we will do this together with its applications.
What is the Apache Kafka?
Apache Kafka is one of the most popular technologies in the big data world.In short, it is a high-performance distributed messaging system.Developed by LinkedIn engineers and put on GitHub as Open Source in 2010, Kafka was proposed as an Apache Software Foundation Incubator Project in 2011 and was renamed Apache Kafka in 2012.
a)Apache Kafka Components
It is the unit that forwards the messages to the topics.
This is where incoming messages are stored.There can be more than one topic.You can think of it like tables in a database.
Kafka topics are partitioned within themselves.Partitions allow the parallelization of topics by dividing the data into specific sections.
Each record in partitions is assigned to certain offsets and they are placed in the order in which they come.
It is the unit that reads messages from topics. There may be more than one.
Each Kafka server is called a Broker. Having more than one broker is important in order to make the system high availability. A leading broker is selected and in case of a crash of this broker, other brokers step in and ensure the stable operation of the system.
Zookeeper runs as an informative service on distributed systems.In the Kafka Cluster environment, it enables Kafka components to communicate with each other.
b)Apache Kafka Setup
First, we start by installing java.
yum install java-1.8.0-openjdk.x86_64
Then we check the java version.
We add the JAVA_HOME and JRE_HOME environment variables to the /etc/bashrc file.
To activate, we cite our file as the source.
Now we start the kafka installation and create a user.
useradd kafka -m
We assign a password to this user.
sudo usermod -aG wheel kafka
Let's log in to the user we created.
su - kafka
We go to our home directory and install kafka.Here I want to show you how to install kafka offline.
mv kafka_2.12-3.1.1.tgz /home/kafka
tar -xvzf kafka_2.12-2.1.0.tgz
mv kafka_2.12-2.1.0/* .
Let's create a service for Zookeeper.
sudo vi /lib/systemd/system/zookeeper.service
Now we are creating a service for Kafka.
sudo vi /etc/systemd/system/kafka.service
Requires=network.target remote-fs.target zookeeper.service
After=network.target remote-fs.target zookeeper.service
Let's edit Kafka configurations.
Enter the code below to save the services we created.
Let's create the Kafka log file and authorize the kafka user.
sudo mkdir -p /var/log/kafka-logs
chown kafka:kafka -R /var/log/kafka-logs
Enter the codes below to automate the services
systemctl enable zookeeper.service
systemctl enable kafka.service
Now let's start our services in order.
systemctl start zookeeper.service
systemctl start kafka.service
Let's check the status of the started services.
systemctl status zookeeper.service
systemctl status kafka.service
c)Apache Kafka Producer-Consumer App
We create a topic named data4tech by switching to Kafka user.
Let's create a producer message system on the console.
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic data4tech
Let's create a consumer message system on the console.
bin/kafka-console-consumer.sh --broker-list localhost:9092 --topic data4tech --from-beginning
Now, let's do a test about receiving the messages we sent from the producer by the consumer.
As you can see, the messages we send from the producer reach the consumer immediately.
In this article, we discussed Apache Kafka with you.If you liked the article, you can support us by sharing.
Hope to see you in new posts, take care.