apache kafka | LinuxHostSupport Linux Tutorials and Guides Wed, 15 Jul 2020 18:58:54 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.1 How to Install Apache Kafka on Debian 9 https://linuxhostsupport.com/blog/how-to-install-apache-kafka-on-debian-9/ https://linuxhostsupport.com/blog/how-to-install-apache-kafka-on-debian-9/#respond Wed, 15 Jul 2020 18:58:54 +0000 https://linuxhostsupport.com/blog/?p=1199 In this guide, we will show you how to install Apache Kafka on a Debian 9 VPS. Apache Kafka is a free and open-source distributed streaming software platform that lets you publish and subscribe to streams of records and store streams of records in a fault-tolerant and durable manner. Apache Kafka is written in Scala […]

The post How to Install Apache Kafka on Debian 9 appeared first on LinuxHostSupport.

]]>
In this guide, we will show you how to install Apache Kafka on a Debian 9 VPS.

Apache Kafka is a free and open-source distributed streaming software platform that lets you publish and subscribe to streams of records and store streams of records in a fault-tolerant and durable manner. Apache Kafka is written in Scala and Java. Used in thousands of companies across the world, Apache Kafka provides anyone with the ability to create streaming and stream processing applications that can read and store data in real time. This has a variety of use cases – anything from logging, to messaging, to processing almost any sort of data stream you could imagine. Let’s get started with the installation.

In order to run Apache Kafka on your VPS, the following requirements have to be met:

  • Java 8 or higher needs to be installed
  • ZooKeeper installed and running on the server
  • A VPS with at least 4GB of RAM

If you don’t have Java or ZooKeeper, don’t worry, we’ll be installing them in this tutorial as well.

Step 1 – Update OS Packages

Before we can start with the Apache Kafka installation, we have to make sure that all Debian OS packages that are installed on the server are up to date. We can do this by executing the following commands:

sudo apt-get update
sudo apt-get upgrade

Step 2 – Install JAVA

In order to run Apache Kafka on our server, we’ll need to have Java installed. We can check if Java is already installed using this command:

which java

If there is no output, that means that Java is not installed on the server yet. We can install it using the following command:

sudo apt-get install default-jdk

In order to check the Java version, run the following command on your server:

java -version

We should receive the following output:

openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-2~deb9u1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

Step 3 – Install Zookeeper

Kafka uses ZooKeeper to store persistent cluster metadata, so we need to install ZooKeeper. The ZooKeeper service is responsible for configuration management, leader detection, synchronization, etc. ZooKeeper is available in the official Debian package repository, so we can install it using the following command:

sudo apt-get install zookeeperd

ZooKeeper is running on port 2181 and it doesn’t require much maintenance.

Step 4 – Install Apache Kafka

Crate a new system user dedicated for the Kafka service using the following command (we’re using the kafka name for our username, you can use any name you like):

useradd kafka -m

Set a password for the newly created user:

passwd kafka

Use a strong password and enter it twice. Next, add the user to the sudo group with:

adduser kafka sudo

Stop the ZooKeeper service:

systemctl stop zookeeper.service

Log in as the newly created admin user with:

su kafka

Download the latest version of Apache Kafka available at https://kafka.apache.org/downloads and extract it in a directory on your server:

cd ~
wget -O kafka.tgz http://apache.osuosl.org/kafka/2.1.0/kafka_2.12-2.1.0.tgz
tar -xvzf kafka.tgz
mv kafka_2.12-2.1.0/* .
rmdir /home/kafka/kafka_2.12-2.1.0

Edit the ZooKeeper systemd script:

vi /lib/systemd/system/zookeeper.service
[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/bin/zookeeper-server-start.sh /home/kafka/config/zookeeper.properties
ExecStop=/home/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Create a systemd unit file for Apache Kafka, so that you can run Kafka as a service on your server:

vi /etc/systemd/system/kafka.service

Add the following lines:

[Unit]
Requires=network.target remote-fs.target zookeeper.service
After=network.target remote-fs.target zookeeper.service

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/bin/kafka-server-start.sh /home/kafka/config/server.properties
ExecStop=/home/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Edit the server.properties file and add/modify the following properties:

vi /home/kafka/config/server.properties
listeners=PLAINTEXT://:9092
log.dirs=/var/log/kafka

After we make changes to a unit file, we should always run the systemctl daemon-reload command:

systemctl daemon-reload

Create a new directory called kafka in the /var/log/ directory on your server:

mkdir -p /var/log/kafka
chown kafka:kafka -R /var/log/kafka

This can be useful for troubleshooting. Then, start the ZooKeeper and  Apache Kafka services:

systemctl start zookeeper.service
systemctl start kafka.service

Enable the Apache Kafka service to automatically start on server boot:

systemctl enable kafka.service

In order to check if ZooKeeper and Kafka services are up and running, run the following command on your VPS:

systemctl status zookeeper.service

We should then receive an output similar to this:

zookeeper.service
Loaded: loaded (/lib/systemd/system/zookeeper.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2018-12-19 06:23:33 EST; 25min ago
Main PID: 20157 (java)
Tasks: 21 (limit: 4915)
CGroup: /system.slice/zookeeper.service
└─20157 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Xloggc:/home/kafka/bin/../l

Run this command next:

systemctl status kafka.service

The output of this command should be similar to this one:

kafka.service
Loaded: loaded (/etc/systemd/system/kafka.service; disabled; vendor preset: enabled)
Active: active (running) since Wed 2018-12-19 06:46:49 EST; 27s ago
Process: 22520 ExecStop=/home/kafka/bin/kafka-server-stop.sh (code=exited, status=0/SUCCESS)
Main PID: 22540 (java)
Tasks: 62 (limit: 4915)
CGroup: /system.slice/kafka.service
└─22540 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Xloggc:/home/kafka/bin/../logs/

We can also use netstat command to check if Kafka and ZooKeeper services are listening on ports 9092 and 2181, respectively:

netstat -tunlp | grep -e \:9092 -e \:2181
tcp6       0      0 :::9092                 :::*                    LISTEN      22540/java
tcp6       0      0 :::2181                 :::*                    LISTEN      20157/java

If they are both running, and both ports are open and listening, then that is all. We have successfully installed Apache Kafka.


Of course, you don’t have to install and configure Apache Kafka on Debian 9 if you use one of our Managed Debian Support solutions, in which case you can simply ask our expert Linux admins to setup and configure Apache Kafka on Debian 9 for you. They are available 24×7 and will take care of your request immediately.

PS. If you liked this post on how to install Apache Kafka on a Debian 9 VPS, please share it with your friends on the social networks using the buttons on the left or simply leave a reply below. Thanks.

The post How to Install Apache Kafka on Debian 9 appeared first on LinuxHostSupport.

]]>
https://linuxhostsupport.com/blog/how-to-install-apache-kafka-on-debian-9/feed/ 0
How to Install Apache Kafka on CentOS 7 https://linuxhostsupport.com/blog/how-to-install-apache-kafka-on-centos-7/ https://linuxhostsupport.com/blog/how-to-install-apache-kafka-on-centos-7/#comments Wed, 29 Apr 2020 19:16:12 +0000 https://linuxhostsupport.com/blog/?p=1139 In this tutorial, we will show you how to install Apache Kafka on CentOS 7. Apache Kafka is an open source messaging system and distributed streaming platform. It’s designed to be scalable, responsive, and provide an excellent experience when dealing with real-time data feeds. It’s great at providing real time analytics and processing of data – […]

The post How to Install Apache Kafka on CentOS 7 appeared first on LinuxHostSupport.

]]>
In this tutorial, we will show you how to install Apache Kafka on CentOS 7.

Apache Kafka is an open source messaging system and distributed streaming platform. It’s designed to be scalable, responsive, and provide an excellent experience when dealing with real-time data feeds. It’s great at providing real time analytics and processing of data – and thanks to its rich API support, developers can easily implement Apache Kafka and mold it to their exact needs.

Let’s begin with the installation.

Prerequisites:

Apache Kafka has the following requirements:

  • Java 8 or higher installed on the server
  • ZooKeeper installed and running on the server
  • A server/VPS with a minimum of 4GB RAM.

Step 1. Connect to the Server

Log in to the server via SSH as user root using the following command:

ssh root@IP_ADDRESS -p PORT_NUMBER

replace “IP_ADDRESS” and “PORT_NUMBER” with your actual server IP address and SSH port number.

Step 2: Update OS Packages

Once logged in, make sure that your server OS packages are up-to-date by running the following commands:

yum clean all
yum update

Step 3: Install JAVA

Apache Kafka requires Java, so in order to run it on your server, we need to install Java first. We can check if Java is already installed on the server using this command:

which java

If there is no output, it means that Java is not installed on the server yet. We can install Java from a RPM package:

yum install java-1.8.0-openjdk.x86_64

We can check the Java version installed on the server by running the following command:

java -version

The output should be similar to this:

openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

Add the “JAVA_HOME” and “JRE_HOME” environment variables at the end of /etc/bashrc file:

sudo vi /etc/bashrc

Append the following lines to the original content of the file:

export JRE_HOME=/usr/lib/jvm/jre
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk
PATH=$PATH:$JRE_HOME:$JAVA_HOME

Open the ~/.bashrc file and make sure that the following lines exist:

if [ -f /etc/bashrc ] ; then
  . /etc/bashrc
fi

Run the following command to activate the path settings immediately:

source /etc/bashrc

Step 4: Install Apache Kafka

Create a new system user dedicated for the Kafka service using the following command:

useradd kafka -m

Set a password for the newly created user:

passwd kafka

Use a strong password and enter it twice. Then, run the following command on the server:

sudo usermod -aG wheel kafka

Log in as the newly created user with:

su kafka

Download the latest version of Apache Kafka available at https://kafka.apache.org/downloads and extract it in the home directory of the kafka user account:

cd ~
wget http://apache.osuosl.org/kafka/2.1.0/kafka_2.12-2.1.0.tgz
tar -xvzf kafka_2.12-2.1.0.tgz
mv kafka_2.12-2.1.0/* .
rmdir /home/kafka/kafka_2.12-2.1.0

Apache Kafka uses ZooKeeper to store persistent cluster metadata, so we need to install ZooKeeper. The ZooKeeper files are included with Apache Kafka. ZooKeeper is running on port 2181 and it doesn’t require much maintenance. The ZooKeeper service is responsible for configuration management, leader detection, synchronization, etc.
Create a ZooKeeper systemd unit file so that we can run ZooKeeper as a service:

sudo vi /lib/systemd/system/zookeeper.service
[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/bin/zookeeper-server-start.sh /home/kafka/config/zookeeper.properties
ExecStop=/home/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Create a systemd unit file for Apache Kafka:

sudo vi /etc/systemd/system/kafka.service

Add the following lines:

[Unit]
Requires=network.target remote-fs.target zookeeper.service
After=network.target remote-fs.target zookeeper.service

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/bin/kafka-server-start.sh /home/kafka/config/server.properties
ExecStop=/home/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Edit the server.properties file and add/modify the following settings:

vi /home/kafka/config/server.properties
listeners=PLAINTEXT://:9092
log.dirs=/var/log/kafka-logs

After we make changes to a unit file, we should run the ‘systemctl daemon-reload‘ command for the changes to take effect:

systemctl daemon-reload

Create a new directory ‘kafka-logs’ in the ‘/var/log/‘ directory on your server:

sudo mkdir -p /var/log/kafka-logs
chown kafka:kafka -R /var/log/kafka-logs

This can be useful for troubleshooting. Once that’s done, start the ZooKeeper and Apache Kafka services:

sudo systemctl start zookeeper.service
sudo systemctl start kafka.service

Enable the ZooKeeper and Apache Kafka services to automatically start on server boot:

systemctl enable zookeeper.service
systemctl enable kafka.service

In order to check if ZooKeeper and Kafka services are up and running, run the following commands on the VPS:

systemctl status zookeeper.service

We should receive an output similar to this:

zookeeper.service
   Loaded: loaded (/usr/lib/systemd/system/zookeeper.service; disabled; vendor preset: disabled)
   Active: active (running) since Fri 2019-01-25 12:42:42 CST; 16s ago
 Main PID: 11682 (java)
   CGroup: /system.slice/zookeeper.service
           └─11682 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.h...
systemctl status kafka.service

The output of this command should be similar to this one:

kafka.service
   Loaded: loaded (/etc/systemd/system/kafka.service; disabled; vendor preset: disabled)
   Active: active (running) since Fri 2019-01-25 12:42:50 CST; 42s ago
 Main PID: 11991 (java)
   CGroup: /system.slice/kafka.service
           └─11991 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headl...

We can also use the netstat command to check if Kafka and ZooKeeper services are listening on ports 9092 and 2181 respectively:

sudo netstat -tunlp | grep -e \:9092 -e \:2181
tcp6       0      0 :::9092                 :::*                    LISTEN      11991/java
tcp6       0      0 :::2181                 :::*                    LISTEN      11682/java

That is it. We successfully installed Apache Kafka.


Of course, you don’t have to install and configure Apache Kafka on CentOS 7, if you use one of our Fully Managed CentOS Support solutions, in which case you can simply ask our expert Linux admins to setup and configure Apache Kafka on CentOS 7 for you. They are available 24×7 and will take care of your request immediately.

PS. If you liked this post on how to install Apache Kafka on a CentOS 7 VPS, please share it with your friends on the social networks using the buttons on the left or simply leave a reply below. Thanks.

The post How to Install Apache Kafka on CentOS 7 appeared first on LinuxHostSupport.

]]>
https://linuxhostsupport.com/blog/how-to-install-apache-kafka-on-centos-7/feed/ 4