A Beginner's Guide to Installing Apache Kafka on Ubuntu
Last updated: June 20th 2024
Apache Kafka is popular for building real-time data pipelines and streaming applications. Whether you're a beginner or an experienced user, our guide will make installing Kafka on your Ubuntu machine a breeze. Let’s get started without any further ado.
What is Apache Kafka?
Apache Kafka is an open-source platform for processing real-time streaming data from various sources. It uses a publish-subscribe model to handle high-volume data streams efficiently.
With features like partitioning, replication, and fault tolerance, Kafka excels at high-throughput message processing in finance, manufacturing, and healthcare industries. Everyday use cases include tracking website activity, reporting metrics, log aggregation, and stream processing.
Kafka reliably stores event streams, processes them in real-time, and offers scalability and security. Deployable across environments, it utilizes a distributed messaging system over TCP for data transmission. Let’s quickly look at how to install Apache Kafka on Ubuntu.
Prerequisites
- An Ubuntu server with a minimum of 5GB of RAM. If your system has less than 4GB of RAM, it could cause the Kafka service to fail. So, I recommend setting up a server with at least that amount of RAM, with some extra for breathing space and buffer. (Webdock’s SSD Bit is very affordable!)
Installing OpenJDK
OpenJDK is an open-source implementation of the Java Platform, Standard Edition. It provides tools and libraries to develop Java applications for free. To check if Java is pre-installed:
$ java -version -y
In most cases, it won’t be preinstalled. So we’ll have to install it and get started. Start by updating the package index:
$ sudo apt update -y
Then, proceed to install it:
$ sudo apt install openjdk-21-jre -y
Then recheck the version to verify the installation.
To compile and run certain Java programs, you might require both the Java Development Kit (JDK) and the Java Runtime Environment (JRE). To get the JDK, simply run the command provided; this will also set up the JRE for you:
$ sudo apt-get install openjdk-21-jdk -y
To confirm the installation of JDK, you need to check the version of javac, which is the Java compiler:
$ javac -version
Installing Kafka
To start using Kafka for network requests, the initial task involves setting up a specific user for the service. This helps to reduce potential harm to your Ubuntu system if the Kafka server gets compromised. You must log in to your server as a sudo user and create a user. I will name the user 'Franz.' ;)
$ sudo adduser franz
Answer the questions it asks and set a password. Remember the password, as we’ll need it in the upcoming steps.
To grant the franz user the necessary privileges for installing Kafka's dependencies, you should add the user to the sudo group using the adduser command:
$ sudo adduser franz sudo
And then login:
$ su -l franz
Once you have set up a user for Kafka, the next step is to download and unpack the Kafka binaries. First, make a folder named Core Files in /home/kafka to keep your downloaded files organized:
$ mkdir ~/core-files
Then, get a copy of the Kafka binaries using curl:
$ curl "https://downloads.apache.org/kafka/3.7.0/kafka-3.7.0-src.tgz" -o ~/core-files/kafka.tgz
This command downloads the Apache Kafka source code version 3.7.0 as a compressed file named kafka.tgz to the Downloads file. At the time of writing this guide, the latest stable version is 3.7.0. However, you may have to adjust the versions in the future. You can keep track of Apache Kafka’s repository.
First, create a directory named kafka and then navigate into this folder. This will serve as the main directory for your Kafka setup:
$ mkdir ~/kafka && cd ~/kafka
And then unzip the downloaded file. Since it is compressed using the TAR methodology, I’ll use the same to unzip it:
$ tar -xvzf ~/core-files/kafka.tgz --strip 1
Notice the --strip 1 flag? That’s to ensure the archive’s contents are extracted directly into your ~/kafka/ directory, not a subdirectory.
After downloading and extracting the binaries, you can start configuring your Kafka server.
A Kafka topic is the name given to the category or similar feats where messages can be published. By default, Kafka does not permit the deletion of topics. To change this, you need to adjust the configuration file:
$ sudo nano ~/kafka/config/server.properties
Start by adding a configuration that lets you remove Kafka topics. Insert this line at the end of the file:
delete.topic.enable = true
And then, update the log.dirs property. Locate the log.dirs property, approximately on the 46th line, and swap out the current path with the new highlighted path:
log.dirs=/home/franz/kafka/kafka-logs
Save and close the file by pressing CTRL + X and then Y.
Systemd unit files on Ubuntu let you set up how systemd handles services, sockets, devices, and other parts of the system. These files outline the actions and dependencies required to start up the system. We’ll need a similar systemd file for Kafka, which will assist you in executing typical service actions like initiating, halting, and rebooting Kafka following the standard procedures for Linux services.
Kafka relies on Zookeeper to handle the cluster's state and settings. So, I’ll start by creating a unit file:
$ sudo nano /etc/systemd/system/zookeeper.service
With the said content:
[Unit] Requires=network.target remote-fs.target After=network.target remote-fs.target [Service] Type=simple User=franz ExecStart=/home/franz/kafka/bin/zookeeper-server-start.sh /home/franz/kafka/config/zookeeper.properties ExecStop=/home/franz/kafka/bin/zookeeper-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Save and exit the file once added.
The [Unit] section indicates that Zookeeper needs both the network and filesystem to be up and running before it can begin operating.
In the [Service] section, it clarifies that systemd will use the zookeeper-server-start.sh and zookeeper-server-stop.sh scripts to manage to start and stop the service. It also notes that Zookeeper should be automatically restarted if it crashes unexpectedly.
Let’s set up the systemd service file for Kafka:
$ sudo nano /etc/systemd/system/kafka.service
By adding the following content:
[Unit] Requires=zookeeper.service After=zookeeper.service [Service] Type=simple User=franz ExecStart=/bin/sh -c '/home/franz/kafka/bin/kafka-server-start.sh /home/franz/kafka/config/server.properties > /home/franz/kafka/kafka.log 2>&1' ExecStop=/home/franz/kafka/bin/kafka-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Save and exit the file.
In the [Unit] section, you indicate that this unit file relies on zookeeper.service. This dependency ensures that zookeeper will start up automatically when the kafka service begins.
In the [Service] section, you tell systemd to use the kafka-server-start.sh and kafka-server-stop.sh scripts to handle starting and stopping the service. Additionally, you specify that Kafka should restart if it terminates unexpectedly.
Before we move forward, we would need to build a Java Archive (JAR) file with Gradle, or you may face exit-code=1 error:
$ ./gradlew jar -PscalaVersion=2.13.12
This command runs the Gradle wrapper script to build a JAR file using Scala version 2.13.12. It compiles the project's source code and packages it into a JAR file. However, depending on your hardware, this process might take a significant time. It took me about 15 minutes to compile, so it might be a good time to get a quick stretch.
All set up and done? Let’s start Kafka:
$ sudo systemctl start kafka
And check status to confirm it works well:
$ sudo systemctl status kafka
The status indicated that the Kafka server is now active on port 9092, the standard Kafka port. Although I have started the Kafka service, it won't restart on its own if the server is rebooted. To set Kafka to start automatically after a reboot, use this command:
$ sudo systemctl enable zookeeper
Also run:
$ sudo systemctl enable kafka
Now, let’s quickly check to see if everything works. Let’s start by creating a topic. I’ll call this TheMetamorphosis.
~/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic TheMetamorphosis
This command would ensure that each message is copied once for fault tolerance and set up one partition to organize the data.
Now, let’s echo a random quote from the acclaimed book:
$ echo "I only fear danger where I want to fear it." | ~/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic TheMetamorphosis > /dev/null
To create a Kafka consumer, use the kafka-console-consumer.sh script. You need to provide the ZooKeeper server’s hostname, port, and topic name. The command below shows how to consume messages from the topic:
~/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic TheMetamorphosis --from-beginning
I have included -- from-beginning flag so I can read messages published before the consumer starts.
The output would be the quote I echoed after running the command:
Let’s test it from the consumer’s side. Log in to a separate session/terminal as the Franz user and echo another message. I’ll include another random quote from the book:
$ echo "He was a tool of the boss, without brains or backbone." | ~/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic TheMetamorphosis > /dev/null
In the initial terminal, you’ll see the message being relayed:
After you finish testing, hit CTRL+C to halt the consumer script in your main terminal.
The following steps can be considered optional. However, it might be a good idea to do them to enhance your server’s security. Since the Franz user is somewhat “exposed,” it would be a great idea to lock its password login and ensure the user can only log in using the sudo su command.
This way, even if the user's password gets leaked, it would be very difficult for anyone to access the Kafka configuration and make a mess. Just run:
$ sudo passwd franz -l
After confirming, to log in to the franz account, you would need to run the following as an alternate sudo/root user:
$ sudo su - franz
You can revert this change, of course, by replacing the -l flag with the -u flag. It would also be a good idea to remove sudo privileges from Franz:
$ sudo deluser franz sudo
Conclusion
In conclusion, installing Apache Kafka on Ubuntu is a straightforward process that enables you to leverage its powerful real-time data streaming capabilities. By following the steps outlined in this article, you can set up Kafka, configure brokers and topics, and start publishing and consuming messages.
With Kafka up and running, you can unlock new possibilities for real-time application data processing and analysis.
Meet Aayush, a WordPress website designer with almost a decade of experience who crafts visually appealing websites and has a knack for writing engaging technology blogs. In his spare time, he enjoys illuminating the minds around him.
Related articles
-
How to install OpenLiteSpeed on Webdock
In this article we show you how you can install the OpenLiteSpeed web server on Ubuntu Jammy.
Last updated: March 29th 2023
-
How to Deploy your First Node.js Application on your Ubuntu Web Server
Deploy a Node.js application on your Ubuntu server and how to issue SSL certificates for your Node application.
Last updated: July 19th 2023
-
How to set up WireGuard on your Webdock Server
This article details how you can quickly and easily install WireGuard on your Webdock Server.
Last updated: July 21st 2024
-
How to set up OpenVPN on your Webdock Server
This article details how you can quickly and easily set up a VPN on your Webdock server.
Last updated: November 10th 2022
-
How to Install and configure aaPanel on Ubuntu
Install aaPanel on Ubuntu as well as building a LEMP stack, pointing your domain to your server and setting SSL certificates with Let's Encrypt.
Last updated: November 10th 2022
-
How to install azuracast on Webdock
This guide shows you how to work around certain issues when installing azuracast on Webdock.
Last updated: November 10th 2022
-
How to set up Runcloud on Webdock
This article details the steps you need to go through in order to install Runcloud on a Webdock server.
Last updated: July 29th 2024
-
How to set up cPanel on Webdock
This article details the steps you need to go through in order to install cPanel on a Webdock server.
Last updated: September 23rd 2024
-
How to set up Gridpane on Webdock
This article details the steps you need to go through in order to install Gridpane on a Webdock server.
Last updated: November 10th 2022
-
How to set up Ploi on Webdock
This article details the steps you need to go through in order to install Ploi on a Webdock server.
Last updated: November 10th 2022
-
How to set up Laravel Forge on Webdock
This article details the steps you need to go through in order to install Laravel Forge on a Webdock server.
Last updated: November 10th 2022
-
How to set up Plesk on Webdock
This article details the steps you need to go through in order to install Plesk on a Webdock server.
Last updated: November 10th 2022
-
How to set up Cyberpanel on Webdock
This article details the steps you need to go through in order to install Cyberpanel on a Webdock server.
Last updated: December 9th 2023
-
How to set up SpinupWP on Webdock
This article details the steps you need to go through in order to installSpinupWP on a Webdock server.
Last updated: March 15th 2023
-
How to set up DirectAdmin on Webdock
This article details the steps you need to go through in order to install DirectAdmin on a Webdock server.
Last updated: September 1st 2023
-
How to set up Hestia on Webdock
This article details the steps you need to go through in order to install Hestia on a Webdock server.
Last updated: November 10th 2022
-
How to set up Virtualmin on Webdock
This article details the steps you need to go through in order to install Virtualmin on a Webdock server.
Last updated: November 10th 2022
-
How to install and create pipelines in Jenkins
This guide describes the step-by-step procedure of installing Jenkins and creating pipelines in Jenkins.
Last updated: November 10th 2022
-
Basic WordPress site setup with aaPanel
In this guide, we will install and setup a basic WordPress site with aaPanel.
Last updated: January 23rd 2023
-
How to use Nginx as reverse proxy and secure connections with SSL certificates
Using Nginx to proxy pass your site with SSL security.
Last updated: July 19th 2023
-
Setting up monitoring with Netdata on your Webdock server
Setting up monitoring on your server to receive alerts and to know real-time resource consumption on your server.
Last updated: July 19th 2023
-
How to Setup Python Web Application With Flask Gunicorn and Nginx
A simple Python Flask web app hosting with Gunicorn and Nginx
Last updated: July 19th 2023
-
How to Daemonize an Application with Systemd
Using systemd to autostart your application on system startup.
Last updated: July 19th 2023
-
Set-up New Relic Monitoring on Your Webdock Server
This guide provides step-by-step instructions to install New Relic to monitor your VPS.
Last updated: July 29th 2024
-
Getting Started with Ruby on Rails on Webdock
In this guide, we will show you how to get started with Ruby on Rails on your Webdock server
Last updated: July 19th 2023
-
How to Install VaultWarden on Your Webdock Server
This guide provided step-by-step instructions to Vaultwarden, an open-source Password Manager on your Webdock Server.
Last updated: July 19th 2023
-
How to Install the Latest Version of HTOP on Ubuntu Server
Instructions to install latest htop package on your Ubuntu server
Last updated: July 29th 2024
-
How to Install ImageMagick 7 on Ubuntu LEMP/LAMP stacks
Simple instructions to install ImageMagick 7 along with the PHP extension
Last updated: November 1st 2023
-
A Quick Guide to Installing Rust on Ubuntu
Instructions to install Rust
Last updated: December 18th 2023
-
How To Install Proxmox on Your Webdock Server
This article provides instructions on how to install Proxmox on your Webdock server
Last updated: July 29th 2024
-
How To Run Nextcloud on Your Webdock Ubuntu Server
Instructions to Install Nextcloud on your server with Docker
Last updated: February 13th 2024
-
The Ultimate Guide to Setting Up Mastodon server
A detailed guide with instructions to set up Mastodon on your Webdock server
Last updated: February 20th 2024
-
A Guide To Setting Up Mindustry Game Server on Ubuntu
Step-by-step instructions to set up your own Mindustry server
Last updated: November 4th 2024
-
A Guide to Installing ERPNext on Your Webdock Server
Step-by-step instructions to install ERPNext - a resource planning software - on your server
Last updated: May 28th 2024
-
A Quick Guide To Installing Rocket Chat on Ubuntu
Instructions to install Rocket Chat on your Ubuntu server
Last updated: July 2nd 2024
-
How to Install Ghost CMS on a Webdock Server
Easily install Ghost CMS on your Webdock server with these instructions!
Last updated: September 2nd 2024
-
How to Install cPanel on a Webdock Ubuntu Server
Instructions for setting up cPanel on an Ubuntu server
Last updated: September 23rd 2024