Kafka on Docker for Mac
On Linux, it’s quite easy to get a single Kafka node cluster up & running with Docker. Confluence gives a great primer in their documentation: https://docs.confluent.io/current/installation/docker/docs/quickstart.html#getting-started-with-docker-compose.
On macOS, things get a little bit more complicated because containers are not directly supported in the OS. To use docker on macOS, one must use the docker-machine (older method) or the new Docker for Mac. This post presents at method on both. The result will be a single node cluster. You will be able to:
- produce/consume from another container
- produce/consume from the host
Note: with a few changes to the kafka docker image and the docker-compose.yml file, it's possible to have a multiple nodes cluster.
The problem
The problem on Mac is that Docker runs with a VM. It’s true in the case of the Docker Machine but it’s also true in the case of Docker for Mac. The implementation is different but the result is essentially the same: we need an address:port
that is resolvable from the host and from other containers.
On linux, there is no problem, you slap --network host
arguments when you run the containers. This way, anybody can use localhost:port
to communicate.
On Mac, --network host
is useless.
With the Docker Machine, you get an IP, which is the IP of the virtual machine where docker runs. You can use this IP from your host, but it will be not work from another container.
With Docker for Mac, the VM implementation is different and from the host perspective, things are mapped on localhost
. But from within a container, that’s not true, localhost
resolves on the container and not on the host.
The solution I found uses the famous pause container.
The solution
To resolves the issue, I did what kubernetes does to allow multiple containers in one pod, to talk to each other using localhost
. We will use the pause container to do port forwarding from the host to the containers. The pause container will expose port on the host, map them on it’s localhost
interface. The other containers will use the pause container’s network interface as their network interface.
Kafka works with Zookeeper and they both need at least one port exposed, 9092
and 2181
respectively. So we can start a pause container to expose the two ports:
docker run -d --name pause \
-p 9092:9092 \
-p 2181:2181 \
gcr.io/google_containers/pause-amd64:3.0
Now we can start zookeeper:
docker run -d \
--net=container:pause \
--ipc=container:pause \
--pid=container:pause \
-e "ZOOKEEPER_CLIENT_PORT=2181" \
confluentinc/cp-zookeeper:4.0.0-3
An finally, you can start kafka:
docker run -d \
--net=container:pause \
--ipc=container:pause \
--pid=container:pause \
-e "KAFKA_ZOOKEEPER_CONNECT=localhost:2181" \
-e "KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092" \
-e "KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1" \
confluentinc/cp-kafka:4.0.0-3
At this point, you should have a running cluster with on Kafka node:
CONTAINER ID IMAGE PORTS
14a04b6682ca confluentinc/cp-kafka:4.0.0-3
30dfc793b973 confluentinc/cp-zookeeper:4.0.0-3
540fe4395d1f gcr.io/google_containers/pause-amd64:3.0 0.0.0.0:2181->2181/tcp, 0.0.0.0:9092->9092/tcp
Test the cluster
To make sure everything is in order, here are a few commands to test whether the cluster can be used.
Set up a temporary directory with executable to consume/produce from your host:
mkdir /tmp/kafka-tests
cd /tmp/kafka-tests
wget http://apache.parentingamerica.com/kafka/0.11.0.2/kafka_2.12-0.11.0.2.tgz
tar -xvf kafka_2.12-0.11.0.2.tgz
cd kafka_2.12-0.11.0.2/bin/
When you have the executables, you can do the following.
Create the topic:
/tmp/kafka-tests/kafka_2.12-0.11.0.2/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic test --partitions 1 --replication-factor 1
Produce messages:
seq 1 45 | /tmp/kafka-tests/kafka_2.12-0.11.0.2/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
Consume all messages of a topic:
/tmp/kafka-tests/kafka_2.12-0.11.0.2/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
Consume from another container:
docker run -it \
--net=container:pause \
--ipc=container:pause \
--pid=container:pause \
confluentinc/cp-kafka:4.0.0-3 \
/usr/bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic test --from-beginning
Produce from another container:
docker run -it \
--net=container:pause \
--ipc=container:pause \
--pid=container:pause \
confluentinc/cp-kafka:4.0.0-3 \
/usr/bin/kafka-console-producer --broker-list localhost:9092 --topic test