Let's Build a 28-core Raspberry Pi Cluster!

This is the second instalment of my 'How I built' trilogy detailing the build process of all 3 of my clusters. Read the first here.

Where you are in the cluster timeline

Octopains

In October 2018, more than a year after building my first cluster Octopi, I had officially outgrown the cluster. I started noticing the performance bottlenecks started when I tried to run Wordpress on it, where a single page load of a newly installed Wordpress blog takes about 10 seconds! Nevertheless, all that was unsurprising given the measly single-core 700MHz processor on each node.

At that point in time, I've graduated from university then and secured a job as a Data Scientist + Full-Stack Engineer at a startup and I thought to myself:

What better a time to build myself a new cluster?

Kraken

The second cluster was named Kraken.

The kraken (/ˈkrɑːkən/)[1] is a legendary cephalopod-like sea monster of gigantic size in Scandinavian folklore. (Wikipedia)

The Kraken is basically a much bigger and monstrous octopus, hence the name, symbolizing an evolution of the first cluster. It was built out of 7 Raspberry Pi 3Bs powered by a single USB charger.

With my first pay check, I had initially intended to finally build a cluster of 8 nodes but once again I was unable to realize this cursed idea. The maximum number of ports on a consumer-grade network switch was 8, enough for 7 nodes and a cable to the router.

Item TP-Link 8-port Switch TP-Link 16-Port Smart Switch
Photo
Cost (SGD)* S$26.46 S$195.03
Cost (USD) $18.99 $139.97
* SGD/USD exchange rate is 0.71770 as of the time of writing

An alternative was to use commercial-grade network switches with 16 ports but that was obviously was out of the question (and budget).

Cluster Specifications

7 × Raspberry Pi 3 Model B Overall
CPU 4 × Cortex-A53 1.2 GHz 28 × Cortex-A53 1.2 GHz
RAM 1GB 7GB
USB 4 × USB 2.0 28 × USB 2.0
Storage 16GB MicroSD Class 10 128GB1 Class 10

1Comprised of 1 × 32GB, 6 × 16GB

The Kraken cluster is born! (No that's not my ceiling)

The total damage my wallet sustained from Kraken was S$626.63 or in $449.73 USD, which was virtually all I had left for the month after subtracting expenses and education loan repayments.

Parts list

Item Qty Cost (SGD) Cost (USD)*
Raspberry Pi 3 Model B 7 S$420.00 $301.43
Anker PowerPort 10 60W 10-port USB Charger 1 S$32.58 $23.38
TP-Link TL-SF1008D 8-Port Ethernet Switch 1 S$20.00 $14.35
Cat 5e 30cm cable 7 S$3.50 $2.51
Cat 5e 1m cable 1 S$2.00 $1.44
MicroUSB charging cable 7 S$21.00 $15.07
16GB Class 10 MicroSD Card 7 S$70.00 $50.24
32GB Class 10 MicroSD Card 1 S$16.00 $11.48
1 GeauxRobot Raspberry Pi 7-Layer Dog Bone Stack 1 S$41.55 $29.82
Total S$626.63 $449.73
* SGD/USD exchange rate is 0.71770 as of the time of writing
1 Optional but handsome

Something noteworthy was that I chose a 32GB MicroSD card as storage for the first node as I had intended it to be the master main node of a Docker swarm setup and anticipated that I'd need additional storage for building and deploying Docker images.

Kraken (top) and Octopi (bottom) co-existing, shot with a potato

Kraken gigabit upgrade

One year later in September 2019, I found myself regularly hitting the bandwidth limit of 100Mbps on the built-in ethernet ports of Raspberry Pi 3Bs, especially when doing large file transfers in or out of the machines, where transfer speeds hover around a sad 8MB/s.

I looked around and found Jeff Geerling's blog post where I learnt that I could use USB Gigabit ethernet adapters to increase the bandwidth to slightly over 200Mbps and so I bought a bunch of cheap Chinese USB Gigabit ethernet adapters and a gigabit switch and got down to upgrading.

Cheap, sketchy, unbranded USB Gigabit Ethernet Adapter

Additional parts list

Item Qty Cost (SGD) Cost (USD)*
TP-Link TL-SG108 8-Port Gigabit Switch 1 S$36.00 $26.56
USB 3.0 Gigabit RJ45 Ethernet Network Adapter 7 S$45.36 $32.55
Total S$82.36 $59.11
* SGD/USD exchange rate is 0.71770 as of the time of writing

After the upgrade, this is how Kraken looked like:

Gigabit Kraken, Octopi and my cable management nightmare

This upgrade brings the total cost of the cluster to S$708.99 or $508.84 USD.

Benchmarks

Running iperf before the upgrade, we see a maximum bandwidth of 93.1 Mbps.

~ ❯ iperf -c 192.168.3.11
------------------------------------------------------------
Client connecting to 192.168.3.11, TCP port 5001
TCP window size:  129 KByte (default)
------------------------------------------------------------
[  4] local 192.168.3.71 port 57041 connected with 192.168.3.11 port 5001
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec   111 MBytes  93.1 Mbits/sec

Running iperf after installing the gigabit adapter, we see a maximum bandwidth of 224 Mbps!

~ ❯ iperf -c 192.168.3.11
------------------------------------------------------------
Client connecting to 192.168.3.11, TCP port 5001
TCP window size:  145 KByte (default)
------------------------------------------------------------
[  4] local 192.168.3.71 port 57298 connected with 192.168.3.11 port 5001
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec   268 MBytes   224 Mbits/sec

With that simple mod, I've gained 131Mbps per node. However, these are still only theoretical speeds as typical usage scenarios involve writing data received from network to disk, but iperf only receives data from the network and does not write data to disk.

Some caveats

Sustained full utilization of this newly available bandwidth is highly unlikely in typical usage, even in web serving. It will mostly aid in transferring large assets like images faster for the first load, after which the images will be cached by users browsers.

Additionally, the actual bandwidth is still constrained by the notorious shared USB 2.0 bus in the Raspberry Pi models 1 to 3.

For the uninitiated, the 480Mbps theoretical unidirectional bandwidth in a single USB 2.0 bus is shared between the Ethernet Port, the SD card slot and all USB ports.

Bandwidth distribution may look like this:

Operation Read bandwidth Write bandwidth
Actual available bandwidth 200Mbps 200Mbps
SD Card -80Mbps -80Mbps
Ethernet disconnected -0Mbps -0Mbps
No USB usage -0Mbps -0Mbps
Available for GbE 120Mbps 120Mbps

Even though the tabulated numbers give the impression that there is no performance gain from this upgrade, they represent the worst case scenario where typically, one would expect mostly reads and few writes in web-servers.

Bandwidth distribution should typically look like this in real-world:

Operation Read bandwidth Write bandwidth
Actual available bandwidth 300Mbps 100Mbps
SD Card -100Mbps -50Mbps
Ethernet disconnected -0Mbps -0Mbps
No USB usage -0Mbps -0Mbps
Available for GbE 200Mbps 50Mbps

Should you build this cluster?

This build is for you if you:

  • You wish to learn Docker and its associated frameworks
  • You do not want to spend a fortune on procuring multiple Raspberry Pi 4Bs to set up a cluster
  • Want to explore Kubernetes

This build is not for you if you:

  • Are already familiar with the Docker ecosystem
  • Are looking for high-performance home setups

My two cents

I highly recommend building this cluster for anyone who's keen on getting into Docker and Kubernetes for 2 main reasons.

Firstly, this cluster is compatible with the latest versions of officially supported Docker images. Raspberry Pi 3Bs run on the armv7  CPU architecture, which happens to be the current lowest common denominator among Arm processors today.

Latest Arm processors (arm64) are backward-compatible with all code written and compiled on armv7. In contrast, arm64 processors are not backward-compatible with armv6 processors (Raspberry Pi 1 and 2), so they are essentially in the process of being phased out from the community.

Secondly, this cluster would be perfect for most but the most bandwidth demanding applications, from hosting your own blog, file sync service, media library managers, link-shorteners, note-taking applications and so on. The only time it would not perform as intended is when your application demands heavy sustained writes, such as when encoding videos, given the USB 2.0 bus bottlenecks in the Raspberry Pi 3, which is resolved only in Raspberry Pi 4.

All in all, building a Raspberry Pi 3 cluster is, in my opinion, the most cost-effective way of learning Docker and clustering in general, and will remain so in the foreseeable future. So if you are just getting into Docker, I'd highly recommend this build.

What's next?

In my next post in this series, I shall introduce the Leviathan cluster which I have built specifically for high I/O applications, capable of real-time video transcoding and streaming of HEVC -> h264 videos.