Using IPFS Cluster Service for Global IPFS Data Persistence
How to install and configure IPFS’s cluster service across your IPFS network
Automatic replication and pinning across your IPFS network
IPFS Cluster Service, a less documented tool in the IPFS family, is available to use as a means of automatically replicating and pinning content throughout your IPFS network of nodes. It runs alongside IPFS as a separate service and needs to be installed on all the IPFS peers that make your cluster. This article acts as a guide of the installation and configuration process, as well as an introduction to managing content on your cluster.
An IPFS cluster is a powerful tool that gives you a simple means of distributing content globally, via IPFS, with an easy to use CLI. The project homepage can be found at https://cluster.ipfs.io, and the Github project here.
Now, there are two methods of deploying an IPFS Cluster Service. Which method you favour will depend on your deployment methods. They are as follows:
- Method 1: To configure your entire cluster peerset and list them in the configuration file of your initial cluster node. Deploying a cluster service on this node will then sync the remaining peerset nodes to the cluster. This method is favourable for automated solutions with tools such as Ansible, whereby your deployment is run as an automated process. It is documented in more detail here.
- Method 2: To deploy the first peer of the cluster and introduce more as they are deployed. This method is a more favourable approach for manually introducing IPFS nodes to a cluster.
This article will adopt Method 2, and will focus on the higher level required configurations over the more technical.
Prerequisites - IPFS
An IPFS node is required to be running alongside IPFS cluster service; the cluster service is an extension of IPFS’s core feature set, a complimentary package that provides additional functionality on top of the core network. If you have not yet set up an IPFS node and would like to experiment with a cluser service, read my introduction of IPFS:
Introduction to IPFS: Run Nodes on Your Network, with HTTP Gateways
How to install IPFS nodes across your VPS network and configure your own Gateways
With that, let’s get to how to install and configure an IPFS cluster service.
Installing IPFS Cluster Packages
There are two packages to install, both of which are available on the IPFS distributions page: ipfs-cluster-service and ipfs-cluster-ctl. The first package,
ipfs-cluster-service, runs a full cluster peer, whereas
ipfs-clister-ctl provides us a command line interface to manage a running cluster peer.
Provided you are on a Linux OS, run the following commands to install the service (refer to the distributions page for the latest version and platform specific packages):
tar xvfz ipfs-cluster-service_v0.7.0_linux-amd64.tar.gz
sudo cp ipfs-cluster-service /usr/local/bin
What we are doing here is downloading and unpacking the
ipfs-cluster-service package inside your home directory. Then we are copying the unpacked binary file into a known $PATH location. I have opted for
To test whether the package installed correctly, run
ipfs-cluster-service help to bring up the corresponding output.
ipfs-cluster-ctl package is an almost identical process. The following commands will set things up:
tar xvfz ipfs-cluster-ctl_v0.7.0_linux-amd64.tar.gz
sudo cp ipfs-cluster-ctl /usr/local/bin
ipfs-cluster-ctl help to verify that the binaries are working as expected.
Before moving on, take a read of the overview page of IPFS cluster service to familiarise yourself with the features and limitations of the project:
Initiating a Cluster Service
Like an IPFS repository, a cluster service folder must be initiated containing the required configurations needed to run the service. Where the default IPFS repository is initiated at
~/.ipfs, the default cluster service is initiated at
If you prefer an alternative directory, it can be changed by either using the -c flag:
ipfs-cluster-service -c <path>, or setting the
IPFS_CLUSTER_PATH environment variable.
The structure of an
.ipfs-cluser folder looks like the following:
Let’s briefly explore what these files and folder consist of:
- service.json: The configuration file of our cluster. A full example of this file can be found here.
- raft/: A folder containing the consensus data, as well as snapshots, of our cluster. Raft is the name of the consensus protocol that Protocol Labs have developed for the service.
- peerstore: Simply a list of IPFS multiaddresses of each peer in the cluster. This file however will not be present upon initiating a cluster service, and instead will be introduced / updated as more peers are added to your cluster.
Note: If you are adhering to Method 1 of cluster setup, the peerstore file is required (among other things) before running the IPFS Cluster daemon.
~/.ipfs-cluster folder is done with the
However, before we hastily run this command there are a couple of things we need to know.
One of those things is the Secret Key of the cluster.
The Cluster Secret Key
The secret key of a cluster is a 32-bit hex encoded random string, of which every cluster peer needs in their
By default, initiating a cluster will generate a secret key which can then be obtained in
service.json. This secret key will then need to be applied to all other peers that make our cluster.
The automatic generation of a secret key is convenient for our initial cluster peer, but successive peers of the cluster will require that identical secret key in their
A secret key can be generated and predefined in the
CLUSTER_SECRET environment variable, and will subsequently be used upon running
Run the following to generate a secret key and display it in your terminal:
export CLUSTER_SECRET=$(od -vN 32 -An -tx1 /dev/urandom | tr -d ' \n')echo $CLUSTER_SECRET
Now, with a known secret key being stored as
CLUSTER_SECRET, we can now initiate the
It is beneficial at this point to familiarise ourselves with what is actually possible to configure within a cluster. A detailed configuration breakdown can be found within the official documentation, starting with the main cluster section:
Opening Required Ports
A small amount of firewall management is needed to get IPFS cluster service communicating between peers.
8080 should already be open for IPFS.
9096 are used by the cluster service. For a firewalld user, open a public zoned port with the following:
sudo firewall-cmd --zone=public --add-port=9094/tcp --permanent#repeat for other ports#reload firewalld
sudo firewall-cmd --reload
This is all that is required here.
Running IPFS Cluster Service in the Background
Like the main IPFS process running in the background, the process of which is documented in my IPFS introduction article, we also run the IPFS cluster service as a background process. I favour supervisord as my preferred process manager, and suggest using it for IPFS cluster service too.
Provided you already have installed supervisord, amend your configuration file with this additional process. I have provided the
IPFS_CLUSTER_PATH environment variable, as well as the full path of the
Note: Supervisord runs as root, so using
~/.ipfs-cluster as the directory will actually point to
/home/root/.ipfs-cluster. Not what we want here.
I have also provided the full path to the
ipfs-cluster-service binaries in the command value.
Note: I have had inconsistent experiences when simply providing
ipfs-cluster-service daemon , with some VPS systems not recognising the program. This may be due to the /usr/local/bin path I opted for in the installation process. But suffice to say, providing the absolute path fixed this issue.
Now, updating your supervisord processes will start the daemon for your cluster peer. We will want to test things in the foreground firstly; update and then stop the new process thereafter:
sudo supervisorctl rereadsudo supervisorctl updatesudo supervisorctl stop ipfs-cluster-service
Testing IPFS Cluster Service
At this point it is wise to run
ipfs-cluster-service daemon in your terminal to observe whether the cluster successfully runs. At the time of this writing you should see a message not dissimilar to
* CLUSTER READY * after some initial output.
When you have confirmed there are no errors, start your supervisord process again:
sudo supervisorctl start ipfs-cluster-service
At this point the initial cluster peer is up and running, and we have our secret key ready for additional peers to be added. Let’s now visit bootstrapping an additional peer to the cluster.
Bootstrapping Additional Peers
Bootstrapping additional peers involves the same process outlined above, with a couple of subtle differences:
ipfs-cluster-ctlpackages (identical to initial peer).
- Export the
CLUSTER_SECRETenvironment variable, setting the secret key you gathered from the initial peer before initiating the cluster service:
export CLUSTER_SECRET=<initial_peer_secret>ipfs-cluster-service init
- Open the required firewall ports (identical to initial peer).
- Run the deamon in the foreground, this time with the
--bootstrapflag containing the multiaddress of the initial cluster peer. The format of this address is as follows:
Replace the 2 values above with the public IP address the node is running on and the IPFS peer identity hash respectively:
ipfs-cluster-service daemon --bootstrap /ip4/<initial_peer_ip_address>/tcp/9096/ipfs/<initial_peer_identity_hash>
- Add the daemon command within your supervisord configuration. Since your cluster peer has already been bootstrapped, we no longer need to include the
--bootstrapflag for subsequent daemon starts.
Note: If restarting a cluster peer, the
--bootstrap flag is not necessary, unless the peer has been removed from the peerset and needs to be added again.
To verify your peers are being added to the cluster successfully, run:
ipfs-cluster-ctl peers ls
Managing Content in the Cluster
ipfs-cluster-ctl can now be used to manage the content on your cluster. Refer to the following page for usage instructions, including pinning content, listing peers, syncing and recovering:
Usage of ipfs-cluster-ctl
IPFS CLuster CTL is the client application to manage the cluster nodes and perform actions. IPFS Cluster is Pinset orchestration for IPFS.
For clarity, the
pin commands add content to your cluster. Pinning content will ensure their persistence across the cluster, and in my opinion is the best use case of IPFS Cluster Service thus far, to globally persist content.
# adds content to the cluster
ipfs-cluster-ctl add myfile.txt http://domain.com/file.txt# pins a CID in the cluster
ipfs-cluster-ctl pin add Qma4Lid2T1F68E3Xa3CpE6vVJDLwxXLD8RfiB9g1Tmqp58
A dedicated production deployments page is readily available with tips, tweaks and guides on how to deploy a cluster using various tools. Some of the stand-out content here includes:
service.jsonto suit your needs.
- A tip for automatically updating
ipfs-cluster-serviceon system restarts.
- A subsection on cleanly removing peers from a cluster.
- A subsection on backups and how to export / import the last persisted state.
This article acted as an introduction to IPFS Cluster Service. From here, it is recommended that the reader continues to read through the official documentation, the node upgrade process and the more advanced concepts including composite clusters.
Finally, the roadmap acts a valuable resource to plan your deployments to coincide with vital upgrades that your organisation may require.