Developers’ Corner: Unveiling GeoServer Active Clustering Extension

GeoServer

Dear all,
today we want to talk about our Active Clustering Extension for GeoServer, which is, let’s cover this right away so that you won’t be wondering, avaible as Free and Open Source.

Clustering GeoServer can be requested in order to achieve an Highly Available set up and/or in order to achieve more scalability.  Regardless of the reason why you intend to create a clustered deployment for GeoServer there are a limitations that must be taken into account and where possible worked around. You can find additional information about this limitations at this link.

The main limitation when it comes to installing multiple GeoServer  instances in parallel to serve the same content is the fact that each instance will load the entire configuration (mind you only the configuration, not the data) in memory at start up time for performance reasons. From tha point on if you share the data directory between multiple instances once you make a change to the configuration of one of these instaces, via the GUI or via the REST Interface, you will have to reload the configuration on the other instances to make them aware of the changes. This obviously does not scale as the frequency of the changes increases and/or the number of served layer increases.

In order to tackle this issue we have implemented the Active Clustering Extension for GeoServer that uses a Multi Master approach leveraging on a Message Oriented Middleware (MOM) infrastructure, also known as broker, to keep all the instances in synch through exchange of messages to distribute configuration changes.

  • Master instances accept changes to the internal configuration, persist them on their own data directory and forward them to the Slaves via the broker (optionally with guaranteed delivery)
  • Slave instances are not suposed to be used to change the configuration from either REST or the User Interface since they are configured to receive and inject (directly in memory to avoid configuration reloads) configuration changes disseminated by the Master(s) via the MOM
  • The Broker is responsible to make Masters and the Slaves exchange messages in a durable (this is configurable) fashion

You can find more detailed information in our GeoServer training here.

This extension has been designed to be as easy as possible to install and configure.  A GUI is provided to check the status of the cluster connection and to manage the main settings. Once you are logged into the GeoServer as administrator you will find a ‘Cluster’ entry into the left-side menu.

The Root Clustering menu in the GeoServer GUI

The Root Clustering menu in the GeoServer GUI

Here you can administer the connections and the principal aspects of the cluster extension as shown below.

Clustering settings detailed GUI

Clustering settings detailed GUI

To complete the plugin we also provided a REST interface which will help you to configure the settings programmatically.

An embedded broker is included into the Active Clustering Extension by default. It can be started, stopped and configured to run as default messanging broker using the provided GUI. It is provided to keep the set-up as simple as possible so that you can create a cluster of GeoServers with only few clicks.

Moreover,  the Active Clustering Extension is able to automatically (and dynamically) discover other brokers (using a Multicast based protocol, make sure this supported in your network) to create an High Availability deployment to store, dispatch and apply changes to GeoServer cluster.

P2pSharedDataDir

Peer-to-peer set-up with shared data directory and embedded brokers indiscovery mode

Aside from sharing the GeoServer data directory, each instance can be configured to have its own (private) data directory, in this case it shall be configured to use a durable subscription to the Broker and to keep its data dir in synch with the Master’s one so that when a configuration change from a Master (through the MOM) is received it will also be persisted on it. With this configuration in case a Slave goes down when it goes up again he will receive a bunch of configuration changes to align its data dir to the Master’s one.

More information can be found in this page.

Let’s now check how easy it is to create a cluster of peer-to-peer instances. I am going to create a cluster of 3 instances on my laptop using the nightly build binary for the 2.4.x branch.

Download and install the 2.4.x binary

  1. Download the GeoServer binary from this link.
  2. Unzip to a folder on your local file system
  3. Rename the geoserver-2.4-SNAPSHOT directory to geoserver-2.4-SNAPSHOT-1
  4. Make two copies of this directory and name them geoserver-2.4-SNAPSHOT-2 and geoserver-2.4-SNAPSHOT-3

Now you have 3 GeoServer instances that have pirvate data directories but which are aligned and ready to be clustered.

Configure the binaries

You now need to configure the 3 instances to start on a different HTTP port. What we are doing here is scaling up or vertically scaling by putting more instances on the same machine hence some additional scare must be taken.

  1. Edit the conf/jetty.xml file
  2. Make sure the  HTTP port in the connector configuration is unique between the three instances.
  3. Remove the file slf4j-simple-1.0.1  from the lib directory (this is not needed on newer releases)

My set up for instance 2 looks like the following::

<Call name=”addConnector”>
<Arg>
<New class=”org.mortbay.jetty.nio.SelectChannelConnector”>
<Set name=”port”><SystemProperty name=”jetty.port” default=”8081″/></Set>
<Set name=”maxIdleTime”>30000</Set>
<Set name=”Acceptors”>2</Set>
<Set name=”confidentialPort”>8443</Set>
</New>
</Arg>
</Call>

Install the extension

Well the default set up is a simple as possible.

  1. Get the Active Clustering Extension from here
  2. Unzip in a temporary directory
  3. Copy all the jar files inside the webapps\geoserver\WEB-INF\lib (remember I am on Windows, the path would have forward slashes on Linux 😉 )

Test the extension

For each instance:

  1. Browse to the bin  directory inside the instance
  2. double click on startup.bat

This will open up a DOS console spitting out logs, GeoServer is starting.  When you see a message like:

2014-045-11 10:14:07.104::INFO:  Started SelectChannelConnector@0.0.0.0:8080

This means that the instances has started. If you see one line in the log like this one:

11 may 10:16:09 WARN [TransportConnection.Transport] – Transport Connection to: tcp://xxx.xxx.xxx.xxx:57234 failed: java.io.EOFException

don’t worry it’s the discovery agent doing its work.

The complete documentation can be viewed here or downloaded here. Mind you, this extension although being used in a few production deployments has not been tested extensively hence it should be considered Beta quality;  we would be happy to hear your feedback in the light of proposing this module as a standard GeoServer Community Module.

The GeoSolutions team,

320x100_eng