Welcome to a new blog series by the Xibo Development Team! The "Inside Xibo" series will give users, system administrators and developers an insight into how Xibo works and the architecture decisions we take to make Xibo the best in class.
In this blog we will focus on XMR and Push Messaging.
What is XMR?
XMR was introduced in Xibo 1.8. Put simply XMR is a push messaging component and before XMR became part of Xibo, we relied entirely on the Player periodically connecting to ask for updates, or in other words pull messaging.
Looking back on the history of Xibo, pull messaging served us well, and we still make use of it today with our routine collection interval, so why the change? The evolution of this change followed a few simple steps:
- Users expect to see the results of their actions straight away, waiting a minute is acceptable, waiting 30 minutes is not.
- To achieve this, systems administrators were under pressure to reduce the “collection interval” for a display to 1 minute or lower. (The collection interval is the period of time a display waits between connections to the CMS, it could also be called a polling interval.)
- Xibo became more popular resulting in more and larger on premise installations (100+ displays) and many more Xibo in the Cloud customers.
- New and more powerful features mean the CMS and the display exchanged larger quantities of data with each collection interval.
A perfect storm was brewing, more frequent requests for data, more data and more displays connected all lead to scalability concerns. The root of the problem is that it was likely a display connecting to the CMS and any given moment would be “wasted” because no changes had occurred at the CMS. These wasted connections far outweighed the ones where changes were available - this had to be improved.
An obvious solution to this growing problem is push messaging, if the CMS can contact a display and tell it that there are changes waiting for it, then it can increase the time between its regular connections dramatically. We called our push messaging component Xibo Message Relay, XMR for short, and developed it as a standalone component to sit alongside the main CMS.
When users make changes via the web portal or API, Xibo determines which Displays are affected by those changes and then tells XMR to let the Player know. The Player receives the message and initiates a collection from the CMS.
Since introducing XMR on Xibo in the Cloud we’re seeing an average collection time of 30 minutes, up from between 1 and 5 minutes pre-XMR.
What else can it do?
So XMR solved our scalability problem for routine CMS/display communication, but is there anything else it's useful for? Yes absolutely. It turns out that there are all sorts of things you might want the display to do “right now” which were impossible to do before. These range from requesting a screenshot, running a command, and even switching immediately to a layout.
All the while the display is responsible for connecting to the CMS and pulling data, you can secure the CMS API with TLS and have a high level of confidence that the display will only do what the CMS tells it to do. Building capability into the display to receive data from an external source and then take action opens a new attack surface that must be properly managed. For XMR we have taken two approaches, one technical and one a design principle.
Each display generates a RSA public/private key on first start, and sends the public key to the CMS. This public key is then used to seal every message sent to the display, who can then verify the seal is valid before processing the message. This is not encryption and we’re not trying to protect the contents of the message.
We don’t need to encrypt the contents of the message, because our design principle for messages states that the message can only contain an instruction and never any data. For example, the message might say “collectNow” to trigger the player to pull new data from the CMS.
Messages also contain a timeout, which gets sealed along with the rest of the message, thus preventing replay attacks.
Under the hood
We had a few key requirements when searching for a solution:
- Good libraries for PHP, Java (Android), C#, node and C++
- Simple to install in the CMS environment (platform dependencies for non-Docker users)
- Support for at least 2 messaging patterns - Request/Response and Publish/Subscribe
- TCP sockets
- Open Source
We looked at various technologies for this and decided that a message queue framework or messaging library made the most sense. There are some great resources describing what a message queue is - simply put they allow two applications to exchanges messages with each other. Out of the two messaging patterns we identified, the Publish/Subscribe pattern is key as it is designed for one to many communication. In our case the CMS is the "one" and the displays are the "many".
When looking for messaging libraries, while there are a few great options to choose from, we chose ZeroMQ, which we felt would fit nicely with the rest of the Xibo ecosystem. ZeroMQ is a flexible, open source universal messaging library. It has built-in support for both messaging patterns we want to use, and libraries for all the languages we need.
XMR itself is a separate CLI component written using ReactPHP, which sits alongside the CMS on the server. It is CMS agnostic which means one XMR installation can serve multiple CMS instances. XMR binds two ports, one for a Request/Response queue used by the CMS and the other for a Publish/Subscribe queue for the displays.
ReactPHP is super fast and we’re very pleased with its performance and reliability. It was tempting to use a different language more naturally suited to this sort of activity, however on balance making it easier for our users to install the XMR component is of greater importance for the majority. Our initial thinking was that we could write another XMR component for larger installations which required higher performance, however this has not been necessary. Xibo in the Cloud routinely processes between 2 and 3 XMR messages a second on average - that is 240,000 per day!
If a request via the web portal, API or XTR (task runner) is determined to effect a display, the CMS creates a message for that display, seals it with the displays public key, envelopes it with some quality control data, and sends that message to XMR using a the Request/Response message queue. XMR then determines the message's priority, destination, and then sends it via the Publish/Subscribe queue to be received by the display, unsealed and actioned accordingly.
What's next for ZeroMQ?
Xibo has been using ZeroMQ in production for several years and we’re very happy with its performance. Adding push messaging to the platform has brought about fantastic user interface improvements, as well as scalability improvements for Xibo’s APIs.
We’re excited to keep ZeroMQ as a core technology and as we move forward with our minor releases for Xibo v3 we have some great use cases for it. In Xibo 3.1 we are planning to add synced playback to Xibo’s feature set, which we’re calling Mirror Sync. All types of display in our suite have ZeroMQ onboard which makes it a natural choice for forming a peer-to-peer message queue amongst players, which will then be used to exchange synchronisation messages for Mirror Sync.
- ZMQ: https://zeromq.org/
- ReactPHP: https://reactphp.org/
- XMR: https://xibo.org.uk/docs/setup/xmr-push-messaging
- Software Triggers with XMR: https://xibo.org.uk/docs/developer/player-control/software-triggers