mscharhag, Programming and Stuff;

A blog about programming and software development topics, mostly focused on Java technologies including Java EE, Spring and Grails.

Wednesday, 6 May, 2020

REST / Using feeds to publish events

Dealing with events

When working with multiple decoupled services (e.g. in a micro service architecture) it is very likely that you need a way to publish some sort of domain event from one service to one or multiple other service(s).

Many widely adopted solutions rely on a separate piece of infrastructure to solve this problem (like an event bus or message queues).

Event feeds

Another approach to this problem is the use of feeds. Feeds like RSS or ATOM are typically used to subscribe to web pages. Whenever a new article is published to a subscribed web page a feed reader application (e.g. browser add-on or mobile app) can inform the user about the new article. Feed readers typically poll a provided feed endpoint in regular intervals to see if new articles are available.

Instead of publishing new articles to RSS-Readers we can use a feed to publish events to other services. This requires no additional infrastructure besides a standard database to store events (which you might already have).

RSS and ATOM are both XML formats and therefore not a good fit if we want to provide a JSON API. There is also JSON Feed, which is similar to RSS and ATOM but uses JSON. Like RSS and ATOM, JSON Feed focuses on website contents, therefore many (optional) feed and feed item properties are not very useful for publishing domain events (like favicon, content_html, image, banners and attachments). However, JSON Feed has a simple extension mechanism that allows us to define custom fields in our feeds. These fields have to start with an underscore. If JSON Feed does not match your needs, you can also come up with your own feed format, which should not be that hard.

An example JSON Feed with two published domain events might look like this:

{
  "version": "https://jsonfeed.org/version/1",
  "title": "user service events",
  "feed_url": "http://userservice.myapi.com/events",
  "next_url": "http://userservice.myapi.com/events?offset=2", 
  "items": [
    {
      "id": "42",
      "url": "http://userservice.myapi.com/user/123",
      "date_published": "2020-05-01T14:00:00-07:00",
      "_type": "NameChanged",
      "_data": {
        "oldName" : "John Foo",
        "newName" : "John Bar"
      }
    }, {
      "id": "43",
      "url": "http://userservice.myapi.com/user/789",
      "date_published": "2020-05-02T17:00:00-03:00",
      "_type": "UserDeleted",
      "_data": {
        "name" : "Anna Smith",
        "email" : "anna@smith.com"
      }
    }
  ]
}

The first event (with id 42) indicates that the name of the user resource /user/123 has been changed. Within the _data block we provide some additional event information that might be useful for the subscriber. The second event indicates that the resource /user/789 has been deleted, the _data block contains the deleted user data. _type and _data are not defined in the JSON Feed format and therefore start with an underscore (the JSON Feed extension format).

The feed property next_url can be used to provide some sort of pagination. It tells the client where to look for more events after all events in the current feed have been processed. Our feed contains only two events, so we tell the client to call the feed endpoint with an offset parameter of two to get the next events.

General considerations

If you use JSON Feed or if you come up with your own feed format, here are some general things you should consider, when building a feed to publish events:

Feed items are immutable

Feed items represent domain events, which are immutable. When necessary, clients can use the unique feed item id to check if they already processed a feed item.

The feed item order is not modified

The order of the items in the feed is not changed. Newer items are appended to the end of the feed.

Clients should be able to request only the feed items they have not processed so far.

To avoid that clients need to process all feed items over and over again to see if new items are available (e.g. by checking the date_published item property), the feed should provide a way to return only the new items. When using JSON Feed this can be accomplished with the next_url property.

The following diagram tries to visualize a possible next_url behavior:

At the first feed request only two events might be available. Both are returned by the server, together with a next_url that contains an offset parameter of 2. After the client has processed both feed items, it requests the next items using an offset of 2. No new items are available, so an empty feed without a new next_url is returned by the server. The client remembers the previous next_url and retries the request some time later again. This time a new item is returned with an updated next_url containing an offset of 3.

Of course you can come up with different ways of accomplishing the same result.

And performance?

Obviously a feed cannot compete with any high throughput messaging solutions from a performance point of view. However, I think it would be enough for many use cases. If it reduces the complexity of your system, it might be a worthy trade off.

Things to consider are:

  • The number of events created by the server
  • The number of feed subscribers
  • The amount of data associated with an event
  • The acceptable delay between publishing and processing of an event. This defines the polling interval for subscribers

Due to the immutable nature of domain events, caching of events can be an option on the server to reduce database lookups. Long polling and conditional GET requests are possible options to reduce network load.

Conclusion

Feeds provide an alternative way of publishing events to other systems using a REST API without additional infrastructure besides a database to store events. You can use existing feed formats like JSON Feed or come up with your own custom feed format.

Because of the polling nature of a feed this solution is probably not the best choice if you have tons of events and a lot of consumers.
 

Leave a reply