mscharhag, Programming and Stuff;

A blog about programming and software development topics, mostly focused on Java technologies including Java EE, Spring and Grails.

  • Sunday, 30 May, 2021

    Providing useful API error messages with Spring Boot

    For API users it is quite important an API provides useful error messages. Otherwise, it can be hard to figure out why things do not work. Debugging what's wrong can quickly become a larger effort for the client than actually implementing useful error responses on the server side. This is especially true if clients are not able to solve the problem themself and additional communication is required.

    Nonetheless this topic is often ignored or implemented halfheartedly.

    Client and security perspectives

    There are different perspectives on error messages. Detailed error messages are more helpful for clients while, from a security perspective, it is preferable to expose as little information as possible. Luckily those two views often do not conflict that much, when implemented correctly.

    Clients are usually interested in very specific error messages if the error is produced by them. This should usually be indicated by a 4xx status code. Here, we need specific messages that point to the mistake made by the client without exposing any internal implementation detail.

    On the other hand, if the client request is valid and the error is produced by the server (5xx status codes), we should be conservative with error messages. In this case, the client is not able to solve the problem and therefore does not require any details about the error.

    A response indicating an error should contain at least two things: A human readable message and an error code. The first one helps the developer that sees the error message in the log file. The later allows specfic error processing on the client (e.g. showing a specific error message to the user).

    How to build a useful error response in a Spring Boot application?

    Assume we have a small application in which we can publish articles. A simple Spring controller to do this might look like this:

    @RestController
    public class ArticleController {
    
        @Autowired
        private ArticleService articleService;
    
        @PostMapping("/articles/{id}/publish")
        public void publishArticle(@PathVariable ArticleId id) {
            articleService.publishArticle(id);
        }
    }

    Nothing special here, the controller just delegates the operation to a service, which looks like this:

    @Service
    public class ArticleService {
    
        @Autowired
        private ArticleRepository articleRepository;
    
        public void publishArticle(ArticleId id) {
            Article article = articleRepository.findById(id)
                    .orElseThrow(() -> new ArticleNotFoundException(id));
    
            if (!article.isApproved()) {
                throw new ArticleNotApprovedException(article);
            }
    
            ...
        }
    }

    Inside the service we throw specific exceptions for possible client errors. Note that those exception do not just describe the error. They also carry information that might help us later to produce a good error message:

    public class ArticleNotFoundException extends RuntimeException {
        private final ArticleId articleId;
    
        public ArticleNotFoundException(ArticleId articleId) {
            super(String.format("No article with id %s found", articleId));
            this.articleId = articleId;
        }
        
        // getter
    }

    If the exception is specific enough we do not need a generic message parameter. Instead, we can define the message inside the exception constructor.

    Next we can use an @ExceptionHandler method in a @ControllerAdvice bean to handle the actual exception:

    @ControllerAdvice
    public class ArticleExceptionHandler {
    
        @ExceptionHandler(ArticleNotFoundException.class)
        public ResponseEntity<ErrorResponse> onArticleNotFoundException(ArticleNotFoundException e) {
            String message = String.format("No article with id %s found", e.getArticleId());
            return ResponseEntity
                    .status(HttpStatus.NOT_FOUND)
                    .body(new ErrorResponse("ARTICLE_NOT_FOUND", message));
        }
        
        ...
    }

    If controller methods throw exceptions, Spring tries to find a method annotated with a matching @ExceptionHandler annotation. @ExceptionHandler methods can have flexible method signatures, similar to standard controller methods. For example, we can a HttpServletRequest request parameter and Spring will pass in the current request object. Possible parameters and return types are described in the Javadocs of @ExceptionHandler.

    In this example, we create a simple ErrorResponse object that consists of an error code and a message.

    The message is constructed based on the data carried by the exception. It is also possible to pass the exception message to the client. However, in this case we need to make sure everyone in the team is aware of this and exception messages do not contain sensitive information. Otherwise, we might accidentally leak internal information to the client.

    ErrorResponse is a simple Pojo used for JSON serialization:

    public class ErrorResponse {
        private final String code;
        private final String message;
    
        public ErrorResponse(String code, String message) {
            this.code = code;
            this.message = message;
        }
    
        // getter
    }

    Testing error responses

    A good test suite should not miss tests for specific error responses. In our example we can verify error behaviour in different ways. One way is to use a Spring MockMvc test.

    For example:

    @SpringBootTest
    @AutoConfigureMockMvc
    public class ArticleExceptionHandlerTest {
    
        @Autowired
        private MockMvc mvc;
    
        @MockBean
        private ArticleRepository articleRepository;
    
        @Test
        public void articleNotFound() throws Exception {
            when(articleRepository.findById(new ArticleId("123"))).thenReturn(Optional.empty());
    
            mvc.perform(post("/articles/123/publish"))
                    .andExpect(status().isNotFound())
                    .andExpect(jsonPath("$.code").value("ARTICLE_NOT_FOUND"))
                    .andExpect(jsonPath("$.message").value("No article with id 123 found"));
        }
    }


    Here, we use a mocked ArticleRepository that returns an empty Optional for the passed id. We then verify if the error code and message match the expected strings.

    In case you want to learn more about testing spring applications with mock mvc: I recently wrote an article showing how to improve Mock mvc tests.

    Summary

    Useful error message are an important part of an API.

    If errors are produced by the client (HTTP 4xx status codes) servers should provide a descriptive error response containing at least an error code and a human readable error message. Responses for unexpected server errors (HTTP 5xx) should be conservative to avoid accidental exposure any internal information.

    To provide useful error responses we can use specific exceptions that carry related data. Within @ExceptionHandler methods we then construct error messages based on the exception data.

  • Monday, 3 May, 2021

    Supporting bulk operations in REST APIs

    Bulk (or batch) operations are used to perform an action on more than one resource in single request. This can help reduce networking overhead. For network performance it is usually better to make fewer requests instead of more requests with less data.

    However, before adding support for bulk operations you should think twice if this feature is really needed. Often network performance is not what limits request throughput. You should also consider techniques like HTTP pipelining as alternative to improve performance.

    When implementing bulk operations we should differentiate between two different cases:

    • Bulk operations that group together many arbitrary operations in one request. For example: Delete product with id 42, create a user named John and retrieve all product-reviews created yesterday.
    • Bulk operations that perform one operation on different resources of the same type. For example: Delete the products with id 23, 45, 67 and 89.

    In the next section we will explore different solutions that can help us with both situations. Be aware that the shown solutions might not look very REST-like. Bulk operations in general are not very compatible with REST constraints as we operate on different resources with a single request. So there simply is no real REST solution.

    In the following examples we will always return a synchronous response. However, as bulk operations usually take longer to process it is likely you are also interested in an asynchronous processing style. In this case, my post about asynchronous operations with REST might also be interesting to you.

    Expressing multiple operations within the request body

    Probably a way that comes to mind quickly is to use a standard data format like JSON to define a list of desired operations.

    Let's start with a simple example request:

    POST /batch
    
    [
        {
            "path": "/products",
            "method": "post",
            "body": {
                "name": "Cool Gadget",
                "price": "$ 12.45 USD"
            }
        }, {
            "path": "/users/43",
            "method": "put",
            "body": {
                "name": "Paul"
            }
        },
        ...
    ]

    We use a generic /batch endpoint that accepts a simple JSON format to describe desired operations using URIs and HTTP methods. Here, we want to execute a POST request to /products and a PUT request to /users/43.

    A response body for the shown request might look like this:

    [
        {
            "path": "/products",
            "method": "post",
            "body": {
                "id": 123,
                "name": "Cool Gadget",
                "price": "$ 12.45 USD"
            },
            "status": 201
        }, {
            "path": "/users/43",
            "method": "put",
            "body": {
                "id": 43,
                "name": "Paul"
            },
            "status": 200
        },
        ...
    ]

    For each requested operation we get a result object containing the URI and HTTP method again. Additionally we get the status code and response body for each operation.

    This does not look too bad. In fact, APIs like this can be found in practice. Facebook for example uses a similiar approach to batch multiple Graph API requests.

    However, there are some things to consider with this approach:

    How are the desired operations executed on the server side? Maybe it is implemented as simple method call. It is also possible to create a real HTTP requests from the JSON data and then process those requests. In this case, it is important to think about request headers which might contain important information required by the processing endpoint (e.g. authentication tokens, etc.).

    Headers in general are missing in this example. However, headers might be important. For example, it is perfectly viable for a server to respond to a POST request with HTTP 201 and an empty body (see my post about resource creation). The URI of the newly created resource is usually transported using a Location header. Without access to this header the client might not know how to look up the newly created resource. So think about adding support for headers in your request format.

    In the example we assume that all requests and responses use JSON data as body which might not always be the case (think of file uploads for example). As alternative we can define the request body as string which gives us more flexibility. In this case, we need to escape JSON double quotes which can be awkward to read:

    An example request that includes headers and uses a string body might look like this:

    [
        {
            "path": "/users/43",
            "method": "put",
            "headers": [{ 
                "name": "Content-Type", 
                "value": "application/json"
            }],
            "body": "{ \"name\": \"Paul\" }"
        },
        ...
    ]

    Multipart Content-Type for the rescue?

    In the previous section we essentially translated HTTP requests and responses to JSON so we can group them together in a single request. However, we can do the same in a more standardized way with multipart content-types.

    A multipart Content-Type header indicates that the HTTP message body consists of multiple distinct body parts and each part can have its own Content-Type. We can use this to merge multiple HTTP requests into a single multipart request body.

    A quick note before we look at an example: My example snippets for HTTP requests and responses are usually simplified (unnecessary headers, HTTP versions, etc. might be skipped). However, in the next snippet we pack HTTP requests into the body of a multipart request requiring correct HTTP syntax. Therefore, the next snippets use the exact HTTP message syntax.

    Now let's look at an example multipart request containing two HTTP requests:

     1  POST http://api.my-cool-service.com/batch HTTP/1.1
     2  Content-Type: multipart/mixed; boundary=request_delimiter
     3  Content-Length: <total body length in bytes>
     4
     5  --request_delimiter
     6  Content-Type: application/http
     7  Content-ID: fa32d92f-87d9-4097-9aa3-e4aa7527c8a7
     8
     9  POST http://api.my-cool-service.com/products HTTP/1.1
    10  Content-Type: application/json
    11
    12  {
    13      "name": "Cool Gadget",
    14      "price": "$ 12.45 USD"
    15  }
    16  --request_delimiter
    17  Content-Type: application/http
    18  Content-ID: a0e98ffb-0b62-42a1-a321-54c6e9ef4c99
    19
    20  PUT http://api.my-cool-service.com/users/43 HTTP/1.1
    21  Content-Type: application/json
    22
    23  {
    24    "section": "Section 2"
    25  }
    26  --request_delimiter--

    Multipart content types require a boundary parameter. This parameter specifies the so-called encapsulation boundary which acts like a delimiter between different body parts.

    Quoting the RFC:

    The encapsulation boundary is defined as a line consisting entirely of two hyphen characters ("-", decimal code 45) followed by the boundary parameter value from the Content-Type header field.

    In line 2 we set the Content-Type to multipart/mixed with a boundary parameter of request_delimiter. The blank line after the Content-Length header separates HTTP headers from the body. The following lines define the multipart request body.

    We start with the encapsulation boundary indicating the beginning of the first body part. Next follow the body part headers. Here, we set the Content-Type header of the body part to application/http which indicates that this body part contains a HTTP message. We also set a Content-Id header which we can be used to identify a specific body part. We use a client generated UUID for this.

    The next blank line (line 8) indicates that now the actual body part begins (in our case that's the embedded HTTP request). The first body part ends with the encapsulation boundary at line 16.

    After the encapsulation boundary, follows the next body part which uses the same format as the first one.

    Note that the encapsulation boundary following the last body part contains two additional hyphens at the end which indicates that no further body parts will follow.

    A response to this request might follow the same principle and look like this:

     1  HTTP/1.1 200
     2  Content-Type: multipart/mixed; boundary=response_delimiter
     3  Content-Length: <total body length in bytes>
     4
     5  --response_delimiter
     6  Content-Type: application/http
     7  Content-ID: fa32d92f-87d9-4097-9aa3-e4aa7527c8a7
     8
     9  HTTP/1.1 201 Created
    10  Content-Type: application/json
    11  Location: http://api.my-cool-service.com/products/123
    12
    13  {
    14      "id": 123,
    15      "name": "Cool Gadget",
    16      "price": "$ 12.45 USD"
    17  }
    18  --response_delimiter
    19  Content-Type: application/http
    20  Content-ID: a0e98ffb-0b62-42a1-a321-54c6e9ef4c99
    21  
    22  HTTP/1.1 200 OK
    23  Content-Type: application/json
    24
    25  {
    26      "id": 43,
    27      "name": "Paul"
    28  }
    29  --response_delimiter--

    This multipart response body contains two body parts both containing HTTP responses. Note that the first body part also contains a Location header which should be included when sending a HTTP 201 (Created) response status.

    Multipart messages seem like a nice way to merge multiple HTTP messages into a single message as it uses a standardized and generally understood technique.

    However, there is one big caveat here. Clients and the server need to be able to construct and process the actual HTTP messages in raw text format. Usually this functionality is hidden behind HTTP client libraries and server side frameworks and might not be easily accessible.

    Bulk operations on REST resources

    In the previous examples we used a generic /batch endpoint that can be used to modify many different types of resources in a single request. Now we will apply bulk operations on a specific set of resources to move a bit into a more rest-like style.

    Sometimes only a single operation needs to support bulk data. In such a case, we can simply create a new resource that accepts a collection of bulk entries.

    For example, assume we want to import a couple of products with a single request:

    POST /product-import
    
    [
        {
            "name": "Cool Gadget",
            "price": "$ 12.45 USD"
        },
        {
            "name": "Very cool Gadget",
            "price": "$ 19.99 USD"
        },
        ...
    ]

    A simple response body might look like this:

    [
        {
            "status": "imported",
            "id": 234235
            
        },
        {
            "status": "failed"
            "error": "Product name too long, max 15 characters allowed"
        },
        ...
    ]

    Again we return a collection containing details about every entry. As we provide a response to a specific operation (importing products) there is not need to use a generic response format. Instead, we can use a specific format that communicates the import status and potential import errors.

    Partially updating collections

    In a previous post we learned that PATCH can be used for partial modification of resources. PATCH can also use a separate format to describe the desired changes.

    Both sound useful for implementing bulk operations. By using PATCH on a resource collection (e.g. /products) we can partially modify the collection. We can use this to add new elements to the collection or update existing elements.

    For example we can use the following snippet to modify the /products collection:

    PATCH /products
    
    [
        {
            "action": "replace",
            "path": "/123",
            "value": {
                "name": "Yellow cap",
                "description": "It's a cap and it's yellow"
            }        
        },
        {
            "action": "delete",
            "path": "/124",
        },
        {
            "action": "create",
            "value": {
                "name": "Cool new product",
                "description": "It is very cool!"
            }
        }
    ]

    Here we perform three operations on the /products collection in a single request. We update resource /products/123 with new information, delete resource /products/124 and create a completely new product.

    A response might look somehow like this:

    [
        {
            "action": "replace",
            "path": "/123",
            "status": "success"
        }, 
        {
            "action": "delete",
            "path": "/124",
            "status": "success"
        }, {
            "action": "create",
            "status": "success"
        }
    ]

    Here we need to use a generic response entry format again as it needs to be compatible to all possible request actions.

    However, it would be too easy without a huge caveat: PATCH requires changes to be applied atomically.

    The RFC says:

    The server MUST apply the entire set of changes atomically and never provide [..] a partially modified representation. If the entire patch document cannot be successfully applied, then the server MUST NOT apply any of the changes.

    I usually would not recommend to implement bulk operation in an atomic way as this can increase complexity a lot.

    A simple workaround to be compatible with the HTTP specifications is to create a separate sub-resource and use POST instead of PATCH.

    For example:

    POST /products/batch 
    

    (same request body as the previous PATCH request)

    If you really want to go the atomic way, you might need to think about the response format again. In this case, it is not possible that some requested changes are applied while others are not. Instead you need to communicate what requested changes failed and which could have been applied if everything else would have worked.

    In this case, a response might look like this:

    [
        {
            "action": "replace",
            "path": "/123",
            "status": "rolled back"
        }, 
        {
            "action": "delete",
            "path": "/124",
            "status": "failed",
            "error": "resource not found"
        },
        ..
    ]

    Which HTTP status code is appropriate for responses to bulk requests?

    With bulk requests we have the problem than some parts of the request might execute successfully while other fail. If everything worked it is easy, in this case we can simply return HTTP 200 OK.

    Even if all requested changes fail it can be argued that HTTP 200 is still a valid response code as long as the bulk operation itself completed successfully.

    In either way the client needs to process the response body to get detailed information about the processing status.

    Another idea that might come in mind is HTTP 207 (Multi-status). HTTP 207 is part of RFC 4918 (HTTP extensions for WebDAV) and described like this:

    A Multi-Status response conveys information about multiple resources in situations where multiple status codes might be appropriate. [..] Although '207' is used as the overall response status code, the recipient needs to consult the contents of the multistatus response body for further information about the success or failure of the method execution. The response MAY be used in success, partial success and also in failure situations.

    So far this reads like a great fit.

    Unfortunately HTTP 207 is part of the Webdav specification and requires a specific response body format that looks like this:

    <?xml version="1.0" encoding="utf-8" ?>
    <d:multistatus xmlns:d="DAV:">
        <d:response>
            <d:href>http://www.example.com/container/resource3</d:href>
            <d:status>HTTP/1.1 423 Locked</d:status>
            <d:error><d:lock-token-submitted/></d:error>
        </d:response>
    </d:multistatus>

    This is likely not the response format you want. Some might argue that it is fine to reuse HTTP 207 with a custom response format. Personally I would not recommend doing this and instead use a simple HTTP 200 status code.

    In case you the bulk request is processed asynchronously HTTP 202 (Accepted) is the status code to use.

    Summary

    We looked at different approaches of building bulk APIs. All approaches have different up- and downsides. There is no single correct way as it always depends on your requirements.

    If you need a generic way to submit multiple actions in a single request you can use a custom JSON format. Alternatively you can use a multipart content-type to merge multiple requests into a single request.

    You can also come up with separate resources that that express the desired operation. This is usually the simplest and most pragmatic way if you only have one or a few operations that need to support bulk operations.

    In all scenarios you should evaluate if bulk operations really produce the desired performance gains. Otherwise, the additional complexity of bulk operations is usually not worth the effort.

     

    Interested in more REST related articles? Have a look at my REST API design page.

  • Thursday, 8 April, 2021

    Looking into the JDK 16 vector API

    JDK 16 comes with the incubator module jdk.incubator.vector (JEP 338) which provides a portable API for expressing vector computations. In this post we will have a quick look at this new API.

    Note that the API is in incubator status and likely to change in future releases.

    Why vector operations?

    When supported by the underlying hardware vector operations can increase the number of computations performed in a single CPU cycle.

    Assume we want to add two vectors each containing a sequence of four integer values. Vector hardware allows us to perform this operation (four integer additions in total) in a single CPU cycle. Ordinary additions would only perform one integer addition in the same time.

    The new vector API allows us to define vector operations in a platform agnostic way. These operations then compile to vector hardware instructions at runtime.

    Note that HotSpot already supports auto-vectorization which can transform scalar operations into vector hardware instructions. However, this approach is quite limited and utilizes only a small set of available vector hardware instructions.

    A few example domains that might benefit from the new vector API are machine learning, linear algebra or cryptography.

    Enabling the vector incubator module (jdk.incubator.vector)

    To use the new vector API we need to use JDK 16 (or newer). We also need to add the jdk.incubator.vector module to our project. This can be done with a module-info.java file:

    module com.mscharhag.vectorapi {
        requires jdk.incubator.vector;
    }

    Implementing a simple vector operation

    Let's start with a simple example:

    float[] a = new float[] {1f, 2f, 3f, 4f};
    float[] b = new float[] {5f, 8f, 10f, 12f};
    
    FloatVector first = FloatVector.fromArray(FloatVector.SPECIES_128, a, 0);
    FloatVector second = FloatVector.fromArray(FloatVector.SPECIES_128, b, 0);
    
    FloatVector result = first
            .add(second)
            .pow(2)
            .neg();

    We start with two float arrays (a and b) each containing four elements. These provide the input data for our vectors.

    Next we create two FloatVectors using the static fromArray(..) factory method. The first parameter defines the size of the vector in bits (here 128). Using the last parameter we are able to define an offset value for the passed arrays (here we use 0)

    In Java a float value has a size of four bytes (= 32 bits). So, four float values match exactly the size of our vector (128 bits).

    After that, we can define our vector operations. In this example we add both vectors together, then we square and negate the result.

    The resulting vector contains the values:

    [-36.0, -100.0, -169.0, -256.0]

    We can write the resulting vector into an array using the intoArray(..) method:

    float[] resultArray = new float[4];
    result.intoArray(resultArray, 0);

    In this example we use FloatVector to define operations on float values. Of course we can use other numeric types too. Vector classes are available for byte, short, integer, float and double (ByteVector, ShortVector, etc.).

    Working with loops

    While the previous example was simple to understand it does not show a typical use case of the new vector API. To gain any benefits from vector operations we usually need to process larger amounts of data.

    In the following example we start with three arrays a, b and c, each having 10000 elements. We want to add the values of a and b and store it in c: c[i] = a[i] + b[i].

    Our code looks like this:

    final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_128;
    
    float[] a = randomFloatArray(10_000);
    float[] b = randomFloatArray(10_000);
    float[] c = new float[10_000];
    
    for (int i = 0; i < a.length; i += SPECIES.length()) {
        VectorMask<Float> mask = SPECIES.indexInRange(i, a.length);
        FloatVector first = FloatVector.fromArray(SPECIES, a, i, mask);
        FloatVector second = FloatVector.fromArray(SPECIES, b, i, mask);
        first.add(second).intoArray(c, i, mask);
    }

    Here we iterate over the input arrays in strides of vector length. A VectorMask helps us if vectors cannot be completely filled from input data (e.g. during the last loop iteration).

    Summary

    We can use the new vector API to define vector operations for optimizing computations for vector hardware. This way we can increase the number of computations performed in a single CPU cycle. Central element of the vector API are type specific vector classes like FloatVector or LongVector.

    You can find the example source code on GitHub.

  • Monday, 8 March, 2021

    Kotlin dependency injection with Koin

    Dependency injection is a common technique in today's software design. With dependency injection we pass dependencies to a component instead of creating it inside the component. This way we can separate construction and use of dependencies.

    In this post we will look at Koin, a lightweight Kotlin dependency injection library. Koin describes itself as a DSL, a light container and a pragmatic API.

    Getting started with Koin

    We start with adding the Koin dependency to our project:

    <dependency>
        <groupId>org.koin</groupId>
        <artifactId>koin-core</artifactId>
        <version>2.2.2</version>
    </dependency>

    Koin artifacts are available on jcenter.bintray.com. If not already available you can add this repository with:

    <repositories>
        <repository>
            <id>central</id>
            <name>bintray</name>
            <url>https://jcenter.bintray.com</url>
        </repository>
    </repositories>
    

    Or if you are using Gradle:

    repositories {
        jcenter()    
    }
    
    dependencies {
        compile "org.koin:koin-core:2.2.2"
    }
    

    Now let's create a simple UserService class with a dependency to an AddressValidator object:

    class UserService(
        private val addressValidator: AddressValidator
    ) {
        fun createUser(username: String, address: Address) {
            // use addressValidator to validate address before creating user
        }
    }

    AddressValidator simply looks like this:

    class AddressValidator {
        fun validate(address: Address): Boolean {
            // validate address
        }
    }

    Next we will use Koin to wire both components together. We do this by creating a Koin module:

    val myModule = module {
        single { AddressValidator() }
        single(createdAtStart = true) { UserService(get()) }
    }

    This creates a module with two singletons (defined by the single function). single accepts a lambda expression as parameter that is used to create the component. Here, we simply call the constructors of our previously defined classes.

    With get() we can resolve dependencies from a Koin module. In this example we use get() to obtain the previously defined AddressValidator instance and pass it to the UserService constructor.

    The createdAtStart option tells Koin to create this instance (and its dependencies) when the Koin application is started.

    We start a Koin application with:

    val app = startKoin {
        modules(myModule)
    }

    startKoin launches the Koin container which loads and initializes dependencies. One or more Koin modules can be passed to the startKoin function. A KoinApplication object is returned.

    Retrieving objects from the Koin container

    Sometimes it necessary to retrieve objects from the Koin dependency container. This can be done by using the KoinApplication object returned by the startKoin function:

    // retrieve UserService instance from previously defined module
    val userService = app.koin.get<UserService>()

    Another approach is to use the KoinComponent interface. KoinComponent provides an inject method we use to retrieve objects from the Koin container. For example:

    class MyApp : KoinComponent {
       
        private val userService by inject<UserService>()
    
        ...
    }

    Factories

    Sometimes object creation is not as simple as just calling a constructor. In this case, a factory method can come in handy. Koin's usage of lambda expressions for object creation support us here. We can simply call factory functions from the lambda expression.

    For example, assume the creation of a UserService instance is more complex. We can come up with something like this:

    val myModule = module {
    
        fun provideUserService(addressValidator: AddressValidator): UserService {
            val userService = UserService(addressValidator)
            // more code to configure userService
            return userService
        }
    
        single { AddressValidator() }
        single { provideUserService(get()) }
    }

    As mentioned earlier, single is used to create singletons. This means Koin creates only one object instance that is then shared by other objects.

    However, sometimes we need a new object instance for every dependency. In this case, the factory function helps us:

    val myModule = module {
        factory { AddressValidator() }
        single { UserService(get()) }
        single { OtherService(get()) } // OtherService constructor takes an AddressValidator instance
    }

    With factory Koin creates a new AddressValidator objects whenever an AddressValidator is needed. Here, UserService and OtherService get two different AddressValidator instances via get().

    Providing interface implementations

    Let's assume AddressValidator is an interface that is implemented by AddressValidatorImpl. We can still write our Koin module like this:

    val myModule = module {
        single { AddressValidatorImpl() }
        single { UserService(get()) }
    }

    This defines a AddressValidatorImpl instance that can be injected to other components. However, it is likely that AddressValidatorImpl should only expose the AddressValidator interface. This way we can enforce that other components only depend on AddressValidator and not on a specific interface implementation. We can accomplish this by adding a generic type to the single function:

    val myModule = module {
        single<AddressValidator> { AddressValidatorImpl() }
        single { UserService(get()) }
    }

    This way we expose only the AddressValidator interface by creating a AddressValidatorImpl instance.

    Properties and configuration

    Obtaining properties from a configuration file is a common task. Koin supports loading property files and giving us the option to inject properties.

    First we need to tell Koin to load properties which is done by using the fileProperties function. fileProperties has an optional fileName argument we can use to specify a path to a property file. If no argument is given Koin tries to load koin.properties from the classpath.

    For example:

    val app = startKoin {
       
        // loads properties from koin.properties
        fileProperties()
        
        // loads properties from custom property file
        fileProperties("/other.properties")
        
        modules(myModule)
    }

    Assume we have a component that requires some configuration property:

    class ConfigurableComponent(val someProperty: String)

    .. and a koin.properties file with a single entry:

    foo.bar=baz

    We can now retrieve this property and inject it to ConfigurableComponent by using the getProperty function:

    val myModule = module {
        single { ConfigurableComponent(getProperty("foo.bar")) }
    }

    Summary

    Koin is an easy to use dependency injection container for Kotlin. Koin provides a simple DSL to define components and injection rules. We use this DSL to create Koin modules which are then used to initialize the dependency injection container. Koin is also able to inject properties loaded from files.

    For more information you should visit the Koin documentation page. You can find the sources for this post on GitHub.

  • Thursday, 18 February, 2021

    REST API Design: Dealing with concurrent updates

    Concurrency control can be an important part of a REST API, especially if you expect concurrent update requests for the same resource. In this post we will look at different options to avoid lost updates over HTTP.

    Let's start with an example request flow, to understand the problem:

    We start with Alice and Bob requesting the resource /articles/123 from the server which responds with the current resource state. Then, Bob executes an update request based on the previously received data. Shorty after that, Alice also executes an update request. Alice's request is also based on the previously received resource and does not include the changes made by Bob. After the server finished processing Alice's update Bob's changes have been lost.

    HTTP provides a solution for this problem: Conditional requests, defined in RFC 7232.

    Conditional requests use validators and preconditions defined in specific headers. Validators are metadata generated by the server that can be used to define preconditions. For example, last modification dates or ETags are validators that can be used for preconditions. Based on those preconditions the server can decide if an update request should be executed.

    For state changing requests the If-Unmodified-Since and If-Match headers are particularly interesting. We will learn how to avoid concurrent updates using those headers in the next sections.

    Using a last modification date with an If-Unmodified-Since header

    Probably the easiest way to avoid lost updates is the use of a last modification date. Saving the date of last modification for a resource is often a good idea so it is likely we already have this value in our database. If this is not the case, it is often very easy to add.

    When returning a response to the client we can now add the last modification date in the Last-Modified response header. The Last-Modified header uses the following format:

    <day-name>, <day> <month-name> <year> <hour>:<minute>:<second> GMT

    For example:

    Request:

    GET /articles/123

    Response:

    HTTP/1.1 200 OK
    Last-Modified: Sat, 13 Feb 2021 12:34:56 GMT
    
    {
        "title": "Sunny summer",
        "text": "bla bla ..."
    }

    To update this resource the client now has to add the If-Unmodified-Since header to the request. The value of this header is set to the last modification date retrieved from the previous GET request.

    Example update request:

    PUT /articles/123
    If-Unmodified-Since: Sat, 13 Feb 2021 12:34:56 GMT
    
    {
        "title": "Sunny winter",
        "text": "bla bla ..."
    }

    Before executing the update, the server has to compare the last modification date of the resource with the value from the If-Unmodified-Since header. The update is only executed if both values are identical.

    One might argue that it is enough to check if the last modification date of the resource is newer than the value of the If-Unmodified-Since header. However, this gives clients the option to overrule other concurrent requests by sending a modified last modification date (e.g. a future date).

    A problem with this approach is that the precision of the Last-Modified header is limited to seconds. If multiple concurrent update requests are executed in the same second, we can still run into the lost update problem.

    Using an ETag with an If-Match header

    Another approach is the use of an entity tag (ETag). ETags are opaque strings generated by the server for the requested resource representation. For example, the hash of the resource representation can be used as ETag.

    ETags are sent to the client using the ETag Header. For example:

    Request:

    GET /articles/123

    Response:

    HTTP/1.1 200 OK
    ETag: "a915ecb02a9136f8cfc0c2c5b2129c4b"
    
    {
        "title": "Sunny summer",
        "text": "bla bla ..."
    }

    When updating the resource, the client sends the ETag in a If-Match header back to the server:

    PUT /articles/123
    If-Match: "a915ecb02a9136f8cfc0c2c5b2129c4b"
    
    {
        "title": "Sunny winter",
        "text": "bla bla ..."
    }

    The server now verifies that the ETag matches the current representation of the resource. If the ETag does not match, the resource state on the server has been changed between GET and PUT requests.

    Strong and weak validation

    RFC 7232 differentiates between weak and strong validation:

    Weak validators are easy to generate but are far less useful for comparisons. Strong validators are ideal for comparisons but can be very difficult (and occasionally impossible) to generate efficiently.

    Strong validators change whenever a resource representation changes. In contrast weak validators do not change every time the resource representation changes.

    ETags can be generated in weak and strong variants. Weak ETags must be prefixed by W/.

    Here are a few example ETags:

    Weak ETags:

    ETag: W/"abcd"
    ETag: W/"123"

    Strong ETags:

    ETag: "a915ecb02a9136f8cfc0c2c5b2129c4b"
    ETag: "ngl7Kfe73Mta"

    Note that the ETag must be placed withing double quotes, so the shown quotes are not optional.

    Besides concurrency control, preconditions are often used for caching and bandwidth reduction. In these situations weak validators can be good enough. For concurrency control in REST APIs strong validators are usually preferable.

    Note that using Last-Modified and If-Unmodified-Since headers is considered weak because of the limited precision. We cannot be sure that the server state has been changed by another request in the same second. However, it depends on the number of concurrent update requests you expect if this is an actual problem.

    Computing ETags

    Strong ETags have to be unique for all versions of all representations for a particular resource. For example, JSON and XML representations of the same resource should have different ETags.

    Generating and validating strong ETags can be a bit tricky. For example, assume we generate an ETag by hashing a JSON representation of a resource before sending it to the client. To validate the ETag for an update request we now have to load the resource, convert it to JSON and then hash the JSON representation.

    In the best case resources contain an implementation-specific field that tracks changes. This can be a precise last modification date or some form of internal revision number. For example, when using database frameworks like Java Persistence API (JPA) with optimistic locking we might already have a version field that increases with every change.

    We can then compute an ETag by hashing the resource id, the media-type (e.g. application/json) together with the last modification date or the revision number.

    HTTP status codes and execution order

    When working with preconditions, two HTTP status codes are relevant:

    • 412 - Precondition failed indicates that one or more preconditions evaluated to false on the server (e.g. because the resource state has been changed on the server)
    • 428 - Precondition required has been added in RFC 6585 and indicates that the server requires the request to be conditional. The server should return this status code if an update request does not contain a expected preconditions

    RFC 7232 also defines the evaluation order for HTTP 412 (Precondition failed):

    [..] a recipient cache or origin server MUST evaluate received request preconditions after it has successfully performed its normal request checks and just before it would perform the action associated with the request method.  A server MUST ignore all received preconditions if its response to the same request without those conditions would have been a status code other than a 2xx (Successful) or 412 (Precondition Failed).  In other words, redirects and failures take precedence over the evaluation of preconditions in conditional requests.

    This usually results in the following processing order of an update request:

    Before evaluating preconditions, we check if the request fulfills all other requirements. When this is not the case, we respond with a standard 4xx status code. This way we make sure that other errors are not suppressed by the 412 status code.

     

    Interested in more REST related articles? Have a look at my REST API design page.