dimanche 9 avril 2017

C#: webApi cache, Etag and 304(NotModified) StatusCode (1/2)

part 1: The Presentation

Some time ago I was assigned to a project to setup a WebApi (C#, .NET 4.5). This service should be exposing lot of data behind each resource, and clients will be requesting very often.
One of the requirements was to avoid to send data back to clients when they already have the most recent version of it.



Here the spec:
Scenario: First time request
    Given I have a data stock available on a rest resource
    When client request data for first time
    Then full data stock is sent back to client 
Scenario: Requesting again for unchanged data
    Given I have a data stock available on a rest resource
    When client request data for second time
        And data stock didn't change since last client request
    Then server inform to client that he has latest data stock
        And data stock is not sent back to client  
Scenario: Requesting again for changed data
    Given I have a data stock available on a rest resource
    And client request data for third time
        And data stock has changed since last client request
    Then full data stock is sent back to client
Point here is we need to find a track version of data each client has in order to compare it with current version on server side and decide if we need to send the new data back or not.

There is way to handle this with a service cache, or Http Caching and a very nice implementation is done by CacheCow

The idea is to use an HttpHeader called ETag.

This tag is a string value that can be associated with each rest ressource. The server send this tag when replying data to clients in a HttpHeader called ETag, then clients, when requesting same resource needs to send previously received tag value inside the dedicated HttpHeader If-None-Match.

Yes, from server to client, use Etag header, from client to server, use If-None-Match, is like that.

In that way, server can use incoming ETag value to know if fresh data needs to be replied or if the client have the most recent one.

This is a nice way to save bandwidth, client don't need to receive and reprocess big amount of data he already has, but clients needs to be aware of this implementation, if they don't populate If-None-Match then we din't solve our problem.

I let you to read more about ETag details out there...

This post is to discuss about the CacheCow implementation... actually not only, but...

Show me the Demo !

I'm assuming from now that you already did the appropriate lectures about http cache, Etag and CacheCow, ok ?)

Let's go to action then !

I'm using a very simple webapi project, based on  VisualStudio template.

CacheCow is very easy to setup, just a nuget package to install, a Handler registration and we are running !!

I'll do my tests with a basic code as follow:

   public class ValuesController : ApiController  
   {  
     // GET api/values/5  
     [HttpGet]  
     [Route("api/values/{id}")]  
     public string Get(int id)  
     {  
          var data = "here is where we load heavy data from database to send back to client.";
          return $"value request: {id}. {data} {data} {data} {data} {data} {data} {data} {data}";  
     }  

Run the project and do a GET request on "/api/values/5"



we have a HttpStatus 200 OK, about 1KB of data received, same value requested in URL is present in data just to be sure we are consistent with the test code (value request: {id}), all good so far.

If we check the headers we can find:

ETag: "01797092b6dc4e8f85a3a379bdd2c495"

Now, we take the ETag value, we put it on header If-None-Match and GET request same resource again:

Well, we have status code 304 (Not Modified), 337Bytes of received and no data on body !! awesome !

Take a look that eTag value should be passed surrounded by double quotes, like a string!.

Second round: let's try to simulate some data changes on server side. Now the controller will generate some random data on each request as follow:

     // GET api/values/5  
     [HttpGet]  
     [Route("api/values/{id}")]  
     public string Get(int id)  
     {  
         var r = new Random();
         var data = $"{r.Next(0, 10)} {r.Next(0, 10)} {r.Next(0, 10)} {r.Next(0, 10)}";  
         return $"value request: {id}. {data}";  
     }  
   

Generated data will consists of a string with the request id from url and 4 random integers between 0 and 10.

I do a GET request on "/api/values/5" two times(just to be sure data is random) and I get:

"value request: 5. 1 8 9 2"
"value request: 5. 6 4 7 0"

What if we take a look on ETag received ? If you are really testing with a controller and random returned data you will see that ETag value is always the same for all requests, even if data is different.
(if you are just reading, then believe me, ETag don't change even if the data does)

Now, the question... why ??? New data should give me new ETag value !

Lets test it in another way, put the Etag value in the If-None-Match header and do a request again...

why Status 304 and no data coming ?? it is random data! different at any time so I should have it always !

Lets go hunt some Cows !

I did told you to do the lectures, it is very clear there, CacheCow don't cache data ! It only cache resources requests !

You remember how to register CacheCow? It is a HttpHandlers that we add to the MessageHandler pipeline of WebApi. In that way CacheCow intercept all requests coming to WebApi before arrive to controllers.

When a resource is GET requested for first time, CacheCow generates an ETag value for this resource, cache it (in memory or persisted database) and give it to the client.
When client request a second time the same resource passing the previously received ETag, then CacheCow compare this values with ETag kept in cache for requested resource... if they match then 304 is sent back with no data.



Here is a very clear comment, extracted from CacheCow source code, file CachingHandler.cs

     /// <summary>  
     /// This is a function responsible for controlling server's cache expiry  
     /// By default, there is no expiry in the cache items as any change   
     /// to the resource must be done via the HTTP API (using POST, PUT, DELETE).  
     /// But in some cases (usually adding Web API on top of legacy code), data is changed   
     /// in the database (e.g. configuration data) but server would not know.  
     /// In these cases a cache expiry is useful. In this case, CachingHandler uses the   
     /// LastModified to calculate whether cache key must be expired.  
     /// </summary>  
     public Func<HttpRequestMessage, HttpConfiguration, TimeSpan> CacheRefreshPolicyProvider { get; set; }   
   

By default, CacheCow monitor changes on resources based on WebApi itself. We need to invalidate cached ETags using POST, PUT or DELETE operations.

So, what to do when data is changed on backend without any rest request ?

CacheCow offers a way to manually invalidate ressources. To remove some ressource from cache you can tuse InvalidateResource() method on CachingHandler class or RemoveResource() from IEntityTagStore interface (on IEntityTagStore implementation for example)
You need then to link in some way your backend code with your webapi to deal with this kind of synchronization.



to see how CacheCow handler work and do it yourself lets go to Part 2

Aucun commentaire:

Enregistrer un commentaire