Is Azure Cosmos DB Really Expensive?

When you mention Azure Cosmos DB in any architecture/solutioning discussion, a likely question you get is “Isn’t Azure Cosmos DB expensive?”. Like any other thing in this world, the right but not so useful answer is that it depends. First of all, let us understand how Cosmos DB is priced. Cosmos DB is not priced based on the usage. Of course, Cosmos DB is more than a simple NoSQL database but like other NoSQL databases, the price is based on what you reserve. So, the analogy is renting a car rather than hailing a cab – you pay regardless of you used the rented car or not. So, what you reserve with Cosmos is the capacity – what Microsoft calls Request Units (RU) per second. You pay for the RU as well as the space (GB).

At the time I write this, the cost for storage is $0.25 GB/month, which is typically not a big deal. When it comes to RU, you have to reserve a minimum of 400 (and then increase in increments of 100). What do you get with 400 RU? A read of a 1 KB document with Session consistency consumes 1 RU. A write costs 5 RU. So, with 400 RU, you can do about 400 such reads concurrently or 80 such writes or any mix of this – roughly speaking. For more accurate calculation, see this. If your app does not need to be this intensive, you are probably paying for more than what you need with Cosmos, since 400 RU is the minimum that you could provision per container.

HERE COMES THE MOST IMPORTANT POINT. The RU that you provision is per container. So, each container will cost about $25 ($23.61, to be exact, with 1 GB storage). If you are a DocumentDB person (SQL API), the container becomes a collection for you. This becomes dangerous when you are a MongoDB person and want to migrate your MongoDB collections to Cosmos collections. If you have 10 collections, you are looking at $236 per month, even if you have only a few kilo bytes worth of data in your collections. Bottom line – you have to reuse a collection if you want to save $.

One of the major selling points about Cosmos is that “It offers turnkey global distribution across any number of Azure regions by transparently scaling and replicating your data wherever your users are”. That is great but you have to understand you can have only one writer per region. Of course, you can have multiple readers. Now, if you want to build applications in which writers and readers are globally distributed, then you have to have geo-replicated Cosmos DB accounts. What does it mean? Say, you care about three regions. For a region, you will have a writer and two readers corresponding to the other two regions so that you can read and write locally for that region (Local RW) and read the data synced from the other two regions locally (Local R).

   +--------+   +-------+  +-------+
   | Local  |   |Local  |  |Local  |
   | RW     |   |R      |  |R      |         Region 1
   |        |   |       |  |       |
   +---+--+-+   +---^---+  +^------+
       |  |         |       |
       |  |         |       |
       |  |         |       |
       |  |         |       |
+------v+ |   +-----+--+    |   +--------
|  Local| |   | Local  |    |   | Local |    Region 2
|  R    | |   | RW     |    |   | R     |
|       | |   |        |    |   |       |
+-------+ |   +----+---+    |   +^-------
          |        |        |    |
          |        |        |    |
          |        |        |    |
          |        |        |    |
   +------v+  +----v---+  +-+----++
   | Local |  | Local  |  | Local |          Region 3
   | R     |  | R      |  | RW    |
   |       |  |        |  |       |
   +-------+  +--------+  +-------+

With nine collections, you now pay almost nine times. You pay for storage, provisioned RU in each container in each region in addition to the data transfer rates between regions. But then, if you have complex requirements, you have to do complex things and pay more as a result.

Coming back to the reuse of a single collection, let us see how we can do this through a simple solution. The code you see here is only for illustration. Say, I have two types: Customer and Payment and these are C# classes. I can now create documents like so.

using (var client = new DocumentClient(new Uri(endPoint), key))
{
    var c = new Customer()
    {
        Id = "123",
        FirstName = "Bob",
        LastName = "Builder"
    };

    var result = await client.CreateDocumentAsync(
                                UriFactory.CreateDocumentCollectionUri(dbase, colln), c);

    var p = new Payment()
    {
        Id = "P1",
        Method = "Credit Card"
    };

    result = await client.CreateDocumentAsync(
                             UriFactory.CreateDocumentCollectionUri(dbase, colln), p);
}

Now, I want to query for customers like so.

IQueryable<Customer> query = client.CreateDocumentQuery<Customer>(
                        UriFactory.CreateDocumentCollectionUri(dbase, colln), opt);

That will return both the customer and payment documents, since Cosmos knows nothing about the types. All documents are basically JSON data. To make this better, let us create a base class like so.

public class Customer : DocumentBase
{
    ...
}

public class Payment : DocumentBase
{
    ...
}

public abstract class DocumentBase
{
    public DocumentBase()
    {
        DocumentType = GetType().Name;
    }

    public string DocumentType { get; private set; }
}

Create an extension method CreateDocumentQueryFor to add the type info into the query.

public static class Helper
{
    public static IQueryable<T> CreateDocumentQueryFor<T>(this DocumentClient client, 
                                   Uri documentCollectionUri, 
                                   FeedOptions feedOptions = null) where T : DocumentBase
    {
        return client.CreateDocumentQuery<T>(documentCollectionUri, feedOptions)
                                     .Where(x => x.DocumentType.Equals(typeof(T).Name));
    }
}

Since the documents we already added do not have the new DocumentType to query, add a couple of new Customers. If you now inspect the JSON, you should see the new DocumentType field like so.

{
    "Id": "456",
    "FirstName": "Tom",
    "LastName": "Engine",
    "DocumentType": "Customer",
    "_rid": "QU52ANhuFwAGAAAAAAAAAA==",
    "_self": "dbs/QU52AA==/colls/QU52ANhuFwA=/docs/QU52ANhuFwAGAAAAAAAAAA==/",
    "_etag": "\"01005422-0000-0000-0000-5ac8548e0000\"",
    "_attachments": "attachments/",
    "_ts": 1523078286
}

With that, if you now use the CreateDocumentQueryFor extension method, it should retrieve only those documents corresponding to the type you specified (Customer, in the example below).

IQueryable<Customer> query = client.CreateDocumentQueryFor<Customer>(
                        UriFactory.CreateDocumentCollectionUri(dbase, colln), opt);

To summarize, Cosmos DB could be expensive if used without understanding the rules. But if you know the rules well and play by them, it will be okay. Remember you need at least one collection and the minimum RU that can be provisioned per collection is 400 and that will be about $25 per month. You cannot go less than that though.

Another point to note – you have to specify the partition key when you create an “Unlimited” collection. In such a case, having one partition key for different document types per collection will be a challenging prospect.

Advertisements

One thought on “Is Azure Cosmos DB Really Expensive?

  1. Thank you for the article, one thing i should mention is you can provision the throughput (RU) at database level which will be shared cross all collections.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.