Peter Birkholm-Buch

Freelance Cloud Architect

Tag: Azure

Azure Cosmos DB

These are my thoughts and key take aways from working with Azure Cosmos DB for a while now.

Infinite Ease

Creating a database importing some data and querying it through the Data Explorer takes literally 5 minutes.

Creating an interface using Azure Logic Apps with the Cosmos connector or a proper API using Azure Functions with bindings and PowerShell is an additional 5 minutes.

Expose the API through the Azure API Management Gateway and you have a complete schema-less database and data access layer in 15 minutes.

Cosmos is hands down the most easy and accessible database technology I’ve ever worked with – period!

Infinite Scale

Cosmos is built to be infinitely scalable in terms of storage and performance.

Data is stored in storage partitions in chunks of 10GB and Cosmos automatically adds more storage partitions as needed. It works great and is seamless – only gotcha is that Cosmos for some reason won‘t scale storage partitions back to less than two which in most cases probably doesn’t matter much, but it is an issue with regards to cost. See more below.

Performance is measure in a fixed unit called Request Units per Second or RUs. 1 RU is required to read 1 KB of data and 5 RU are required to write 1KB of data. So on the surface calculating how many RUs you need should be easy. Turns out it isn’t and missing your RUs can be costly or make you loose data.

Infinite Cost

This is where infinite scale turns into infinite cost. On the surface it only seems fair that infinite scale should have a price – but controlling cost in Cosmos is difficult and completely different for what you have come to expect of the Cloud.

It’s all about those RUs.

You allocate a fixed amount of RUs either to an entire database (probably can’t recommend doing that) or to individual collections – and then you pay for those fixed RUs whether you use them or not!

Fine you think – how much can it cost in terms of RUs to write and read some data? Quite a lot in fact and there are several factors which have great influence over how many RUs are required to write and read data. I highly recommend reading this and this but it all boils down to the partition key.

The partition key is used to distribute data over logical data partitions when Cosmos writes data. The less data you write to the same logical partition as the same time – the more performant the write is.

When reading data the partition key can be used in a query to tell Cosmos where the data is stored and make reading data more performant. If you don’t know the partition key in a query then Cosmos have to read through all the data that fits the WHERE clause and that drives RUs – because reading 1 KB costs 1 RUs – so reading through 1000s of KBs of data to find that single document can cost a lot of RUs. So if you have a lot of queries without partition key and lots of spikes then you have to provision lots of RUs and pay for them 24/7.

Unfortunately there’s no auto scaling for RUs available in Cosmos even though the competitor seems to have one. However recent additions to Azure Monitor makes it possible to create your own.

If you have spikes and you run of out provisioned RUs then Cosmos will throttle the requests by sending back an HTTP429 error message. It’s then up to the client to know how to handle this and perform a retry. If the client doesn’t know how, then it’s an error and the data is probably lost. Please be aware of the retries!

Infinite Possibilities

Once you’ve tackled the scaling, cost and partition keys and you start to to use Cosmos with the change feed and hook it up to the event grid – you get a completely new publish and subscribe data layer capable of replacing integration middleware, ingesting data at IoT scale and driving real time data analysis with Spark and Databricks.

Cosmos is a game changer!

Azure Functions

These are my thoughts and key take aways from working with Azure Functions for a while now.

Creating a Function in the Azure Portal is easy as Pie

Creating your first Function in the Azure Portal is a simple process and you can use pretty much any language you want. I prefer PowerShell for prototyping and management stuff – like reacting to events where I have to fire some PowerShell command to handle something in Azure – and I use JavaScript/TypeScript for the more heavy programmatic tasks like creating “real” solutions.

As Microsoft adds support for more languages the possibilities of using serverless Functions will extend to other areas. Recently Python was GA’ed (see below) and as soon as PowerShell for Functions is GA’ed “DevOps”-people can become Azure Function Developers too. Functions is not only for web services and databases but for all things serverless! Think Flow/Logic Apps -> Functions written in PowerShell which do IT management operations.

Great Developer Experience

The developer experience for creating, running and testing Functions locally is just perfect. You can do everything completely locally and even offline – just like any other local development stack/platform/toolchain.

Local development in Visual Studio Code on a Mac

Then push changes to a central repo like Azure DevOps where CI/CD pipelines can build and deploy the Function to Azure completely automated.

On the left building the Function deployment package and on the right deploying the package to Azure

If you don’t have Azure DevOps then Functions can pull in code from pretty much any cloud reachable Git repo – it’s completely cross platform.

There is complete support for Visual Studio and Visual Studio Code and lots of other editors for creating Functions so that writing, editing, testing, debugging and so on is a first class experience.

Triggers & Bindings

Functions can be triggered or react to a range of builtin sources like Azure Event Grid, Service Bus, Cosmos DB and so on – and of course HTTP requests. So calling and activating Functions is really easy.

Bindings are a way of declaratively receiving the input from a trigger or other resources and passing the output of the Function to a receiver – like Azure Cosmos DB, storage or Service Bus. This makes it very easy to react to events – get and process data and output the result with very little friction. Moving data in and out of Cosmos is almost like magic (Please note that more complicated usage of Cosmos requires the use of the SDK!).

For me it has completely replaced the necessity to create Web APIs in .NET or Node and deploy to Azure Web App. If you’re doing web services today using the regular technology platforms and you want to move to Azure I would recommend looking into Functions rather than Web Apps for hosting web services.

Proxies & API Management Gateway

Function Proxies is like a miniature API Manangement Gateway (APIM) which can route URL based requests to methods in your Function or even other Functions if you’re scaling out at the implementation level.

For instance a proxy could route requests to /api/shipments to the actual implementation in the “GetShipments” method.

I also really like the actual APIM and the integration with Functions but updating the API specification in the APIM when a Function is updated is a bit of a pain. I would never expose a Function to the internet just through the URL or even Proxies I would always use APIM as the front door.

Automated Scaling

You can deploy using either App Plan or Consumption Plan and unless you have very specific requirements (or extremely high load) I can’t think of a reason not to choose Consumption Plan and just let Azure handle everything.

Some of our APIs are hammered in the morning and Azure just scales the number of “servers” up in seconds and scales back down again when things settle. We haven’t missed a single call yet with Consumption Plan and we did that on App Plan because we ran out of horse power during that single it will never happen freak influx of data moment.

Azure Logic Apps

This is not a guide or any kind of introduction to Logic Apps but my thoughts and key take aways from working with them for a while now.

Events & Connectors

Logic Apps are great for reacting to events – especially Azure events – and I usually use them to build up the outer and more general logic of a system. This is to react to things that happens and maybe do comparison against thresholds and other configuration settings. It’s easy to call out to other services both in Azure, Office and external APIs.

As a general rule I don’t use Logic Apps to update data – I prefer to use Azure Functions for that as SDKs usually supports more advanced retries and error handling. However, some connectors supports this also – your mileage may vary.

The connectors really are the stars of Logic Apps – you can receive and act on data from pretty much any resource in the Microsoft eco system and most enterprise systems that support Azure Active Directory have connectors too.

Developer Support

Creating Logic Apps using the visual designer in the Portal and Visual Studio is a breeze – Visual Studio Code still only supports editing the underlying JSON document but will display the visual designer in read-only mode. Actual development using a repo and Azure DevOps CI/CD is a bit clunky and deployment requires an ARM template to be built using a script. Better support for storing the “code” for a Logic App in a code repository and deployment is something I’m hoping that will be added in the future.

I’ve seen integration developers use Logic Apps to quickly and easily build integration pipelines using HTTP, data parsing, conditions and service bus but they get stuck when it comes to CI/CD.

No deployment slots

Logic Apps doesn’t support deployment slots so it requires a service window of some sort when deploying in higher load scenarios. Since we’re exposing our Logic Apps through the Azure API Management Gateway we mock the requests to our Logics Apps during deployments. This is certainly not ideal and I hope that Logic Apps will get some form of deployment slot capability in the future.

Scalability

Logic Apps on paper supports reacting to and handling 1000s of events and requests per minute (we’ve done that too – Just be aware that the normal limit is 100.000 requests per 5 minutes). However if you’re not carefull then “long” running activations during high load can cause your entire App to freeze. If this happens then a manual restart can become required. If that happens then you have to start thinking about your usage of Logic Apps; the flows, branching, error handling and perhaps if moving to an Azure Function is better fit for what you’re trying to accomplish.

Setting roles and policies on Azure KeyVault to enable getting secrets from an Azure Function

This is just a follow up on this post: https://azure.microsoft.com/en-us/blog/simplifying-security-for-serverless-and-web-apps-with-azure-functions-and-app-service/ for a bit more clarification on the roles and policies required in KeyVault to make this work.

The enable an Azure Function to access secrets in KeyVault you have to do the following:

  1. Create a system assigned managed identity to the Function. Just go to Platform Features in your Function and select Identity and enable the System assigned identity – remember to click save!
  2. Giving the role “Managed Application Reader” to the managed identity of the Function.
    1. Go to KeyVault and click on “Access control (IAM) in the menu and click on “Role assignments”.
    2. If you want to see existing App/Functions that have assignments select “App Services or Function Apps” from the Type dropdown menu.
    3. Click on Add and select the “Managed Applications Reader” from the Role drop down menu.
    4. Type in the name of your Function and select it from the menu – make sure you select the actual identity of the Function – see the icon in the screendump below.
    5. Click on save.
      Adding roles to KeyVault
  3. Assigning at least the “Get Secret” policy to the service principle of the managed identity.
    1. Click on “Access policies” in the KeyVault menu
    2. Click on Add and click on the “Select principal” fly out menu and type in the name of your Function
    3. This time it’s the service principal we want to select – click on it and click on select
    4. From the “Secret permissions” drop down menu select at least the “Get” permission.
    5. Click on OK.
      Adding policies to KeyVault

You’re done and you should now be able to get secrets from your KeyVault in your application settings in your Function.

Payroll Services Firm Transforms its Product Into a Platform with API Management

Bluegarden, a large Scandinavian payroll service, is using Microsoft Azure API Management to gain a simple, efficient, security-enabled way to share application programming interfaces (APIs) with partners. By publishing APIs for key product capabilities, Bluegarden can create an extensible product to fit every customer’s needs, expand its partners and consequently its business, and help keep APIs secure.