Azure, Azure Government, Technical

Azure Event Hubs vs AWS Kinesis

With Amazon and Microsoft being the main providers for cloud based telemetry injestion services I wanted to do a feature and price comparison between the two. If nothing else, this info should help with an understanding of each services capabilities and perhaps help with making a decision on which service is best for your needs. I realize if you’re on AWS you’re probably going to use Kinesis and if you’re on Azure you’re probably going to use Event Hubs. But at least arm yourself with all the info before diving in!

Two caveats to this info worth noting:

  1. Yes, I work for Microsoft. I did not fudge these numbers or any of the info to paint a nicer picture for Azure. This info is factual based on my research into both services.
  2. Cloud services and their pricing change, so these specs and pricing are current as of the date of this post and you should re-check on Azure or AWS to verify.

This is a purely objective comparison focused on service specs. I’m not going to get into the usability of either service, programming efficiency, portal experiences, or anything else like that. Just numbers. Notice there are a couple question marks on the AWS side because I couldn’t find the info in the Kinesis documentation and folks I asked didn’t know. If you can help fill in those gaps, or notice some of this has changed, please let me know in the comments.

 

Event Hubs

AWS Kinesis

Input Capacity

1MB/s per Throughput Unit (TU)

1MB/s per Shard

Output Capacity

2MB/s per TU

2MB/s per Shard

Events/s

1K

1K

Latency

50ms Avg, 99th % < 100ms

10s min

Protocol

HTTPS or AMQP 1.0

HTTPS

Max Message Size

256KB

1MB

Included Storage

84GB per TU

?? (none?)

Max Consumers

1 Consumer Group (Basic Tier)

20 Consumer Groups (Standard Tier)

?? (only limited by output capacity?) (See <Update 6/1/2016> below)

Monitoring

Built in portal metrics or REST API

CloudWatch

Message Retention

24 hrs (up to 7 days)

24 hrs (up to 7 days)

Price per Hour

$0.015/TU Basic Tier
$0.030/TU Standard Tier

$0.015/Shard

Price per Million Units

$0.028 Basic & Standard (64KB/unit)

$0.014 (25KB/unit)

Extended Data Retention Price

Only if stored event size exceeds 84GB * #TU’s, $0.024/GB (assuming LRS)

$0.020/Shard hour

Region used for pricing

East US

US East

Throughput Flexibility

Adjust TU’s as needed

Adjust Shards as needed

Supported Regions

18 (plus GovCloud)

9

<Update 6/1/2016> Turns out the answer to Max Consumers for Kinesis isn’t exactly straight forward due to their dependency on HTTP(S), as pointed out to me after publishing this post in February. Kinesis is limited to 5 read transactions per shard so your max consumers is going to be dependent on how you spread those transactions across your consumers. If you have five consumers each reading once per second, five is your max. Since output is capped at 2MB/s,  you can read up to that capacity in each transaction but you have to design your consumers to work within those limits. Additional info on this Stack Overflow thread.</Update 6/2/2016>

To compare pricing, I’m using the sample from AWS. In case they change their sample, here is the sample the below numbers are based on:

“Let’s assume that our data producers put 100 records per second in aggregate, and each record is 35KB. In this case, the total data input rate is 3.4MB/sec (100 records/sec*35KB/record). For simplicity, we assume that the throughput and data size of each record are stable and constant throughout the day.”

Kinesis Pricing Sample

Shards

4

Shard cost/month (31 days)

$44.64

PUT cost/month

$7.50

Total

$52.14

Extended Retention Cost

$59.52

Total w/Extended Retention

$111.66

Event Hubs Pricing Sample

 

Basic

Standard

TU

4

4

TU cost/month (31 days)

$44.64

$89.28

PUT cost/month

$7.50

$7.50

Total

$52.14

$96.78

Extended Retention Cost*

N/A

$47.24

Total w/Extended Retention

N/A

$144.02

* Extended storage only available on Standard tier

Results

On the pricing side, I found it interesting they are the exact same price! Unless you need extended retention and need to bump up to the Standard tier on Event Hubs. Comparing the specs, the items that jump out for me that might impact a decision are latency (Event Hubs blows away Kinesis), protocol (no AMQP on Kinesis), max message size (Kinesis is quite a bit larger), the size of a pricing unit (64KB for Event Hubs and 25KB for Kinesis), and the number of regions. Whichever service you choose to go with, hopefully this info helps make the decision a bit easier.

Azure, Azure Government, Technical

Using Event Hub and EventProcessorHost on Azure Government

There are few needs which apply to almost every industry when it comes to building software and solutions to meet the needs of that industry. Manufacturing, healthcare, industrial, education, home automation, military and public safety (to name a few) all have a need to collect data from hundreds/thousands/millions of data sources and bring the data together in order to either report on it as a whole or send the data somewhere else. For example, a government agency responsible for monitoring rainfall and temperature across an entire country. It would be great if that agency could set up a few thousand monitoring stations around the country and have those stations report their respective sensor data to a central location for aggregation where the agency can begin to see trends across various regions within the country and across given time spans. Quite a bit more reliable and near-real time compared to sending a worker out to each station to collect data and bring it back manually to a data center.

In order to manage the intake and processing of what could be billions of pieces of data per day we will need a scalable and efficient hub for all of the sources to talk to at once. Using architecture speak, we need a durable event stream collection service. Azure Event Hub was built to support these types of use cases and perform as the event stream collection service that sits in the middle of our Internet of Things (IoT) architecture. Once we get our environmental sensors set up to send their data to Event Hub, we can easily scale that service to support the thousands of devices we need and begin building really powerful reporting solutions that utilize the ingested data.

To see what an actual Event Hub implementation would start to look like on Azure Government, where it was recently released (as of the date of this post) along with all other Azure regions, let’s start by setting up a simple Event Hub service using a single instance of EventProcessorHost following the instructions on the Azure documentation site. For the most part, using Event Hubs in Azure Government is the same as any other Azure region. However, since the endpoint for Azure Government is usgovcloudapi.net instead of windows.net for many other Azure regions, the sample needs to be modified a bit. Creating the Event Hub and storage account is exactly the same, shown in the screenshots below choosing the USGov Iowa region:

Creating the Event Hub

Creating the Storage Account

Creating the sender client is the same as shown in the example, as well. The small tweak we need to make is on the receiver, which references the storage account we created previously since EventProcessorHost utilizes a storage account when processing messages. Notice the URL for the storage endpoint in Azure Government is *.core.usgovcloudapi.net. When you create the EventProcessorHost in the receiver application, the default behavior of the class is to assume you are using a storage account located in the *.core.windows.net domain. This means if you run the sample as-is (with your Event Hub and Storage Account info, of course), you will get an error:

Since my Storage Account was named “rkmeventhubstorage”, the default behavior is to create a URI of rkmeventhubstorage.blob.core.windows.net. Obviously, that doesn’t exist. I need a URI of rkmeventhubstorage.blob.core.usgovcloudapi.net. What now?

Diving into the source for Microsoft.ServiceBus.Messaging.EventProcessorHost, you’ll see (or just save your time and trust me) that the blob client is created using the CloudStorageAccount class. Looking at the documentation for that class, you won’t see anything to help get that endpoint updated (as of the writing of this post.) Turns out there’s an undocumented property for EndpointSuffix. Bingo. All you need to do is add a property for EndpointSuffix to use core.usgovcloudapi.net and the stars will align. Here is the full Main method for the Receiver application, showing the use of the EndpointSuffix property.

string eventHubConnectionString = "Endpoint=sb://rkmeventhub-ns.servicebus.usgovcloudapi.net/;SharedAccessKeyName=ReceiveRule;SharedAccessKey={YourSharedAccessKey}
string eventHubName = "rkmeventhub";
string storageAccountName = "rkmeventhubstorage";
string endpointSuffix = "core.usgovcloudapi.net";
string storageAccountKey = "{YourStorageAccountKey}";
string storageConnectionString = string.Format("DefaultEndpointsProtocol=https;AccountName={0};AccountKey={1};EndpointSuffix={2}",
storageAccountName, storageAccountKey, endpointSuffix);


string eventProcessorHostName = Guid.NewGuid().ToString();
EventProcessorHost eventProcessorHost = new EventProcessorHost(eventProcessorHostName, eventHubName, EventHubConsumerGroup.DefaultGroupName, eventHubConnectionString, storageConnectionString);
eventProcessorHost.RegisterEventProcessorAsync().Wait();


Console.WriteLine("Receiving. Press enter key to stop worker.");
Console.ReadLine();

After adding that property, your Receiver will be able to receive the messages successfully.

Azure, Azure Government, PowerShell, Technical

Get Started with PowerShell on Azure Government

Many folks using Azure Government probably have a subscription or two on public Azure. If you’re bouncing between environments and using PowerShell on each, it could become cumbersome to switch between them. This post shows a method that I’ve found to be easy to implement and simple to switch between environments. As a footnote, this can also be used to set up multiple environments beyond Azure Government, such as on-premises Azure.

If you do nothing after installing the Azure PowerShell modules and then run Get-AzureEnvironment, you’ll get two results (as of this posting): AzureCloud and AzureChinaCloud. So the first thing we need to do is add another environment for Azure Government. After that, we’ll use the certificate method to connect to our subscription. I prefer this method for three reasons:

  1. I can use this same certificate for my other subscriptions, allowing me to easily switch between them on the same machine
  2. Azure Government doesn’t support using Azure AD (Add-AzureAccount), at least based on my experiences (see edit below)
  3. Using a publishing settings file may work, but honestly I haven’t spent time using this method to see if it works or works as well as using a certificate

Ok, let’s add that new local environment. Run the following Posh command (I included line breaks for readability):

Add-AzureEnvironment -name “AzureGovernment”
-PublishSettingsFileUrl “https://manage.windowsazure.us/publishsettings/index?client=xplat”
-ServiceEndpoint “https://management.core.usgovcloudapi.net”
-ManagementPortalUrl “https://manage.windowsazure.us” -StorageEndpoint “core.usgovcloudapi.net”
-ActiveDirectoryEndpoint “https://login.windows.net/” -ActiveDirectoryServiceEndpointResourceId “https://management.core.usgovcloudapi.net/”

 Feel free to change the –name parameter value to whatever you want to use as this is a local environment name, but leave the rest as-is. And don’t forget the trailing slash on -ActiveDirectoryServiceEndpointResourceId or you’ll get an error when authenticating.

Now let’s create a local certificate. Open up a Visual Studio command prompt or other cli that supports makecert and run:

makecert -sky exchange -r -n “CN=<YourCertName>” -pe -a sha1 -len 2048 -ss My “c:temp<YourCertName>.cer”

For a reference on how to do that, look here: https://msdn.microsoft.com/en-us/library/azure/gg551722.aspx

Once that cert is created, you need to add it to your subscription in Azure Government.

  1. Navigate to https://manage.windowsazure.us and log in
  2. At the bottom of the left navigation, click on “Settings”
  3. Click on “Management Certificates”
  4. At the bottom of the screen, click on “Upload” and choose the .cer file you created earlier and stored in c:temp, then upload the file

Once the certificate has been added, you can now add a new subscription entry using the Azure environment and certificate previously created. First, you need to grab some configuration values:

$subId “<YourSubscriptionId>”
$thumbprint 
“<YourCertificateThumbprint>”
$cert Get-Item Cert:\CurrentUser\My\$thumbprint
$localSubName 
“<LocalSubscriptionName>”
$environmentName “AzureGovernment”

 <YourSubscriptionId> can be copied from the Management Certificates screen where you uploaded your certificate. Double click the value next to your cert and it will highlight the entire value so you can copy it, although it won’t show the entire value. You can expand the width of the column if you’d like to see the entire value (that was recently added J)

<YourCertificateThumbprint> can be copied from the same location under the Thumbprint column.

<LocalSubscriptionName> is a local name you will use to refer to this subscription, so use a name that makes sense to you. Maybe “ProdAzureGovernment”, as an example.

For environmentName, use the same name you used earlier when creating the local Azure Environment. If you kept my default, the name will be “AzureGovernment”.

Now run the following (I included line breaks for readability):

Set-AzureSubscription -SubscriptionName $localSubName
-SubscriptionId $subId -Certificate $cert -Environment $environmentName

If all went well, you’re all set! To see your local subscriptions, run Get-AzureSubscription. You should see your new ProdAzureGovernment subscription (or whatever you called it) along with any other subscriptions you already had configured, if any. You will also see which one is default and also current. The one flagged as default will be used by default when you first fire up PowerShell. The one marked current is what you’re currently hitting when you run commands against your subscription. You can change which subscription is default and current by running Select-AzureSubscription and passing in the desired config.

Assuming you have one subscription called “MSDN” and another called “ProdAzureGovernment”, within the same PowerShell window you can switch between them by simply running Select-AzureSubscription.

Select-AzureSubscription “MSDN” –Current
Get-AzureVM

Will show you all VMs on your MSDN subscription.

Select-AzureSubscription “ProdAzureGovernment” –Current
Get-AzureVM

Will show you all VMs on your Azure Government subscription.

If you have your Azure Government subscription set to current and then run Get-AzureSubscription, you may receive an error stating “The given key was not present in the dictionary.” I’m not sure what the cause of this is, but all other commands I’ve run against the subscription have succeeded just fine. If I figure that out I’ll post an update.

It’s just that simple! Hope that helps. As always, if you have any questions or suggestions please post a comment.

<EDIT>Thanks to a tip from my colleague Keith Mayer, I discovered why I couldn’t get Azure AD to work. My previous script for Add-AzureEnvironment was missing the -ActiveDirectoryEndpoint parameter, which is kind of important. After adding that to the environment definition I was able to use Azure AD and the Add-AzureAccount cmdlet to authenticate against Azure Government. Yeah! This is actually the preferred method going forward as opposed to using a certificate.</EDIT>