An Introduction to Windows Azure (Part 1)

Windows Azure is the Microsoft cloud computing platform which enables developers to quickly develop, deploy, and manage their applications hosted in a Microsoft data center. As a PAAS provider, Windows Azure not only takes care of the infrastructure, but will also help to manage higher level components including operating systems, runtimes, and middleware.

This article will begin by looking at the Windows Azure data centers and will then walk through each of the available services provided by Windows Azure.

Windows Azure Data Centers

Map showing global location of datacenters

Slide 17 of WindowsAzureOverview.pptx (Windows Azure Platform Training Kit)

Microsoft has invested heavily in Windows Azure over the past few years. Six data centers across three continents have been developed to serve millions of customers. They have been built with an optimized power efficiency mechanism, self-cooling containers, and hardware homogeneity, which differentiates them from other data centers.

The data centers are located in the following cities:

  • US North Central – Chicago, IL
  • US South Central – San Antonio, TX
  • West Europe – Amsterdam
  • North Europe – Dublin
  • East Asia – Hong Kong
  • South-East Asia – Singapore

Windows Azure Datacenters- aerial and internal views

Windows Azure data centers are vast and intricately sophisticated.

Images courtesy of Microsoft http://azurebootcamp.com

Windows Azure Services

Having seen the data centers, let’s move on to discuss the various services provided by Windows Azure.

Microsoft has previously categorized the Windows Azure Platform into three main components: Windows Azure, SQL Azure, and Windows Azure AppFabric. However, with the recent launch of the Metro-style Windows Azure portal, there are some slight changes to the branding, but the functionality has remained similar.  The following diagram illustrates the complete suite of Windows Azure services available today.

The complete suite of Windows Azure services available today

The complete suite of Windows Azure services available today

A. Core Services

1. Compute

The Compute service refers to computation power, usually in the form of provisioned Virtual Machines (VMs). In Windows Azure, the compute containers are often referred to as ‘roles’. At the moment, there are three types of roles:

(i) Web Roles

Web Roles offer a predefined environment, set-up to allow developers to easily deploy web applications. Web server IIS (Internet Information Services) has been preinstalled and preconfigured to readily host your web application.

(ii) Worker Roles

Worker Roles allow the developer to run an application’s background processes that do not require user interface interaction. Worker Roles are perfectly suitable to run processes such as scheduled batch jobs, asynchronous processing, and number crunching jobs.

(iii) VM Roles

VM Roles enable developers to bring their customized Windows Server 2008 R2 VM to the cloud, and configure it. VM Roles are suitable for cases where the prerequisite software requires lengthy, manual installation.

Using VM Roles has one substantial drawback. Unlike Web Roles and Worker Roles, whereby Windows Azure will automatically manage the OS, VM Roles require developers to actively manage the OS.

Apart from ‘roles’, there are two other essential terms, namely ‘VM Size’ and ‘Instance’.

  • VM Size denotes the predefined specifications that Windows Azure offers for the provisioned VM. The following diagram shows various Windows Azure VM Sizes.

Various Windows Azure VM Sizes, and the associated costs

Slide 21 of WindowsAzureOverview.pptx (Windows Azure Platform Training Kit)

  • Instance refers to the actual VM that is provisioned. Developers will need to specify how many instances they need after selecting the VM Size.

Screenshot showing VM size

2.     Storage

Windows Azure Storage is a cloud storage service that comes with the following characteristics:

The first step in using Windows Azure Storage is to create a storage account by specifying storage account name and the region:

Screenshot- creating a storage account

There are four types of storage abstraction that are available today:

(i) BLOB (Binary Large Object) Storage

Blob Storage provides a highly scalable, durable, and available file system in the cloud. Blob Storage allows customers to store any file type such as video, audio, photos, or text.

(ii) Table Storage

Table Storage provides structured storage that can be used to store non-relational tabular data. A Table is a set of entities, which contain a set of properties. An application can manipulate the entities and query over any of the properties stored in a Table.

(iii) Queue Storage

Queue Storage is a reliable and persistent messaging delivery that can be used to bridge applications. Queues are often being used to reliably dispatch asynchronous work.

(iv) Azure Drive

Azure Drive (aka X-Drive) provides the capability to store durable data by using the existing Windows NTFS APIs. Azure Drive is essentially a VHD Page Blob mounted as an NTFS drive by a Windows Azure instance.

3.  Database

SQL Azure database is a highly available database service built on existing SQL Server technology. Developers do not have to setup, install, configure, or manage any of the  database infrastructure. All developers need to do is define the database name, edition, and size. Developers are then ready to bring the objects and data to the cloud:

Screenshot- creating a database

SQL Azure uses the same T-SQL language and the same tools as SQL Server Management Studio to manage databases.  SQL Azure is likely to lead to a shift in the responsibility of DBAs toward a more logical administration, as SQL Azure handles physical administration. For example, a SQL Azure database will be replicated to three copies to ensure high-availability.

Although some variations exist today, Microsoft plans to support the features unavailable in SQL Azure in the future. Users can always vote and provide feedback to the SQL Azure team for upcoming feature consideration.

Coming up in my next article, I will carry on the discussion with the additional services that Windows Azure offers including ‘Building Block Services’, Data Services, Networking and more so make sure you keep an eye out for it because it’s coming soon!

This post was also published at A Cloud Place blog.

Posted in Azure | 4 Comments

Comparing IAAS and PAAS: A Developer’s Perspective

In my previous article, I discussed the basic concepts behind Cloud Computing including definitions, characteristics, and various service models. In this article I will discuss service models in more detail, and in particular the comparison between IAAS and PAAS from a developer’s standpoint.

I’m using two giant cloud players for illustrative purposes: Amazon Web Service representing IAAS and Windows Azure Platform representing PAAS. Nonetheless, please be informed that the emphasis is on the service models and not the actual cloud players.

Figure 1: IAAS VS PAAS

Infrastructure as a Service (IAAS)

IAAS refers to the cloud service model that provides on-demand infrastructure services to the customer. The infrastructure may refer to rentable resources such as computation power, storage, load-balancer, and etc.

As you can see on the left-hand side of Table 1, the IAAS provider will be responsible for managing physical resources, for example network, servers, and clustered machines. Additionally, they typically will also manage virtualization technology enabling customers to run VMs (virtual machines). When it comes to the Operating System (OS), it is often arguable whether it’s managed by the provider or customer. In most cases, the IAAS provider will be responsible for customer VM Images with a preloaded OS but the customer will need to subsequently manage it. Using AWS as an example, AMI (Amazon Machine Image) offers customers several types of Operating Systems such as Windows Server, Linux SUSE, andLinux Red Hat. Although the OS is preloaded, AWS will not maintain or update it.

Other stacks of software including middleware (such as IIS, Tomcat, Caching Services), runtime (JRE and .NET Framework), and databases (SQL Server, Oracle, MySQL) are normally not provided in the VM Image. That’s because the IAAS provider won’t know and won’t care what customers are going to do with the VM. Customers are responsible for taking care of them. When all of the above mentioned software has been settled, customers will finally deploy the application and data on the VM.

Step-by-step: Setting-up an Application on IAAS Environment

To convey a comprehensive explanation, I am going to illustrate the steps involved when setting up an application in an IAAS environment. For that, I’m borrowing a slide from a presentation by Mark Russinovich, at the BUILD conference. This illustration explains how a typical IAAS provisioning model works.

 

Figure 2: Setting up an App

Considering a common scenario when you have finished developing a multi-tier application, you as the developer will need to deploy it to the cloud. The application will need to be hosted on a Web Server and an RDBMS database. For IAAS, here are the typical steps:

1.       Preparing Database Servers

Select the VM Images from the VM Images library. The VM Image will then get provisioned and launched. If DBMS software is not provided, you will need to install DBMS on your own.

2.       Preparing Web / Application Servers

Select VM Images from the library to get provisioned and launched. If the web/app server/runtime aren’t installed, you’ll need to install them by yourself.

3.       Provisioning a Database and Its Objects

The next step is about provisioning the database, including configuring the data files, log files, security, etc. Then you create the tables and add data to it.

4.       Deploying Your Application

Next you take the application that you’ve developed and deploy it to the Web Server.

5.       Configuring load-balancer

When you need to host your application on multiple instances, you may also need to configure things such as the IP Address for each instance and load balancer.

6.       Managing Your VMs and DMBS

The final step is about managing the VMs. For example, when there’s an update or service pack on the OS, the IAAS provider will not automatically do it for you. Instead, you may need to do it by yourself.

Platform as a Service (PAAS)

Now, let’s jump into another cloud spectrum, “PAAS”, to see how it differs. In PAAS, the provisioning model is about an on-demand application hosting environment. Not only managing the component like an IAAS provider would, a PAAS provider will also help customers manage additional responsibilities such as OS, Middleware, Runtime, and even Databases, as can be seen on the right-hand side of  Table 1.

In other words, you can think of PAAS as renting a stack of software, hardware, and infrastructure. Customer will just need to bring the application and data and they are ready to go.

Step-by-step: Setting-up an Application on PAAS Environment

For PAAS, given that the database server, VM, and web server VM are readily provisioned, you just need to do two steps, as illustrated by another slide from Mark Russinovich.

 

Figure 3: Provision and Deploy

1.       Database Provisioning

You might need to indicate where (which region) your virtual DB Server is provisioned, but you don’t have to install a bunch of DBMS software on your own. You will need to provision the database, create tables, and add data.

2.       Deploying Your Application

This is a similar step applicable to IAAS, you will still need to deploy your application on the PAAS cloud environment.

How about the load-balancer? Take Windows Azure as example, it will all automatically be configured and ready to take the traffic, and everything else will be automatically managed. You don’t have to worry about IP Addresses or a load-balancer.

How about maintaining VMs? The DBMS and Web Server VM will be maintained by the provider. For example:

  • If the VM where your application is hosted has any hardware issues, the provider should be able to detect the failure and rectify it immediately to make sure that your application will stay up and running. In Windows Azure, Fabric Controller will be the component handling these kinds of issues.
  • If there are new updates or patches on the Operating System, the provider will make sure that the VM your application sits on is always updated. For example: Windows Azure uses “Guest OS Version” to differentiate service updates. Of course you can also choose to stick to one version or auto-update.

 

Figure 4: Configuration

Summary

To summarize, we have investigated different service models and provisioning steps of IAAS and PAAS solutions. PAAS providers indeed take on much more responsibility for your solution than an IAAS provider would. On the other side, IAAS may offer more flexibility at lower level (example: public IP addresses, load-balancer, etc.).

There’s no one-size-fits-all here. As a developer or architect, you should understand a customer’s need and determine the correct model to get the best possible outcome.

This post was also published at A Cloud Place blog.

Posted in Cloud | 1 Comment

A Comprehensive Introduction to Cloud Computing

Introduction

I’m aware that there many cloud computing introductory articles or papers out there, so why am I writing another one? As this my first article at A Cloudy Place, I’d like to start at the beginning and take a new approach to explain the cloud.

Would you rather buy or rent a car?

The analogy that I will be using is very simple and I believe some readers have probably run into it at some point. Have you ever considered buying or renting a car?

Buying or Renting Cloud Computing

Buy Your Own Car

Buying a car is a big investment, and there are a lot of important decisions to take into account. Some people like all the different options, and others don’t want to bother with thousands of decisions.  When buying a car you have full control over everything, its make and model, cost, interior, etc. Additionally, you’ve got to work about taxes, insurance, inspections, and all sorts of maintenance, you’ve got the control, but it comes with a hassle.

Renting a Car

Then how about renting a car? You have fewer and simpler decisions to make. You just need to select a car from what’s available, and you can switch your car if something comes up.

Rent when you need; pay when you use. You don’t have to worry about maintenance costs, tax, and insurance since they are included in your rental fee. On the other hand, there are obviously some disadvantages. You’re limited by what’s available from the rental vendor, you may not be allowed to customize the car, and the car is not dedicated to you all the time.

Translating the Analogy to Cloud Computing

This simple real life analogy is easily translatable to Cloud Computing.

Buying your own car is similar to setting up your own on-premise data center. You have the flexibility to customize whatever you like, starting from physical infrastructure, the security system, hardware and software, etc. However, you also have to invest a lot of money upfront. And also, you will also need to manage it later when it’s operating.

On the other hand, instead of building your own data center, you can rent computation power and storage from the cloud provider. You can scale in and out when necessary. Just pay when you use. No specific commitment takes place. You can start and stop anytime.

Characteristics of Cloud Computing

This summarizes the characteristics of cloud computing.

  • On-demand

Resources should be always available when you need them, and you have control over turning them on or off to ensure there’s no lack of resource or wastage happen.

  • Scalable

You should be able to scale (increase or decrease the resource) when necessary. The cloud providers should have sufficient capacity to meet customer’s needs.

  • Multi-tenant

Sometimes you may be sharing the same resource (e.g. hardware) with another tenant. But of course, this is transparent to the customer. Cloud provider shall responsible the security aspect, ensuring that one tenant won’t be able to access other’s data.

  • Self-service computation and storage resource

Related processes including: billing, resource provisioning, and deployment should be self-service and automated, involving much less manual processing. If a machine where our service is hosted fails, the cloud provider should be able to failover our service immediately.

  • Reliability

Cloud provider should be able to provide customer reliability service, committing to uptimes of their service.

  • Utility-based subscription

You will pay the cloud provider as a utility based subscription, just like paying your electricity bill – without any upfront investment.

Cloud Computing Service Model

Cloud Computing consists of several type of service models.

  • On-premise Environment
    As you can see, for the first stack from left, you will need to take care of everything from networking all the way up to applications. This is typically what many of us are doing today.
  • IaaS (Infrastructure as a Service)
    IaaS helps you to take care of some of the components, starting from networking to provisioning the OS. But you are responsible for the middleware, runtime, data, and application. Sometimes IaaS vendors will just provide the OS but will not manage updates or patches for you. You basically just rent the virtual machine (VM) with the preferred OS installed. They won’t care what you do with the VM.
    Example of IaaS market players: Amazon Web Service, Rackspace, and VMware vCloud.
  • PaaS (Platform as a Service)
    Paas is one level up from IaaS, where cloud providers not only take care of the components that IaaS does; but also manage the platform-level components like middleware and runtime. Middleware such as applications / web server (IIS, JBoss, Tomcat, etc.) and runtime (.NET Framework, Java runtime) will be pre-installed. As a customer, you just need to focus on managing application and data.
    Example of PaaS market player: Google AppEngine, Windows Azure Platform, andforce.com.
  • SaaS (Software as a Service)
    SaaS is probably the most common one as we may have been using it, unaware that they are actually cloud services. SaaS takes care of all the stacks from networking to application level. You don’t even manage the application and data storage. All you need to do is to use the system.
    Example of SaaS market player: GMail, Office 365, and Google Docs.

Cloud Computing Deployment Model

There are three main cloud deployment models, each on with its own set of customers it’s targeting.

  • Public Cloud
    Public cloud provider refers to the cloud platform that targets any types of customers, regardless of whether they’re an independent consumer, enterprise, or even public sector. Normally, public cloud providers are considered prominent players which have invested huge amount of capital. Windows Azure Platform by Microsoft, AWS by Amazon, AppEngine and Gmail by Google, etc. are all examples of public cloud services. Customers who possess sensitive data and application normally do not feel comfortable using public cloud due to privacy, policy, and security concerns. Remember, for public cloud, the application and data will be stored in the provider’s data center.
  • Private Cloud
    Private cloud is infrastructure that’s hosted internally, targeting specific customers or sometimes exclusively within an organization. Setting up a private cloud is normally more affordable when compared to a public cloud. As the matter of fact, there are many organizations who have implemented their own private cloud system with product offering from vendors such as IBM, HP, Microsoft, and so on. Customers who possess sensitive data and application feel more comfortable going with this approach since the data and application are hosted privately.
  • Hybrid Cloud
    Hybrid cloud is the combination of public and private clouds, or sometimes on-premise services. Customers who look into this solution generally want to utilize the scalability and cost-competitiveness that public cloud providers offer, but also want to retain their sensitive data on-premise or in a private cloud. With the benefits derived from both deployment models, the hybrid model solution has become more popular nowadays.

I sincerely hope that, this article would be helpful to you. In my next article, I’ll discuss more about the comparison between IaaS and PaaS.

This post was also published at A Cloud Place blog.

Posted in Cloud | 1 Comment

Applying Config Transformation app.config in Windows Azure Worker Role

Background

In many cases, we need to have two different set of configuration settings (let say: one for development environment and another one for production environment). What we normally do is to change the setting one by one manually before deploying to production server and change them back again to development. This is very annoying especially when you have many settings.

Web.config transformation is an awesome technique to transform the original web.config into another one with slightly changed of settings.

You could find more detail about how to configure and use it here in common ASP.NET project.

Transforming App.Config with some trick

The bad news is the technique is only built-in for web.config for ASP.NET Web Project, not others like Windows Form, Console App, etc.!

The good news is we can do some trick to make it works. The idea is to perform some modifications on its project file as illustrated in this post.

Config Transformation in Windows Azure

Since SDK 1.5 (if I remember correctly), VS Tools for Windows Azure enables us to select service configuration and build configuration.

1

Service Configuration is essentially configuration for Windows Azure services. You can have two or more different configurations, let say one for local (ServiceConfiguration.Local.cscfg) and another one for cloud environment (ServiceConfiguration.Cloud.cscfg).

Build configuration is either your web.config (for Web Role) and app.config (for Worker Role). Let say one for debug (Web.Debug.config) and another one for release (Web.Release.config).

App.Config in Windows Azure Worker Role

For web.config, it certainly works well. Unfortunately, it doesn’t applicable for app.config (Worker Role project) Crying face. Although if you try to apply the technique above to your App.config inside your Worker Role, it still won’t work.

That is the reason why I am writing this article Winking smile.

Using SlowCheetah – XML Transforms

The idea is utilizing an Visual Studio add-on SlowCheetah – XML Transforms to help us perform xml transformation. This is an awesome tools (not only for Windows Azure project) that can help us add and preview applicable on config. Thanks to JianBo for recommending me this tool!

How to?

Let’s see how it will be done …

1. Download and install SlowCheetah – XML Transforms. You might need to restart your Visual Studio after the installation.

2. Prepare your Windows Azure Worker Role project. I named my Windows Azure project: WindowsAzureWorkerConfigDemo and my Worker Role: WorkerRole1.

4

3. Open the app.config file and add the following value:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <appSettings>
    <add key="setting1" value="original"/>
  </appSettings>
    <system.diagnostics>
        <trace>
            <listeners>
                <add type="Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener, Microsoft.WindowsAzure.Diagnostics, Version=1.0.0.0,Culture=neutral,PublicKeyToken=31bf3856ad364e35"
                    name="AzureDiagnostics">
                    <filter type="" />
                </add>
            </listeners>
        </trace>
    </system.diagnostics>
</configuration>

Remember to save the file after adding that value.

4. Right-click on app.config and select Add Transform. (This Add Transform menu will only appear if you’ve successfully install the SlowCheetah). If Visual Studio prompts you for Add Transform Project Import, click on Yes to proceed.

2

5. You will then see there are children file (app.Debug.config and app.Release.config) below your app.config.

5

6. Double-click on the app.Release.config and add the following snippet:

<?xml version="1.0" encoding="utf-8" ?>
<!-- For more information on using transformations 
     see the web.config examples at http://go.microsoft.com/fwlink/?LinkId=214134. -->
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
  <appSettings>
    <add key="setting1" value="new value" xdt:Transform="SetAttributes" xdt:Locator="Match(key)" />
  </appSettings>
</configuration>

As you could see, I’ve change the value of setting1 into “new value”.

The “xdt:Transform=SetAttributes” indicates that the action that will be perform. In this case, it sets the attribute of the entry.

The “xdt:Locator=”Match(key)” indicates the condition when it will be perform. In this case, when the “key” is matched.

You can refer to this post to see what are the other possible values for xdt:Transform and xdt:Locator.

Remember to save the file after adding the settings.

7. Now, right-click on the app.Release.config and click on Preview Transform. (Again: it will be only appeared if SlowCheetah is properly installed).

6

8. Now, you can see the comparison between the original app.config and app.Release.config.

7

9. Right-click your Windows Azure project and click “Unload Project”. Right-click on it again and select Edit [your Windows Azure project].ccproj file.

10

10. Scroll down to the end of the file and add the following snippet before the closing tag of project.

  <Import Project="$(CloudExtensionsDir)Microsoft.WindowsAzure.targets" />
  <Target Name="CopyWorkerRoleConfigurations" BeforeTargets="AfterPackageComputeService">
    <Copy SourceFiles="..WorkerRole1bin$(Configuration)WorkerRole1.dll.config" 
          DestinationFolder="$(IntermediateOutputPath)WorkerRole1" OverwriteReadOnlyFiles="true" />
  </Target>
</Project>

What it does is basically performing a task each time before packaging the Windows Azure service. The task is to copy the WorkerRole1.dll.config file to the IntermediateOutputPath.

Save and close the file. Right-click and select Reload Project again on the Windows Azure project.

11. Alright, we should package it now and see if it really works. To do that, right-click on Windows Azure project and select Package. Choose Release for the build configuration. Click on Package to package the file.

8 9

When Release is selected, we expect the value of “setting1” would be “new value” as we set inside the app.Release.config.

12. Verification

As the service is successfully packaged, you can see two files as usual (one is ServiceConfiguration.Cloud.cscfg and another one is WindowsAzureWorkerConfigDemo.cspkg).

To verify the correct configuration is included, change the extension of the cspkg file into .zip and unzip it. Inside the directory, look for the biggest size file (start with WorkerRole1, since I name my Worker Role project “WorkerRole1”).

11

Change its extension to .zip and unzip it again. Navigate inside that directory and look for “approot” directory. You could see the WorkerRole1.dll.config file inside.

13. Open that file and check out if it’s the correct value, set in our “release” build.

12

Mine is correct, how about yours?

Posted in Azure, Azure Development | 3 Comments

Uploading Big Files in Windows Azure Blob Storage using PutListBlock

Windows Azure Blob Storage could be analogized as file-system on the cloud. It enables us to store any unstructured data file such as text, images, video, etc. In this post, I will show how to upload big file into Windows Azure Storage. Please be inform that we will be using Block Blob for this case. For more information about Block Blob and Page Block, please visit here.

I am assume that you know how to upload a file to Windows Azure Storage. If you don’t know, I would recommend you to check out this lab from Windows Azure Training Kit.

Uploading a blob (commonly-used technique)

The following snippet show you how to upload a blob using a commonly-used technique, blob.UploadFromStream() which eventually invoking PutBlob REST-API.

protected void btnUpload_Click(object sender, EventArgs e)
{
    var storageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
    blobClient = storageAccount.CreateCloudBlobClient();

    CloudBlobContainer container = blobClient.GetContainerReference("image2");
    container.CreateIfNotExist();

    var permission = container.GetPermissions();
    permission.PublicAccess = BlobContainerPublicAccessType.Container;
    container.SetPermissions(permission);

    string name = fu.FileName;
    CloudBlob blob = container.GetBlobReference(name);
    blob.UploadFromStream(fu.FileContent);
}

The above code snippet works well in most case. Although you could upload at maximum 64 MB per file (for block blob), it’s more recommended to upload using another technique which I am going to describe more detail.

Uploading a blob by splitting it into chunks and calling PutBlockList

The idea of this technique is to split a block blob into smaller chunk of blocks, uploading them one-by-one or in-parallel and eventually join them all by calling PutBlockList().

protected void btnUpload_Click(object sender, EventArgs e)
{
    CloudBlobClient blobClient;
    var storageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
    blobClient = storageAccount.CreateCloudBlobClient();

    CloudBlobContainer container = blobClient.GetContainerReference("mycontainer");
    container.CreateIfNotExist();

    var permission = container.GetPermissions();
    permission.PublicAccess = BlobContainerPublicAccessType.Container;
    container.SetPermissions(permission);

    string name = fu.FileName;
    CloudBlockBlob blob = container.GetBlockBlobReference(name);

    blob.UploadFromStream(fu.FileContent);

    int maxSize = 1 * 1024 * 1024; // 4 MB

    if (fu.PostedFile.ContentLength > maxSize)
    {
        byte[] data = fu.FileBytes; 
        int id = 0;
        int byteslength = data.Length;
        int bytesread = 0;
        int index = 0;
        List<string> blocklist = new List<string>();
        int numBytesPerChunk = 250 * 1024; //250KB per block
    
        do
        {
            byte[] buffer = new byte[numBytesPerChunk];
            int limit = index + numBytesPerChunk;
            for (int loops = 0; index < limit; index++)
            {
                buffer[loops] = data[index];
                loops++;
            }
            bytesread = index;
            string blockIdBase64 = Convert.ToBase64String(System.BitConverter.GetBytes(id));

            blob.PutBlock(blockIdBase64, new MemoryStream(buffer, true), null); 
            blocklist.Add(blockIdBase64);
            id++;
        } while (byteslength - bytesread > numBytesPerChunk);

        int final = byteslength - bytesread;
        byte[] finalbuffer = new byte[final];
        for (int loops = 0; index < byteslength; index++)
        {
            finalbuffer[loops] = data[index];
            loops++;
        }
        string blockId = Convert.ToBase64String(System.BitConverter.GetBytes(id));
        blob.PutBlock(blockId, new MemoryStream(finalbuffer, true), null);
        blocklist.Add(blockId);

        blob.PutBlockList(blocklist); 
    }
    else
        blob.UploadFromStream(fu.FileContent);            
}

Explanation about the code snippet

Since the idea is to split the big file into chunks. We would need to define size of each chunk, in this case 250KB. By dividing actual size with size of each chunk, we should be able to know number of chunk we need to split.

image

We also need to have a list of string (in this case: blocklist variable) to determine the blocks are in one group. Then we will loop to through each chunk and perform and upload by calling blob.PutBlock() and add it (as form of Base64 String) into the blocklist.

Note that there’s actually a left-over block that didn’t uploaded inside the loop. We will need to upload it again. When all blocks are successfully uploaded, finally we call blob.PutBlockList(). Calling PutListBlock() will commit all the blocks that we’ve uploaded previously.

Pros and Cons

The benefits (pros) of the technique

There’re a few benefit of using this technique:

  • In the event where uploading one of the block fail due to whatever condition like connection time-out, connection lost, etc. We’ll just need to upload that particular block only, not the entire big file / blob.
  • It’s also possible to upload each block in-parallel which might result shorter upload time.
  • The first technique will only allow you to upload a block blob at maximum 64MB. With this technique, you can do more almost unlimited.

The drawbacks (cons) of the technique

Despite of the benefits, there’re also a few drawbacks:

  • You have more code to write. As you can see from the sample, you can simply call the one line blob.UploadFromStream() in the first technique. But you will need to write 20+ lines of code for the second technique.
  • It incurs more storage transaction as may lead to higher cost in some case. Referring to a post by Azure Storage team. The more chuck you have, the more storage transaction is incurred.

Large blob upload that results in 100 requests via PutBlock, and then 1 PutBlockList for commit = 101 transactions

Summary

I’ve shown you how to upload file with simple technique at beginning. Although, it’s easy to use, it has a few limitation. The second technique (using PutListBlock) is more powerful as it could do more than the first one. However, it certainly has some pros and cons as well.

I hope you could be able to use either one of them appropriately in your scenario. Hope this helps!

Posted in Uncategorized | 22 Comments

Uploading File Securely to Windows Azure Blob Storage with Shared Access Signature via REST-API

In many scenario, you would need to give somebody an access (regardless write, read, etc.) but you don’t want to give him / her full permission. Wouldn’t also be great if you could control the access on certain time frame. The “somebody” could be person or system that use various different platform other than .NET. This post is about to show you how to upload a file to Windows Azure Storage with REST-based API without having to expose your Storage Account credential.

Shared Access Signature

A cool feature Shared Access Signature (SAS) is built-in on Windows Azure Storage. In a nutshell, SAS is a mechanism to give permission while retaining security by producing a set of attributes and signature in the URL.

For the fundamental of SAS, I recommend you to read the following post:

Here’re the walkthrough how you can do that:

I assume that you’ve the Windows Azure Storage Account and Key with you.

Preparing SAS and Signature

1. Giving access to your container. You can either use tools or library to set SAS permission access on container or blobs. In this example, I use Cerebrata’s Cloud Storage Studio.

1

As could be seen, I’ve created a policy with the following attributes:

  • Start Date Time: Jan 8, 2012 / 00:00:00
  • Expiry Date Time: Jan 31, 2012 / 00:00:00
  • Permission: Write only
  • Signed Identifier: Policy1

By applying this policy to some particular container, somebody who possess a signature will only be able to write something inside this container on the given timeframe. I mentioned “a signature”, what’s the signature then?

2. You can click on “Generate Signed URL” button if you’re using Cloud Storage Studio. But I believe you can do similarly feature although using different tool.

2

In the textbox, you’ll see something like this:

https://[your-account].blob.core.windows.net/samplecontainer1?&sr=c&si=Policy1&sig=pjJhE%2FIgsGQN9Z1231312312313123123A%3D

Basically, starting the ? symbol to the end, that’s the signature: ?&sr=c&si=Policy1&sig=pjJhE%2FIgsGQN9Z1231312312313123123A%3D

*Copy that value first, you will need this later.

The signature will be signed securely according to your storage credentials and also the properties you’ve specified.

 

Let’s jump to Visual Studio to start our coding!!!

I use the simplest C# Console Application to get started. Prepare the file to be uploaded. In my case, I am using Penguin.jpg which you can find in Windows sample photo.

3. Since I am about to upload a picture, I will need to get byte[] of data from the actual photo. To do that, I use the following method.

        public static byte[] GetBytesFromFile(string fullFilePath)
        {
            FileStream fs = File.OpenRead(fullFilePath);
            try
            {
                byte[] bytes = new byte[fs.Length];
                fs.Read(bytes, 0, Convert.ToInt32(fs.Length));
                fs.Close();
                return bytes;
            }
            finally
            {
                fs.Close();
            } 
        } 

4. The next step is the most important one, which is to upload a file to Windows Azure Storage through REST with SAS.

        static WebResponse UploadFile(string storageAccount, string container, string filename, string signature, byte[] data)
        {
            var req = (HttpWebRequest)WebRequest.Create(string.Format("http://{0}.blob.core.windows.net/{1}/{2}{3}", storageAccount, container , filename, signature)); 
            req.Method = "PUT"; 
            req.ContentType = "text/plain";

            using (Stream stream = req.GetRequestStream())
            { 
                stream.Write(data, 0, data.Length); 
            }

            return req.GetResponse();
        }

5. To call it, you will need to do the following:

        static void Main(string[] args)
        {
            string storageAccount = "your-storage-account";
            string file = "Penguins.jpg";
            string signature = "?&sr=c&si=Policy1&sig=pjJhE%2FIgsGQN9Z1231312312313123123A%3D";
            string container = "samplecontainer1";

            byte[] data = GetBytesFromFile(file);
            WebResponse resp = UploadFile(storageAccount, container, file, signature, data);
            
            Console.ReadLine();
        }

The signature variable should be filled with the signature that you’ve copied in step 2 just now.

6. Let’s try to see if it works!

And yes, it works Smile.

3

Hope this helps!

Posted in Uncategorized | 4 Comments

“A Cloudy Place”– Blogging About Cloud Computing

I am glad to share that “a cloudy place” blog is finally up here: http://acloudyplace.com

image

What is “a cloudy place”?

A centralized blog focused on cloud technology exclusively for developers. If you’ve heard about SQL Server Central, it’s somewhat similar but focus on cloud computing. You can find topics such as general cloud info, Amazon Web Services, Windows Azure, and so many more.

Who own “a cloudy place”?

“a cloudy place” is owned and managed by Red Gate, a software company specializing in SQL, DBA, .NET, and Oracle development tools based in Cambridge, UK.

What to do with me?

Aha! I am invited to become a contributor on “a cloudy place”. In a first few post, you will see me write about some generic cloud concept. Of course, I’ll discuss more about Windows Azure later on.

You can find my articles here.

Happy “cloudy” reading on “a cloudy place”!

Posted in Cloud | Leave a comment

New Announcements on Windows Azure [12 Dec 2011]

What’s new?

There’re several announcements that are made by Windows Azure team yesterday. Here’re the summary:

The site is more simplified now, the look-and-feel looks clean and comfortable.

1

  • Windows Azure SDK for Node.js

https://github.com/WindowsAzure/azure-sdk-for-node

  • Maximum database size has been increased from 50GB to 150GB.

http://www.windowsazure.com/en-us/home/tour/database/

*Price will be capped to $ 499.95 per DB per month.

  • SQL Azure Federation

http://social.technet.microsoft.com/wiki/contents/articles/2281.aspx

  • Metro-styled SQL Azure Management Portal
  • 2

  • Some adjustment on pricing

http://www.windowsazure.com/en-us/pricing/details/

More Info?

For more detail, please refer to the following links:

Posted in Azure | Leave a comment

Windows Azure Storage Transaction | Unveiling the Unforeseen Cost and Tips to Cost Effective Usage

[update: the price for storage transaction is now $0.01 per 100,000]

Background

I was quite surprise when seeing the Storage Transaction bills 2000% more than Storage Capacity, and it’s about 40% of my total bill. Wow… How can that be?

Isn’t that the storage transaction just costs $ 0.01 per 10,000 transactions, but why it’s become so expensive? In fact, this is the component that many people ignore when doing the running cost estimation for Windows Azure project.

This leads me to explore and understand Window Azure Storage Transaction more deeply. This article will unveil this unforeseen cost, explain thoroughly how the Storage Transaction costs charge, followed by scenarios that potentially cause the high cost of Storage Transaction. Eventually, I’ll provide some tips to avoid the costly charge of Storage Transaction.

Before getting into the detail, let’s refresh our mind to understanding how Windows Azure Storage costs in overview.

Understanding Windows Azure Storage Billing

Brad Calder from Windows Azure Storage Team did a great post on explaining how the billing looks like for Windows Azure Storage including the Capacity, Bandwidth, and Transaction.

In summary, here’re how it costs (as per Nov 2011). Keep in mind that the cost may change (although not very frequent, but who knows)

1. Storage Capacity = $0.14 per GB stored per month, based on the daily average

2. Storage Transactions = $0.01 per 10,000 transactions

3. Data Transfer (Bandwidth)

  • Free Ingress (inbound)
  • Outbound:
    • North America and Europe region = $ 0.15 per GB
    • Asia Pacific region = $ 0.20 per GB

Please always refer to the following for latest pricing:

Many people argue that Windows Azure Storage is much more cost-effective than SQL Azure.

Well, that’s true in “most of the time”, but not “all the time”.

Understanding How Storage Transaction Charge in More Detail

Now, let’s forget the Storage Capacity and Bandwidth first, let’s talk about Storage Transaction now. It’s considered 1 transaction whenever you “touch” any component of Windows Azure Storage.

  • “Touch” means any REST calls or operation including read, write, delete, update.
  • “Any Component” means any entity inside Blobs, Tables, or Queues.

Here’re some examples of transactions that extracted from “Understanding Windows Azure Storage Billings” post.

  • A single GetBlob request to the blob service = 1 transaction
  • PutBlob with 1 request to the blob service = 1 transaction
  • Large blob upload that results in 100 requests via PutBlock, and then 1 PutBlockList for commit = 101 transactions
  • Listing through a lot of blobs using 5 requests total (due to 4 continuation markers) = 5 transactions
  • Table single entity AddObject request = 1 transaction
  • Table Save Changes (without SaveChangesOptions.Batch) with 100 entities = 100 transactions
  • Table Save Changes (with SaveChangesOptions.Batch) with 100 entities = 1 transaction
  • Table Query specifying an exact PartitionKey and RowKey match (getting a single entity) = 1 transaction
  • Table query doing a single storage request to return 500 entities (with no continuation tokens encountered) = 1 transaction
  • Table query resulting in 5 requests to table storage (due to 4 continuation tokens) = 5 transactions
  • Queue put message = 1 transaction
  • Queue get single message = 1 transaction
  • Queue get message on empty queue = 1 transaction
  • Queue batch get of 32 messages = 1 transaction
  • Queue delete message = 1 transaction

Scenarios

Having done understanding how the storage transaction charge, considering the following scenarios:

Scenario 1 – Iterating files inside Blob container

An application will organize the blobs in different container per each users. It also allows the users to check size of each container. For that, a function is created to loop through entire files inside the container and return the size in decimal. Now, this functionality is exposed at UI screen. An admin can typically call this function a few times a day.

*Update: Actually, we can use ListBlobs method to get the length / size of files inside the container. But anyway, just forget it at the moment. (Thanks to Jai Haridas for this comment)

Some Figures for Illustration

Assuming the following figures are used for illustration:

  • I have 1,000 users.
  • I have 10,000 of files in average for each container.
  • Admin call this function 5 times a day in average.

How much it costs for Storage Transaction per month?

Remember: a single GetBlob request is considered 1 transaction!

1,000 users X 10,000 files X 5 times query X 30 days = 1,500,000,000 transaction

$ 0.01 per 10,000 transactions X 1,500,000,000 transactions = $ 1,500 per month

Well, that’s not cheap at all.

Tips to Bring it Down

  • Verify with the admin if they really need to use the function for 5 times a day? Educate them, tell them that each time this function is being called, it roughly costs $ 10 since it involves 10 million transaction (10,000 files X 1,000 users). I bet the admin will also avoid that if he/she knows the cost.
  • Do not expose this functionality as real time query to admin. Considering to automatically run this function once in a day, save the size in somewhere. Just let admin to view the daily result (day by day).

With limiting the admin to just only view once a day, what will be the monthly cost looks like:

1,000 users X 10,000 files X 1 times query X 30 days = 300,000,000 transaction

$ 0.01 per 10,000 transactions X 300,000,000  transactions = $ 300 per month

Well, I think that’s fair enough!

 

Scenario 2 – Worker Role Constantly Pinging Queue

An application enables user to upload some document for processing. The uploaded document will be processed asynchronously at the backend. When processing is done, the user will get notified by email.

Technically, it uses Queue to store and centralize all tasks. Two instances of web roles to take the input and store task as message inside the Queue. On the other hand, 5 instances of Worker Role are provisioned, they will constantly pinging Queue Storage to check if there’s new message to be processed.

The following diagrams illustrates how the architecture may look like.

image

*icons by http://azuredesignpatterns.com/, David Pallman

Some Figures for Illustration

Assuming the following figures:

  • It has 5 instances of Worker Role
  • Those Worker Role will constantly get message from Queue (regardless it’s empty or filled)
public override void Run()
{ 
    while (true)
    { 
        CloudQueueMessage msg = queue.GetMessage();
        if (msg != null)
        {  
            // process the message }  
    }
}
  • Those Worker Role will run 24 hours per day, 30 days per month
  • It’s stated here that a single queue is able to process up to 500 messages per second. Let’s assume in average, it will process 200 messages per second (considering some tiny latency between Worker Role and Storage)

How much it costs for Storage Transaction per month?

Remember: a GetMessage on Queue function (regardless empty or filled) is considered 1 transaction

200 req X 60 sec X 60 min X 24 hours X 30 days X 5 instances = 2,592,000,000 transactions

$ 0.01 per 10,000 transactions X 2,592,000,000 transactions = $ 2,592 per month

Tips to Bring it Down  #1

Unless there’s requirement to meet certain number of target, otherwise consider to put some Sleep to especially when you’ve got empty message result for several times.

Assuming we put Thread.Sleep(100) = 0.1 second, which means for every second there will be 10 time polling to the queue to check if there’s message.

public override void Run()
{ 
    while (true)
    {
        CloudQueueMessage msg = queue.GetMessage();
        if (msg != null)
        { 
            // process the message }
 else Thread.Sleep(100);
    }
}

With that, how much do you think it will cost for a month?

10 req X 60 sec X 60 min X 24 hours X 30 days X 5 instances = 129,600,000  transactions

$ 0.01 per 10,000 transactions X 129,600,000 transactions =$ 129.6 per month

Well, that’s fair enough.

Tips to Bring it Down #2

When your 5 instances of Worker Role have fetched so many times of empty message, then you should start asking yourself if you really need those 5 instances of Worker Roles?

Scaling them in will not only can bring down the Storage Transaction costs, but also will save you some money on Windows Azure Compute Instances.

*Thanks to Brad Calder for this thought.

Scenario 3 – Be Aware with Turning on Windows Azure Diagnostic

Another hidden scenario that may burst your bill on Storage Transaction is turning on Windows Azure Diagnostic if you do not control it properly.

How Windows Azure Diagnostic Work

Windows Azure Diagnostic collects diagnostic data from your instances and copies it to a Window Azure Storage account (either on blob and table storage). Those diagnostic data (such as log) can indeed help developer for the purpose of monitoring performance and tracing source of failure if exception occurs.

We’ll need to define what kind of log (IIS Logs, Crash Dumps, FREB Logs, Arbitrary log files, Performance Counters, Event Logs, etc.) to be collected and send to Windows Azure Storage either on-schedule-basis or on-demand.

However, if you are not carefully define what you are really need for the diagnostic info, you might end up paying the unexpected bill.

Some Figures for Illustration

Assuming the following figures:

  • You a few application that require high processing power of 100 instances
  • You apply 5 performance counter logs (Processor% Processor Time, MemoryAvailable Bytes, PhysicalDisk% Disk Time, Network Interface Connection: Bytes Total/sec, Processor Interrupts/sec)
  • Performing a schedule transfer for every 5 seconds
  • The instance will run 24 hours per day, 30 days per month

How much it costs for Storage Transaction per month?

5 counters X 12 times X 60 min X 24 hours X 30 days X 100 instances = 259,200,000 transactions

$ 0.01 per 10,000 transactions X 129,600,000 transactions =$ 259.2 per month

Tips to Bring it Down #2

Ask yourself again if you really need to monitor all 5 performance counters on every 5 seconds? What if you reduce them to 3 counters and monitor it every 20 seconds?

3 counters X 3 times X 60 min X 24 hours X 30 days X 100 instances = 3,8880,000 transactions

$ 0.01 per 10,000 transactions X 129,600,000 transactions =$ 38.8 per month

You can see how much you save for this numbers. Windows Azure Diagnostic is really needed but use it improperly may cause you paying unnecessary money. It’s double-edge sword, be careful.

Conclusion

To conclude, this article gives you a view of how Transaction Cost of Windows Azure Transaction may lead to costly charge if it’s not properly used. Different component in Windows Azure Platform charges differently, cloud architect should have deep understanding in order to design scalable, reliable, yet cost-effective solution to customer.

In some case where constantly request is requirement, you may also would like to evaluate using SQL Azure instead of Windows Azure Storage because there will no any storage transaction cost in SQL Azure.

Do not worry of using any component if you really need. As long as you architect and design the solution properly, the cost should be reasonable enough.

Hopefully by reading this article, you’ll save some money for storage transaction Winking smile.

Posted in Azure Storage | 4 Comments

I am officially MCPD in Windows Azure

Due to bunch of busy stuff, I had not have chance to blog.

Anyway, I am pleased to share that I am now officially MCPD (Microsoft Certified Professional Developer) in Windows Azure.

Obtaining MCPD in Windows Azure requires three exams:

Passing the core 70-538 Beta Exam

I have successfully passed the beta exam of Designing and Developing Windows Azure Applications about a year ago. Beta exam is more challenging than normal exam. There’re no official preparation material, no training kit, no e-book, no online training.

It’s just purely state a few topics with their short descriptions what will be tested.

Passing the MCTS Prerequisite: 70-516 and 70-513

As you can see that, there are 2 prerequisites exams that need to be accomplished apart form the core exam 70-583. I had passed the 70-516 last September and 70-513 just last week.

With that, here’s my MCPD cert in Windows Azure Winking smile

clip_image001

Posted in Azure | 1 Comment