Windows Azure Blob Storage could be analogized as file-system on the cloud. It enables us to store any unstructured data file such as text, images, video, etc. In this post, I will show how to upload big file into Windows Azure Storage. Please be inform that we will be using Block Blob for this case. For more information about Block Blob and Page Block, please visit here.
I am assume that you know how to upload a file to Windows Azure Storage. If you don’t know, I would recommend you to check out this lab from Windows Azure Training Kit.
The following snippet show you how to upload a blob using a commonly-used technique, blob.UploadFromStream() which eventually invoking PutBlob REST-API.
protected void btnUpload_Click(object sender, EventArgs e) { var storageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString"); blobClient = storageAccount.CreateCloudBlobClient(); CloudBlobContainer container = blobClient.GetContainerReference("image2"); container.CreateIfNotExist(); var permission = container.GetPermissions(); permission.PublicAccess = BlobContainerPublicAccessType.Container; container.SetPermissions(permission); string name = fu.FileName; CloudBlob blob = container.GetBlobReference(name); blob.UploadFromStream(fu.FileContent); }
The above code snippet works well in most case. Although you could upload at maximum 64 MB per file (for block blob), it’s more recommended to upload using another technique which I am going to describe more detail.
The idea of this technique is to split a block blob into smaller chunk of blocks, uploading them one-by-one or in-parallel and eventually join them all by calling PutBlockList().
protected void btnUpload_Click(object sender, EventArgs e) { CloudBlobClient blobClient; var storageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString"); blobClient = storageAccount.CreateCloudBlobClient(); CloudBlobContainer container = blobClient.GetContainerReference("mycontainer"); container.CreateIfNotExist(); var permission = container.GetPermissions(); permission.PublicAccess = BlobContainerPublicAccessType.Container; container.SetPermissions(permission); string name = fu.FileName; CloudBlockBlob blob = container.GetBlockBlobReference(name); blob.UploadFromStream(fu.FileContent); int maxSize = 1 * 1024 * 1024; // 4 MB if (fu.PostedFile.ContentLength > maxSize) { byte[] data = fu.FileBytes; int id = 0; int byteslength = data.Length; int bytesread = 0; int index = 0; List<string> blocklist = new List<string>(); int numBytesPerChunk = 250 * 1024; //250KB per block do { byte[] buffer = new byte[numBytesPerChunk]; int limit = index + numBytesPerChunk; for (int loops = 0; index < limit; index++) { buffer[loops] = data[index]; loops++; } bytesread = index; string blockIdBase64 = Convert.ToBase64String(System.BitConverter.GetBytes(id)); blob.PutBlock(blockIdBase64, new MemoryStream(buffer, true), null); blocklist.Add(blockIdBase64); id++; } while (byteslength - bytesread > numBytesPerChunk); int final = byteslength - bytesread; byte[] finalbuffer = new byte[final]; for (int loops = 0; index < byteslength; index++) { finalbuffer[loops] = data[index]; loops++; } string blockId = Convert.ToBase64String(System.BitConverter.GetBytes(id)); blob.PutBlock(blockId, new MemoryStream(finalbuffer, true), null); blocklist.Add(blockId); blob.PutBlockList(blocklist); } else blob.UploadFromStream(fu.FileContent); }
Since the idea is to split the big file into chunks. We would need to define size of each chunk, in this case 250KB. By dividing actual size with size of each chunk, we should be able to know number of chunk we need to split.
We also need to have a list of string (in this case: blocklist variable) to determine the blocks are in one group. Then we will loop to through each chunk and perform and upload by calling blob.PutBlock() and add it (as form of Base64 String) into the blocklist.
Note that there’s actually a left-over block that didn’t uploaded inside the loop. We will need to upload it again. When all blocks are successfully uploaded, finally we call blob.PutBlockList(). Calling PutListBlock() will commit all the blocks that we’ve uploaded previously.
There’re a few benefit of using this technique:
Despite of the benefits, there’re also a few drawbacks:
Large blob upload that results in 100 requests via PutBlock, and then 1 PutBlockList for commit = 101 transactions
I’ve shown you how to upload file with simple technique at beginning. Although, it’s easy to use, it has a few limitation. The second technique (using PutListBlock) is more powerful as it could do more than the first one. However, it certainly has some pros and cons as well.
I hope you could be able to use either one of them appropriately in your scenario. Hope this helps!

Categories
Tag Cloud
Blog RSS
Comments RSS
Last 50 Posts
Back
Back
Void
Life
Earth
Wind « Default
Water
Fire
Light 