Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Azure Package Admin Guide

The Azure Parameter Sweep package is an easy-to-use package that allows you to parallelize embarrassingly-parallel code over a number of Sho instances running in Azure. It is especially well-suited to parameter sweeps.

Getting an Azure account

Before you can use the Azure package, you’ll need to get an Azure account, so you can upload data and run code in the cloud. To get started, go to the Azure Offers page (http://www.microsoft.com/windowsazure/offers/) and select the package that is most appropriate for you . Note that you do not need SQL Azure access to use the Sho Azure package. Please see the following links for information on pricing for academic and NSF-funded work:

http://www.nsf.gov/cise/news/2010_microsoft.jsp

http://www.azurepilot.com/

Server-side Installation and setup

Creating the Azure storage service

First, we’ll create the storage service that will be used to store all of our data and results in the cloud. Go to the Windows Azure service portal (http://windows.azure.com) and log in.

Then, click on “New Storage Account” in the ribbon:

When prompted, enter a URL to use to access your storage service (e.g., “myshodata”) and click the ‘Create’ button. Make a note of the prefix for this URL (the “myshodata” part) – users of the package will need to know this account name in order to use the service. The other piece of information is the primary access key associated with this service, which users will need in order to authenticate themselves. To find this, select the new storage service you just created and then click on the “View” button for the primary access key:

This will display the access keys:

Users of the service will have to set two environment variables, SHOAZUREACCOUNT, which will get the name of your storage service (e.g., myshodata), and SHOAZUREKEY, which will get the primary access key. You will also need to set these variables on any computer you wish to administer the Sho Azure package from.

Copying Sho to the Cloud

Now that you have created the storage service and set your SHOAZUREACCOUNT and SHOAZUREKEY environment variables, you’ll need to copy a Sho installation to the cloud. The simplest way to do that is to run the Sho console and run the uploadSho() function from the Azure package:

Sho 2.0.4 on IronPython 2.6.1 (), .NET 4.0.30319.1 and MKL 10.3.
Includes parts of the Intel Math Kernel Library for Windows.
>>> azureutils.uploadSho()
 

This may take quite a long time. When it’s done, you’re ready to create the hosted service and upload the parameter sweeper.

Creating the Hosted service

Next, we’ll create the Azure compute service that the actual code will run on. In preparation for this, we must first load the ServiceConfiguration.cscfg file from the Publish directory in the Azure package into a text editor. Then replace <ACCOUNTNAME> and <ACCOUNTKEY> with your account name and key in the highlighted lines:

<?xml version="1.0"?>
<ServiceConfiguration serviceName="ParameterSweep" xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration">
<Role name="ParamSweepWorkerRole">
<Instances count="1" />
<ConfigurationSettings>
<Setting name="DataConnectionString" value="DefaultEndpointsProtocol=https;AccountName=<ACCOUNTNAME>;AccountKey=<ACCOUNTKEY>" />
<Setting name="DiagnosticsConnectionString" value="DefaultEndpointsProtocol=https;AccountName=<ACCOUNTNAME>;AccountKey=<ACCOUNTKEY>" />
<Setting name="ResultsContainerName" value="paramsweepresults" />
<Setting name="JobQueueName" value="paramsweepqueue" />
</ConfigurationSettings>
<Certificates>
</Certificates>
</Role>
</ServiceConfiguration>
 

Now that you have edited the configuration file to be appropriate for your deployment, let’s create a new hosted compute service and upload the deployment. First, click on the “New Hosted Service” button from the Windows Azure portal:

This will bring up the “Create a new Hosted Service” dialog box. In this screen, you’ll have to make up a name and URL for your hosted service.

In addition, you can upload the deployment package (and its corresponding configuration file that you edited above) now. Choose “Browse Locally…” and point the file dialog at the .cspkg and .cscfg files in the Publish directory inside you Azure package directory (i.e., SHODIR/packages/Azure/Publish, where SHODIR is you Sho installation directory). Note that if you keep the "Start after successful deployment" box checked, the deployment start running in the cloud and you will start accruing usage charges!

Client-side installation and setup

To install the package, first copy the package directory to your Sho packages directory.

Then, set the following environment variables to the values you noted above:

SHOAZUREACCOUNT

The name of the storage account you created (myshodata in the example in this guide)

SHOAZUREKEY

The private key for the storage service

 

The following are optional, and are needed if you wish to do basic administration (changing the number of active instances) from the Sho console instead of the Azure web portal:

SHOAZUREHOST

The name of the hosting account

SHOAZUREDEPLOYSLOT

The deployment slot you used for the Sho worker roles (either “staging” or “deployment”)

SHOAZURECERTNAME

The name of the certificate you created (e.g., “Windows Azure Authentication Certificate”)

SHOAZURESUBID

The subscription ID for your Azure subscription. This can be found on the Windows Azure portal screen.

 

Usage

Here is a brief example of how to use the package so you can test your installation. Please read the user guide for more complete information.

>>> paramsweep.addDemoDir()
>>> import paramsweeptest
>>> session = paramsweep.run(paramsweeptest.add, 100, [1,2,3,4,5])

 

The runsweep function returns a parameter sweep session object:

>>> print session
ParamSweepSession('2c574185-9cd3-4ba1-a6f3-9668424374a8')
 

The session runs asynchronously on the cloud. We can check to see if it’s finished by calling isDone(). If it is, we can call getResults to get the results from each of the jobs. Alternately we can call waitResults(), which blocks until it finishes, and then return the result values.

>>> print session.isDone()
True
>>> print session.getResults()
[101, 102, 103, 104, 105]
 

It’s a good idea to delete the server-side data associated with this session when you’re done.

>>> session.cleanup()
 

Service utilities

The Sho Azure package includes a function that allows you to change the number of Azure worker role instances that are running in the cloud, waiting for jobs. If you want to use this function instead of using the web portal, you’ll have to add a certificate to your service.

First, you’ll need to generate a new self-signed certificate in a .cer file. The following blog post describes one way to do this:

http://blogs.msdn.com/b/ericnel/archive/2009/09/22/using-iis-to-generate-a-x509-certificate-for-use-with-the-windows-azure-service-management-api-step-by-step.aspx

If you are more comfortable with the command line, another method is described here:

http://consultingblogs.emc.com/gracemollison/archive/2010/02/19/creating-and-using-self-signed-certificates-for-use-with-azure-service-management-api.aspx

To add the certificate to your Azure subscription, select “Management Certificates” in the sidebar and then click the “Add Certificate button”

Then, click the Browse button in the dialog box and find your .cer file:

Using the API from Sho

In order to change the number of deployment instances, you can call the changeNumWorkerRoles() function:

>>> req = azureservice.changeNumWorkerRoles(num)

This function returns a request ID, which you can use to check on the status of the operation. It typically takes several minutes or so to change the number of instances.

>>> azureservice.getOperationStatus(req)

Setup Summary

To use the parameter sweep facilities for running jobs in the cloud, your users will need to set the following environment variables on the client computers:

SHOAZUREACCOUNT

The name of the storage account you created (myshodata in the example in this guide)

SHOAZUREKEY

The private key for the storage service

 

For using the Sho service utilities described in this document, you will also need to set the following environment variables:

SHOAZUREHOST

The name of the hosting account

SHOAZUREDEPLOYSLOT

The deployment slot you used for the Sho worker roles (either “staging” or “deployment”)

SHOAZURECERTNAME

The name of the certificate you created (e.g., “Windows Azure Authentication Certificate”)

SHOAZURESUBID

The subscription ID for your Azure subscription. This can be found on the Windows Azure portal screen.

 

Thanks! And please send feedback to shofeedback@microsoft.com