Windows Azure Affinity Groups

Note: The following content is literally taken from Technet (I like this explanation about Affinity Groups and posting here for my own benefit)

Sometimes people wonder about Affinity Groups in Windows Azure and their benefits since for some; this is nothing more than a way to logically group both Compute and Storage.

So in order to explain this we need to dig a little deep in terms of how Windows Azure Data Centers are created. Basically Windows Azure Data Centers are built using “Containers” that inside are full of clusters and racks. Each of those Containers has specific services, like for example, Compute and Storage, SQL Azure, Service Bus, Access Control Service, and so on. Those containers are spread across the data center and each time we subscribe/deploy a service, the Fabric Controller (which chooses based on our solution configuration where the services should be deployed) can place our services spread across the data center. The fabric controller manages the nodes of a cluster. A higher-level process controls the choice of data center and cluster. This process is called RDFE which is short for Red Dog Front End (Red Dog was the Azure project’s code name). The Azure portal web site and web service talks directly to RDFE

Now one thing that can happen is we need to be very careful in where we create the several services, because if we place the Hosted Service in North Central US and then the Storage Account in South Central US, this won’t be very good both in terms of Latency or Costs, since we’ll get charged whenever we get out of the Data Center. But even if we choose the same Data Center, nothing tells us that the services will be close together, since one can be placed in one end of the Data Center and the other in the other end, and so this will remove the costs and make the latency better, but it would be great to go a little further like placing them in the same Container, or even in the same Cluster. The answer for this is Affinity Groups.

Basically Affinity Groups is a way to tell the Fabric Controller that those two elements, Compute and Storage, should always be together and close to one another, and what this does is when the Fabric Controller is searching for the best suited Container to deploy those services will be looking for one where it can deploy both in the same Cluster, making them as close as possible, and reducing the latency, and increasing the performance.

So in summary, Affinity Groups provide us:

  • Aggregation, since it aggregates our Compute and Storage services and provides the Fabric Controller the information needed for them to be kept in the same Data Center, and even more, in the same Cluster.
  • Reducing the Latency, because by providing information to the Fabric Controller that they should be kept together, allow us to get a lot better latency when accessing the Storage from the Compute Nodes, which makes difference in a highly available environment.
  • Lowering costs, as by using them we don’t have the possibility of getting one service in one Data Center and the other in another Data Center if for some reason we choose the wrong way, or even because we choose for both one of the “Anywhere” options.

Based on this, don’t forget to use Affinity Groups right from the start, since it’s NOT possible after having deployed either the Compute or Storage to change them into an Affinity Group.

To finalize, and since now you can be thinking that this would be very interesting for other services also, no other services are able to take advantage of this Affinity, since neither of them share the same Container.


Export and Import BACPAC using command line

SQLPackage.exe can be used for the following tasks:

  • Extract: Creates a database snapshot (.dacpac) file from an on-premise (local) SQL Server or Windows Azure SQL Database.
  • Export: Exports a on-premise (local) database – including database schema and user data – from SQL Server or Windows Azure SQL Database to a BACPAC package (.bacpac file).
  • Import: Imports the schema and table data from a BACPAC package into a new user database in an instance of SQL Server or Windows Azure SQL Database.
  • Publish: Incrementally updates a database schema to match the schema of a source .dacpac file. If the database does not exist on the server, the publish operation will create it. Otherwise, an existing database will be updated.
  • DeployReport: Creates an XML report of the changes that would be made by a publish action.
  • DriftReport: Creates an XML report of the changes that have been made to a registered database since it was last registered.
  • Script: Creates a Transact-SQL incremental update script that updates the schema of a target to match the schema of a source.

Let’s see some examples

Export the local database to a bacpac file

“C:\Program Files (x86)\Microsoft SQL Server\110\DAC\bin\sqlpackage.exe” /a:Export /ssn:mysqlserver /sdn:Tailspintoys /tf:C:\DataExtraction\Tailspintoys.bacpac

ssn – Specifies the name of the server that hosts the database.
sdn – Defines the name of the source database.
tf – Specifies a disk file path where the .dacpac or .bacpac file will be written.
If the logged in user is executing the script, Integrated Windows Authentication will be used. So no need to specify the username or password.

Import the bacpac file to SQLAzure

“C:\Program Files (x86)\Microsoft SQL Server\110\DAC\bin\sqlpackage.exe” /a:Import /sf:C:\DataExtraction\Tailspintoys.bacpac / /tdn:Tailspintoys /tu:mysysadmin@cgrd7z8kac /tp:Pa55w0rd

sf – Specifies a source file to be used as the source of action instead of database.
tsn – Specifies the name of the server that hosts the target database.
tdn – Specifies the name of the target database.
tu – SQL Server user that is used to get access to the target database.
tp – specifies password that is used to get access to the target database.

If you are planning to automate this using C# console application, there is a code sample available here