I welcome all residents of Habrahabr! All successful new work week! So, we continue the process of migrating the database to use SQL Azure Federations. As you remember,
last time we decided which table and field we would break the database into shards. Let's finally do it!
Migration
So, we will break the database on the table of accounts (Account), since the data stored in it and associated with it logically with each other do not overlap. Since we have a database creation script, we will try to adapt it to use SQL Azure Federations.
We assume that the database has already been created in the Windows Azure Management Portal or through SQL Server Management Studio.
Open the script to create objects in the database.
USE xPenses
GO
IF EXISTS (SELECT name FROM sysobjects where name = N'Operation') DROP TABLE Operation
...
The first thing to remove is the use of a USE operation, since this is one of the main limitations of SQL Azure. One database - one connection. Instead, add requests to create a federation:
-- A database must be selected before executing this statement
CREATE FEDERATION Accounts(AccountId BIGINT RANGE)
GO
USE FEDERATION Accounts(AccountId = 1) WITH RESET, FILTERING = OFF
GO
Please note that if you connect using SSMS to SQL Azure Server, you must select a database from the list to execute the query.
')
Thus, we will create a new federation, the data of which is distributed according to the value of the account identifier (Account ID). Please note that at the moment there are no tables in the database, that is, the AccountId field is not associated with any data set in real tables. The field name may also differ from the field name of the table, which will be distributed.
Here we can see another logical
limitation of SQL Azure Federations . The field over which the distribution will be performed must be of type INT, BIGINT, UNIQUEIDENTIFIER and VARBINARY.
After creating the federation, we need to select the first shard in which we begin to enter data. That is, the shard that stores the data of the first account (AccountId = 1).
See our script further. We need to modify the creation of the table of accounts so that SQL Azure knows that the data of this particular table will be distributed across shards across the Id field.
CREATE TABLE Account (
[Id] INTEGER NOT NULL PRIMARY KEY IDENTITY(1,1),
[EntityId] INTEGER NOT NULL FOREIGN KEY REFERENCES Entity(Id),
[Currency] NVARCHAR(3)
)
Thus, the table creation script will turn into the following:
CREATE TABLE Account (
[Id] BIGINT NOT NULL,
[EntityId] INTEGER NOT NULL FOREIGN KEY REFERENCES Entity(Id),
[Currency] NVARCHAR(3),
CONSTRAINT [PK_Account] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
) FEDERATED ON (AccountId= Id)
So what has changed? The ID field type has become BIGINT. In addition, we lost the ability to automatically generate values ​​for this field when inserting a new record. This is another of the limitations of SQL Azure Federations. However, it remains possible to use the keyword DEFAULT. For example, this will be useful if the type of the field ID is UNIQUEIDENTIFIER. In this case, we can declare a field like this:
[Id] UNIQUEIDENTIFIERNOT NULL DEFAULT NEWID()
Then when inserting new records into the table, we don’t need to specify the ID of the created record. When working with other types, this logic should be implemented at the application level.
The next thing you should pay attention to is the declaration of the main key of the table. We need to explicitly indicate that the key being created will be clustered.
The last thing to do is to indicate using the FEDERATED ON keyword that the table will be federated. We will break the data in it by the ID field.
So, with the creation of a table of accounts we figured out. We go further. As can be seen from the database schema, the table of accounts is the parent of the “credit card” and “bank account” tables.
That is, the BankAccount and CreditCard tables have a foreign key to the Account table. Since the table of accounts is now federated with us, it is impossible to ensure the integrity of the links of one table to another.
This is another limitation of SQL Azure Federations . The tables on which the data is distributed (federated table) cannot be referenced in other tables. That is, it will be necessary to remove all foreign keys from the tables referring to the accounts table (Account).
Thus, the script to create a table, for example, CreditCard, instead of the form:
CREATE TABLE CreditCard (
[Id] INTEGER NOT NULL PRIMARY KEY IDENTITY(1,1),
[AccountId] INTEGER NOT NULL FOREIGN KEY REFERENCES Account(id),
[Type] NVARCHAR(MAX)
CONSTRAINT CreditCardType CHECK (
[Type] = 'Visa'
OR [Type] = 'MasterCard'
OR [Type] = 'JCB'
OR [Type] = 'AmericanExpress'),
[Number] NVARCHAR(MAX)
)
It will look like:
CREATE TABLE CreditCard (
[Id] INTEGER NOT NULL PRIMARY KEY IDENTITY(1,1),
[AccountId] BIGINT NOT NULL,
[Type] NVARCHAR(MAX)
CONSTRAINT CreditCardType CHECK (
[Type] = 'Visa'
OR [Type] = 'MasterCard'
OR [Type] = 'JCB'
OR [Type] = 'AmericanExpress'),
[Number] NVARCHAR(MAX)
)
That is, the integrity of the reference records from the table of credit cards to the table of accounts falls on the shoulders of the application logic.
If you try to execute the database creation script at this stage, then all the tables will be successfully created, but besides the tables there are also two procedures in the database. The first of them - adding a new category does not require changes, since its logic does not go beyond the bounds of a single shard.
But the procedure for adding a new account (AddAccount) requires small changes. So consider the source code of this procedure:
CREATE PROCEDURE AddAccount(
@Name NVARCHAR(MAX),
@Description NVARCHAR(MAX),
@Currency NVARCHAR(3),
@Instrument NVARCHAR(MAX),
@Type NVARCHAR(MAX),
@Number NVARCHAR(MAX)
)
AS
INSERT INTO Entity VALUES (@Name, @Description)
DECLARE @EntityId INTEGER = (SELECT Id FROM Entity WHERE Name = @Name AND Description = @Description)
INSERT INTO Account VALUES (@EntityId, @Currency)
IF (@Instrument = 'BankAccount')
BEGIN
INSERT INTO BankAccount VALUES (
(SELECT Id FROM Account WHERE Account.EntityId = @EntityId),
@Type,
@Number
) END
IF (@Instrument = 'CreditCard')
BEGIN
INSERT INTO CreditCard VALUES (
(SELECT Id FROM Account WHERE Account.EntityId = @EntityId),
@Type,
@Number
) END
GO
In fact, the changes that need to be made are fairly obvious. Since we have lost the ability to automatically generate the value of the ID field for the account, this logic falls on the shoulders of the application logic. That is, we need to make changes to the procedure header:
CREATE PROCEDURE AddAccount(
@AccountId BIGINT,
@Name NVARCHAR(MAX),
@Description NVARCHAR(MAX),
@Currency NVARCHAR(3),
@Instrument NVARCHAR(MAX),
@Type NVARCHAR(MAX),
@Number NVARCHAR(MAX)
)
AS
INSERT INTO Entity VALUES (@Name, @Description)
DECLARE @EntityId INTEGER = (SELECT Id FROM Entity WHERE Name = @Name AND Description = @Description)
INSERT INTO Account VALUES (@AccountId, @EntityId, @Currency)
...
GO
Accordingly, the insertion of records in the account table, instead of the code:
EXEC AddAccount 'Cash', 'Everyday cash account', 'USD', NULL, NULL, NULL
Now it will take on input one more parameter (ID of the account being created):
EXEC AddAccount 1, 'Cash', 'Everyday cash account', 'USD', NULL, NULL, NULL
Probably the next logical question would be the following: is it possible to add the command to use the necessary federation (USE FEDERATION) to the body of the procedure? Since, if the ID of the created account has come to us, then we know with which federation we need to work, which means we can go straight to using the necessary shard:
USE FEDERATION Accounts(AccountId = @AcccountId) WITH RESET, FILTERING = OFF
GO
Unfortunately, if you do this, SSMS will give you an error. The thing is that the AddAccount procedure is stored in a particular shard, which means that we have no possibility of using USE FEDERATION. In addition, the use of USE FEDERATION is generally impossible in procedures. The code for switching federations must be located "level up".
This completes the changes that need to be made in the database creation script. We can execute it without errors. As a result, one federation root and one shard (federation member) will be created.
Sharding
All that is left for us to do is, in fact, produce scaling databases. That is, to separate the data related to one account from the data of another account.
To do this, create a separate script:
-- Scaling out the federation
USE FEDERATION ROOT WITH RESET
GO
ALTER FEDERATION Accounts SPLIT AT (AccountId = 2)
GO
The first thing that makes this script is to use federation root, that is, it works within the xPenses database, because the information about federations (metadata) is stored there.
Then we indicate that we want to split the federation with the name Accounts, starting with the value of the AccountId field equal to 2. That is, the data of the first account remain in the shard already created, and the data of the other accounts are transferred to the next shard. Also note that we do not indicate anywhere that the data is broken down into an account table. We only work with xPenses database metadata!
So, let's execute this command and ... The situation is quite possible when updating the Object Explorer window you will see the following:
Instead of one new shard, we got ... As many as 3! In fact, nothing unusual about it. The thing is that the data stored in the first shard is copied in accordance with the value of AccountId. That is, for an account with an ID of 1, you need to copy the data to one shard, with an ID of 2 to another. Of course it takes time. After SQL Azure redistributes data across shards, we see that we really have a database with federation metadata and two shards.
Now, if we need to allow splitting the data into 3 shards, for example, the account data with an ID of 3 is transferred to a separate shard, we just need to run the following command:
ALTER FEDERATION Accounts SPLIT AT (AccountId = 3)
GO
Conclusion
We have reviewed the migration process of the database creation script using SQL Azure Federations. As you can see, most of the sharp corners are quite simple. However, a fairly large part of the database logic should be transferred “higher”. We stopped at the database level. In real projects, it is highly recommended to start a thorough analysis of the subject area and database architecture before starting the migration to SQL Azure Federations.