Central Subscriber Model Explained

The majority of SQL Server Replication topologies are based on the Central Publisher model, which is comprised of a single publisher replicating to one or more subscribers.  Another replication model, which is sometimes overlooked, is the Central Subscriber model, which is comprised of multiple publishers replicating to one subscriber using Transactional Replication.

Central Subscriber Model

The Central Subscriber model is useful for rolling up or consolidating data from multiple sources.  Some examples include:

  • Rolling up inventory from several warehouses into a central server at corporate headquarters.
  • Sending data from remote offices within a company to a central office for business continuity.
  • Consolidating order information to one location for centralized ordering.

Priming the pump

By default, subscriptions are initialized from a snapshot generated by the Snapshot Agent and then applied by the Distribution Agent.  When the snapshot is applied, by default the article property Action if name is in use is set to Drop existing object and create a new one, which instructs that the destination table be dropped if it already exists at a subscriber.  This behavior can be problematic in the Central Subscriber model when snapshots are applied since snapshots must be applied from multiple publications.  The first snapshot is applied as expected, however, subsequent snapshot applications result in the previous snapshot data being wiped out.

The solution to this problem is horizontal partitioning, static row filters, and setting the Action if name is in use article property to Delete data. If article has a row filter, delete only data that matches the filter.

Horizontal partitioning

Ideally, published tables in a Central Subscriber topology will be horizontally partitioned.  In order to horizontally partition the tables to be published, a location-specific column should be added and included as a part of a composite primary key.  Consider a table that looks like this:

CREATE TABLE TestTable
(
    ID int IDENTITY(1,1) NOT NULL,
    FirstName varchar(100) NULL,
    CONSTRAINT PK_TestTable_ID PRIMARY KEY CLUSTERED (ID ASC)
)

To horizontally partition TestTable and prepare it for a Central Subscriber configuration at Publisher 1, drop primary key constraint PK_TestTable_ID, add a location-specific column named LocationID with a default value of 1, and add the new composite primary key including the LocationID column.

ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_ID
GO
 
ALTER TABLE TestTable
ADD LocationID INT NOT NULL DEFAULT(1)
GO
 
ALTER TABLE TestTable
ADD CONSTRAINT PK_TestTable_ID_LocationID PRIMARY KEY CLUSTERED (ID, LocationID)
GO

Next, to horizontally partition TestTable and prepare it for a Central Subscriber configuration at Publisher 2, the same preparation can be done with a default value of 2 for LocationID.

ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_ID
GO
 
ALTER TABLE TestTable
ADD LocationID INT NOT NULL DEFAULT(2)
GO
 
ALTER TABLE TestTable
ADD CONSTRAINT PK_TestTable_ID_LocationID PRIMARY KEY CLUSTERED (ID, LocationID)
GO

Finally, to horizontally partition TestTable and prepare it for a Central Subscriber configuration at Publisher 3, the same preparation can be done with a default value of 3 for LocationID.

ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_ID
GO
 
ALTER TABLE TestTable
ADD LocationID INT NOT NULL DEFAULT(3)
GO
 
ALTER TABLE TestTable
ADD CONSTRAINT PK_TestTable_ID_LocationID PRIMARY KEY CLUSTERED (ID, LocationID)
GO

Once the tables are horizontally partitioned, they can be properly published in a Central Subscriber topology by using static row filters, filtering on the LocationID column and setting the article property Action if name is in use to Delete data. If article has a row filter, delete only data that matches the filter.

Static row filters

For each article to be published in a Central Subscriber topology, a static row filter should be defined to leverage the Action if name is in use article property appropriately.  A static row filter uses a WHERE clause to select the data to be published.  To publish rows from Publisher 1, specify LocationID = 1 for the filter clause. Likewise, to publish rows from Publisher 2 and Publisher 3, specify LocationID = 2 and LocationID = 3 for the filter clause, respectively.

Static Row Filters

Action if name is in use

When creating the publications and adding articles, the article property Action if name is in use needs to be set to Delete data. If article has a row filter, delete only data that matches the filter.  This can be set using the New Publication Wizard Article Properties dialog or by using replication stored procedures sp_addarticle and specifying a value of delete for the @pre_creation_cmd argument.  This way, when the central subscriber is initialized or reinitialized from multiple publication snapshots, previously applied snapshot data will be preserved since only data matching the filter clause will be deleted.

Action if name is in use

The caveat

As we can see, horizontal partitioning requires that tables have a location-specific column added, however, the location-specific column does not necessarily need to be included as a part of the primary key at the publication databases.  In addition, it is not a hard requirement that published tables in a Central Subscriber topology be horizontally partitioned.  In some shops, changing a primary key or adding additional columns is strictly prohibited, in which case I would urge you to take an alternative approach.  If you would like some ideas on implementing a Central Subscriber topology without modifying primary keys or horizontally partitioning publication databases, feel free to get in touch or leave a comment below.

-Brandon Williams (blog | linkedin | twitter)

Executing scripts with sp_addscriptexec

Looking for ways to do things more quickly and efficiently, I thought I would talk a little bit about sp_addscriptexec and how it can be leveraged to expedite the process of executing SQL scripts in a replication topology.  By creating and saving a SQL script on a publisher, it can then be distributed to and executed on UNC/FTP deployed subscribers using sp_addscriptexec.

The syntax for sp_addscriptexec per BOL

sp_addscriptexec [ @publication = ] publication
[ , [ @scriptfile = ] 'scriptfile' ]
[ , [ @skiperror = ] 'skiperror' ]
[ , [ @publisher = ] 'publisher' ]

The problem with numbers

If you’re dealing with a large number of subscribers, database administration can be tricky.  Tasks such as adding logins and users, granting permissions, maintaining indexes, and managing constraints must be done individually at each node and deploying all of the scripts can be very time consuming.  Rather than visit each node, sp_addscriptexec should be used to post the ad-hoc script to all subscribers in the topology, saving valuable time.  Put simply – if you’re visiting each node to execute a script, you’re doing it wrong.

The benefit with using sp_addscriptexec is that the publisher can act as a centralized script repository.  Scripts can be saved to the publisher and executed on demand for subscribers.  This process is quicker and more efficient than copying, saving, and executing scripts directly at each subscriber.  Not only does this save time, but space as well.

Executing scripts

Applying scripts to subscribers from the publisher can be done by:

  • Create and test the script
  • Save the script to the publisher
  • Execute sp_addscriptexec at the publisher to apply script to subscribers

From here, the script will be executed at each subscriber on the next synchronization.  Taking this one step further — to limit a script to run on only certain subscribers, a check for HOST_NAME can be performed in the script.

-- Limit script to Subscriber1, Subscriber2, and Subscriber3
IF HOST_NAME() IN ('Subscriber1', 'Subscriber2', 'Subscriber3')
	-- script goes here

Something to consider

A caveat with sp_addscriptexec is that by default, if the script being executed encounters an error at the subscriber, the replication agent will fail and retry on subsequent syncs in an endless loop, be aware of this.  This behavior can be overridden by passing a value of 1 for @skiperror which instructs the agent to skip errors.  At the very least, SQL scripts should be tested thoroughly at the publisher before calling sp_addscriptexec to apply them to subscribers.

Arithmetic Overflow in sp_MSarticle_validation

This is probably a rare occurrence — but I've noticed that data validation for an article with 10 billion or more rows will fail due to an arithmetic overflow error in sp_MSarticle_validation.

Data validation for a specific article in a transactional publication is done using sp_article_validation which in turn calls sp_MSarticle_validation.  In the definition of sp_MSarticle_validation, a local variable named @actual_rowcount is defined which is of the type bigint.  Later on in sp_MSarticle_validation, a command is built to execute sp_table_validation and the @expected_rowcount parameter passed in is derived by converting @actual_rowcount to a varchar(10).

The offending statement can be observed in the following code snippet from sp_MSarticle_validation:

select @command = 'exec dbo.sp_table_validation @table = ''' + replace(@destination_table, '''', '''''')  + ''', @expected_rowcount = ' +
	        convert(varchar(10), @actual_rowcount) + ', @expected_checksum = ' +
	        convert(varchar(100), @actual_checksum) + ', @rowcount_only = ' + convert(varchar(5),@rowcount_only) +
	        ', @full_or_fast = ' + convert(varchar(10), @full_or_fast) +
	        ', @shutdown_agent = ' + convert(varchar(10), @shutdown_agent)

Considering @actual_rowcount is a bigint, it should be converted to a varchar(19), rather than a varchar(10).  This is where an arithmetic overflow error occurs when validating an article that has 10 billion or more rows, causing validation to fail.

If you find yourself needing to validate an article with 10 billion or more rows, please vote this item as important to expedite a fix.

Error when creating a subscription

When creating a subscription in SQL Server sometimes the following error will occur:

The remote server "%s" does not exist, or has not been designated as a valid Publisher, or you may not have permission to see available Publishers.

The error is pretty straightforward and indicates that the replication agent login used to connect to the Publisher is not a member of the publication access list (PAL).  That would be the Distribution Agent process account for Transactional Replication and the Merge Agent process account for Merge Replication.  Add the agent process account to the PAL and try to create the subscription again.

Publication Access List

Extending the Publisher to the Cloud

SQL Azure Data Sync

SQL Azure Data SyncThis weekend I decided to setup a SQL Azure sync group to synchronize a SQL Azure database with an on-premise replication publisher.  I was curious if this would work, as I’ve been unable to find any documentation stating whether or not SQL Azure Data Sync can co-exist with SQL Server Replication.  Not only was I able to get this working, but I took it a step further and downloaded the Windows Azure and Windows Phone SDKs — then wrote a quick and dirty Windows Phone app which connects to my SQL Azure sync group hub database and pulls down Customer and related Order data to my mobile device, making it available via a pivot control within the application.

I was able to create a sync group in SQL Azure Data Sync to synchronize the Customer, SalesOrderHeader, and SalesOrderDetail tables in the AdventureWorksLTAZ2008R2 database, which also acted as an on-premise Merge publisher in my local network with 3 subscribers.  The solution provided a sync group hub database containing the Merge publication Customer and related Order data, synching one-way from on-premise to the cloud.  I settled for synchronizing the sync group every 30 minutes and kept existing subscribers synching with the on-premise publisher on their normal schedules, which was every 1 minute in my test topology.

Publisher Sync Group

Windows Phone Application

Windows Phone SDKFrom here, I created 2 projects — 1 Windows Azure WCF Service Web Role which allows for Windows Phone app interaction with a SQL Azure database using Entity Framework, and 1 Windows Phone application using WCF to connect to the SQL Azure sync group hub database through the service reference and consume the Customer and related Order data through a pivot control.

A couple of screenshots of the app:


View Customer OrdersCustomer Order
 

As you can see, the potential for SQL Azure Data Sync is interesting.  Not only can sync groups be useful in providing an offsite database storage solution, but when synchronized with an on-premise replication publisher, can also provide a mechanism for exposing enterprise data to the cloud to be consumed by mobile devices.  SQL Azure Data Sync is currently available for preview and can be found in the Windows Azure Management Portal.  More information about SQL Azure Data Sync can be found in SQL Azure Data Sync on MSDN.

How to let non-admin business users run replication agent jobs

This blog post is basic in nature but highlights a very real requirement that may arise when administering replication.  Occasionally we, as Database Administrators, need to grant non-administrator business users the ability to run replication agent jobs due to business needs.  Instead of giving the business user the ability to execute/stop/start jobs by adding them to a SQL Server Agent fixed database role, it is probably wise to limit the business user to execute only the replication agent job in question.

For example, if we want to give a business user the ability to run only the Snapshot Agent job – first, we can create a stored procedure RunSnapshotAgent that runs the Snapshot Agent job, using WITH EXECUTE AS to run the stored procedure as the database owner:

CREATE PROCEDURE dbo.RunSnapshotAgent
WITH EXECUTE AS 'snapshotagent_job_owner'
AS
EXEC sp_start_job
	@job_name = 'snapshot_agent_job'
GO

Where 'snapshot_job_owner' is the snapshot agent job owner and 'snapshot_agent_job' is the snapshot agent job name.

Next, we can create a role named snapshot_executor, and add the business user to that role:

CREATE ROLE snapshot_executor;
 
EXEC sp_addrolemember
	@rolename = 'snapshot_executor',
	@membername = 'BUSINESSUSER1';
GO

Where ‘BUSINESSUSER1’ is the business user’s login.

Then, grant execute rights for the RunSnapshotAgent stored procedure to the snapshot_executor role to allow the business user to start only the snapshot agent job:

GRANT EXEC ON dbo.RunSnapshotAgent TO snapshot_executor;
GO

After creating the stored procedure, creating the role, and adding the business user to that role; the business user can then execute the RunSnapshotAgent stored procedure to run only the Snapshot Agent:

EXEC
dbo.RunSnapshotAgent;
GO

The process could not read file due to OS error 3

When configuring a subscription to a publication in SQL Server, sometimes the agent process cannot read from the snapshot:

The process could not read file X:\Program Files\Microsoft SQL Server\<instance>\MSSQL\repldata\<snapshot>\<snapshot_file> due to OS error 3. (Source: MSSQL_REPL, Error number: MSSQL_REPL20016)
Get help: http://help/MSSQL_REPL20016
The system cannot find the path specified.
(Source: MSSQL_REPL, Error number: MSSQL_REPL3)
Get help: http://help/MSSQL_REPL3

This error can be puzzling at first especially when the path indeed does exist and the replication agent process accounts have sufficient privileges to access the snapshot folder.

The error occurs because when using pull subscriptions and/or a remote distributor you must specify a UNC network share such as \\<computername>\snapshot rather than a local path for the snapshot folder.  To alleviate this issue, make the snapshot folder a UNC network share and update the Distributor Properties to reflect the change.

More Information

Securing the Snapshot Folder

Updatable Subscriptions Removed in SQL Server 2012

After spending some time looking at SQL Server 2012 CTP3, it appears as if Transactional Publication with Updatable Subscriptions will officially be removed in SQL Server 2012.  According to Books Online, Microsoft is recommending that users switch to Peer-to-Peer Replication instead.  This is a shame considering Peer-to-Peer is an Enterprise-only feature and will cost significantly more than Standard Edition.  On the other hand, we will still be able to achieve updatable subscriptions in Standard Edition via Merge Replication or Bidirectional Transactional Replication.

Here are the screenshots comparing the SQL Server 2008 R2 New Publication Wizard with the SQL Denali CTP3 New Publication Wizard:

SQL Server 2008 R2 vs. SQL Server Denali CTP3

SQL Server 2008 R2 vs. SQL Server Denali CTP3 - New Publication Wizard

Applying a Snapshot -AltSnapshotFolder Parameter

Did you know a snapshot can be applied using the Distribution and Merge Agent –AltSnapshotFolder parameter?  This means that after generating a snapshot, it can be copied to Subscribers manually using removable media for instance, and applied locally.  This can be useful with large snapshots and when replicating over slow, unreliable links.

This method of initialization can be performed by adding the -AltSnapshotFolder parameter to the agent job step, using the agent programming interfaces, or when invoking agent executables from the command prompt.  For more information on the –AltSnapshotFolder parameter, refer to Replication Distribution Agent and Replication Merge Agent in the MSDN Library.

Marking FK Constraints NOT FOR REPLICATION

When administering Merge Replication, sometimes we need to mark foreign key constraints NOT FOR REPLICATION.  In some cases, we might require replication agent activity to be treated differently from user activity since we cannot guarantee the order of changes that are replicated.  To be more specific, sometimes parent and child records are sent in different batches, with the children getting sent first, resulting in constraint violations and conflicts.

There are a few scenarios that cause this behavior to occur and some of the details are covered in the following links.

By marking foreign key constraints NOT FOR REPLICATION, we can alleviate constraint violations from occurring when changes are made by replication agents.  With that said, it takes a fair amount of faith to trust that Merge will in fact deliver the parent records in the next batch of changes.  That's why it's wise to take a closer look when this happens and verify that after dropping and adding the constraints NOT FOR REPLICATION, that the parent records do arrive.

Dropping and adding foreign key constraints NOT FOR REPLICATION can be time consuming if there are a large number of them.  To speed this process up, I've been using variations of the following SELECT to generate my ALTER statements:

USE AdventureWorks;
GO
SELECT
  Drop_FK = 'ALTER TABLE [' + FK.FKTableSchema +
    '].[' + FK.FKTableName + '] DROP CONSTRAINT [' + FK.FKName + ']; ',
  Add_FK_NFR = 'ALTER TABLE [' + FK.FKTableSchema +
    '].[' + FK.FKTableName +
    '] WITH CHECK ADD CONSTRAINT [' + FK.FKName +
    '] FOREIGN KEY([' + FK.FKTableColumn +
    ']) REFERENCES [' + schema_name(sys.objects.schema_id) +
    '].[' + sys.objects.[name] + ']([' + sys.columns.[name] +
    ']) NOT FOR REPLICATION; '
FROM sys.objects
INNER JOIN sys.columns
  ON (sys.columns.[object_id] = sys.objects.[object_id])
INNER JOIN (
  SELECT
    sys.foreign_keys.[name] AS FKName,
    schema_name(sys.objects.schema_id) AS FKTableSchema,
    sys.objects.[name] AS FKTableName,
    sys.columns.[name] AS FKTableColumn,
    sys.foreign_keys.referenced_object_id AS referenced_object_id,
    sys.foreign_key_columns.referenced_column_id AS referenced_column_id
  FROM sys.foreign_keys
  INNER JOIN sys.foreign_key_columns
    ON (sys.foreign_key_columns.constraint_object_id = sys.foreign_keys.[object_id])
  INNER JOIN sys.objects
    ON (sys.objects.[object_id] = sys.foreign_keys.parent_object_id)
  INNER JOIN sys.columns
    ON (sys.columns.[object_id] = sys.objects.[object_id])
      AND (sys.columns.column_id = sys.foreign_key_columns.parent_column_id)
) FK
  ON (FK.referenced_object_id = sys.objects.[object_id])
    AND (FK.referenced_column_id = sys.columns.column_id)
WHERE (sys.objects.[type] = 'U')
  AND (sys.objects.is_ms_shipped = 0)
  AND (sys.objects.[name] NOT IN ('sysdiagrams'))

 

This has been tested on SQL Server 2008 R2 against the AdventureWorks database.  It does not handle composite foreign keys, among other things, but should provide a good starting point.  Please verify your constraints are defined correctly before deploying to production.  If you have any questions, feel free to get in touch.