Central Subscriber Model Explained

The majority of SQL Server Replication topologies are based on the Central Publisher model, which is comprised of a single publisher replicating to one or more subscribers.  Another replication model, which is sometimes overlooked, is the Central Subscriber model, which is comprised of multiple publishers replicating to one subscriber using Transactional Replication.

Central Subscriber Model

The Central Subscriber model is useful for rolling up or consolidating data from multiple sources.  Some examples include:

  • Rolling up inventory from several warehouses into a central server at corporate headquarters.
  • Sending data from remote offices within a company to a central office for business continuity.
  • Consolidating order information to one location for centralized ordering.

Priming the pump

By default, subscriptions are initialized from a snapshot generated by the Snapshot Agent and then applied by the Distribution Agent.  When the snapshot is applied, by default the article property Action if name is in use is set to Drop existing object and create a new one, which instructs that the destination table be dropped if it already exists at a subscriber.  This behavior can be problematic in the Central Subscriber model when snapshots are applied since snapshots must be applied from multiple publications.  The first snapshot is applied as expected, however, subsequent snapshot applications result in the previous snapshot data being wiped out.

The solution to this problem is horizontal partitioning, static row filters, and setting the Action if name is in use article property to Delete data. If article has a row filter, delete only data that matches the filter.

Horizontal partitioning

Ideally, published tables in a Central Subscriber topology will be horizontally partitioned.  In order to horizontally partition the tables to be published, a location-specific column should be added and included as a part of a composite primary key.  Consider a table that looks like this:

CREATE TABLE TestTable
(
    ID int IDENTITY(1,1) NOT NULL,
    FirstName varchar(100) NULL,
    CONSTRAINT PK_TestTable_ID PRIMARY KEY CLUSTERED (ID ASC)
)

To horizontally partition TestTable and prepare it for a Central Subscriber configuration at Publisher 1, drop primary key constraint PK_TestTable_ID, add a location-specific column named LocationID with a default value of 1, and add the new composite primary key including the LocationID column.

ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_ID
GO
 
ALTER TABLE TestTable
ADD LocationID INT NOT NULL DEFAULT(1)
GO
 
ALTER TABLE TestTable
ADD CONSTRAINT PK_TestTable_ID_LocationID PRIMARY KEY CLUSTERED (ID, LocationID)
GO

Next, to horizontally partition TestTable and prepare it for a Central Subscriber configuration at Publisher 2, the same preparation can be done with a default value of 2 for LocationID.

ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_ID
GO
 
ALTER TABLE TestTable
ADD LocationID INT NOT NULL DEFAULT(2)
GO
 
ALTER TABLE TestTable
ADD CONSTRAINT PK_TestTable_ID_LocationID PRIMARY KEY CLUSTERED (ID, LocationID)
GO

Finally, to horizontally partition TestTable and prepare it for a Central Subscriber configuration at Publisher 3, the same preparation can be done with a default value of 3 for LocationID.

ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_ID
GO
 
ALTER TABLE TestTable
ADD LocationID INT NOT NULL DEFAULT(3)
GO
 
ALTER TABLE TestTable
ADD CONSTRAINT PK_TestTable_ID_LocationID PRIMARY KEY CLUSTERED (ID, LocationID)
GO

Once the tables are horizontally partitioned, they can be properly published in a Central Subscriber topology by using static row filters, filtering on the LocationID column and setting the article property Action if name is in use to Delete data. If article has a row filter, delete only data that matches the filter.

Static row filters

For each article to be published in a Central Subscriber topology, a static row filter should be defined to leverage the Action if name is in use article property appropriately.  A static row filter uses a WHERE clause to select the data to be published.  To publish rows from Publisher 1, specify LocationID = 1 for the filter clause. Likewise, to publish rows from Publisher 2 and Publisher 3, specify LocationID = 2 and LocationID = 3 for the filter clause, respectively.

Static Row Filters

Action if name is in use

When creating the publications and adding articles, the article property Action if name is in use needs to be set to Delete data. If article has a row filter, delete only data that matches the filter.  This can be set using the New Publication Wizard Article Properties dialog or by using replication stored procedures sp_addarticle and specifying a value of delete for the @pre_creation_cmd argument.  This way, when the central subscriber is initialized or reinitialized from multiple publication snapshots, previously applied snapshot data will be preserved since only data matching the filter clause will be deleted.

Action if name is in use

The caveat

As we can see, horizontal partitioning requires that tables have a location-specific column added, however, the location-specific column does not necessarily need to be included as a part of the primary key at the publication databases.  In addition, it is not a hard requirement that published tables in a Central Subscriber topology be horizontally partitioned.  In some shops, changing a primary key or adding additional columns is strictly prohibited, in which case I would urge you to take an alternative approach.  If you would like some ideas on implementing a Central Subscriber topology without modifying primary keys or horizontally partitioning publication databases, feel free to get in touch or leave a comment below.

-Brandon Williams (blog | linkedin | twitter)

Implementing a replication agent progess bar

Using Replication Management Objects, SQL Server subscriptions can be synchronized programmatically without using SQL Server Agent or SQL Server Management Studio.  Common uses include:

  • Express edition subscribers
  • Sync from within an application on-demand without an agent job
  • Display agent status information in an application

In a previous post I provided a link to a code sample showing how to synchronize a Merge pull subscription in SQL Server Express using RMO.  Taking this a step further, for this post I will discuss how to implement a Merge Agent progress bar during synchronous execution, handling the MergeSynchronizationAgent.Status event and displaying the results in a Windows Form.

Synchronously synchronize asynchronously

When using the MergeSynchronizationAgent.Synchronize method to synchronize the Merge Agent it is important to realize that this starts the agent synchronously and control remains with the running agent until it completes.  This can make updating UI controls a bit tricky as UI events can be delayed until the Merge Agent finishes synchronizing, which is not very useful.  The key to making this work smoothly is to use a BackgroundWorker to synchronize the agent on a separate thread asynchronously and report progress back to the main UI thread when the MergeSynchronization.Status event is raised.

From here the synchronization BackgroundWorker.DoWork event handler can subscribe to the MergeSynchronizationAgent.Status event and start the Merge Agent synchronously for a specified subscription.

Status Event

The MergeSynchronizationAgent.Status event handler reports the Merge Agent progress passing StatusEventArgs PercentCompleted and Message back to the main UI thread which is handled by the synchronization BackgroundWorker.ProgressChanged event handler.

ProgressChanged Event

Finally, the synchronization BackgroundWorker.ProgressChanged event handler smoothly updates the progress bar and text box controls according to the Merge Agent status.

Sample

This code sample can be downloaded from the MSDN Code Gallery:

If you have any questions about the sample, or would like help integrating this into your application, feel free to ping me or leave a comment below.

-Brandon Williams (blog | linkedin | twitter)

Synchronizing subscriptions in SQL Server Express

I was recently asked about synchronizing a Merge pull subscription in SQL Server Express using RMO, the lack of a SQL Server Agent, and what to do about it.  Since SQL Server Express does not include the SQL Server Agent, pull subscriptions must be synchronized by some other means.  The problem becomes evident when trying to open the View Synchronization Status dialog for a pull subscription in SQL Server Express.

View Synchronization Status

The View Synchronization Status dialog depends on a SQL Server Agent job which does not exist in SQL Server Express.  To accommodate for the lack of a SQL Server Agent, SQL Server Express Merge pull subscriptions can also be synchronized by executing batch scripts, Windows Synchronization Manager, or Replication Management Objects (RMO).  While not optimal without the SQL Server Agent, plenty of options are available. This is where it really pays to know the different ways a subscription can be synchronized when tasked with an Express subscriber.

Batch script

The Merge Agent (replmerg.exe) executable can be run from the command line on-demand or from a batch script as a scheduled task.  To run from the command line, execute replmerg.exe from the COM folder.

replmerg.exe

Likewise, this can be saved as a batch file and run from Task Scheduler on a schedule.  This option alone provides a pretty good replacement for the SQL Server Agent and synchronizing subscriptions in Express.

Windows Synchronization Manager

Another option to synchronize pull subscriptions in SQL Server Express is Windows Synchronization Manager, or Sync Center.  Sync Center provides options for setting and viewing sync partnerships, setting subscription properties and sync schedules, and viewing sync results and conflicts.  This tool is ideal for having non-technical users synchronize SQL Server Express pull subscriptions on-demand as it offers a user-friendly interface.  Synchronization schedules can also be configured as needed.

Windows Synchronization Manager

RMO

For all the developers, Replication Management Objects (RMO) can be used to synchronize SQL Server Express Merge pull subscriptions through managed code access.  The RMO MergeSynchronizationAgent class exposes a Synchronize method which can be used to synchronize a subscription without an agent job — additional properties must be supplied.  A code sample demonstrating how to synchronize a SQL Server Express Merge pull subscription using RMO can be downloaded from the MSDN Code Gallery.

So

As we can see, there are options for synchronizing pull subscriptions in SQL Server Express.  That is because Replication was built with SQL Server Express subscribers in mind.  Pull subscriptions can be synchronized using batch scripts, Windows Synchronization Manager, and RMO which should be sufficient enough to synchronize the data.  If you happen to know of another way to synchronize SQL Server Express pull subscriptions and would like to share, feel free to leave a comment below.

Executing scripts with sp_addscriptexec

Looking for ways to do things more quickly and efficiently, I thought I would talk a little bit about sp_addscriptexec and how it can be leveraged to expedite the process of executing SQL scripts in a replication topology.  By creating and saving a SQL script on a publisher, it can then be distributed to and executed on UNC/FTP deployed subscribers using sp_addscriptexec.

The syntax for sp_addscriptexec per BOL

sp_addscriptexec [ @publication = ] publication
[ , [ @scriptfile = ] 'scriptfile' ]
[ , [ @skiperror = ] 'skiperror' ]
[ , [ @publisher = ] 'publisher' ]

The problem with numbers

If you’re dealing with a large number of subscribers, database administration can be tricky.  Tasks such as adding logins and users, granting permissions, maintaining indexes, and managing constraints must be done individually at each node and deploying all of the scripts can be very time consuming.  Rather than visit each node, sp_addscriptexec should be used to post the ad-hoc script to all subscribers in the topology, saving valuable time.  Put simply – if you’re visiting each node to execute a script, you’re doing it wrong.

The benefit with using sp_addscriptexec is that the publisher can act as a centralized script repository.  Scripts can be saved to the publisher and executed on demand for subscribers.  This process is quicker and more efficient than copying, saving, and executing scripts directly at each subscriber.  Not only does this save time, but space as well.

Executing scripts

Applying scripts to subscribers from the publisher can be done by:

  • Create and test the script
  • Save the script to the publisher
  • Execute sp_addscriptexec at the publisher to apply script to subscribers

From here, the script will be executed at each subscriber on the next synchronization.  Taking this one step further — to limit a script to run on only certain subscribers, a check for HOST_NAME can be performed in the script.

-- Limit script to Subscriber1, Subscriber2, and Subscriber3
IF HOST_NAME() IN ('Subscriber1', 'Subscriber2', 'Subscriber3')
	-- script goes here

Something to consider

A caveat with sp_addscriptexec is that by default, if the script being executed encounters an error at the subscriber, the replication agent will fail and retry on subsequent syncs in an endless loop, be aware of this.  This behavior can be overridden by passing a value of 1 for @skiperror which instructs the agent to skip errors.  At the very least, SQL scripts should be tested thoroughly at the publisher before calling sp_addscriptexec to apply them to subscribers.

Creating merge replication conflict alerts

One of the challenges in working with Merge Replication is handling conflicts since writes can occur at any node in a topology. In an earlier post I demonstrated how to handle conflicts using the business logic handler framework - now I would like to show how to create Merge Replication conflict alerts based on the Performance Monitor counter SQLServer:Replication Merge Conflicts/sec.

Ideally proper planning would be done in an application to minimize the chances of conflicts occurring, whether that is achieved through column-level tracking, partitioning the writes, filtering, or some combination.  However sometimes application changes can introduce conflicts and having alerts in place is a smart precaution to take.

To setup a conflict alert - identify the Performance Monitor counter SQLServer:Replication Merge Conflicts/sec instance name to monitor.  In the Add Counters dialog in Performance Monitor the instance name can be identified for the Merge conflicts counter for a publication:

Add Counters Dialog

For this example the instance name that I will monitori is WS2008R2_1-AdventureWorksLT-TestMergePub1-WS2008R2_1-49.

From here sp_add_alert can be used to specify a merge conflict performance condition using our instance name to alert us when a conflict arises.  This can be done with the following bit of T-SQL:

USE msdb
GO
 
EXEC sp_add_alert @name=N'Merge Conflict Alert',
        @message_id=0,
        @severity=0,
        @enabled=1,
        @delay_between_responses=0,
        @include_event_description_in=0,
        @category_name=N'[Uncategorized]',
        @performance_condition=N'SQLServer:Replication Merge|Conflicts/sec|WS2008R2_1-AdventureWorksLT-TestMergePub1-WS2008R2_1-49|>|0'

This can also be modified to raise alerts only when a certain threshold is met to suit your needs.  An alert response can be set to send an email when the conflict performance condition is met which would give a much needed heads-up when things start to go awry.  I hope this provides a technique to detect precisely when conflicts occur and a head start on tracking down the culprit.

Error when creating a subscription

When creating a subscription in SQL Server sometimes the following error will occur:

The remote server "%s" does not exist, or has not been designated as a valid Publisher, or you may not have permission to see available Publishers.

The error is pretty straightforward and indicates that the replication agent login used to connect to the Publisher is not a member of the publication access list (PAL).  That would be the Distribution Agent process account for Transactional Replication and the Merge Agent process account for Merge Replication.  Add the agent process account to the PAL and try to create the subscription again.

Publication Access List

Merge Business Logic Handler Conflict Resolver

Introduction

This example demonstrates how to implement a business logic handler for a Merge article to perform custom resolution of conflicting update data changes that occur at Publisher and Subscribers.

Building the Sample

This sample was built using Visual Studio 2010.

Description

With Merge Replication you can utilize the business logic handler framework to write a managed code assembly that gets executed during the merge synchronization process.  The assembly can respond to a variety of replication events including data changes, conflicts, and errors.

For this example I will demonstrate how to implement a business logic handler for a Merge article that handles update conflicts.  In order to do this, we will override the UpdatesConflictHandler and make use of the ADO.NET data sets that are provided by the merge process to not only mimic, but extend the functionality included in the Microsoft SQL Server Priority Column Resolver.

The Microsoft SQL Server Priority Column Resolver, when used, uses a column to determine the conflict winner.  Basically, whichever column in the conflicting rows at Publisher or Subscriber that has the highest value, wins the conflict.  This is limited in the fact that you can only incorporate one column into the equation.  This example will extend this functionality to incorporate two columns by making use of the business logic handler framework.

Code Snippet

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.Common;
using System.Linq;
using System.Text;
 
using Microsoft.SqlServer.Replication.BusinessLogicSupport;
 
namespace BLH_CustomConflictResolver
{
    public class SalesOrderHeaderBusinessLogicHandler : Microsoft.SqlServer.Replication.BusinessLogicSupport.BusinessLogicModule
    {
        // Variables to hold server names.
        private string publisherName;
        private string subscriberName;
 
        public SalesOrderHeaderBusinessLogicHandler()
        {
        }
 
        // Implement the Initialize method to get publication
        // and subscription information.
        public override void Initialize(string publisher, string subscriber, string distributor,
            string publisherDB, string subscriberDB, string articleName)
        {
            // Set the Publisher and Subscriber names
            publisherName = publisher;
            subscriberName = subscriber;
        }
 
        // Declare what types of row changes, conflicts, or errors to handle
        override public ChangeStates HandledChangeStates
        {
            get
            {
                // We are handling update conflicts for this example
                return ChangeStates.UpdateConflicts;
            }
        }
 
        // This handler mimics the Microsoft SQL Server Priority Column Resolver but extends its functionality to incorporate 2 columns
        // in the resolution process.
        public override ActionOnUpdateConflict UpdateConflictsHandler(DataSet publisherDataSet, DataSet subscriberDataSet, ref DataSet customDataSet,
                ref ConflictLogType conflictLogType, ref string customConflictMessage, ref int historyLogLevel, ref string historyLogMessage)
        {
            // Priority column #1
            int publisherRevisionNumber = Int32.Parse(publisherDataSet.Tables[0].Rows[0]["RevisionNumber"].ToString());
            int subscriberRevisionNumber = Int32.Parse(subscriberDataSet.Tables[0].Rows[0]["RevisionNumber"].ToString());
 
            // Priority column #2
            DateTime publisherModifiedDate = DateTime.Parse(publisherDataSet.Tables[0].Rows[0]["ModifiedDate"].ToString());
            DateTime subscriberModifiedDate = DateTime.Parse(subscriberDataSet.Tables[0].Rows[0]["ModifiedDate"].ToString());
 
            // Check priority for column #1
            if (publisherRevisionNumber > subscriberRevisionNumber)
            {
                customDataSet = publisherDataSet.Copy();
            }
            else if (subscriberRevisionNumber > publisherRevisionNumber)
            {
                customDataSet = subscriberDataSet.Copy();
            }
            else if (publisherRevisionNumber == subscriberRevisionNumber)
            {
                // Priority column #1 for publisher and subscriber are equal
                // Check priority for column #2
                if (publisherModifiedDate > subscriberModifiedDate)
                {
                    customDataSet = publisherDataSet.Copy();
                }
                else
                {
                    customDataSet = subscriberDataSet.Copy();
                }
            }
 
            return ActionOnUpdateConflict.AcceptCustomConflictData;
        }
    }
}

Downloading the Sample

This sample can be downloaded from the MSDN Code Gallery.

More Information

For more information on executing business logic during merge synchronization see:

Applying a Snapshot -AltSnapshotFolder Parameter

Did you know a snapshot can be applied using the Distribution and Merge Agent –AltSnapshotFolder parameter?  This means that after generating a snapshot, it can be copied to Subscribers manually using removable media for instance, and applied locally.  This can be useful with large snapshots and when replicating over slow, unreliable links.

This method of initialization can be performed by adding the -AltSnapshotFolder parameter to the agent job step, using the agent programming interfaces, or when invoking agent executables from the command prompt.  For more information on the –AltSnapshotFolder parameter, refer to Replication Distribution Agent and Replication Merge Agent in the MSDN Library.

Marking FK Constraints NOT FOR REPLICATION

When administering Merge Replication, sometimes we need to mark foreign key constraints NOT FOR REPLICATION.  In some cases, we might require replication agent activity to be treated differently from user activity since we cannot guarantee the order of changes that are replicated.  To be more specific, sometimes parent and child records are sent in different batches, with the children getting sent first, resulting in constraint violations and conflicts.

There are a few scenarios that cause this behavior to occur and some of the details are covered in the following links.

By marking foreign key constraints NOT FOR REPLICATION, we can alleviate constraint violations from occurring when changes are made by replication agents.  With that said, it takes a fair amount of faith to trust that Merge will in fact deliver the parent records in the next batch of changes.  That's why it's wise to take a closer look when this happens and verify that after dropping and adding the constraints NOT FOR REPLICATION, that the parent records do arrive.

Dropping and adding foreign key constraints NOT FOR REPLICATION can be time consuming if there are a large number of them.  To speed this process up, I've been using variations of the following SELECT to generate my ALTER statements:

USE AdventureWorks;
GO
SELECT
  Drop_FK = 'ALTER TABLE [' + FK.FKTableSchema +
    '].[' + FK.FKTableName + '] DROP CONSTRAINT [' + FK.FKName + ']; ',
  Add_FK_NFR = 'ALTER TABLE [' + FK.FKTableSchema +
    '].[' + FK.FKTableName +
    '] WITH CHECK ADD CONSTRAINT [' + FK.FKName +
    '] FOREIGN KEY([' + FK.FKTableColumn +
    ']) REFERENCES [' + schema_name(sys.objects.schema_id) +
    '].[' + sys.objects.[name] + ']([' + sys.columns.[name] +
    ']) NOT FOR REPLICATION; '
FROM sys.objects
INNER JOIN sys.columns
  ON (sys.columns.[object_id] = sys.objects.[object_id])
INNER JOIN (
  SELECT
    sys.foreign_keys.[name] AS FKName,
    schema_name(sys.objects.schema_id) AS FKTableSchema,
    sys.objects.[name] AS FKTableName,
    sys.columns.[name] AS FKTableColumn,
    sys.foreign_keys.referenced_object_id AS referenced_object_id,
    sys.foreign_key_columns.referenced_column_id AS referenced_column_id
  FROM sys.foreign_keys
  INNER JOIN sys.foreign_key_columns
    ON (sys.foreign_key_columns.constraint_object_id = sys.foreign_keys.[object_id])
  INNER JOIN sys.objects
    ON (sys.objects.[object_id] = sys.foreign_keys.parent_object_id)
  INNER JOIN sys.columns
    ON (sys.columns.[object_id] = sys.objects.[object_id])
      AND (sys.columns.column_id = sys.foreign_key_columns.parent_column_id)
) FK
  ON (FK.referenced_object_id = sys.objects.[object_id])
    AND (FK.referenced_column_id = sys.columns.column_id)
WHERE (sys.objects.[type] = 'U')
  AND (sys.objects.is_ms_shipped = 0)
  AND (sys.objects.[name] NOT IN ('sysdiagrams'))

 

This has been tested on SQL Server 2008 R2 against the AdventureWorks database.  It does not handle composite foreign keys, among other things, but should provide a good starting point.  Please verify your constraints are defined correctly before deploying to production.  If you have any questions, feel free to get in touch.

Showing all posts tagged implement