Central Subscriber Model Explained

The majority of SQL Server Replication topologies are based on the Central Publisher model, which is comprised of a single publisher replicating to one or more subscribers.  Another replication model, which is sometimes overlooked, is the Central Subscriber model, which is comprised of multiple publishers replicating to one subscriber using Transactional Replication.

Central Subscriber Model

The Central Subscriber model is useful for rolling up or consolidating data from multiple sources.  Some examples include:

  • Rolling up inventory from several warehouses into a central server at corporate headquarters.
  • Sending data from remote offices within a company to a central office for business continuity.
  • Consolidating order information to one location for centralized ordering.

Priming the pump

By default, subscriptions are initialized from a snapshot generated by the Snapshot Agent and then applied by the Distribution Agent.  When the snapshot is applied, by default the article property Action if name is in use is set to Drop existing object and create a new one, which instructs that the destination table be dropped if it already exists at a subscriber.  This behavior can be problematic in the Central Subscriber model when snapshots are applied since snapshots must be applied from multiple publications.  The first snapshot is applied as expected, however, subsequent snapshot applications result in the previous snapshot data being wiped out.

The solution to this problem is horizontal partitioning, static row filters, and setting the Action if name is in use article property to Delete data. If article has a row filter, delete only data that matches the filter.

Horizontal partitioning

Ideally, published tables in a Central Subscriber topology will be horizontally partitioned.  In order to horizontally partition the tables to be published, a location-specific column should be added and included as a part of a composite primary key.  Consider a table that looks like this:

CREATE TABLE TestTable
(
    ID int IDENTITY(1,1) NOT NULL,
    FirstName varchar(100) NULL,
    CONSTRAINT PK_TestTable_ID PRIMARY KEY CLUSTERED (ID ASC)
)

To horizontally partition TestTable and prepare it for a Central Subscriber configuration at Publisher 1, drop primary key constraint PK_TestTable_ID, add a location-specific column named LocationID with a default value of 1, and add the new composite primary key including the LocationID column.

ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_ID
GO
 
ALTER TABLE TestTable
ADD LocationID INT NOT NULL DEFAULT(1)
GO
 
ALTER TABLE TestTable
ADD CONSTRAINT PK_TestTable_ID_LocationID PRIMARY KEY CLUSTERED (ID, LocationID)
GO

Next, to horizontally partition TestTable and prepare it for a Central Subscriber configuration at Publisher 2, the same preparation can be done with a default value of 2 for LocationID.

ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_ID
GO
 
ALTER TABLE TestTable
ADD LocationID INT NOT NULL DEFAULT(2)
GO
 
ALTER TABLE TestTable
ADD CONSTRAINT PK_TestTable_ID_LocationID PRIMARY KEY CLUSTERED (ID, LocationID)
GO

Finally, to horizontally partition TestTable and prepare it for a Central Subscriber configuration at Publisher 3, the same preparation can be done with a default value of 3 for LocationID.

ALTER TABLE TestTable
DROP CONSTRAINT PK_TestTable_ID
GO
 
ALTER TABLE TestTable
ADD LocationID INT NOT NULL DEFAULT(3)
GO
 
ALTER TABLE TestTable
ADD CONSTRAINT PK_TestTable_ID_LocationID PRIMARY KEY CLUSTERED (ID, LocationID)
GO

Once the tables are horizontally partitioned, they can be properly published in a Central Subscriber topology by using static row filters, filtering on the LocationID column and setting the article property Action if name is in use to Delete data. If article has a row filter, delete only data that matches the filter.

Static row filters

For each article to be published in a Central Subscriber topology, a static row filter should be defined to leverage the Action if name is in use article property appropriately.  A static row filter uses a WHERE clause to select the data to be published.  To publish rows from Publisher 1, specify LocationID = 1 for the filter clause. Likewise, to publish rows from Publisher 2 and Publisher 3, specify LocationID = 2 and LocationID = 3 for the filter clause, respectively.

Static Row Filters

Action if name is in use

When creating the publications and adding articles, the article property Action if name is in use needs to be set to Delete data. If article has a row filter, delete only data that matches the filter.  This can be set using the New Publication Wizard Article Properties dialog or by using replication stored procedures sp_addarticle and specifying a value of delete for the @pre_creation_cmd argument.  This way, when the central subscriber is initialized or reinitialized from multiple publication snapshots, previously applied snapshot data will be preserved since only data matching the filter clause will be deleted.

Action if name is in use

The caveat

As we can see, horizontal partitioning requires that tables have a location-specific column added, however, the location-specific column does not necessarily need to be included as a part of the primary key at the publication databases.  In addition, it is not a hard requirement that published tables in a Central Subscriber topology be horizontally partitioned.  In some shops, changing a primary key or adding additional columns is strictly prohibited, in which case I would urge you to take an alternative approach.  If you would like some ideas on implementing a Central Subscriber topology without modifying primary keys or horizontally partitioning publication databases, feel free to get in touch or leave a comment below.

-Brandon Williams (blog | linkedin | twitter)

Error when creating a subscription

When creating a subscription in SQL Server sometimes the following error will occur:

The remote server "%s" does not exist, or has not been designated as a valid Publisher, or you may not have permission to see available Publishers.

The error is pretty straightforward and indicates that the replication agent login used to connect to the Publisher is not a member of the publication access list (PAL).  That would be the Distribution Agent process account for Transactional Replication and the Merge Agent process account for Merge Replication.  Add the agent process account to the PAL and try to create the subscription again.

Publication Access List

Extending the Publisher to the Cloud

SQL Azure Data Sync

SQL Azure Data SyncThis weekend I decided to setup a SQL Azure sync group to synchronize a SQL Azure database with an on-premise replication publisher.  I was curious if this would work, as I’ve been unable to find any documentation stating whether or not SQL Azure Data Sync can co-exist with SQL Server Replication.  Not only was I able to get this working, but I took it a step further and downloaded the Windows Azure and Windows Phone SDKs — then wrote a quick and dirty Windows Phone app which connects to my SQL Azure sync group hub database and pulls down Customer and related Order data to my mobile device, making it available via a pivot control within the application.

I was able to create a sync group in SQL Azure Data Sync to synchronize the Customer, SalesOrderHeader, and SalesOrderDetail tables in the AdventureWorksLTAZ2008R2 database, which also acted as an on-premise Merge publisher in my local network with 3 subscribers.  The solution provided a sync group hub database containing the Merge publication Customer and related Order data, synching one-way from on-premise to the cloud.  I settled for synchronizing the sync group every 30 minutes and kept existing subscribers synching with the on-premise publisher on their normal schedules, which was every 1 minute in my test topology.

Publisher Sync Group

Windows Phone Application

Windows Phone SDKFrom here, I created 2 projects — 1 Windows Azure WCF Service Web Role which allows for Windows Phone app interaction with a SQL Azure database using Entity Framework, and 1 Windows Phone application using WCF to connect to the SQL Azure sync group hub database through the service reference and consume the Customer and related Order data through a pivot control.

A couple of screenshots of the app:


View Customer OrdersCustomer Order
 

As you can see, the potential for SQL Azure Data Sync is interesting.  Not only can sync groups be useful in providing an offsite database storage solution, but when synchronized with an on-premise replication publisher, can also provide a mechanism for exposing enterprise data to the cloud to be consumed by mobile devices.  SQL Azure Data Sync is currently available for preview and can be found in the Windows Azure Management Portal.  More information about SQL Azure Data Sync can be found in SQL Azure Data Sync on MSDN.

How to let non-admin business users run replication agent jobs

This blog post is basic in nature but highlights a very real requirement that may arise when administering replication.  Occasionally we, as Database Administrators, need to grant non-administrator business users the ability to run replication agent jobs due to business needs.  Instead of giving the business user the ability to execute/stop/start jobs by adding them to a SQL Server Agent fixed database role, it is probably wise to limit the business user to execute only the replication agent job in question.

For example, if we want to give a business user the ability to run only the Snapshot Agent job – first, we can create a stored procedure RunSnapshotAgent that runs the Snapshot Agent job, using WITH EXECUTE AS to run the stored procedure as the database owner:

CREATE PROCEDURE dbo.RunSnapshotAgent
WITH EXECUTE AS 'snapshotagent_job_owner'
AS
EXEC sp_start_job
	@job_name = 'snapshot_agent_job'
GO

Where 'snapshot_job_owner' is the snapshot agent job owner and 'snapshot_agent_job' is the snapshot agent job name.

Next, we can create a role named snapshot_executor, and add the business user to that role:

CREATE ROLE snapshot_executor;
 
EXEC sp_addrolemember
	@rolename = 'snapshot_executor',
	@membername = 'BUSINESSUSER1';
GO

Where ‘BUSINESSUSER1’ is the business user’s login.

Then, grant execute rights for the RunSnapshotAgent stored procedure to the snapshot_executor role to allow the business user to start only the snapshot agent job:

GRANT EXEC ON dbo.RunSnapshotAgent TO snapshot_executor;
GO

After creating the stored procedure, creating the role, and adding the business user to that role; the business user can then execute the RunSnapshotAgent stored procedure to run only the Snapshot Agent:

EXEC
dbo.RunSnapshotAgent;
GO

The process could not read file due to OS error 3

When configuring a subscription to a publication in SQL Server, sometimes the agent process cannot read from the snapshot:

The process could not read file X:\Program Files\Microsoft SQL Server\<instance>\MSSQL\repldata\<snapshot>\<snapshot_file> due to OS error 3. (Source: MSSQL_REPL, Error number: MSSQL_REPL20016)
Get help: http://help/MSSQL_REPL20016
The system cannot find the path specified.
(Source: MSSQL_REPL, Error number: MSSQL_REPL3)
Get help: http://help/MSSQL_REPL3

This error can be puzzling at first especially when the path indeed does exist and the replication agent process accounts have sufficient privileges to access the snapshot folder.

The error occurs because when using pull subscriptions and/or a remote distributor you must specify a UNC network share such as \\<computername>\snapshot rather than a local path for the snapshot folder.  To alleviate this issue, make the snapshot folder a UNC network share and update the Distributor Properties to reflect the change.

More Information

Securing the Snapshot Folder

Marking FK Constraints NOT FOR REPLICATION

When administering Merge Replication, sometimes we need to mark foreign key constraints NOT FOR REPLICATION.  In some cases, we might require replication agent activity to be treated differently from user activity since we cannot guarantee the order of changes that are replicated.  To be more specific, sometimes parent and child records are sent in different batches, with the children getting sent first, resulting in constraint violations and conflicts.

There are a few scenarios that cause this behavior to occur and some of the details are covered in the following links.

By marking foreign key constraints NOT FOR REPLICATION, we can alleviate constraint violations from occurring when changes are made by replication agents.  With that said, it takes a fair amount of faith to trust that Merge will in fact deliver the parent records in the next batch of changes.  That's why it's wise to take a closer look when this happens and verify that after dropping and adding the constraints NOT FOR REPLICATION, that the parent records do arrive.

Dropping and adding foreign key constraints NOT FOR REPLICATION can be time consuming if there are a large number of them.  To speed this process up, I've been using variations of the following SELECT to generate my ALTER statements:

USE AdventureWorks;
GO
SELECT
  Drop_FK = 'ALTER TABLE [' + FK.FKTableSchema +
    '].[' + FK.FKTableName + '] DROP CONSTRAINT [' + FK.FKName + ']; ',
  Add_FK_NFR = 'ALTER TABLE [' + FK.FKTableSchema +
    '].[' + FK.FKTableName +
    '] WITH CHECK ADD CONSTRAINT [' + FK.FKName +
    '] FOREIGN KEY([' + FK.FKTableColumn +
    ']) REFERENCES [' + schema_name(sys.objects.schema_id) +
    '].[' + sys.objects.[name] + ']([' + sys.columns.[name] +
    ']) NOT FOR REPLICATION; '
FROM sys.objects
INNER JOIN sys.columns
  ON (sys.columns.[object_id] = sys.objects.[object_id])
INNER JOIN (
  SELECT
    sys.foreign_keys.[name] AS FKName,
    schema_name(sys.objects.schema_id) AS FKTableSchema,
    sys.objects.[name] AS FKTableName,
    sys.columns.[name] AS FKTableColumn,
    sys.foreign_keys.referenced_object_id AS referenced_object_id,
    sys.foreign_key_columns.referenced_column_id AS referenced_column_id
  FROM sys.foreign_keys
  INNER JOIN sys.foreign_key_columns
    ON (sys.foreign_key_columns.constraint_object_id = sys.foreign_keys.[object_id])
  INNER JOIN sys.objects
    ON (sys.objects.[object_id] = sys.foreign_keys.parent_object_id)
  INNER JOIN sys.columns
    ON (sys.columns.[object_id] = sys.objects.[object_id])
      AND (sys.columns.column_id = sys.foreign_key_columns.parent_column_id)
) FK
  ON (FK.referenced_object_id = sys.objects.[object_id])
    AND (FK.referenced_column_id = sys.columns.column_id)
WHERE (sys.objects.[type] = 'U')
  AND (sys.objects.is_ms_shipped = 0)
  AND (sys.objects.[name] NOT IN ('sysdiagrams'))

 

This has been tested on SQL Server 2008 R2 against the AdventureWorks database.  It does not handle composite foreign keys, among other things, but should provide a good starting point.  Please verify your constraints are defined correctly before deploying to production.  If you have any questions, feel free to get in touch.

Non-Convergence of Replicated Data

Brad McGehee over at SQL Aloha posted his July Question: What Do You Think are the Most Common DBA Pain Points and it got me thinking about my DBA career and some of the pain points I've experienced with SQL Server Replication.

One of the biggest problems I face is ensuring the convergence of data in all nodes in my Merge topologies.  There are a variety of reasons for non-convergence to occur such as network related issues, conflicts, improper filters, and/or some general catastrophic error.  Sometimes disasters happen and they need to be cleaned up.

If and when non-convergence occurs, it’s important to identify the root cause and correct it.  After doing so, you'll likely need to get everything back in sync.  If you're lucky and the volume of non-converged data is minimal, a complete tear down might not be required.  Sometimes the problem can be corrected by identifying the non-converged data and performing dummy updates on these rows, or with reinitialization.

However, if the non-convergence is widespread you may want to consider tearing everything down (publications and subscriptions), manually synchronizing the non-converged databases, and setting everything back up again.  The detection of non-convergence and the manual synchronization of databases can be done using the tablediff utility but an even better tool is Red Gate's SQL Data Compare.

SQL Data Compare is by far superior to the tablediff utility and offers a powerful user interface.  This tool has saved my bacon on numerous occasions and has got me up and running quickly after replication disasters.  It's a fine tool to have in your DBA toolbox if you're replicating databases.

Accidentally Dropped the Distribution Database

Have you made the mistake of dropping your distribution database like this?

DropDistributionCapture

If so, remember not to do this ever again.  Disabling publishing and distribution involves more than just dropping the distribution database.  The distribution database in SQL Server requires special attention and dropping it in this fashion can wreak havoc on your distribution server.  Doing so will cause existing publications to error on synchronization and further attempts to disable publishing and distribution will fail.  This really puts a DBA between a rock and a hard place.  Setting the distribution database offline and dropping it like this is a great example of how to hose your distributor.

DistributionDoesNotExistCapture

CouldNotDisablePublishingCapture

Whatever you do, don’t panic.  Dropping the distribution database this way isn’t good but we still have the distribution data and log files files located in the folder we specified when configuring distribution.  The reason these files are still around is because when a database is offline and is dropped, the disk files are not deleted.  Knowing this, all we need to do is attach the distribution data and log files to recreate the distribution database.  This can be done via SSMS or with the following bit of T-SQL.

AttachDistributionTSQLCapture

If you encounter access denied errors when attaching the distribution data and log files, grant appropriate permissions to the distribution.mdf and distribution.ldf files on the Security tab in the file properties dialog.  This will circumvent the access denied errors.

DistributionFilePropertiesCapture

After attaching the distribution data and log files, the distribution database should now be in a workable state.  Existing publications should resume as normal and if you choose to do so, publishing and distribution can now be disabled correctly.

Disabling Publishing and Distribution

To avoid this problem altogether... use replication stored procedures, RMO (Replication Management Objects), or the Disable Publishing and Distribution Wizard to disable publishing and distribution.  We can learn from the MSDN Library How to: Disable Publishing and Distribution (Replication Transact-SQL Programming) that this can be done with T-SQL which is nice.

I encourage you to become familiar with these procedures and know what they do.  Using the Wizard is convenient but it’s important to realize what happens when publishing and distribution is disabled.  Not only does the distribution database need to be dropped, but replication objects need to be removed as well.  Publishers have to be dropped from the Distribution server and the Distribution server needs to be uninstalled.

DisablePublishing&DistributionViaTSQLCapture

To disable publishing and distribution using the Wizard, right-click the Replication node in Object Explorer and click ‘Disable Publishing and Distribution…

DisablePublishing&DistributionWizardCapture

I hope this sheds some light on how to disable publishing and distribution in SQL Server.  It’s wise to do this through the use of replication stored procedures, RMO, or the Disable Publishing and Distribution Wizard.  I’ve came across a couple of threads in the SQL Server Replication Forum with people getting stuck after accidentally dropping the distribution database.  If you find yourself in this situation don’t do anything drastic.  Re-installing SQL Server is not the answer.  Just remain calm, attach the distribution data and log files, and all will be fine.