Non-Convergence of Replicated Data

Brad McGehee over at SQL Aloha posted his July Question: What Do You Think are the Most Common DBA Pain Points and it got me thinking about my DBA career and some of the pain points I've experienced with SQL Server Replication.

One of the biggest problems I face is ensuring the convergence of data in all nodes in my Merge topologies.  There are a variety of reasons for non-convergence to occur such as network related issues, conflicts, improper filters, and/or some general catastrophic error.  Sometimes disasters happen and they need to be cleaned up.

If and when non-convergence occurs, it’s important to identify the root cause and correct it.  After doing so, you'll likely need to get everything back in sync.  If you're lucky and the volume of non-converged data is minimal, a complete tear down might not be required.  Sometimes the problem can be corrected by identifying the non-converged data and performing dummy updates on these rows, or with reinitialization.

However, if the non-convergence is widespread you may want to consider tearing everything down (publications and subscriptions), manually synchronizing the non-converged databases, and setting everything back up again.  The detection of non-convergence and the manual synchronization of databases can be done using the tablediff utility but an even better tool is Red Gate's SQL Data Compare.

SQL Data Compare is by far superior to the tablediff utility and offers a powerful user interface.  This tool has saved my bacon on numerous occasions and has got me up and running quickly after replication disasters.  It's a fine tool to have in your DBA toolbox if you're replicating databases.

Accidentally Dropped the Distribution Database

Have you made the mistake of dropping your distribution database like this?

DropDistributionCapture

If so, remember not to do this ever again.  Disabling publishing and distribution involves more than just dropping the distribution database.  The distribution database in SQL Server requires special attention and dropping it in this fashion can wreak havoc on your distribution server.  Doing so will cause existing publications to error on synchronization and further attempts to disable publishing and distribution will fail.  This really puts a DBA between a rock and a hard place.  Setting the distribution database offline and dropping it like this is a great example of how to hose your distributor.

DistributionDoesNotExistCapture

CouldNotDisablePublishingCapture

Whatever you do, don’t panic.  Dropping the distribution database this way isn’t good but we still have the distribution data and log files files located in the folder we specified when configuring distribution.  The reason these files are still around is because when a database is offline and is dropped, the disk files are not deleted.  Knowing this, all we need to do is attach the distribution data and log files to recreate the distribution database.  This can be done via SSMS or with the following bit of T-SQL.

AttachDistributionTSQLCapture

If you encounter access denied errors when attaching the distribution data and log files, grant appropriate permissions to the distribution.mdf and distribution.ldf files on the Security tab in the file properties dialog.  This will circumvent the access denied errors.

DistributionFilePropertiesCapture

After attaching the distribution data and log files, the distribution database should now be in a workable state.  Existing publications should resume as normal and if you choose to do so, publishing and distribution can now be disabled correctly.

Disabling Publishing and Distribution

To avoid this problem altogether... use replication stored procedures, RMO (Replication Management Objects), or the Disable Publishing and Distribution Wizard to disable publishing and distribution.  We can learn from the MSDN Library How to: Disable Publishing and Distribution (Replication Transact-SQL Programming) that this can be done with T-SQL which is nice.

I encourage you to become familiar with these procedures and know what they do.  Using the Wizard is convenient but it’s important to realize what happens when publishing and distribution is disabled.  Not only does the distribution database need to be dropped, but replication objects need to be removed as well.  Publishers have to be dropped from the Distribution server and the Distribution server needs to be uninstalled.

DisablePublishing&DistributionViaTSQLCapture

To disable publishing and distribution using the Wizard, right-click the Replication node in Object Explorer and click ‘Disable Publishing and Distribution…

DisablePublishing&DistributionWizardCapture

I hope this sheds some light on how to disable publishing and distribution in SQL Server.  It’s wise to do this through the use of replication stored procedures, RMO, or the Disable Publishing and Distribution Wizard.  I’ve came across a couple of threads in the SQL Server Replication Forum with people getting stuck after accidentally dropping the distribution database.  If you find yourself in this situation don’t do anything drastic.  Re-installing SQL Server is not the answer.  Just remain calm, attach the distribution data and log files, and all will be fine.

You are browsing all posts categorized Transactional