Counting Merge Replication Article Changes

One of the cool things about Merge Replication is that it has the capability of revealing very interesting information regarding applications and their associated statements ran against published articles.  We can learn quite a bit about an application just by the sheer number of data changes that are replicated as a result of the statements it executes.  This can reveal not only which tables an application changes the most, but may even be useful in identifying flaws in application statements when the amount of data changes being replicated is more than anticipated.

Merge Replication tracks data changes through triggers and system tables in the publication and subscription databases.  The replication system tables contain metadata that indicates which data changes should be propagated.  One of Merge Replication system tables, MSmerge_genhistory, contains one row for each generation (collection of changes) that is delivered to a Publisher or Subscriber.  MSmerge_genhistory contains a column changecount, which can be used to determine which table articles contain the most data changes.  So to get an idea for which articles contain the most data changes within the retention period, we can do something like this:

USE MyPublicationDB
GO
 
SELECT
name AS [Object Name],
SUM(changecount) as [Change Count]
FROM dbo.MSmerge_genhistory WITH (NOLOCK)
INNER JOIN dbo.sysmergearticles  WITH (NOLOCK) ON art_nick = nickname
WHERE genstatus = 1 OR genstatus = 2
GROUP BY name
ORDER BY [Change Count] DESC

In this scenario, the articles in my test Merge topology that have been changed since my retention period are Customer, SalesOrderDetail, Address, Product, and SalesOrderHeader,  which have had 847, 542, 450, 295, and 32 data changes respectively.  If you find yourself wondering which articles in your Merge topology are exhibiting the most data changes, the changecount column in MSmerge_genhistory is what you're looking for.

Merge Article Count PivotTable And Chart

Replication Bookmarks

I started a list of the resources I frequent the most in the MSDN Library SQL Server Books Online - Replication, Technical Articles and Whitepapers, Blogs and Websites, Videos, Forums, and Books - Replication Bookmarks.

I find myself visiting these pages often and thought this could be a useful reference when searching for quality content on SQL Server Replication.  The list is subject to change as I intend to add to it over time.  One list I plan to grow is Blogs and Websites as I'm aware that there have been a number of blog posts written on the topic of SQL Server Replication.  If you're interested in being added to my replication bookmarks, feel free to get in touch, I'm always looking for something to read.

Merge Business Logic Handler Conflict Resolver

Introduction

This example demonstrates how to implement a business logic handler for a Merge article to perform custom resolution of conflicting update data changes that occur at Publisher and Subscribers.

Building the Sample

This sample was built using Visual Studio 2010.

Description

With Merge Replication you can utilize the business logic handler framework to write a managed code assembly that gets executed during the merge synchronization process.  The assembly can respond to a variety of replication events including data changes, conflicts, and errors.

For this example I will demonstrate how to implement a business logic handler for a Merge article that handles update conflicts.  In order to do this, we will override the UpdatesConflictHandler and make use of the ADO.NET data sets that are provided by the merge process to not only mimic, but extend the functionality included in the Microsoft SQL Server Priority Column Resolver.

The Microsoft SQL Server Priority Column Resolver, when used, uses a column to determine the conflict winner.  Basically, whichever column in the conflicting rows at Publisher or Subscriber that has the highest value, wins the conflict.  This is limited in the fact that you can only incorporate one column into the equation.  This example will extend this functionality to incorporate two columns by making use of the business logic handler framework.

Code Snippet

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.Common;
using System.Linq;
using System.Text;
 
using Microsoft.SqlServer.Replication.BusinessLogicSupport;
 
namespace BLH_CustomConflictResolver
{
    public class SalesOrderHeaderBusinessLogicHandler : Microsoft.SqlServer.Replication.BusinessLogicSupport.BusinessLogicModule
    {
        // Variables to hold server names.
        private string publisherName;
        private string subscriberName;
 
        public SalesOrderHeaderBusinessLogicHandler()
        {
        }
 
        // Implement the Initialize method to get publication
        // and subscription information.
        public override void Initialize(string publisher, string subscriber, string distributor,
            string publisherDB, string subscriberDB, string articleName)
        {
            // Set the Publisher and Subscriber names
            publisherName = publisher;
            subscriberName = subscriber;
        }
 
        // Declare what types of row changes, conflicts, or errors to handle
        override public ChangeStates HandledChangeStates
        {
            get
            {
                // We are handling update conflicts for this example
                return ChangeStates.UpdateConflicts;
            }
        }
 
        // This handler mimics the Microsoft SQL Server Priority Column Resolver but extends its functionality to incorporate 2 columns
        // in the resolution process.
        public override ActionOnUpdateConflict UpdateConflictsHandler(DataSet publisherDataSet, DataSet subscriberDataSet, ref DataSet customDataSet,
                ref ConflictLogType conflictLogType, ref string customConflictMessage, ref int historyLogLevel, ref string historyLogMessage)
        {
            // Priority column #1
            int publisherRevisionNumber = Int32.Parse(publisherDataSet.Tables[0].Rows[0]["RevisionNumber"].ToString());
            int subscriberRevisionNumber = Int32.Parse(subscriberDataSet.Tables[0].Rows[0]["RevisionNumber"].ToString());
 
            // Priority column #2
            DateTime publisherModifiedDate = DateTime.Parse(publisherDataSet.Tables[0].Rows[0]["ModifiedDate"].ToString());
            DateTime subscriberModifiedDate = DateTime.Parse(subscriberDataSet.Tables[0].Rows[0]["ModifiedDate"].ToString());
 
            // Check priority for column #1
            if (publisherRevisionNumber > subscriberRevisionNumber)
            {
                customDataSet = publisherDataSet.Copy();
            }
            else if (subscriberRevisionNumber > publisherRevisionNumber)
            {
                customDataSet = subscriberDataSet.Copy();
            }
            else if (publisherRevisionNumber == subscriberRevisionNumber)
            {
                // Priority column #1 for publisher and subscriber are equal
                // Check priority for column #2
                if (publisherModifiedDate > subscriberModifiedDate)
                {
                    customDataSet = publisherDataSet.Copy();
                }
                else
                {
                    customDataSet = subscriberDataSet.Copy();
                }
            }
 
            return ActionOnUpdateConflict.AcceptCustomConflictData;
        }
    }
}

Downloading the Sample

This sample can be downloaded from the MSDN Code Gallery.

More Information

For more information on executing business logic during merge synchronization see:

Updatable Subscriptions Removed in SQL Server 2012

After spending some time looking at SQL Server 2012 CTP3, it appears as if Transactional Publication with Updatable Subscriptions will officially be removed in SQL Server 2012.  According to Books Online, Microsoft is recommending that users switch to Peer-to-Peer Replication instead.  This is a shame considering Peer-to-Peer is an Enterprise-only feature and will cost significantly more than Standard Edition.  On the other hand, we will still be able to achieve updatable subscriptions in Standard Edition via Merge Replication or Bidirectional Transactional Replication.

Here are the screenshots comparing the SQL Server 2008 R2 New Publication Wizard with the SQL Denali CTP3 New Publication Wizard:

SQL Server 2008 R2 vs. SQL Server Denali CTP3

SQL Server 2008 R2 vs. SQL Server Denali CTP3 - New Publication Wizard

SQL Saturday #92 Lunch: Bend Area Gathering

SQLSaturday #92There's nothing more I love than free food and SQL Server, no joke.  I take my free food and SQL Server very seriously and you should too. Fortunately for me, the folks at PASS know the way to my heart.  It seems as if they've tailored a day of free SQL Server training and food just for me.  I have no idea what's on the menu is but it's free and that's good enough for me to make the trip.  My guess is we'll get an entree of best practices, a side of tips and tricks, and a delectable networking dessert.

With that said, I’ll be attending SQLSaturday #92 in Portland, OR on October 8th, 2011.  If anyone is interested, I’ll be meeting during lunch with anyone from Central Oregon that is interested in working together to create a PASS Chapter in the Bend/Redmond area.  Anyone who is interested is welcome to join the lunch conversation.  Box lunches will be served, so pick up a lunch and join us in this discussion.

Replication Agent Logging

To troubleshoot replication agent failures, replication agent profiles can be configured to output verbose agent logs to a text file.  All replication agents contain the parameters -Output output_path_and_file_name and -OutputVerboseLevel [0|1|2], which when specified, output the agent log to the file output_path_and_file_name and will contain varying degrees of information depending on the verbose level passed.  In general, the information contained in the verbose agent logs consists of progress report messages, error messages, procedure calls, and various session statistics.

Depending on the verbose level and the agent, some or all of this information will be provided.  For the purpose of troubleshooting replication agent failures, a verbose level of 2 should be specified, which prints all error and progress report messages.  It might also be worth mentioning that some agents can be passed undocumented output verbose levels 3 and 4 which provide even more information.  The information provided in the verbose agent logs can be useful when trying to diagnose and identify exactly where replication is failing as well as get detailed information as to the root cause of the failure.  In addition, the verbose agent logs are likely the first piece of diagnostic data that Microsoft Support will ask for when diagnosing a replication support incident.

Reading the logs takes a little getting used to but after utilizing them to troubleshoot a few times, it becomes second nature.  I often use them when replication failures are occurring and the error given in Replication Monitor and the View Synchronization Status window isn’t sufficient enough to diagnose the issue.  An example error message with Merge Replication that occasionally crops up is: The process was successfully stopped. (Source: MSSQL_REPL, Error number: MSSQL_REPL-2147200963), which isn’t very verbose.  Knowing how to output the verbose agent logs is a smart skill to have under these circumstances in order to identify the root cause.

The parameters to output the verbose agent logs can be passed to the Run Agent job step in the Job Properties dialog, Agent Profile dialog in Replication Monitor, via RMO, or by way of invoking the agents from the command line.  After adding the parameters, the replication agent must be restarted for the profile changes to take effect.  Once restarted, the replication agent will begin outputting the verbose log to the path and file name specified.  After reproducing the failure, the log can then be examined for further information.

The verbose agent log is outputted in the form of a text document and can be opened with your favorite text editor.  The -OutputVerboseLevel specified for the agent profile will dictate the contents of the log.  Typically, if a replication agent failure occurs, it will fail and the process will stop.  That means that the failure is likely to be found at the end of the log, in the last 20 or so lines in the output file.

Be sure to remove the agent parameters after reproducing the failure and collecting the log.  Not doing so will result in the log to be appended to on subsequent synchronizations and consume disk space.  Outputting agent logs can also impact replication performance and thus isn’t recommended to do unless troubleshooting replication failures.

Non-Convergence of Replicated Data

Brad McGehee over at SQL Aloha posted his July Question: What Do You Think are the Most Common DBA Pain Points and it got me thinking about my DBA career and some of the pain points I've experienced with SQL Server Replication.

One of the biggest problems I face is ensuring the convergence of data in all nodes in my Merge topologies.  There are a variety of reasons for non-convergence to occur such as network related issues, conflicts, improper filters, and/or some general catastrophic error.  Sometimes disasters happen and they need to be cleaned up.

If and when non-convergence occurs, it’s important to identify the root cause and correct it.  After doing so, you'll likely need to get everything back in sync.  If you're lucky and the volume of non-converged data is minimal, a complete tear down might not be required.  Sometimes the problem can be corrected by identifying the non-converged data and performing dummy updates on these rows, or with reinitialization.

However, if the non-convergence is widespread you may want to consider tearing everything down (publications and subscriptions), manually synchronizing the non-converged databases, and setting everything back up again.  The detection of non-convergence and the manual synchronization of databases can be done using the tablediff utility but an even better tool is Red Gate's SQL Data Compare.

SQL Data Compare is by far superior to the tablediff utility and offers a powerful user interface.  This tool has saved my bacon on numerous occasions and has got me up and running quickly after replication disasters.  It's a fine tool to have in your DBA toolbox if you're replicating databases.

Accidentally Dropped the Distribution Database

Have you made the mistake of dropping your distribution database like this?

DropDistributionCapture

If so, remember not to do this ever again.  Disabling publishing and distribution involves more than just dropping the distribution database.  The distribution database in SQL Server requires special attention and dropping it in this fashion can wreak havoc on your distribution server.  Doing so will cause existing publications to error on synchronization and further attempts to disable publishing and distribution will fail.  This really puts a DBA between a rock and a hard place.  Setting the distribution database offline and dropping it like this is a great example of how to hose your distributor.

DistributionDoesNotExistCapture

CouldNotDisablePublishingCapture

Whatever you do, don’t panic.  Dropping the distribution database this way isn’t good but we still have the distribution data and log files files located in the folder we specified when configuring distribution.  The reason these files are still around is because when a database is offline and is dropped, the disk files are not deleted.  Knowing this, all we need to do is attach the distribution data and log files to recreate the distribution database.  This can be done via SSMS or with the following bit of T-SQL.

AttachDistributionTSQLCapture

If you encounter access denied errors when attaching the distribution data and log files, grant appropriate permissions to the distribution.mdf and distribution.ldf files on the Security tab in the file properties dialog.  This will circumvent the access denied errors.

DistributionFilePropertiesCapture

After attaching the distribution data and log files, the distribution database should now be in a workable state.  Existing publications should resume as normal and if you choose to do so, publishing and distribution can now be disabled correctly.

Disabling Publishing and Distribution

To avoid this problem altogether... use replication stored procedures, RMO (Replication Management Objects), or the Disable Publishing and Distribution Wizard to disable publishing and distribution.  We can learn from the MSDN Library How to: Disable Publishing and Distribution (Replication Transact-SQL Programming) that this can be done with T-SQL which is nice.

I encourage you to become familiar with these procedures and know what they do.  Using the Wizard is convenient but it’s important to realize what happens when publishing and distribution is disabled.  Not only does the distribution database need to be dropped, but replication objects need to be removed as well.  Publishers have to be dropped from the Distribution server and the Distribution server needs to be uninstalled.

DisablePublishing&DistributionViaTSQLCapture

To disable publishing and distribution using the Wizard, right-click the Replication node in Object Explorer and click ‘Disable Publishing and Distribution…

DisablePublishing&DistributionWizardCapture

I hope this sheds some light on how to disable publishing and distribution in SQL Server.  It’s wise to do this through the use of replication stored procedures, RMO, or the Disable Publishing and Distribution Wizard.  I’ve came across a couple of threads in the SQL Server Replication Forum with people getting stuck after accidentally dropping the distribution database.  If you find yourself in this situation don’t do anything drastic.  Re-installing SQL Server is not the answer.  Just remain calm, attach the distribution data and log files, and all will be fine.

A hello, and welcome to my new blog

Hi everyone.  Welcome to my new blog.  I'm still in the building stages here so things may seem a little uneventful at the moment.  I'll be polishing this up as I go.  Interested in what type of content I'll be providing?  Have a look at the About page.

You are browsing all posts categorized Noteworthy