The process could not read file due to OS error 3

When configuring a subscription to a publication in SQL Server, sometimes the agent process cannot read from the snapshot:

The process could not read file X:\Program Files\Microsoft SQL Server\<instance>\MSSQL\repldata\<snapshot>\<snapshot_file> due to OS error 3. (Source: MSSQL_REPL, Error number: MSSQL_REPL20016)
Get help: http://help/MSSQL_REPL20016
The system cannot find the path specified.
(Source: MSSQL_REPL, Error number: MSSQL_REPL3)
Get help: http://help/MSSQL_REPL3

This error can be puzzling at first especially when the path indeed does exist and the replication agent process accounts have sufficient privileges to access the snapshot folder.

The error occurs because when using pull subscriptions and/or a remote distributor you must specify a UNC network share such as \\<computername>\snapshot rather than a local path for the snapshot folder.  To alleviate this issue, make the snapshot folder a UNC network share and update the Distributor Properties to reflect the change.

More Information

Securing the Snapshot Folder

Merge Business Logic Handler Conflict Resolver

Introduction

This example demonstrates how to implement a business logic handler for a Merge article to perform custom resolution of conflicting update data changes that occur at Publisher and Subscribers.

Building the Sample

This sample was built using Visual Studio 2010.

Description

With Merge Replication you can utilize the business logic handler framework to write a managed code assembly that gets executed during the merge synchronization process.  The assembly can respond to a variety of replication events including data changes, conflicts, and errors.

For this example I will demonstrate how to implement a business logic handler for a Merge article that handles update conflicts.  In order to do this, we will override the UpdatesConflictHandler and make use of the ADO.NET data sets that are provided by the merge process to not only mimic, but extend the functionality included in the Microsoft SQL Server Priority Column Resolver.

The Microsoft SQL Server Priority Column Resolver, when used, uses a column to determine the conflict winner.  Basically, whichever column in the conflicting rows at Publisher or Subscriber that has the highest value, wins the conflict.  This is limited in the fact that you can only incorporate one column into the equation.  This example will extend this functionality to incorporate two columns by making use of the business logic handler framework.

Code Snippet

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.Common;
using System.Linq;
using System.Text;
 
using Microsoft.SqlServer.Replication.BusinessLogicSupport;
 
namespace BLH_CustomConflictResolver
{
    public class SalesOrderHeaderBusinessLogicHandler : Microsoft.SqlServer.Replication.BusinessLogicSupport.BusinessLogicModule
    {
        // Variables to hold server names.
        private string publisherName;
        private string subscriberName;
 
        public SalesOrderHeaderBusinessLogicHandler()
        {
        }
 
        // Implement the Initialize method to get publication
        // and subscription information.
        public override void Initialize(string publisher, string subscriber, string distributor,
            string publisherDB, string subscriberDB, string articleName)
        {
            // Set the Publisher and Subscriber names
            publisherName = publisher;
            subscriberName = subscriber;
        }
 
        // Declare what types of row changes, conflicts, or errors to handle
        override public ChangeStates HandledChangeStates
        {
            get
            {
                // We are handling update conflicts for this example
                return ChangeStates.UpdateConflicts;
            }
        }
 
        // This handler mimics the Microsoft SQL Server Priority Column Resolver but extends its functionality to incorporate 2 columns
        // in the resolution process.
        public override ActionOnUpdateConflict UpdateConflictsHandler(DataSet publisherDataSet, DataSet subscriberDataSet, ref DataSet customDataSet,
                ref ConflictLogType conflictLogType, ref string customConflictMessage, ref int historyLogLevel, ref string historyLogMessage)
        {
            // Priority column #1
            int publisherRevisionNumber = Int32.Parse(publisherDataSet.Tables[0].Rows[0]["RevisionNumber"].ToString());
            int subscriberRevisionNumber = Int32.Parse(subscriberDataSet.Tables[0].Rows[0]["RevisionNumber"].ToString());
 
            // Priority column #2
            DateTime publisherModifiedDate = DateTime.Parse(publisherDataSet.Tables[0].Rows[0]["ModifiedDate"].ToString());
            DateTime subscriberModifiedDate = DateTime.Parse(subscriberDataSet.Tables[0].Rows[0]["ModifiedDate"].ToString());
 
            // Check priority for column #1
            if (publisherRevisionNumber &gt; subscriberRevisionNumber)
            {
                customDataSet = publisherDataSet.Copy();
            }
            else if (subscriberRevisionNumber &gt; publisherRevisionNumber)
            {
                customDataSet = subscriberDataSet.Copy();
            }
            else if (publisherRevisionNumber == subscriberRevisionNumber)
            {
                // Priority column #1 for publisher and subscriber are equal
                // Check priority for column #2
                if (publisherModifiedDate &gt; subscriberModifiedDate)
                {
                    customDataSet = publisherDataSet.Copy();
                }
                else
                {
                    customDataSet = subscriberDataSet.Copy();
                }
            }
 
            return ActionOnUpdateConflict.AcceptCustomConflictData;
        }
    }
}

Downloading the Sample

This sample can be downloaded from the MSDN Code Gallery.

More Information

For more information on executing business logic during merge synchronization see:

Applying a Snapshot -AltSnapshotFolder Parameter

Did you know a snapshot can be applied using the Distribution and Merge Agent –AltSnapshotFolder parameter?  This means that after generating a snapshot, it can be copied to Subscribers manually using removable media for instance, and applied locally.  This can be useful with large snapshots and when replicating over slow, unreliable links.

This method of initialization can be performed by adding the -AltSnapshotFolder parameter to the agent job step, using the agent programming interfaces, or when invoking agent executables from the command prompt.  For more information on the –AltSnapshotFolder parameter, refer to Replication Distribution Agent and Replication Merge Agent in the MSDN Library.

Marking FK Constraints NOT FOR REPLICATION

When administering Merge Replication, sometimes we need to mark foreign key constraints NOT FOR REPLICATION.  In some cases, we might require replication agent activity to be treated differently from user activity since we cannot guarantee the order of changes that are replicated.  To be more specific, sometimes parent and child records are sent in different batches, with the children getting sent first, resulting in constraint violations and conflicts.

There are a few scenarios that cause this behavior to occur and some of the details are covered in the following links.

By marking foreign key constraints NOT FOR REPLICATION, we can alleviate constraint violations from occurring when changes are made by replication agents.  With that said, it takes a fair amount of faith to trust that Merge will in fact deliver the parent records in the next batch of changes.  That's why it's wise to take a closer look when this happens and verify that after dropping and adding the constraints NOT FOR REPLICATION, that the parent records do arrive.

Dropping and adding foreign key constraints NOT FOR REPLICATION can be time consuming if there are a large number of them.  To speed this process up, I've been using variations of the following SELECT to generate my ALTER statements:

USE AdventureWorks;
GO
SELECT
  Drop_FK = 'ALTER TABLE [' + FK.FKTableSchema +
    '].[' + FK.FKTableName + '] DROP CONSTRAINT [' + FK.FKName + ']; ',
  Add_FK_NFR = 'ALTER TABLE [' + FK.FKTableSchema +
    '].[' + FK.FKTableName +
    '] WITH CHECK ADD CONSTRAINT [' + FK.FKName +
    '] FOREIGN KEY([' + FK.FKTableColumn +
    ']) REFERENCES [' + schema_name(sys.objects.schema_id) +
    '].[' + sys.objects.[name] + ']([' + sys.columns.[name] +
    ']) NOT FOR REPLICATION; '
FROM sys.objects
INNER JOIN sys.columns
  ON (sys.columns.[object_id] = sys.objects.[object_id])
INNER JOIN (
  SELECT
    sys.foreign_keys.[name] AS FKName,
    schema_name(sys.objects.schema_id) AS FKTableSchema,
    sys.objects.[name] AS FKTableName,
    sys.columns.[name] AS FKTableColumn,
    sys.foreign_keys.referenced_object_id AS referenced_object_id,
    sys.foreign_key_columns.referenced_column_id AS referenced_column_id
  FROM sys.foreign_keys
  INNER JOIN sys.foreign_key_columns
    ON (sys.foreign_key_columns.constraint_object_id = sys.foreign_keys.[object_id])
  INNER JOIN sys.objects
    ON (sys.objects.[object_id] = sys.foreign_keys.parent_object_id)
  INNER JOIN sys.columns
    ON (sys.columns.[object_id] = sys.objects.[object_id])
      AND (sys.columns.column_id = sys.foreign_key_columns.parent_column_id)
) FK
  ON (FK.referenced_object_id = sys.objects.[object_id])
    AND (FK.referenced_column_id = sys.columns.column_id)
WHERE (sys.objects.[type] = 'U')
  AND (sys.objects.is_ms_shipped = 0)
  AND (sys.objects.[name] NOT IN ('sysdiagrams'))

 

This has been tested on SQL Server 2008 R2 against the AdventureWorks database.  It does not handle composite foreign keys, among other things, but should provide a good starting point.  Please verify your constraints are defined correctly before deploying to production.  If you have any questions, feel free to get in touch.

Non-Convergence of Replicated Data

Brad McGehee over at SQL Aloha posted his July Question: What Do You Think are the Most Common DBA Pain Points and it got me thinking about my DBA career and some of the pain points I've experienced with SQL Server Replication.

One of the biggest problems I face is ensuring the convergence of data in all nodes in my Merge topologies.  There are a variety of reasons for non-convergence to occur such as network related issues, conflicts, improper filters, and/or some general catastrophic error.  Sometimes disasters happen and they need to be cleaned up.

If and when non-convergence occurs, it’s important to identify the root cause and correct it.  After doing so, you'll likely need to get everything back in sync.  If you're lucky and the volume of non-converged data is minimal, a complete tear down might not be required.  Sometimes the problem can be corrected by identifying the non-converged data and performing dummy updates on these rows, or with reinitialization.

However, if the non-convergence is widespread you may want to consider tearing everything down (publications and subscriptions), manually synchronizing the non-converged databases, and setting everything back up again.  The detection of non-convergence and the manual synchronization of databases can be done using the tablediff utility but an even better tool is Red Gate's SQL Data Compare.

SQL Data Compare is by far superior to the tablediff utility and offers a powerful user interface.  This tool has saved my bacon on numerous occasions and has got me up and running quickly after replication disasters.  It's a fine tool to have in your DBA toolbox if you're replicating databases.

The Proxy Auto-configuration URL was not found

When configuring Merge Replication for Web Synchronization it's common to encounter the error The Proxy Auto-configuration URL was not found and for replication to fail on synchronization.  This error occurs when either the certificate is not installed properly, and/or Internet Explorer proxy settings are set.

To alleviate this issue, verify the certificate is installed properly at the subscriber(s) for the merge agent process account.  Then, open Internet Explorer, go to Internet Options -> Connections tab -> LAN settings, and uncheck Automatically detect settings.

Accidentally Dropped the Distribution Database

Have you made the mistake of dropping your distribution database like this?

DropDistributionCapture

If so, remember not to do this ever again.  Disabling publishing and distribution involves more than just dropping the distribution database.  The distribution database in SQL Server requires special attention and dropping it in this fashion can wreak havoc on your distribution server.  Doing so will cause existing publications to error on synchronization and further attempts to disable publishing and distribution will fail.  This really puts a DBA between a rock and a hard place.  Setting the distribution database offline and dropping it like this is a great example of how to hose your distributor.

DistributionDoesNotExistCapture

CouldNotDisablePublishingCapture

Whatever you do, don’t panic.  Dropping the distribution database this way isn’t good but we still have the distribution data and log files files located in the folder we specified when configuring distribution.  The reason these files are still around is because when a database is offline and is dropped, the disk files are not deleted.  Knowing this, all we need to do is attach the distribution data and log files to recreate the distribution database.  This can be done via SSMS or with the following bit of T-SQL.

AttachDistributionTSQLCapture

If you encounter access denied errors when attaching the distribution data and log files, grant appropriate permissions to the distribution.mdf and distribution.ldf files on the Security tab in the file properties dialog.  This will circumvent the access denied errors.

DistributionFilePropertiesCapture

After attaching the distribution data and log files, the distribution database should now be in a workable state.  Existing publications should resume as normal and if you choose to do so, publishing and distribution can now be disabled correctly.

Disabling Publishing and Distribution

To avoid this problem altogether... use replication stored procedures, RMO (Replication Management Objects), or the Disable Publishing and Distribution Wizard to disable publishing and distribution.  We can learn from the MSDN Library How to: Disable Publishing and Distribution (Replication Transact-SQL Programming) that this can be done with T-SQL which is nice.

I encourage you to become familiar with these procedures and know what they do.  Using the Wizard is convenient but it’s important to realize what happens when publishing and distribution is disabled.  Not only does the distribution database need to be dropped, but replication objects need to be removed as well.  Publishers have to be dropped from the Distribution server and the Distribution server needs to be uninstalled.

DisablePublishing&DistributionViaTSQLCapture

To disable publishing and distribution using the Wizard, right-click the Replication node in Object Explorer and click ‘Disable Publishing and Distribution…

DisablePublishing&DistributionWizardCapture

I hope this sheds some light on how to disable publishing and distribution in SQL Server.  It’s wise to do this through the use of replication stored procedures, RMO, or the Disable Publishing and Distribution Wizard.  I’ve came across a couple of threads in the SQL Server Replication Forum with people getting stuck after accidentally dropping the distribution database.  If you find yourself in this situation don’t do anything drastic.  Re-installing SQL Server is not the answer.  Just remain calm, attach the distribution data and log files, and all will be fine.

Replicating XML Columns

One caveat that’s worth mentioning is replicating the XML data type in SQL Server using Merge Replication.  Sometimes, when replicating a table that contains a column of type XML via a Merge Publication, the XML column data in updated rows sent from Publisher to Subscriber, and vice-versa, arrive at the destination as empty, leading to non-convergence.

If you're experiencing this behavior, verify you have the latest Service Pack applied.  If the problem continues to persist, execute sp_helpmergearticle on the Publisher to retrieve information about the article in question and examine the value of the article property stream_blob_columns.

sp_helpmergearticle

sp_helpmergearticle_result

If the value of stream_blob_columns is set to true then this is the culprit.  When stream_blob_columns is set to true, SQL Server uses data stream optimization when replicating LOBs.  Somewhere in this optimization process things go south leading to non-convergence.  To workaround this, execute sp_changemergearticle, set stream_blob_columns to false, and synchronize your subscriptions.

sp_changemergearticle

 

After setting stream_blob_columns to false, articles containing XML columns should converge correctly in subsequent synchronizations.

Replication Monitor – Friendly Names

You may have noticed when you setup a merge pull subscription using the New Subscription Wizard in SSMS that you weren’t given the opportunity to give your subscription a description.  This is unfortunate because subscription descriptions are displayed in the Friendly Name column in Replication Monitor and give you a convenient way to tag your subscriptions with meaningful names.  You may have also noticed that when you right-click on a merge pull subscription and view it’s properties that the subscription description property is grayed out.

Subscription Properties Capture - Pull Subscription

Executing sp_changemergepullsubscription to change the subscription description to 'Seattle 1' does appear in the Subscription Properties dialog.  But Replication Monitor still displays no Friendly Name.

Subscription Properties Capture - Pull Subscription w description

Replication Monitor - No Friendly Name

Fortunately there is a table MSmerge_subscriptions in the distribution database which contains a row for each subscription in our merge pull topology and is our key to populating the Friendly Name column for our merge pull subscriptions.

-- Use the distribution database

USE distribution

GO


-- Give our merge pull subscription a description

UPDATE MSmerge_subscriptions

SET description = 'Seattle 1'

WHERE subid = 'C281B1EC-3684-42B6-B9B8-AB6DB91F6F18'

GO


MSmerge_subscriptions - w description

Replication Monitor - w Friendly Name

We can now see a Friendly Name for our merge pull subscription in Replication Monitor.  A friendly name is much easier to remember than an obnoxious server name and can save us time when identifying subscribing servers in Replication Monitor.

You are browsing all posts categorized Merge