当镜像SQL Server数据库失败时,如何获得通知

时间:2021-02-16 09:37:20

We have a couple of mirrored SQL Server databases.

我们有一些镜像的SQL Server数据库。

My first problem - the key problem - is to get a notification when the db fails over. I don't need to know because, erm, its mirrored and so it (almost) all carries on working automagically but it would useful to be advised and I'm currently getting failovers when I don't think I should be so it want to know when they occur (without too much digging) to see if I can determine why.

我的第一个问题——关键问题——是在db失败时获得通知。我不需要知道,因为erm、镜像和它(几乎)所有工作进行自动但有用的建议,我现在得到故障转移时,我不认为我应该想知道当他们发生(没有太多的挖掘),看看我可以确定为什么。

I have services running that I could fairly easily use to monitor this - so the alternative question would be "How do I programmatically determine which is the principal and which is the mirror" - preferably in a more intelligent fashion than just attempting to connect each in turn (which would mostly work but...).

运行我服务,我可以很轻松地使用监控这个问题,所以选择将“我如何以编程方式确定哪些是本金和镜子”——最好是更聪明的方式不仅仅是试图连接每个反过来(主要工作但是…)。

Thanks, Murph

谢谢,梅菲

Addendum:

附录:

One of the answers queries why I don't need to know when it fails over - the answer is that we're developing using ADO.NET and that has automatic failover support, all you have to do is add Failover Partner=MIRRORSERVER (where MIRRORSERVER is the name of your mirror server instance) to your connection string and your code will fail over transparently - you may get some errors depending on what connections are active but in our case very few.

其中一个答案就是为什么我不需要知道什么时候它失败了——答案是我们正在使用ADO。网和自动故障转移支持,所有你需要做的就是添加故障转移伙伴= MIRRORSERVER(MIRRORSERVER镜像服务器实例的名称)你的连接字符串,您的代码将会失败在透明的——你可能会得到一些错误取决于连接是活动的,但在我们的例子中很少。

3 个解决方案

#1


2  

Right,

对的,

The two answers and a little thought got me to something approaching an answer.

这两个答案和一个小小的念头让我想到了一个接近答案的东西。

First a little more clarification:

首先再做一点澄清:

The app is written in C# (2.0+) and uses ADO.NET to talk to SQL Server 2005. The mirror setup is two W2k3 servers hosting the Principal and the Mirror plus a third server hosting an express instance as a monitor. The nice thing about this is a failover is all but transparent to the app using the database, it will throw an error for some connections but fundamentally everything will carry on nicely. Yes we're getting the odd false positive but the whole point is to have the system carry on working with the least amount of fuss and mirror does deliver this very nicely.

这个应用程序是用c#(2.0+)编写的,使用ADO。NET与SQL Server 2005对话。镜像设置是两个W2k3服务器托管主体,镜像加上第三个服务器托管一个express实例作为监视器。这样做的好处是,故障转移对使用数据库的应用程序来说几乎是透明的,它会给一些连接带来错误,但从根本上说,一切都会顺利进行。是的,我们得到的是奇数假阳性,但关键是要让系统以最少的麻烦继续工作,镜子确实很好地传递了这一点。

Further, the issue is not with serious server failure - that's usually a bit more obvious but with a failover for other reasons (c.f. the false positives above) as we do have a couple of things that can't, for various reasons, fail over and in any case so we can see if we can identify the circumstance where we get false positives.

此外,问题不是严重服务器故障,通常是更明显的但故障转移其他原因(多严峻上面的假阳性),我们有几件事情,不能,由于种种原因,故障转移和在任何情况下我们可以看到如果我们能确定我们得到了假阳性的情况。

So, given the above, simply checking the status of the boxes is not quite enough and chasing through the event log is probably overly complex - the answer is, as it turns out, fairly simple: sp_helpserver

因此,考虑到上面的内容,简单地检查复选框的状态是不够的,并且在事件日志中进行搜索可能过于复杂了——答案是,非常简单:sp_helpserver。

The first column returned by sp_helpserver is the server name. If you run the request at regular intervals saving the previous server name and doing a comparison each time you'll be able to identify when a change has taken place and then take the appropriate action.

sp_helpserver返回的第一列是服务器名。如果您定期运行请求,保存以前的服务器名并进行比较,每次您将能够识别更改发生的时间,然后采取适当的操作。

The following is a console app that demonstrates the principal - although it needs some work (e.g. the connection ought to be non-pooled and new each time) but its enough for now (so I'd then accept this as "the" answer"). Parameters are Principal, Mirror, Database

下面是一个显示主体的控制台应用程序,尽管它需要一些工作(例如,连接应该是非池的和新的),但是它现在已经足够了(所以我将接受这个作为“答案”)。参数是主体、镜像、数据库

using System;
using System.Data.SqlClient;

namespace FailoverMonitorConcept
{
    class Program
    {
        static void Main(string[] args)
        {
            string server = args[0];
            string failover = args[1];
            string database = args[2];

            string connStr = string.Format("Integrated Security=SSPI;Persist Security Info=True;Data Source={0};Failover Partner={1};Packet Size=4096;Initial Catalog={2}", server, failover, database);
            string sql = "EXEC sp_helpserver";

            SqlConnection dc = new SqlConnection(connStr);
            SqlCommand cmd = new SqlCommand(sql, dc);
            Console.WriteLine("Connection string: " + connStr);
            Console.WriteLine("Press any key to test, press q to quit");

            string priorServerName = "";
            char key = ' ';

            while(key.ToString().ToLower() != "q")
            {
                dc.Open();
                try
                {
                    string serverName = cmd.ExecuteScalar() as string;
                    Console.WriteLine(DateTime.Now.ToLongTimeString() + " - Server name: " + serverName);
                    if (priorServerName == "")
                    {
                        priorServerName = serverName;
                    }
                    else if (priorServerName != serverName)
                    {
                        Console.WriteLine("***** SERVER CHANGED *****");
                        Console.WriteLine("New server: " + serverName);
                        priorServerName = serverName;
                    }
                }
                catch (System.Data.SqlClient.SqlException ex)
                {
                    Console.WriteLine("Error: " + ex.ToString());
                }
                finally
                {
                    dc.Close();
                }
                key = Console.ReadKey(true).KeyChar;

            }

            Console.WriteLine("Finis!");

        }
    }
}

I wouldn't have arrived here without a) asking the question and then b) getting the responses which made me actually think

如果没有a)问问题,然后b)得到让我真正思考的答案,我就不会来到这里

Murph

梅菲

#2


1  

If the failover logic is in your application you could write a status screen that shows which box you're connected by writing to a var when the first connection attempt fails.

如果故障转移逻辑在应用程序中,您可以编写一个状态屏幕,在第一次连接尝试失败时,通过写入到var来显示连接的是哪个框。

I think your best bet would be a ping daemon/cron job that checks the status of each box periodically and sends an email if one doesn't respond.

我认为你最好的选择是一个ping守护进程/cron作业,定期检查每个盒子的状态,如果没有回复,就发送电子邮件。

#3


1  

Use something like Host Monitor http://www.ks-soft.net/hostmon.eng/ to monitor the Event Log for messages related to the failover event, which can send you an alert via email/SMS.

使用类似于Host Monitor的http://www.ks-soft.net/hostmon.eng/来监视与故障转移事件相关的消息的事件日志,该消息可以通过电子邮件/SMS向您发送警报。

I'm curious though how you wouldn't need to know that the failover happened, because don't you have to then update the datasources in your applications to point to the new server that you failed over to? Mirroring takes place on different hosts (the primary and the mirror), unlike clustering which has multiple nodes that appear to be a single device from the outside.

我很好奇,为什么您不需要知道故障转移发生了,因为您不需要随后更新应用程序中的数据源来指向失败的新服务器呢?镜像发生在不同的主机上(主服务器和镜像服务器),与集群不同的是,集群有多个节点,这些节点看起来像是来自外部的单个设备。

Also, are you using a witness server in order to automatically fail over from the primary to the mirror? This is the only way I know of to make it happen automatically, and in my experience, you get a lot of false-positives where network hiccups can fool the mirror and witness into thinking the primary is down when in fact it is not.

另外,您是否正在使用见证服务器以自动从主服务器故障转移到镜像服务器?这是我所知道的唯一一种使它自动发生的方法,根据我的经验,你会得到很多假阳性,网络打嗝会骗过镜子,让目击者认为主服务器是宕机的,而实际上并非如此。

#1


2  

Right,

对的,

The two answers and a little thought got me to something approaching an answer.

这两个答案和一个小小的念头让我想到了一个接近答案的东西。

First a little more clarification:

首先再做一点澄清:

The app is written in C# (2.0+) and uses ADO.NET to talk to SQL Server 2005. The mirror setup is two W2k3 servers hosting the Principal and the Mirror plus a third server hosting an express instance as a monitor. The nice thing about this is a failover is all but transparent to the app using the database, it will throw an error for some connections but fundamentally everything will carry on nicely. Yes we're getting the odd false positive but the whole point is to have the system carry on working with the least amount of fuss and mirror does deliver this very nicely.

这个应用程序是用c#(2.0+)编写的,使用ADO。NET与SQL Server 2005对话。镜像设置是两个W2k3服务器托管主体,镜像加上第三个服务器托管一个express实例作为监视器。这样做的好处是,故障转移对使用数据库的应用程序来说几乎是透明的,它会给一些连接带来错误,但从根本上说,一切都会顺利进行。是的,我们得到的是奇数假阳性,但关键是要让系统以最少的麻烦继续工作,镜子确实很好地传递了这一点。

Further, the issue is not with serious server failure - that's usually a bit more obvious but with a failover for other reasons (c.f. the false positives above) as we do have a couple of things that can't, for various reasons, fail over and in any case so we can see if we can identify the circumstance where we get false positives.

此外,问题不是严重服务器故障,通常是更明显的但故障转移其他原因(多严峻上面的假阳性),我们有几件事情,不能,由于种种原因,故障转移和在任何情况下我们可以看到如果我们能确定我们得到了假阳性的情况。

So, given the above, simply checking the status of the boxes is not quite enough and chasing through the event log is probably overly complex - the answer is, as it turns out, fairly simple: sp_helpserver

因此,考虑到上面的内容,简单地检查复选框的状态是不够的,并且在事件日志中进行搜索可能过于复杂了——答案是,非常简单:sp_helpserver。

The first column returned by sp_helpserver is the server name. If you run the request at regular intervals saving the previous server name and doing a comparison each time you'll be able to identify when a change has taken place and then take the appropriate action.

sp_helpserver返回的第一列是服务器名。如果您定期运行请求,保存以前的服务器名并进行比较,每次您将能够识别更改发生的时间,然后采取适当的操作。

The following is a console app that demonstrates the principal - although it needs some work (e.g. the connection ought to be non-pooled and new each time) but its enough for now (so I'd then accept this as "the" answer"). Parameters are Principal, Mirror, Database

下面是一个显示主体的控制台应用程序,尽管它需要一些工作(例如,连接应该是非池的和新的),但是它现在已经足够了(所以我将接受这个作为“答案”)。参数是主体、镜像、数据库

using System;
using System.Data.SqlClient;

namespace FailoverMonitorConcept
{
    class Program
    {
        static void Main(string[] args)
        {
            string server = args[0];
            string failover = args[1];
            string database = args[2];

            string connStr = string.Format("Integrated Security=SSPI;Persist Security Info=True;Data Source={0};Failover Partner={1};Packet Size=4096;Initial Catalog={2}", server, failover, database);
            string sql = "EXEC sp_helpserver";

            SqlConnection dc = new SqlConnection(connStr);
            SqlCommand cmd = new SqlCommand(sql, dc);
            Console.WriteLine("Connection string: " + connStr);
            Console.WriteLine("Press any key to test, press q to quit");

            string priorServerName = "";
            char key = ' ';

            while(key.ToString().ToLower() != "q")
            {
                dc.Open();
                try
                {
                    string serverName = cmd.ExecuteScalar() as string;
                    Console.WriteLine(DateTime.Now.ToLongTimeString() + " - Server name: " + serverName);
                    if (priorServerName == "")
                    {
                        priorServerName = serverName;
                    }
                    else if (priorServerName != serverName)
                    {
                        Console.WriteLine("***** SERVER CHANGED *****");
                        Console.WriteLine("New server: " + serverName);
                        priorServerName = serverName;
                    }
                }
                catch (System.Data.SqlClient.SqlException ex)
                {
                    Console.WriteLine("Error: " + ex.ToString());
                }
                finally
                {
                    dc.Close();
                }
                key = Console.ReadKey(true).KeyChar;

            }

            Console.WriteLine("Finis!");

        }
    }
}

I wouldn't have arrived here without a) asking the question and then b) getting the responses which made me actually think

如果没有a)问问题,然后b)得到让我真正思考的答案,我就不会来到这里

Murph

梅菲

#2


1  

If the failover logic is in your application you could write a status screen that shows which box you're connected by writing to a var when the first connection attempt fails.

如果故障转移逻辑在应用程序中,您可以编写一个状态屏幕,在第一次连接尝试失败时,通过写入到var来显示连接的是哪个框。

I think your best bet would be a ping daemon/cron job that checks the status of each box periodically and sends an email if one doesn't respond.

我认为你最好的选择是一个ping守护进程/cron作业,定期检查每个盒子的状态,如果没有回复,就发送电子邮件。

#3


1  

Use something like Host Monitor http://www.ks-soft.net/hostmon.eng/ to monitor the Event Log for messages related to the failover event, which can send you an alert via email/SMS.

使用类似于Host Monitor的http://www.ks-soft.net/hostmon.eng/来监视与故障转移事件相关的消息的事件日志,该消息可以通过电子邮件/SMS向您发送警报。

I'm curious though how you wouldn't need to know that the failover happened, because don't you have to then update the datasources in your applications to point to the new server that you failed over to? Mirroring takes place on different hosts (the primary and the mirror), unlike clustering which has multiple nodes that appear to be a single device from the outside.

我很好奇,为什么您不需要知道故障转移发生了,因为您不需要随后更新应用程序中的数据源来指向失败的新服务器呢?镜像发生在不同的主机上(主服务器和镜像服务器),与集群不同的是,集群有多个节点,这些节点看起来像是来自外部的单个设备。

Also, are you using a witness server in order to automatically fail over from the primary to the mirror? This is the only way I know of to make it happen automatically, and in my experience, you get a lot of false-positives where network hiccups can fool the mirror and witness into thinking the primary is down when in fact it is not.

另外,您是否正在使用见证服务器以自动从主服务器故障转移到镜像服务器?这是我所知道的唯一一种使它自动发生的方法,根据我的经验,你会得到很多假阳性,网络打嗝会骗过镜子,让目击者认为主服务器是宕机的,而实际上并非如此。