.NET分布式事务未提交造成6107错误或系统被挂起的问题分析定位

时间:2022-05-08 06:11:28

问题描述:

      系统中多个功能不定期出现“Unable to get error message (6107) (0).”错误,即分布式事务超时,但报出错误的部分功能根本没有使用分布式事务。

原因分析:

      推测是存在分布式事务未提交的情况,回到线程池后被复用造成的,例如:

      系统中A功能存在分布式事务未提交的问题,处理A功能的线程执行完成后回到线程池,B功能复用了处理A功能的线程,A功能未提交的分布式事务将B功能的代码也包含进去了,依次类推可能还有C、D等功能,直到A功能开启的分布式事务超时报出6107错误,而报错的B或C、D等功能本身是没有问题的。

Demo验证:

   1:      ThreadPool.SetMinThreads(1, 1);
   2:      ThreadPool.SetMaxThreads(1, 1);
   3:   
   4:      ThreadPool.QueueUserWorkItem((x) =>
   5:      {
   6:          Console.WriteLine("33: " + Thread.CurrentThread.ManagedThreadId);
   7:          TransactionScope ts = new TransactionScope(TransactionScopeOption.Required, new TimeSpan(0, 0, 10));
   8:          using (OracleConnection conn = new OracleConnection(connstr))
   9:          {
  10:              conn.Open();
  11:              var cmd = conn.CreateCommand();
  12:              cmd.CommandText = @"update workitem set name = 'yyxx' where workitemid = '64365dc4-41d7-4057-af68-012d552660e2'";
  13:              cmd.ExecuteNonQuery();
  14:              conn.Close();
  15:          }
  16:          Thread.Sleep(100);
  17:          Console.WriteLine("33: end");
  18:      });
  19:              
  20:      ThreadPool.QueueUserWorkItem((x) => {
  21:          Console.WriteLine("44: " + Thread.CurrentThread.ManagedThreadId);
  22:          try
  23:          {
  24:              while (true)
  25:              {
  26:                  using (OracleConnection conn = new OracleConnection(connstr))
  27:                  {
  28:                      conn.Open();
  29:                      var cmd = conn.CreateCommand();
  30:                      cmd.CommandText = @"update workitem set name = 'xxyy' where workitemid = '68265741-4433-42e9-b7e7-63ae1ce8bbc8'";
  31:                      cmd.ExecuteNonQuery();
  32:                      conn.Close();
  33:                  }
  34:                  Thread.Sleep(100);
  35:              }
  36:          }
  37:          catch (Exception e)
  38:          {
  39:              Console.WriteLine(e);
  40:          }
  41:          Console.WriteLine("44: end");
  42:      });

 

.NET分布式事务未提交造成6107错误或系统被挂起的问题分析定位

 

解决方法:

      使用HttpModule 分别在beginRequest和endRequest中判断是否已包含了事务上下文

      如果beginRequest中没有事务,endRequest时有未提交事务,则说明当前请求对应的是问题源头;

      如果beginRequest和endRequest都有,说明当前请求是线程复用后的受害者。

   示例如下:

   1:          protected void context_EndRequest(object sender, EventArgs e)
   2:          {
   3:              Debug.WriteLine(Thread.CurrentThread.ManagedThreadId);
   4:              Type t = typeof(Transaction);
   5:              FieldInfo field = t.GetField("complete", BindingFlags.Instance | BindingFlags.NonPublic);
   6:              if (Transaction.Current != null)
   7:              {
   8:                  bool isComplete = (bool) field.GetValue(Transaction.Current);
   9:                  if (!isComplete)
  10:                  {
  11:                      Debug.WriteLine(
  12:                          string.Format(@"Creation Time: {0}\r\nDistributedIdentifier:{1}\r\nLocalIdentifier:{2}",
  13:                              Transaction.Current.TransactionInformation.CreationTime,
  14:                              Transaction.Current.TransactionInformation.DistributedIdentifier,
  15:                              Transaction.Current.TransactionInformation.LocalIdentifier));
  16:                      //Transaction.Current.Rollback();
  17:                      //Transaction.Current.Dispose();
  18:                  }
  19:              }
  20:          }