如何使用DateTime.ToString提高代码的性能?

时间:2021-01-01 03:56:16

In my binary to text decoding application (.NET 2.0) I found that the line:

在我的二进制文本解码应用程序(.NET 2.0)中,我发现了这一行:

logEntryTime.ToString("dd.MM.yy HH:mm:ss:fff")

takes 33% of total processing time. Does anyone have any ideas on how to make it faster?

占总处理时间的33%。有没有人对如何加快速度有任何想法?

EDIT: This app is used to process some binary logs and it currently takes 15 hours to run. So 1/3 of this will be 5 hours.

编辑:此应用程序用于处理一些二进制日志,它目前需要15个小时才能运行。所以1/3将是5个小时。

EDIT: I am using NProf for profiling. App is processing around 17 GBytes of binary logs.

编辑:我正在使用NProf进行分析。应用程序正在处理大约17 GB的二进制日志。

4 个解决方案

#1


It's unfortunate that .NET doesn't have a sort of "formatter" type which can parse a pattern and remember it.

遗憾的是,.NET没有一种“格式化程序”类型,它可以解析模式并记住它。

If you're always using the same format, you might want to hand-craft a formatter to do exactly that. Something along the lines of:

如果您总是使用相同的格式,您可能希望手工制作格式化程序来完成相同的操作。有点像:

public static string FormatDateTime(DateTime dt)
{
    char[] chars = new char[21];
    Write2Chars(chars, 0, dt.Day);
    chars[2] = '.';
    Write2Chars(chars, 3, dt.Month);
    chars[5] = '.';
    Write2Chars(chars, 6, dt.Year % 100);
    chars[8] = ' ';
    Write2Chars(chars, 9, dt.Hour);
    chars[11] = ' ';
    Write2Chars(chars, 12, dt.Minute);
    chars[14] = ' ';
    Write2Chars(chars, 15, dt.Second);
    chars[17] = ' ';
    Write2Chars(chars, 18, dt.Millisecond / 10);
    chars[20] = Digit(dt.Millisecond % 10);

    return new string(chars);
}

private static void Write2Chars(char[] chars, int offset, int value)
{
    chars[offset] = Digit(value / 10);
    chars[offset+1] = Digit(value % 10);
}

private static char Digit(int value)
{
    return (char) (value + '0');
}

This is pretty ugly, but it's probably a lot more efficient... benchmark it, of course!

这非常难看,但它可能效率更高......当然,基准吧!

#2


Are you sure it takes 33% of the time? How have you measured that? It sounds more than a little suspicious to me...

你确定需要33%的时间吗?你怎么测量的?这对我来说听起来有点可疑......

This makes things a little bit quicker:

这使得事情变得更快一些:

Basic: 2342ms
Custom: 1319ms

Or if we cut out the IO (Stream.Null):

或者如果我们切出IO(Stream.Null):

Basic: 2275ms
Custom: 839ms

using System.Diagnostics;
using System;
using System.IO;
static class Program
{
    static void Main()
    {
        DateTime when = DateTime.Now;
        const int LOOP = 1000000;

        Stopwatch basic = Stopwatch.StartNew();
        using (TextWriter tw = new StreamWriter("basic.txt"))
        {
            for (int i = 0; i < LOOP; i++)
            {
                tw.Write(when.ToString("dd.MM.yy HH:mm:ss:fff"));
            }
        }
        basic.Stop();
        Console.WriteLine("Basic: " + basic.ElapsedMilliseconds + "ms");

        char[] buffer = new char[100];
        Stopwatch custom = Stopwatch.StartNew();
        using (TextWriter tw = new StreamWriter("custom.txt"))
        {
            for (int i = 0; i < LOOP; i++)
            {
                WriteDateTime(tw, when, buffer);
            }
        }
        custom.Stop();
        Console.WriteLine("Custom: " + custom.ElapsedMilliseconds + "ms");
    }
    static void WriteDateTime(TextWriter output, DateTime when, char[] buffer)
    {
        buffer[2] = buffer[5] = '.';
        buffer[8] = ' ';
        buffer[11] = buffer[14] = buffer[17] = ':';
        Write2(buffer, when.Day, 0);
        Write2(buffer, when.Month, 3);
        Write2(buffer, when.Year % 100, 6);
        Write2(buffer, when.Hour, 9);
        Write2(buffer, when.Minute, 12);
        Write2(buffer, when.Second, 15);
        Write3(buffer, when.Millisecond, 18);
        output.Write(buffer, 0, 21);
    }
    static void Write2(char[] buffer, int value, int offset)
    {
        buffer[offset++] = (char)('0' + (value / 10));
        buffer[offset] = (char)('0' + (value % 10));
    }
    static void Write3(char[] buffer, int value, int offset)
    {
        buffer[offset++] = (char)('0' + (value / 100));
        buffer[offset++] = (char)('0' + ((value / 10) % 10));
        buffer[offset] = (char)('0' + (value % 10));
    }
}

#3


This is not an answer in itself, but rather an addedum to Jon Skeet's execellent answer, offering a variant for the "s" (ISO) format:

这本身不是一个答案,而是Jon Skeet的优秀答案的补充,提供了“s”(ISO)格式的变体:

    /// <summary>
    ///     Implements a fast method to write a DateTime value to string, in the ISO "s" format.
    /// </summary>
    /// <param name="dateTime">The date time.</param>
    /// <returns></returns>
    /// <devdoc>
    ///     This implementation exists just for performance reasons, it is semantically identical to
    ///     <code>
    /// text = value.HasValue ? value.Value.ToString("s") : string.Empty;
    /// </code>
    ///     However, it runs about 3 times as fast. (Measured using the VS2015 performace profiler)
    /// </devdoc>
    public static string ToIsoStringFast(DateTime? dateTime) {
        if (!dateTime.HasValue) {
            return string.Empty;
        }
        DateTime dt = dateTime.Value;
        char[] chars = new char[19];
        Write4Chars(chars, 0, dt.Year);
        chars[4] = '-';
        Write2Chars(chars, 5, dt.Month);
        chars[7] = '-';
        Write2Chars(chars, 8, dt.Day);
        chars[10] = 'T';
        Write2Chars(chars, 11, dt.Hour);
        chars[13] = ':';
        Write2Chars(chars, 14, dt.Minute);
        chars[16] = ':';
        Write2Chars(chars, 17, dt.Second);
        return new string(chars);
    }

With the 4 digit serializer as:

使用4位数序列化器:

    private static void Write4Chars(char[] chars, int offset, int value) {
        chars[offset] = Digit(value / 1000);
        chars[offset + 1] = Digit(value / 100 % 10);
        chars[offset + 2] = Digit(value / 10 % 10);
        chars[offset + 3] = Digit(value % 10);
    }

This runs about 3 times as fast. (Measured using the VS2015 performance profiler)

这大约快3倍。 (使用VS2015性能分析器测量)

#4


Do you know how big each record in the binary and text logs are going to be? If so you can split the processing of the log file across a number of threads which would give better use of a multi core/processor PC. If you don't mind the result being in separate files it would be a good idea to have one hard disk per core that way you will reduce the amount the disk heads have to move.

你知道二进制文本和文本日志中的每条记录有多大吗?如果是这样,您可以跨多个线程拆分日志文件的处理,这样可以更好地使用多核/处理器PC。如果你不介意将结果放在单独的文件中,那么每个核心有一个硬盘是个好主意,这样你就可以减少磁头移动的数量。

#1


It's unfortunate that .NET doesn't have a sort of "formatter" type which can parse a pattern and remember it.

遗憾的是,.NET没有一种“格式化程序”类型,它可以解析模式并记住它。

If you're always using the same format, you might want to hand-craft a formatter to do exactly that. Something along the lines of:

如果您总是使用相同的格式,您可能希望手工制作格式化程序来完成相同的操作。有点像:

public static string FormatDateTime(DateTime dt)
{
    char[] chars = new char[21];
    Write2Chars(chars, 0, dt.Day);
    chars[2] = '.';
    Write2Chars(chars, 3, dt.Month);
    chars[5] = '.';
    Write2Chars(chars, 6, dt.Year % 100);
    chars[8] = ' ';
    Write2Chars(chars, 9, dt.Hour);
    chars[11] = ' ';
    Write2Chars(chars, 12, dt.Minute);
    chars[14] = ' ';
    Write2Chars(chars, 15, dt.Second);
    chars[17] = ' ';
    Write2Chars(chars, 18, dt.Millisecond / 10);
    chars[20] = Digit(dt.Millisecond % 10);

    return new string(chars);
}

private static void Write2Chars(char[] chars, int offset, int value)
{
    chars[offset] = Digit(value / 10);
    chars[offset+1] = Digit(value % 10);
}

private static char Digit(int value)
{
    return (char) (value + '0');
}

This is pretty ugly, but it's probably a lot more efficient... benchmark it, of course!

这非常难看,但它可能效率更高......当然,基准吧!

#2


Are you sure it takes 33% of the time? How have you measured that? It sounds more than a little suspicious to me...

你确定需要33%的时间吗?你怎么测量的?这对我来说听起来有点可疑......

This makes things a little bit quicker:

这使得事情变得更快一些:

Basic: 2342ms
Custom: 1319ms

Or if we cut out the IO (Stream.Null):

或者如果我们切出IO(Stream.Null):

Basic: 2275ms
Custom: 839ms

using System.Diagnostics;
using System;
using System.IO;
static class Program
{
    static void Main()
    {
        DateTime when = DateTime.Now;
        const int LOOP = 1000000;

        Stopwatch basic = Stopwatch.StartNew();
        using (TextWriter tw = new StreamWriter("basic.txt"))
        {
            for (int i = 0; i < LOOP; i++)
            {
                tw.Write(when.ToString("dd.MM.yy HH:mm:ss:fff"));
            }
        }
        basic.Stop();
        Console.WriteLine("Basic: " + basic.ElapsedMilliseconds + "ms");

        char[] buffer = new char[100];
        Stopwatch custom = Stopwatch.StartNew();
        using (TextWriter tw = new StreamWriter("custom.txt"))
        {
            for (int i = 0; i < LOOP; i++)
            {
                WriteDateTime(tw, when, buffer);
            }
        }
        custom.Stop();
        Console.WriteLine("Custom: " + custom.ElapsedMilliseconds + "ms");
    }
    static void WriteDateTime(TextWriter output, DateTime when, char[] buffer)
    {
        buffer[2] = buffer[5] = '.';
        buffer[8] = ' ';
        buffer[11] = buffer[14] = buffer[17] = ':';
        Write2(buffer, when.Day, 0);
        Write2(buffer, when.Month, 3);
        Write2(buffer, when.Year % 100, 6);
        Write2(buffer, when.Hour, 9);
        Write2(buffer, when.Minute, 12);
        Write2(buffer, when.Second, 15);
        Write3(buffer, when.Millisecond, 18);
        output.Write(buffer, 0, 21);
    }
    static void Write2(char[] buffer, int value, int offset)
    {
        buffer[offset++] = (char)('0' + (value / 10));
        buffer[offset] = (char)('0' + (value % 10));
    }
    static void Write3(char[] buffer, int value, int offset)
    {
        buffer[offset++] = (char)('0' + (value / 100));
        buffer[offset++] = (char)('0' + ((value / 10) % 10));
        buffer[offset] = (char)('0' + (value % 10));
    }
}

#3


This is not an answer in itself, but rather an addedum to Jon Skeet's execellent answer, offering a variant for the "s" (ISO) format:

这本身不是一个答案,而是Jon Skeet的优秀答案的补充,提供了“s”(ISO)格式的变体:

    /// <summary>
    ///     Implements a fast method to write a DateTime value to string, in the ISO "s" format.
    /// </summary>
    /// <param name="dateTime">The date time.</param>
    /// <returns></returns>
    /// <devdoc>
    ///     This implementation exists just for performance reasons, it is semantically identical to
    ///     <code>
    /// text = value.HasValue ? value.Value.ToString("s") : string.Empty;
    /// </code>
    ///     However, it runs about 3 times as fast. (Measured using the VS2015 performace profiler)
    /// </devdoc>
    public static string ToIsoStringFast(DateTime? dateTime) {
        if (!dateTime.HasValue) {
            return string.Empty;
        }
        DateTime dt = dateTime.Value;
        char[] chars = new char[19];
        Write4Chars(chars, 0, dt.Year);
        chars[4] = '-';
        Write2Chars(chars, 5, dt.Month);
        chars[7] = '-';
        Write2Chars(chars, 8, dt.Day);
        chars[10] = 'T';
        Write2Chars(chars, 11, dt.Hour);
        chars[13] = ':';
        Write2Chars(chars, 14, dt.Minute);
        chars[16] = ':';
        Write2Chars(chars, 17, dt.Second);
        return new string(chars);
    }

With the 4 digit serializer as:

使用4位数序列化器:

    private static void Write4Chars(char[] chars, int offset, int value) {
        chars[offset] = Digit(value / 1000);
        chars[offset + 1] = Digit(value / 100 % 10);
        chars[offset + 2] = Digit(value / 10 % 10);
        chars[offset + 3] = Digit(value % 10);
    }

This runs about 3 times as fast. (Measured using the VS2015 performance profiler)

这大约快3倍。 (使用VS2015性能分析器测量)

#4


Do you know how big each record in the binary and text logs are going to be? If so you can split the processing of the log file across a number of threads which would give better use of a multi core/processor PC. If you don't mind the result being in separate files it would be a good idea to have one hard disk per core that way you will reduce the amount the disk heads have to move.

你知道二进制文本和文本日志中的每条记录有多大吗?如果是这样,您可以跨多个线程拆分日志文件的处理,这样可以更好地使用多核/处理器PC。如果你不介意将结果放在单独的文件中,那么每个核心有一个硬盘是个好主意,这样你就可以减少磁头移动的数量。