mktime性能问题调查

时间:2021-05-16 16:57:13

一、问题提出

会议中有同学提到使用mktime遇到一些问题: 1) 设置tm_isdst后速度很慢 2) 设置TZ环境变量提速极大 所以想调查下具体情况。

 

mktime真的这么慢?如果是,为什么?

二、测试和检验

环境(不同环境可能结果迥异,以下所述仅对本环境有效)

$ cat /proc/version
Linux version --tlinux2-.tl2 (mockbuild@TENCENT64.site)
(gcc version   (Red Hat -) (GCC) )
# SMP Fri Apr  :: CST 

$ getconf -a | grep glibc -i
GNU_LIBC_VERSION                   glibc 2.17

首先写了个简单的mktime测试。

#include <sys/time.h>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>

typedef int64_t timestamp_t;

static timestamp_t get_timestamp()
{
    struct timeval tv = {};
    gettimeofday(&tv, );
     + (timestamp_t)tv.tv_usec;
}

static void call_mktime(int isdst)
{
    struct tm tm = {};
    tm.tm_year =  - ;
    tm.tm_mon  =  - ;
    tm.tm_mday = ;
    tm.tm_hour = ;
    tm.tm_min  = ;
    tm.tm_sec  = ;
    tm.tm_isdst = isdst;
     == mktime(&tm)) {
        abort();
    }
}

int main()
{
    , };

    for (const auto &isdst: isdsts) {
        timestamp_t t1 = get_timestamp();
        ;
        ; i < N; ++i) {
            call_mktime(isdst);
        }
        timestamp_t t2 = get_timestamp();
        printf("isdst=%d rounds %d avg cost %4.2f us\n", isdst, N, 1.0*(t2-t1)/N);
    }

    ;
}

跑一下,得到结果如下:

$ TZ="Asia/Shanghai" ./app_test_mktime 
isdst=1 rounds 100 avg cost 484.32 us 
isdst=0 rounds 100 avg cost 2.17 us

还真的很慢啊!慢的掉渣了! 
 
But!真的是这样吗? 
 
要不试试其他时区?就用美国东部时间好了。

$ TZ="US/Eastern" ./app_test_mktime 
isdst=1 rounds 100 avg cost 2.31 us 
isdst=0 rounds 100 avg cost 0.20 us

奇迹发生了,不慢啊,也就几微秒而已,虽然慢了一点点,但绝对没有那么夸张。

三、基本结论

跟着这个思路,稍微扩大一下可变的参数,包括下面几个因子:

  1. 日历时间
  2. 时区配置
  3. 输入isdst

最后跑出一个结果(源代码见后):

 

 

 

isdst

日历时间

时区配置

夏令时

1

0

-1

1688-06-01 02:00

Asia/Shanghai

N

103.95

0.23

0.23

 

US/Eastern

N

114.6

0.26

0.23

 

America/Jujuy

N

125.26

0.27

0.25

1960-06-01 02:00

Asia/Shanghai

N

158.9

0.3

0.29

 

US/Eastern

Y

0.6

4.3

0.39

 

America/Jujuy

Y

0.48

67.24

0.34

1986-06-01 02:00

Asia/Shanghai

Y

0.48

2.47

0.3

 

US/Eastern

Y

0.44

2.73

0.3

 

America/Jujuy

N

54.67

0.34

0.32

2016-01-01 02:00

Asia/Shanghai

N

501.6

0.76

0.76

 

US/Eastern

N

4.49

0.32

0.31

 

America/Jujuy

N

507.45

0.81

0.78

2016-05-01 02:00

Asia/Shanghai

N

505.49

0.7

0.7

 

US/Eastern

Y

0.64

3.77

0.31

 

America/Jujuy

N

514.04

0.76

0.77

最后两列是一次mktime调用消耗的微秒数。 注意America/Jujuy这个时区,非常有趣。

从这张表格可以总结出一个基本结论: 当tm_isdst设置不当时,调用mktime会消耗更多的时间。

  • 如果当时的日历时间是夏令时,那么isdst=1速度比isdst=0 快 ;
  • 如果当时的日立时间是常规时,那么isdst=1速度比isdst=0 慢 ;
  • 调用mktime,可以传入isdst=-1,让glibc根据时区自动决定DST标记。

The mktime() function converts a broken-down time structure, expressed as local time, to calendar time representation. The function ignores the values supplied by the caller in the tm_wday and tm_yday fields. The value specified in the tm_isdst field informs mktime() whether or not daylight saving time (DST) is in effect for the time supplied in the tm structure: a positive value means DST is in effect; zero means that DST is not in effect; and a negative value means that mktime() should (use timezone information and system databases to) attempt to determine whether DST is in effect at the specified time.

至于KM文章中提到的设置TZ环境变量导致的性能差异,是非常小的,有兴趣的可以做个测试。

四、进一步调查

为什么isdst配置不当时,速度会相差这么多?这和mktime的实现有关。 
 
mktime转换年月日格式的时间到时间戳,分几个步骤

  1. 6次循环,猜测得到一个时间戳t1,调用localtime(t1)能够得到正确的年月日表示。但它的夏令时标记(isdst)可能和用户传递进来的不一致。
  2. 如果夏令时标记和用户调用传递的isdst不同(0 vs 1, 1 vs 0),以t1为基准,前后搜寻合适的日历时,如果找到一个日历时,它的DST标记符合,以该日历时所处的DST为准。如果无法搜寻到合适的结果,直接返回t1
  3. 搜寻思想:以当前时间为中心,前后搜寻,找到一个与输入参数匹配的夏令时/冬令时区间,以该区间的配置来校准t1。
  4. 搜寻算法: 
    • 步长:601200秒,这是所有夏令时区间中最短的一个周期:7天,时区为:America/Recife
    • 范围:以t1为中心的536454000秒区间。这是所有夏令时区间中最长的一个周期:17年,时区为America/Jujuy。范围:[t-536454000/2+601200,t+536454000/2+601200],最大故迭代次数894次。
    • 迭代: 对当前时间戳tx调用localtime(tx),若结果的isdst和输入的isdst相同,命中,跳出循环。否则继续。
    • 命中:根据命中时的DST设置,找到正确的时间戳t3,转换成功。

迭代次数越多,耗时就越多。如果步骤1转换的posix time距离最近的匹配区间(夏令时/冬令时)很远,搜寻耗时就很长。 
 
Asio/Shanghai的夏令时从1991年废除,而US/Eastern每年都有夏令时,所以,大部分情况下前者的迭代次数远大于后者,这也能很好的解释上面的图表。 
 
zdump看一下不同时区数据库的信息,注意16881986两个测试日历时间的选取。

$ zdump -v Asia/Shanghai
Asia/Shanghai -9223372036854775808 = NULL
Asia/Shanghai -9223372036854689408 = NULL
Asia/Shanghai Mon Dec 31 15:54:16 1900 UTC = Mon Dec 31 23:59:59 1900 LMT isdst=0 gmtoff=29143
Asia/Shanghai Mon Dec 31 15:54:17 1900 UTC = Mon Dec 31 23:54:17 1900 CST isdst=0 gmtoff=28800
Asia/Shanghai Sun Jun 2 15:59:59 1940 UTC = Sun Jun 2 23:59:59 1940 CST isdst=0 gmtoff=28800
Asia/Shanghai Sun Jun 2 16:00:00 1940 UTC = Mon Jun 3 01:00:00 1940 CDT isdst=1 gmtoff=32400
Asia/Shanghai Mon Sep 30 14:59:59 1940 UTC = Mon Sep 30 23:59:59 1940 CDT isdst=1 gmtoff=32400
Asia/Shanghai Mon Sep 30 15:00:00 1940 UTC = Mon Sep 30 23:00:00 1940 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Mar 15 15:59:59 1941 UTC = Sat Mar 15 23:59:59 1941 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Mar 15 16:00:00 1941 UTC = Sun Mar 16 01:00:00 1941 CDT isdst=1 gmtoff=32400
Asia/Shanghai Tue Sep 30 14:59:59 1941 UTC = Tue Sep 30 23:59:59 1941 CDT isdst=1 gmtoff=32400
Asia/Shanghai Tue Sep 30 15:00:00 1941 UTC = Tue Sep 30 23:00:00 1941 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat May 3 15:59:59 1986 UTC = Sat May 3 23:59:59 1986 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat May 3 16:00:00 1986 UTC = Sun May 4 01:00:00 1986 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 13 14:59:59 1986 UTC = Sat Sep 13 23:59:59 1986 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 13 15:00:00 1986 UTC = Sat Sep 13 23:00:00 1986 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 11 15:59:59 1987 UTC = Sat Apr 11 23:59:59 1987 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 11 16:00:00 1987 UTC = Sun Apr 12 01:00:00 1987 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 12 14:59:59 1987 UTC = Sat Sep 12 23:59:59 1987 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 12 15:00:00 1987 UTC = Sat Sep 12 23:00:00 1987 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 9 15:59:59 1988 UTC = Sat Apr 9 23:59:59 1988 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 9 16:00:00 1988 UTC = Sun Apr 10 01:00:00 1988 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 10 14:59:59 1988 UTC = Sat Sep 10 23:59:59 1988 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 10 15:00:00 1988 UTC = Sat Sep 10 23:00:00 1988 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 15 15:59:59 1989 UTC = Sat Apr 15 23:59:59 1989 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 15 16:00:00 1989 UTC = Sun Apr 16 01:00:00 1989 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 16 14:59:59 1989 UTC = Sat Sep 16 23:59:59 1989 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 16 15:00:00 1989 UTC = Sat Sep 16 23:00:00 1989 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 14 15:59:59 1990 UTC = Sat Apr 14 23:59:59 1990 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 14 16:00:00 1990 UTC = Sun Apr 15 01:00:00 1990 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 15 14:59:59 1990 UTC = Sat Sep 15 23:59:59 1990 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 15 15:00:00 1990 UTC = Sat Sep 15 23:00:00 1990 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 13 15:59:59 1991 UTC = Sat Apr 13 23:59:59 1991 CST isdst=0 gmtoff=28800
Asia/Shanghai Sat Apr 13 16:00:00 1991 UTC = Sun Apr 14 01:00:00 1991 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 14 14:59:59 1991 UTC = Sat Sep 14 23:59:59 1991 CDT isdst=1 gmtoff=32400
Asia/Shanghai Sat Sep 14 15:00:00 1991 UTC = Sat Sep 14 23:00:00 1991 CST isdst=0 gmtoff=28800
Asia/Shanghai 9223372036854689407 = NULL
Asia/Shanghai 9223372036854775807 = NULL 

$ zdump -v US/Eastern
US/Eastern -9223372036854775808 = NULL
US/Eastern -9223372036854689408 = NULL
US/Eastern Sun Nov 18 16:59:59 1883 UTC = Sun Nov 18 12:03:57 1883 LMT isdst=0 gmtoff=-17762
US/Eastern Sun Nov 18 17:00:00 1883 UTC = Sun Nov 18 12:00:00 1883 EST isdst=0 gmtoff=-18000
US/Eastern Sun Mar 31 06:59:59 1918 UTC = Sun Mar 31 01:59:59 1918 EST isdst=0 gmtoff=-18000
US/Eastern Sun Mar 31 07:00:00 1918 UTC = Sun Mar 31 03:00:00 1918 EDT isdst=1 gmtoff=-14400
US/Eastern Sun Oct 27 05:59:59 1918 UTC = Sun Oct 27 01:59:59 1918 EDT isdst=1 gmtoff=-14400
US/Eastern Sun Oct 27 06:00:00 1918 UTC = Sun Oct 27 01:00:00 1918 EST isdst=0 gmtoff=-18000
US/Eastern Sun Mar 30 06:59:59 1919 UTC = Sun Mar 30 01:59:59 1919 EST isdst=0 gmtoff=-18000
US/Eastern Sun Mar 30 07:00:00 1919 UTC = Sun Mar 30 03:00:00 1919 EDT isdst=1 gmtoff=-14400
…省略若干…
US/Eastern Sun Mar 9 06:59:59 2498 UTC = Sun Mar 9 01:59:59 2498 EST isdst=0 gmtoff=-18000
US/Eastern Sun Mar 9 07:00:00 2498 UTC = Sun Mar 9 03:00:00 2498 EDT isdst=1 gmtoff=-14400
US/Eastern Sun Nov 2 05:59:59 2498 UTC = Sun Nov 2 01:59:59 2498 EDT isdst=1 gmtoff=-14400
US/Eastern Sun Nov 2 06:00:00 2498 UTC = Sun Nov 2 01:00:00 2498 EST isdst=0 gmtoff=-18000
US/Eastern Sun Mar 8 06:59:59 2499 UTC = Sun Mar 8 01:59:59 2499 EST isdst=0 gmtoff=-18000
US/Eastern Sun Mar 8 07:00:00 2499 UTC = Sun Mar 8 03:00:00 2499 EDT isdst=1 gmtoff=-14400
US/Eastern Sun Nov 1 05:59:59 2499 UTC = Sun Nov 1 01:59:59 2499 EDT isdst=1 gmtoff=-14400
US/Eastern Sun Nov 1 06:00:00 2499 UTC = Sun Nov 1 01:00:00 2499 EST isdst=0 gmtoff=-18000
US/Eastern 9223372036854689407 = NULL
US/Eastern 9223372036854775807 = NULL

五、附录

1. test_timezone.cpp

测试不同时区的mktime性能

#include <sys/time.h>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>

typedef int64_t timestamp_t;

static timestamp_t get_timestamp()
{
    struct timeval tv = {};
    gettimeofday(&tv, );
     + (timestamp_t)tv.tv_usec;
}

struct calendar_time {
    int year;
    int month;
    int day;
    int hour;
    int minute;
    int second;
};

static void call_mktime(
        const calendar_time &calendar,
        int isdst)
{
    struct tm tm = {};
    tm.tm_year = calendar.year - ;
    tm.tm_mon  = calendar.month- ;
    tm.tm_mday = calendar.day;
    tm.tm_hour = calendar.hour;
    tm.tm_min  = calendar.minute;
    tm.tm_sec  = calendar.second;
    tm.tm_isdst = isdst;
     == mktime(&tm)) {
        abort();
    }
}

int main()
{
    const char *timeonzes[] = { "Asia/Shanghai", "US/Eastern", "America/Jujuy"};

    calendar_time times[] = {
        {, , , , , },
        {, , , , , },
        {, , , , , },
        {, , , , , },
        {, , , , , },
        {, , , , , },
        {, , , , , },
    };

    , , -};

    for (const auto &calendar: times) {
        ];

        for (const auto &tz: timeonzes) {
            for (const auto &isdst: isdsts) {
                setenv();
                timestamp_t t1 = get_timestamp();
                ;
                ; i < N; ++i) {
                    call_mktime(calendar, isdst);
                }
                timestamp_t t2 = get_timestamp();

                printf("calendar: %04d-%02d-%02d %02d:%02d:%02d ",
                        calendar.year, calendar.month, calendar.day,
                        calendar.hour, calendar.minute, calendar.second);

                printf("%-20s isdst=%2d rounds %d avg cost %4.2f us\n", tz, isdst, N, 1.0*(t2-t1)/N);
            }
            printf("\n");
        }
        printf("-------------------------------------------\n");
    }

    ;
}

2. test_setenv.cpp

测试setenv("TZ")和不设置时的性能差别。

#include <sys/time.h>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>

typedef int64_t timestamp_t;

static timestamp_t get_timestamp()
{
    struct timeval tv = {};
    gettimeofday(&tv, );
     + (timestamp_t)tv.tv_usec;
}

struct calendar_time {
    int year;
    int month;
    int day;
    int hour;
    int minute;
    int second;
};

static void call_mktime(
        const calendar_time &calendar,
        int isdst)
{
    struct tm tm = {};
    tm.tm_year = calendar.year - ;
    tm.tm_mon  = calendar.month- ;
    tm.tm_mday = calendar.day;
    tm.tm_hour = calendar.hour;
    tm.tm_min  = calendar.minute;
    tm.tm_sec  = calendar.second;
    tm.tm_isdst = isdst;
     == mktime(&tm)) {
        abort();
    }
}

int main()
{
    const char *tz = "Asia/Shanghai";

    calendar_time times[] = {
        {, , , , , },
        {, , , , , },
        {, , , , , },
        {, , , , , },
    };

    for (const auto &calendar: times) {
        printf("calendar: %04d-%02d-%02d %02d:%02d:%02d\n",
                calendar.year, calendar.month, calendar.day,
                calendar.hour, calendar.minute, calendar.second);

        {
            unsetenv("TZ");
            timestamp_t t1 = get_timestamp();
            ;
            ; i < N; ++i) {
                call_mktime(calendar, );
            }
            timestamp_t t2 = get_timestamp();
            printf("unsetenv %-20s rounds %d avg cost %4ld us\n", tz, N, (t2-t1)/N);
        }

        {
            setenv();

            timestamp_t t1 = get_timestamp();
            ;
            ; i < N; ++i) {
                call_mktime(calendar, );
            }
            timestamp_t t2 = get_timestamp();
            printf("setenv   %-20s rounds %d avg cost %4ld us\n", tz, N, (t2-t1)/N);
        }
        printf("\n");
    }

    ;
}

六、参考文档

  1. https://github.com/lattera/glibc/blob/master/time/mktime.c