I wrote the following program using both gcc __get_cpuid
and inline assembly to get the cache info of my laptop but fail to identify them on the table about (Encoding of Cache and TLB Descriptors) I found online.
我使用gcc __get_cpuid和内联汇编编写了以下程序来获取我的笔记本电脑的缓存信息,但无法在桌面上识别它们(我在网上找到的缓存和TLB描述符的编码)。
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <string.h>
#include <time.h>
#include <stdint.h>
#include <math.h>
#include <cpuid.h>
static inline void cpuid(uint32_t *eax, uint32_t *ebx,
uint32_t *ecx, uint32_t *edx);
int main() {
uint32_t a, b, c, d;
uint32_t eax, ebx, ecx, edx;
eax = 2; /* processor info and feature bits */
uint32_t command = 2;
cpuid(&eax, &ebx, &ecx, &edx);
__get_cpuid(command, &a, &b, &c, &d);
printf("eax: %08x\n", eax);
printf("ebx: %08x\n", ebx);
printf("ecx: %08x\n", ecx);
printf("edx: %08x\n", edx);
printf("a: %08x\n", a);
printf("b: %08x\n", b);
printf("c: %08x\n", c);
printf("d: %08x\n", d);
}
static inline void cpuid(uint32_t *eax, uint32_t *ebx,
uint32_t *ecx, uint32_t *edx)
{
/* ecx is often an input as well as an output. */
asm ("cpuid"
: "=a" (*eax),
"=b" (*ebx),
"=c" (*ecx),
"=d" (*edx)
: "0" (*eax));
}
my output:
我的输出:
eax: 76036301
ebx: 00f0b5ff
ecx: 00000000
edx: 00c10000
a: 76036301
b: 00f0b5ff
c: 00000000
d: 00c10000
I found this table from here
我从这里找到了这张桌子
I use sysctl hw.cachesize
and find that
我使用sysctl hw.cachesize并找到它
L1 cache: 32KB
L2 cache: 256KB
L3 cache: 6MB
My Environment:
我的环境:
system: os x 10.10.1
compiler: clang-602.0.53
CPU: I7-4850 HQ 2.3HZ
What's wrong with my program? My program should work since both methods give the same result... I am confused about this. Thank you!
我的计划有什么问题?我的程序应该工作,因为两种方法都给出相同的结果......我对此感到困惑。谢谢!
EDIT: I try what Mats' suggested and get the following as my output:
编辑:我尝试Mats的建议,并得到以下作为我的输出:
gcc intrinsic
a: 76036301
b: 00f0b5ff
c: 00000000
d: 00c10000
eax: 2
eax: 76036301
ebx: 00f0b5ff
ecx: 00000000
edx: 00c10000
eax: 4, ecx: 0
eax: 1c004121
ebx: 01c0003f
ecx: 0000003f
edx: 00000000
eax: 4, ecx: 1
eax: 1c004122
ebx: 01c0003f
ecx: 0000003f
edx: 00000000
eax: 4, ecx: 2
eax: 1c004143
ebx: 01c0003f
ecx: 000001ff
edx: 00000000
eax: 4, ecx: 3
eax: 1c03c163
ebx: 02c0003f
ecx: 00001fff
edx: 00000006
eax: 4, ecx: 4
eax: 1c03c183
ebx: 03c0f03f
ecx: 00001fff
edx: 00000004
eax: 4, ecx: 5
eax: 00000000
ebx: 00000000
ecx: 00000000
edx: 00000000
I look up the table at here
static cpuid_cache_descriptor_t intel_cpuid_leaf2_descriptor_table[] = {
我在这里查看表静态cpuid_cache_descriptor_t intel_cpuid_leaf2_descriptor_table [] = {
// -------------------------------------------------------
// value type level ways size entries
// -------------------------------------------------------
{ 0x00, _NULL_, NA, NA, NA, NA },
{ 0x01, TLB, INST, 4, SMALL, 32 },
{ 0x02, TLB, INST, FULLY, LARGE, 2 },
{ 0x03, TLB, DATA, 4, SMALL, 64 },
{ 0x04, TLB, DATA, 4, LARGE, 8 },
{ 0x05, TLB, DATA1, 4, LARGE, 32 },
{ 0x06, CACHE, L1_INST, 4, 8*K, 32 },
{ 0x08, CACHE, L1_INST, 4, 16*K, 32 },
{ 0x09, CACHE, L1_INST, 4, 32*K, 64 },
{ 0x0A, CACHE, L1_DATA, 2, 8*K, 32 },
{ 0x0B, TLB, INST, 4, LARGE, 4 },
{ 0x0C, CACHE, L1_DATA, 4, 16*K, 32 },
{ 0x0D, CACHE, L1_DATA, 4, 16*K, 64 },
{ 0x0E, CACHE, L1_DATA, 6, 24*K, 64 },
{ 0x21, CACHE, L2, 8, 256*K, 64 },
{ 0x22, CACHE, L3_2LINESECTOR, 4, 512*K, 64 },
{ 0x23, CACHE, L3_2LINESECTOR, 8, 1*M, 64 },
{ 0x25, CACHE, L3_2LINESECTOR, 8, 2*M, 64 },
{ 0x29, CACHE, L3_2LINESECTOR, 8, 4*M, 64 },
{ 0x2C, CACHE, L1_DATA, 8, 32*K, 64 },
{ 0x30, CACHE, L1_INST, 8, 32*K, 64 },
{ 0x40, CACHE, L2, NA, 0, NA },
{ 0x41, CACHE, L2, 4, 128*K, 32 },
{ 0x42, CACHE, L2, 4, 256*K, 32 },
{ 0x43, CACHE, L2, 4, 512*K, 32 },
{ 0x44, CACHE, L2, 4, 1*M, 32 },
{ 0x45, CACHE, L2, 4, 2*M, 32 },
{ 0x46, CACHE, L3, 4, 4*M, 64 },
{ 0x47, CACHE, L3, 8, 8*M, 64 },
{ 0x48, CACHE, L2, 12, 3*M, 64 },
{ 0x49, CACHE, L2, 16, 4*M, 64 },
{ 0x4A, CACHE, L3, 12, 6*M, 64 },
{ 0x4B, CACHE, L3, 16, 8*M, 64 },
{ 0x4C, CACHE, L3, 12, 12*M, 64 },
{ 0x4D, CACHE, L3, 16, 16*M, 64 },
{ 0x4E, CACHE, L2, 24, 6*M, 64 },
{ 0x4F, TLB, INST, NA, SMALL, 32 },
{ 0x50, TLB, INST, NA, BOTH, 64 },
{ 0x51, TLB, INST, NA, BOTH, 128 },
{ 0x52, TLB, INST, NA, BOTH, 256 },
{ 0x55, TLB, INST, FULLY, BOTH, 7 },
{ 0x56, TLB, DATA0, 4, LARGE, 16 },
{ 0x57, TLB, DATA0, 4, SMALL, 16 },
{ 0x59, TLB, DATA0, FULLY, SMALL, 16 },
{ 0x5A, TLB, DATA0, 4, LARGE, 32 },
{ 0x5B, TLB, DATA, NA, BOTH, 64 },
{ 0x5C, TLB, DATA, NA, BOTH, 128 },
{ 0x5D, TLB, DATA, NA, BOTH, 256 },
{ 0x60, CACHE, L1, 16*K, 8, 64 },
{ 0x61, CACHE, L1, 4, 8*K, 64 },
{ 0x62, CACHE, L1, 4, 16*K, 64 },
{ 0x63, CACHE, L1, 4, 32*K, 64 },
{ 0x70, CACHE, TRACE, 8, 12*K, NA },
{ 0x71, CACHE, TRACE, 8, 16*K, NA },
{ 0x72, CACHE, TRACE, 8, 32*K, NA },
{ 0x78, CACHE, L2, 4, 1*M, 64 },
{ 0x79, CACHE, L2_2LINESECTOR, 8, 128*K, 64 },
{ 0x7A, CACHE, L2_2LINESECTOR, 8, 256*K, 64 },
{ 0x7B, CACHE, L2_2LINESECTOR, 8, 512*K, 64 },
{ 0x7C, CACHE, L2_2LINESECTOR, 8, 1*M, 64 },
{ 0x7D, CACHE, L2, 8, 2*M, 64 },
{ 0x7F, CACHE, L2, 2, 512*K, 64 },
{ 0x80, CACHE, L2, 8, 512*K, 64 },
{ 0x82, CACHE, L2, 8, 256*K, 32 },
{ 0x83, CACHE, L2, 8, 512*K, 32 },
{ 0x84, CACHE, L2, 8, 1*M, 32 },
{ 0x85, CACHE, L2, 8, 2*M, 32 },
{ 0x86, CACHE, L2, 4, 512*K, 64 },
{ 0x87, CACHE, L2, 8, 1*M, 64 },
{ 0xB0, TLB, INST, 4, SMALL, 128 },
{ 0xB1, TLB, INST, 4, LARGE, 8 },
{ 0xB2, TLB, INST, 4, SMALL, 64 },
{ 0xB3, TLB, DATA, 4, SMALL, 128 },
{ 0xB4, TLB, DATA1, 4, SMALL, 256 },
{ 0xBA, TLB, DATA1, 4, BOTH, 64 },
{ 0xCA, STLB, DATA1, 4, BOTH, 512 },
{ 0xD0, CACHE, L3, 4, 512*K, 64 },
{ 0xD1, CACHE, L3, 4, 1*M, 64 },
{ 0xD2, CACHE, L3, 4, 2*M, 64 },
{ 0xD3, CACHE, L3, 4, 4*M, 64 },
{ 0xD4, CACHE, L3, 4, 8*M, 64 },
{ 0xD6, CACHE, L3, 8, 1*M, 64 },
{ 0xD7, CACHE, L3, 8, 2*M, 64 },
{ 0xD8, CACHE, L3, 8, 4*M, 64 },
{ 0xD9, CACHE, L3, 8, 8*M, 64 },
{ 0xDA, CACHE, L3, 8, 12*M, 64 },
{ 0xDC, CACHE, L3, 12, 1536*K, 64 },
{ 0xDD, CACHE, L3, 12, 3*M, 64 },
{ 0xDE, CACHE, L3, 12, 6*M, 64 },
{ 0xDF, CACHE, L3, 12, 12*M, 64 },
{ 0xE0, CACHE, L3, 12, 18*M, 64 },
{ 0xE2, CACHE, L3, 16, 2*M, 64 },
{ 0xE3, CACHE, L3, 16, 4*M, 64 },
{ 0xE4, CACHE, L3, 16, 8*M, 64 },
{ 0xE5, CACHE, L3, 16, 16*M, 64 },
{ 0xE6, CACHE, L3, 16, 24*M, 64 },
{ 0xF0, PREFETCH, NA, NA, 64, NA },
{ 0xF1, PREFETCH, NA, NA, 128, NA }
};
The problem right now is that I still cannot get the correct size of my L3 cache(when ecx=1, I get 22 i.e. 512K, but the correct value is 6MB). Also, there seems to be some conflicts in terms of the size of my L2 cache(43(when ecx=2) and 21(when ecx=0) )
现在的问题是我仍然无法获得正确的L3缓存大小(当ecx = 1时,我得到22即512K,但正确的值是6MB)。此外,我的L2缓存大小(43(当ecx = 2时)和21(当ecx = 0时)似乎存在一些冲突)
2 个解决方案
#1
1
So, your data seems to be reasonably correct, just that you are using an old reference. Unfortunately, Intel's website is either broken presently or it doesn't like Firefox and/or Linux.
因此,您的数据似乎是合理正确的,只是您使用的是旧参考。不幸的是,英特尔的网站目前要么已经破解,要么就像Firefox和/或Linux那样。
76036301
76036301
76 means trace cache with 64K ops.
76表示具有64K操作的跟踪缓存。
03 means 4 way DATA TLB with 64 entries.
03表示具有64个条目的4路DATA TLB。
63 is 32KB L1 cache - the source here shows that value, which is not in your docs.
63是32KB L1缓存 - 这里的源显示该值,这不在您的文档中。
01 means 4 way Instruction TLB with 32 entries.
01表示具有32个条目的4路指令TLB。
00f0b5ff gives
00f0b5ff给出
00 "nothing"
00“没什么”
f0 prefetch, 64 entries.
f0预取,64个条目。
0b Instruction 4 way TLB for large pages, 4 entries.
0b指令4路TLB用于大页面,4个条目。
b5 is not documented even on that link. [guessing small data TLB]
即使在该链接上也没有记录b5。 [猜测小数据TLB]
To get L2 and L3 cache sizes, you need to use CPUID with EAX=4, and set ECX to 0, 1, 2, ... for each caching level. The linked code shows this, and Intel's docs have details on which bits mean what.
要获得L2和L3高速缓存大小,您需要使用EID = 4的CPUID,并为每个高速缓存级别将ECX设置为0,1,2,....链接的代码显示了这一点,英特尔的文档详细说明了哪些位意味着什么。
#2
1
Intel's Instruction Set Reference has all the relevant information you need (at around page 263), and is actually up to date unlike every other source I have found.
英特尔的指令集参考包含您需要的所有相关信息(在第263页左右),并且实际上是最新的,与我找到的其他所有来源不同。
Probably the best way to get the cache info is mentioned in that reference.
在该引用中可能提到了获取缓存信息的最佳方法。
When eax = 4 and ecx is the cache level,
当eax = 4且ecx是缓存级别时,
Ways = EBX[31:22]
Partitions = EBX[21:12]
LineSize = EBX[11:0]
Sets = ECX
Total Size = (Ways + 1) * (Partitions + 1) * (Line_Size + 1) * (Sets + 1)
So when CUPID is called with eax = 4 and ecx = 3, you can get your L3 cache size by doing the computation above. Using the OP's posted data:
因此,当使用eax = 4和ecx = 3调用CUPID时,您可以通过执行上面的计算来获得L3缓存大小。使用OP的发布数据:
ebx: 02c0003f
ecx: 00001fff
Ways = 63
Partitions = 0
LineSize = 11
Sets = 8191
Total L3 cache size = 6291456
总L3缓存大小= 6291456
Which is what was expected.
这是预期的。
#1
1
So, your data seems to be reasonably correct, just that you are using an old reference. Unfortunately, Intel's website is either broken presently or it doesn't like Firefox and/or Linux.
因此,您的数据似乎是合理正确的,只是您使用的是旧参考。不幸的是,英特尔的网站目前要么已经破解,要么就像Firefox和/或Linux那样。
76036301
76036301
76 means trace cache with 64K ops.
76表示具有64K操作的跟踪缓存。
03 means 4 way DATA TLB with 64 entries.
03表示具有64个条目的4路DATA TLB。
63 is 32KB L1 cache - the source here shows that value, which is not in your docs.
63是32KB L1缓存 - 这里的源显示该值,这不在您的文档中。
01 means 4 way Instruction TLB with 32 entries.
01表示具有32个条目的4路指令TLB。
00f0b5ff gives
00f0b5ff给出
00 "nothing"
00“没什么”
f0 prefetch, 64 entries.
f0预取,64个条目。
0b Instruction 4 way TLB for large pages, 4 entries.
0b指令4路TLB用于大页面,4个条目。
b5 is not documented even on that link. [guessing small data TLB]
即使在该链接上也没有记录b5。 [猜测小数据TLB]
To get L2 and L3 cache sizes, you need to use CPUID with EAX=4, and set ECX to 0, 1, 2, ... for each caching level. The linked code shows this, and Intel's docs have details on which bits mean what.
要获得L2和L3高速缓存大小,您需要使用EID = 4的CPUID,并为每个高速缓存级别将ECX设置为0,1,2,....链接的代码显示了这一点,英特尔的文档详细说明了哪些位意味着什么。
#2
1
Intel's Instruction Set Reference has all the relevant information you need (at around page 263), and is actually up to date unlike every other source I have found.
英特尔的指令集参考包含您需要的所有相关信息(在第263页左右),并且实际上是最新的,与我找到的其他所有来源不同。
Probably the best way to get the cache info is mentioned in that reference.
在该引用中可能提到了获取缓存信息的最佳方法。
When eax = 4 and ecx is the cache level,
当eax = 4且ecx是缓存级别时,
Ways = EBX[31:22]
Partitions = EBX[21:12]
LineSize = EBX[11:0]
Sets = ECX
Total Size = (Ways + 1) * (Partitions + 1) * (Line_Size + 1) * (Sets + 1)
So when CUPID is called with eax = 4 and ecx = 3, you can get your L3 cache size by doing the computation above. Using the OP's posted data:
因此,当使用eax = 4和ecx = 3调用CUPID时,您可以通过执行上面的计算来获得L3缓存大小。使用OP的发布数据:
ebx: 02c0003f
ecx: 00001fff
Ways = 63
Partitions = 0
LineSize = 11
Sets = 8191
Total L3 cache size = 6291456
总L3缓存大小= 6291456
Which is what was expected.
这是预期的。