i have got an 32bit (hexadecimal)word 0xaabbccdd and have to swap the 2. and the 3. byte. in the end it should look like 0xaaccbbdd
我有一个32位(十六进制)字0xaabbccdd,必须交换2.和3.字节。最后它应该看起来像0xaaccbbdd
how can i "mask" the 2nd and the 3rd byte to first load them up to register r1 and r2 and the swap them.. i also know that i have to work with lsl and lsr commands but dont know how to start.
如何“掩盖”第2和第3个字节,首先将它们加载到寄存器r1和r2并交换它们..我也知道我必须使用lsl和lsr命令,但不知道如何启动。
sorry for my bad english.hope anyone could help me out!
对不起我的英语不好。我可以帮助我了!
regards, sebastian
5 个解决方案
#1
8
Back in the day we used to rely heavily on EOR for this kind of trickery.
回到过去,我们过去常常依赖EOR来获取这种技巧。
You can do it in 4 cycles.
你可以在4个周期内完成。
First off, we need the fact that: A ^ (A^B) = B
首先,我们需要这样的事实:A ^(A ^ B)= B.
We start with 0xAABBCCDD, and we want 0xAACCBBDD. To get there, we need 0x00EEEE00^0xAABBCCDD, where EE = BB^CC.
我们从0xAABBCCDD开始,我们想要0xAACCBBDD。为此,我们需要0x00EEEE00 ^ 0xAABBCCDD,其中EE = BB ^ CC。
Now, we need a few cycles to build 00EEEE00:
现在,我们需要几个周期来构建00EEEE00:
eor r1,r0,r0,lsr #8
and r1,r1,#0xFF00
orr r1,r1,r1,lsl #8
eor r0,r0,r1
In c:
t=x^(x>>8);
t=t&0xFF00;
t=t|(t<<8);
x^=t;
After each line, the result calculated is: starting with: AABBCCDD
在每一行之后,计算的结果是:从:AABBCCDD开始
eor XXXXEEXX
and 0000EE00
orr 00EEEE00
eor AACCBBDD
This will work on any 32bit ARM core.
这适用于任何32位ARM内核。
#2
6
That's not a simple task in ARM assembly because you can't easily use 32 bit constants. You have to break up all your operations that mask out bytes to use 8 bit constants each (also these constants can be rotated).
这不是ARM组装中的简单任务,因为您不能轻易使用32位常量。你必须分解所有掩盖字节的操作,每个操作使用8位常量(也可以旋转这些常量)。
You mask out byte2 and 3 using the AND instruction and do the shift later. in ARM-assembler you have with most instruction one shift for free, so the shift-into-position and merge with the other bits often end up being a single instruction.
使用AND指令屏蔽byte2和3并稍后进行移位。在ARM-assembler中,您可以免费获得大多数指令一个移位,因此移位到位并与其他位合并通常最终成为单个指令。
Here is some untested code that does the middle byte swap (ARMv4, not thumb-instruction set):
这是一些未经测试的代码,它执行中间字节交换(ARMv4,而不是拇指指令集):
.text
swap_v4:
AND R2, R0, #0x00ff0000 @ R2=0x00BB0000 get byte 2
AND R3, R0, #0x0000ff00 @ R3=0x0000CC00 get byte 1
BIC R0, R0, #0x00ff0000 @ R0=0xAA00CCDD clear byte 2
BIC R0, R0, #0x0000ff00 @ R0=0xAA0000DD clear byte 1
ORR R0, R2, LSR #8 @ R0=0xAA00BBDD merge and shift byte 2
ORR R0, R3, LSL #8 @ R0=0xAACCBBDD merge and shift byte 1
B LR
That translate line by line into the following c-code:
这逐行转换为以下c代码:
int swap (int R0)
{
int R2,R3;
R2 = R0 & 0x00ff0000;
R3 = R0 & 0x0000ff00;
R0 = R0 & 0xff00ffff;
R0 = R0 & 0xffff00ff;
R0 |= (R2>>8);
R0 |= (R3<<8);
return R0;
}
You'll see - lots of lines for such a simple task. Not even the ARMv6 architecture helps here much.
你会看到 - 这么简单的任务有很多行。甚至ARMv6架构也没有帮助。
EDIT: ARMv6 version (also untested, but two instructions shorter)
编辑:ARMv6版本(也未经测试,但两个指令更短)
swap_v6:
@ bits in R0: aabbccdd
ROR R0, R0, #8 @ r0 = ddaabbcc
REV R1, R0 @ r1 = ccbbaadd
PKHTB R0, R0, R1 @ r0 = ddaaccbb
ROR R0, R0, #24 @ r0 = aaccbbdd
BX LR
#3
2
Hmmm, dont know what happened, it submitted my answer before I had really started.
嗯,不知道发生了什么,它在我真正开始之前提交了我的答案。
At first I didnt think I could do it with only two registers but then I decided I could and did. These solutions are register only, no memory (other than the ldr r0,= which you can replace with four instructions). If you use memory and hmmm, two registers you can cut down the number of instructions perhaps, str, bic, bic, ldrb, orr lsl, ldrb, orr lsl. Okay I did it in one instruction fewer but then you need the memory location and the stores and loads cost cycles so same amount of memory and more cycles for me to do it with memory. Someone else may have some good tricks. I think some of the newer cores have an endian swap instruction which would make it even easier.
起初我并不认为我只能用两个寄存器来做,但后来我决定我可以做到。这些解决方案只是寄存器,没有内存(除了ldr r0,=可以用四条指令替换)。如果你使用内存和嗯,两个寄存器,你可以减少指令的数量,str,bic,bic,ldrb,orr lsl,ldrb,orr lsl。好吧,我在一个指令中做的少了,但是你需要内存位置和存储并加载成本周期所以相同的内存量和更多的周期让我用内存来做。其他人可能有一些好的技巧。我认为一些较新的内核有一个endian交换指令,这将使它更容易。
.globl midswap
midswap:
mov r2,r0,lsl #8 ;@ r2 = BBCCDDAA
mov r3,r0,lsr #8 ;@ r3 = DDAABBCC (this might drag a sign bit, dont care)
and r2,r2,#0x00FF0000 ;@ r2 = 00CC0000
and r3,r3,#0x0000FF00 ;@ r3 = 0000BB00
bic r0,r0,#0x00FF0000 ;@ r0 = AA00CCDD
bic r0,r0,#0x0000FF00 ;@ r0 = AA0000DD
orr r0,r0,r2 ;@ r0 = AACC00DD
orr r0,r0,r3 ;@ r0 = AACCBBDD
bx lr ;@ or mov pc,lr for older arm cores
.globl tworegs
tworegs:
mov r2,r0,ror #8 ;@ r2 = DDAABBCC
bic r2,r2,#0xFF000000 ;@ r2 = 00AABBCC
bic r2,r2,#0x00FF0000 ;@ r2 = 0000BBCC
orr r2,r2,ror #16 ;@ r2 = BBCCBBCC
bic r2,r2,#0xFF000000 ;@ r2 = 00CCBBCC
bic r2,r2,#0x000000FF ;@ r2 = 00CCBB00
bic r0,r0,#0x00FF0000 ;@ r0 = AA00CCDD
bic r0,r0,#0x0000FF00 ;@ r0 = AA0000DD
orr r0,r0,r2 ;@ r0 = AACCBBDD
bx lr
testfun:
ldr r0,=0xAABBCCDD
bl midswap
#4
1
Can you use BFI and UBFX they will make your job easier
你可以使用BFI和UBFX,它们会让你的工作更轻松
#5
0
You vould just use pointers to swap two bytes
你只需使用指针交换两个字节
static union {
BYTE BBuf[4];
WORD WWBuf[2];
DWORD DWBuf;
}swap;
unsigned char *a;
unsigned char *b;
swap.DWBuf = 0xaabbccdd;
a = &swap.BBuf[1];
b = &swap.BBuf[2];
*a ^= *b;
*b ^= *a;
*a ^= *b;
And now the result is
现在的结果是
swap.DWbuf == 0xaaccbbdd;
#1
8
Back in the day we used to rely heavily on EOR for this kind of trickery.
回到过去,我们过去常常依赖EOR来获取这种技巧。
You can do it in 4 cycles.
你可以在4个周期内完成。
First off, we need the fact that: A ^ (A^B) = B
首先,我们需要这样的事实:A ^(A ^ B)= B.
We start with 0xAABBCCDD, and we want 0xAACCBBDD. To get there, we need 0x00EEEE00^0xAABBCCDD, where EE = BB^CC.
我们从0xAABBCCDD开始,我们想要0xAACCBBDD。为此,我们需要0x00EEEE00 ^ 0xAABBCCDD,其中EE = BB ^ CC。
Now, we need a few cycles to build 00EEEE00:
现在,我们需要几个周期来构建00EEEE00:
eor r1,r0,r0,lsr #8
and r1,r1,#0xFF00
orr r1,r1,r1,lsl #8
eor r0,r0,r1
In c:
t=x^(x>>8);
t=t&0xFF00;
t=t|(t<<8);
x^=t;
After each line, the result calculated is: starting with: AABBCCDD
在每一行之后,计算的结果是:从:AABBCCDD开始
eor XXXXEEXX
and 0000EE00
orr 00EEEE00
eor AACCBBDD
This will work on any 32bit ARM core.
这适用于任何32位ARM内核。
#2
6
That's not a simple task in ARM assembly because you can't easily use 32 bit constants. You have to break up all your operations that mask out bytes to use 8 bit constants each (also these constants can be rotated).
这不是ARM组装中的简单任务,因为您不能轻易使用32位常量。你必须分解所有掩盖字节的操作,每个操作使用8位常量(也可以旋转这些常量)。
You mask out byte2 and 3 using the AND instruction and do the shift later. in ARM-assembler you have with most instruction one shift for free, so the shift-into-position and merge with the other bits often end up being a single instruction.
使用AND指令屏蔽byte2和3并稍后进行移位。在ARM-assembler中,您可以免费获得大多数指令一个移位,因此移位到位并与其他位合并通常最终成为单个指令。
Here is some untested code that does the middle byte swap (ARMv4, not thumb-instruction set):
这是一些未经测试的代码,它执行中间字节交换(ARMv4,而不是拇指指令集):
.text
swap_v4:
AND R2, R0, #0x00ff0000 @ R2=0x00BB0000 get byte 2
AND R3, R0, #0x0000ff00 @ R3=0x0000CC00 get byte 1
BIC R0, R0, #0x00ff0000 @ R0=0xAA00CCDD clear byte 2
BIC R0, R0, #0x0000ff00 @ R0=0xAA0000DD clear byte 1
ORR R0, R2, LSR #8 @ R0=0xAA00BBDD merge and shift byte 2
ORR R0, R3, LSL #8 @ R0=0xAACCBBDD merge and shift byte 1
B LR
That translate line by line into the following c-code:
这逐行转换为以下c代码:
int swap (int R0)
{
int R2,R3;
R2 = R0 & 0x00ff0000;
R3 = R0 & 0x0000ff00;
R0 = R0 & 0xff00ffff;
R0 = R0 & 0xffff00ff;
R0 |= (R2>>8);
R0 |= (R3<<8);
return R0;
}
You'll see - lots of lines for such a simple task. Not even the ARMv6 architecture helps here much.
你会看到 - 这么简单的任务有很多行。甚至ARMv6架构也没有帮助。
EDIT: ARMv6 version (also untested, but two instructions shorter)
编辑:ARMv6版本(也未经测试,但两个指令更短)
swap_v6:
@ bits in R0: aabbccdd
ROR R0, R0, #8 @ r0 = ddaabbcc
REV R1, R0 @ r1 = ccbbaadd
PKHTB R0, R0, R1 @ r0 = ddaaccbb
ROR R0, R0, #24 @ r0 = aaccbbdd
BX LR
#3
2
Hmmm, dont know what happened, it submitted my answer before I had really started.
嗯,不知道发生了什么,它在我真正开始之前提交了我的答案。
At first I didnt think I could do it with only two registers but then I decided I could and did. These solutions are register only, no memory (other than the ldr r0,= which you can replace with four instructions). If you use memory and hmmm, two registers you can cut down the number of instructions perhaps, str, bic, bic, ldrb, orr lsl, ldrb, orr lsl. Okay I did it in one instruction fewer but then you need the memory location and the stores and loads cost cycles so same amount of memory and more cycles for me to do it with memory. Someone else may have some good tricks. I think some of the newer cores have an endian swap instruction which would make it even easier.
起初我并不认为我只能用两个寄存器来做,但后来我决定我可以做到。这些解决方案只是寄存器,没有内存(除了ldr r0,=可以用四条指令替换)。如果你使用内存和嗯,两个寄存器,你可以减少指令的数量,str,bic,bic,ldrb,orr lsl,ldrb,orr lsl。好吧,我在一个指令中做的少了,但是你需要内存位置和存储并加载成本周期所以相同的内存量和更多的周期让我用内存来做。其他人可能有一些好的技巧。我认为一些较新的内核有一个endian交换指令,这将使它更容易。
.globl midswap
midswap:
mov r2,r0,lsl #8 ;@ r2 = BBCCDDAA
mov r3,r0,lsr #8 ;@ r3 = DDAABBCC (this might drag a sign bit, dont care)
and r2,r2,#0x00FF0000 ;@ r2 = 00CC0000
and r3,r3,#0x0000FF00 ;@ r3 = 0000BB00
bic r0,r0,#0x00FF0000 ;@ r0 = AA00CCDD
bic r0,r0,#0x0000FF00 ;@ r0 = AA0000DD
orr r0,r0,r2 ;@ r0 = AACC00DD
orr r0,r0,r3 ;@ r0 = AACCBBDD
bx lr ;@ or mov pc,lr for older arm cores
.globl tworegs
tworegs:
mov r2,r0,ror #8 ;@ r2 = DDAABBCC
bic r2,r2,#0xFF000000 ;@ r2 = 00AABBCC
bic r2,r2,#0x00FF0000 ;@ r2 = 0000BBCC
orr r2,r2,ror #16 ;@ r2 = BBCCBBCC
bic r2,r2,#0xFF000000 ;@ r2 = 00CCBBCC
bic r2,r2,#0x000000FF ;@ r2 = 00CCBB00
bic r0,r0,#0x00FF0000 ;@ r0 = AA00CCDD
bic r0,r0,#0x0000FF00 ;@ r0 = AA0000DD
orr r0,r0,r2 ;@ r0 = AACCBBDD
bx lr
testfun:
ldr r0,=0xAABBCCDD
bl midswap
#4
1
Can you use BFI and UBFX they will make your job easier
你可以使用BFI和UBFX,它们会让你的工作更轻松
#5
0
You vould just use pointers to swap two bytes
你只需使用指针交换两个字节
static union {
BYTE BBuf[4];
WORD WWBuf[2];
DWORD DWBuf;
}swap;
unsigned char *a;
unsigned char *b;
swap.DWBuf = 0xaabbccdd;
a = &swap.BBuf[1];
b = &swap.BBuf[2];
*a ^= *b;
*b ^= *a;
*a ^= *b;
And now the result is
现在的结果是
swap.DWbuf == 0xaaccbbdd;