Hello Josh and Peter,
As mentionned in the cover letter of my series "powerpc/objtool: uaccess
validation for PPC32 (v3)" [1], a few switch table lookup fail, and it
would help if you had ideas on how to handle them.
First one is as follows. First switch is properly detected, second is not.
0000 00003818 <vsnprintf>:
...
0054 386c: 3f 40 00 00 lis r26,0 386e: R_PPC_ADDR16_HA .rodata+0x6c
0058 3870: 3f 20 00 00 lis r25,0 3872: R_PPC_ADDR16_HA .rodata+0x4c
005c 3874: 7f be eb 78 mr r30,r29
0060 3878: 3b 5a 00 00 addi r26,r26,0 387a: R_PPC_ADDR16_LO
.rodata+0x6c <== First switch table address loaded in r26 register
0064 387c: 3b 39 00 00 addi r25,r25,0 387e: R_PPC_ADDR16_LO
.rodata+0x4c <== Second switch table address loaded in r25 register
...
009c 38b4: 41 9d 02 64 bgt cr7,3b18 <vsnprintf+0x300> <==
Conditional jump to where second switch is
00a0 38b8: 55 29 10 3a slwi r9,r9,2
00a4 38bc: 7d 39 48 2e lwzx r9,r25,r9
00a8 38c0: 7d 29 ca 14 add r9,r9,r25
00ac 38c4: 7d 29 03 a6 mtctr r9
00b0 38c8: 4e 80 04 20 bctr <== Dynamic switch branch based on r25
register
...
0300 3b18: 39 29 ff f8 addi r9,r9,-8
0304 3b1c: 55 2a 06 3e clrlwi r10,r9,24
0308 3b20: 2b 8a 00 0a cmplwi cr7,r10,10
030c 3b24: 89 3f 00 00 lbz r9,0(r31)
0310 3b28: 41 9d 01 88 bgt cr7,3cb0 <vsnprintf+0x498>
0314 3b2c: 55 4a 10 3a slwi r10,r10,2
0318 3b30: 7d 5a 50 2e lwzx r10,r26,r10
031c 3b34: 7d 4a d2 14 add r10,r10,r26
0320 3b38: 7d 49 03 a6 mtctr r10
0324 3b3c: 4e 80 04 20 bctr <== Dynamic switch branch based on r26
register
...
Here is the second one. Two first switches are properly detected. Third
one fails, it stops looking for the table address when it finds the
previous switch dynamic jump instruction.
0000 000004d4 <__nla_validate_parse>:
...
0084 558: 3e 40 00 00 lis r18,0 55a: R_PPC_ADDR16_HA .rodata+0xc8
...
0094 568: 3a 52 00 00 addi r18,r18,0 56a: R_PPC_ADDR16_LO
.rodata+0xc8 <== Loading r18 with switch table address
...
00b8 58c: 40 9c 00 dc bge cr7,668 <__nla_validate_parse+0x194>
<== Conditional jump to 668
...
0190 664: 48 00 07 24 b d88 <__nla_validate_parse+0x8b4> <==
Unconditional jump away
0194 668: a1 3e 00 02 lhz r9,2(r30)
..
01a8 67c: 40 9d 00 1c ble cr7,698 <__nla_validate_parse+0x1c4>
<== Conditional jump to 698
...
01c0 694: 48 00 05 b4 b c48 <__nla_validate_parse+0x774> <==
Unconditional jump away
01c4 698: 7d 09 a8 50 subf r8,r9,r21
...
0238 70c: 41 82 00 40 beq 74c <__nla_validate_parse+0x278> <==
Conditional jump to 74c
...
0274 748: 4b ff fe fc b 644 <__nla_validate_parse+0x170> <==
Unconditional jump away
0278 74c: 2b 87 00 11 cmplwi cr7,r7,17
027c 750: 41 9d 01 fc bgt cr7,94c <__nla_validate_parse+0x478>
0280 754: 3d 00 00 00 lis r8,0 756: R_PPC_ADDR16_HA .rodata+0x64
0284 758: 39 08 00 00 addi r8,r8,0 75a: R_PPC_ADDR16_LO
.rodata+0x64 <== Loading r8 with switch table address
0288 75c: 54 ea 10 3a slwi r10,r7,2
028c 760: 7d 48 50 2e lwzx r10,r8,r10
0290 764: 7d 0a 42 14 add r8,r10,r8
0294 768: 7d 09 03 a6 mtctr r8
0298 76c: 4e 80 04 20 bctr <== Switch jump based on register r8
...
02cc 7a0: 2f 93 00 00 cmpwi cr7,r19,0 <== Arrived here through
a switch
02d0 7a4: 40 be 04 f8 bne cr7,c9c <__nla_validate_parse+0x7c8>
02d4 7a8: 89 3c 00 01 lbz r9,1(r28)
02d8 7ac: 39 29 ff ff addi r9,r9,-1
02dc 7b0: 55 29 06 3e clrlwi r9,r9,24
02e0 7b4: 2b 89 00 06 cmplwi cr7,r9,6
02e4 7b8: 41 9d 05 14 bgt cr7,ccc <__nla_validate_parse+0x7f8>
02e8 7bc: 3c e0 00 00 lis r7,0 7be: R_PPC_ADDR16_HA .rodata+0xac
02ec 7c0: 38 e7 00 00 addi r7,r7,0 7c2: R_PPC_ADDR16_LO
.rodata+0xac <== Loading r7 with switch table address
02f0 7c4: 55 29 10 3a slwi r9,r9,2
02f4 7c8: 7d 27 48 2e lwzx r9,r7,r9
02f8 7cc: 7d 29 3a 14 add r9,r9,r7
02fc 7d0: 7d 29 03 a6 mtctr r9
0300 7d4: 4e 80 04 20 bctr <== Switch jump based on register r7
...
04a0 974: 7d 3a b8 ae lbzx r9,r26,r23 <== Arrived here through
a switch
...
04c0 994: 40 82 00 10 bne 9a4 <__nla_validate_parse+0x4d0> <==
Conditional jump to 9a4
04c4 998: 71 48 f0 00 andi. r8,r10,61440
04c8 99c: 40 a2 01 74 bne b10 <__nla_validate_parse+0x63c>
04cc 9a0: 48 00 02 28 b bc8 <__nla_validate_parse+0x6f4> <==
Unconditional jump away
04d0 9a4: 39 29 ff ff addi r9,r9,-1
04d4 9a8: 55 29 06 3e clrlwi r9,r9,24
04d8 9ac: 2b 89 00 12 cmplwi cr7,r9,18
04dc 9b0: 41 bd fc b0 bgt cr7,660 <__nla_validate_parse+0x18c>
04e0 9b4: 55 29 10 3a slwi r9,r9,2
04e4 9b8: 7d 32 48 2e lwzx r9,r18,r9
04e8 9bc: 7d 29 92 14 add r9,r9,r18
04ec 9c0: 7d 29 03 a6 mtctr r9
04f0 9c4: 4e 80 04 20 bctr <== Switch jump based on register r18
...
Third exemple is rather similar to first one but with a lot more
switches. It fails with the first switch because it doesn't use the
correct address, it uses the one from r25 instead of the one from r30.
0000 000013c8 <filter_match_preds>:
...
0028 13f0: 3f c0 00 00 lis r30,0 13f2: R_PPC_ADDR16_HA .rodata+0x18
002c 13f4: 3f a0 00 00 lis r29,0 13f6: R_PPC_ADDR16_HA
.rodata+0x108
0030 13f8: 3f 80 00 00 lis r28,0 13fa: R_PPC_ADDR16_HA .rodata+0xf4
0034 13fc: 3f 60 00 00 lis r27,0 13fe: R_PPC_ADDR16_HA .rodata+0xe0
0038 1400: 3f 40 00 00 lis r26,0 1402: R_PPC_ADDR16_HA .rodata+0xcc
003c 1404: 3f 20 00 00 lis r25,0 1406: R_PPC_ADDR16_HA .rodata+0xb8
...
0048 1410: 3b de 00 00 addi r30,r30,0 1412: R_PPC_ADDR16_LO
.rodata+0x18 <== loading r30 with table address
004c 1414: 3b bd 00 00 addi r29,r29,0 1416: R_PPC_ADDR16_LO
.rodata+0x108 <== loading r29 with table address
...
0054 141c: 3b 9c 00 00 addi r28,r28,0 141e: R_PPC_ADDR16_LO
.rodata+0xf4 <== loading r28 with table address
0058 1420: 3b 7b 00 00 addi r27,r27,0 1422: R_PPC_ADDR16_LO
.rodata+0xe0 <== loading r27 with table address
005c 1424: 3b 5a 00 00 addi r26,r26,0 1426: R_PPC_ADDR16_LO
.rodata+0xcc <== loading r26 with table address
0060 1428: 3b 39 00 00 addi r25,r25,0 142a: R_PPC_ADDR16_LO
.rodata+0xb8 <== loading r25 with table address
...
008c 1454: 55 29 10 3a slwi r9,r9,2
0090 1458: 7d 3e 48 2e lwzx r9,r30,r9
0094 145c: 7d 29 f2 14 add r9,r9,r30
0098 1460: 7d 29 03 a6 mtctr r9
009c 1464: 4e 80 04 20 bctr <== Switch based on register r30
...
00dc 14a4: 3d 20 00 00 lis r9,0 14a6: R_PPC_ADDR16_HA .rodata+0x68
00e0 14a8: 39 29 00 00 addi r9,r9,0 14aa: R_PPC_ADDR16_LO
.rodata+0x68 <== loading r9 with table address
00e4 14ac: 55 4a 10 3a slwi r10,r10,2
00e8 14b0: 7d 49 50 2e lwzx r10,r9,r10
00ec 14b4: 81 13 00 08 lwz r8,8(r19)
00f0 14b8: 7d 4a 4a 14 add r10,r10,r9
00f4 14bc: 7d 49 03 a6 mtctr r10
00f8 14c0: 81 33 01 2c lwz r9,300(r19)
00fc 14c4: 4e 80 04 20 bctr <== Switch based on register r9
...
01c8 1590: 3d 20 00 00 lis r9,0 1592: R_PPC_ADDR16_HA .rodata+0x7c
01cc 1594: 39 29 00 00 addi r9,r9,0 1596: R_PPC_ADDR16_LO
.rodata+0x7c <== loading r9 with table address
01d0 1598: 55 4a 10 3a slwi r10,r10,2
01d4 159c: 7d 09 50 2e lwzx r8,r9,r10
01d8 15a0: 81 53 00 08 lwz r10,8(r19)
01dc 15a4: 7d 08 4a 14 add r8,r8,r9
01e0 15a8: 7d 09 03 a6 mtctr r8
01e4 15ac: 81 33 01 2c lwz r9,300(r19)
01e8 15b0: 7d 14 4a 14 add r8,r20,r9
01ec 15b4: 4e 80 04 20 bctr <== Switch based on register r9
...
02e0 16a8: 3d 20 00 00 lis r9,0 16aa: R_PPC_ADDR16_HA .rodata+0x90
02e4 16ac: 39 29 00 00 addi r9,r9,0 16ae: R_PPC_ADDR16_LO
.rodata+0x90 <== loading r9 with table address
02e8 16b0: 55 4a 10 3a slwi r10,r10,2
02ec 16b4: 7d 49 50 2e lwzx r10,r9,r10
02f0 16b8: 81 13 01 2c lwz r8,300(r19)
02f4 16bc: 7d 4a 4a 14 add r10,r10,r9
02f8 16c0: 7d 49 03 a6 mtctr r10
02fc 16c4: 81 33 00 0c lwz r9,12(r19)
0300 16c8: 7c 74 40 2e lwzx r3,r20,r8
0304 16cc: 4e 80 04 20 bctr <== Switch based on register r9
...
0354 171c: 3d 20 00 00 lis r9,0 171e: R_PPC_ADDR16_HA
.rodata+0xa4 <== loading r9 with table address
0358 1720: 39 29 00 00 addi r9,r9,0 1722: R_PPC_ADDR16_LO
.rodata+0xa4
035c 1724: 55 4a 10 3a slwi r10,r10,2
0360 1728: 7d 49 50 2e lwzx r10,r9,r10
0364 172c: 81 13 01 2c lwz r8,300(r19)
0368 1730: 7d 4a 4a 14 add r10,r10,r9
036c 1734: 7d 49 03 a6 mtctr r10
0370 1738: 81 33 00 0c lwz r9,12(r19)
0374 173c: 7c 74 40 2e lwzx r3,r20,r8
0378 1740: 4e 80 04 20 bctr <== Switch based on register r9
...
03c8 1790: 55 29 10 3a slwi r9,r9,2
03cc 1794: 7d 59 48 2e lwzx r10,r25,r9
03d0 1798: a0 73 00 0e lhz r3,14(r19)
03d4 179c: 7d 4a ca 14 add r10,r10,r25
03d8 17a0: 7d 49 03 a6 mtctr r10
03dc 17a4: 81 33 01 2c lwz r9,300(r19)
03e0 17a8: 4e 80 04 20 bctr <== Switch based on register r25
...
0444 180c: 7d 5a 48 2e lwzx r10,r26,r9
0448 1810: 81 33 01 2c lwz r9,300(r19)
044c 1814: 7d 4a d2 14 add r10,r10,r26
0450 1818: 7d 49 03 a6 mtctr r10
0454 181c: 4e 80 04 20 bctr <== Switch based on register r26
...
04b0 1878: 7d 5b 48 2e lwzx r10,r27,r9
04b4 187c: 88 73 00 0f lbz r3,15(r19)
04b8 1880: 7d 4a da 14 add r10,r10,r27
04bc 1884: 7d 49 03 a6 mtctr r10
04c0 1888: 81 33 01 2c lwz r9,300(r19)
04c4 188c: 4e 80 04 20 bctr <== Switch based on register r27
...
0558 1920: 7d 5c 48 2e lwzx r10,r28,r9
055c 1924: 81 33 01 2c lwz r9,300(r19)
0560 1928: 7d 4a e2 14 add r10,r10,r28
0564 192c: 7d 49 03 a6 mtctr r10
0568 1930: 4e 80 04 20 bctr <== Switch based on register r28
...
06ac 1a74: 7d 5d 50 2e lwzx r10,r29,r10
06b0 1a78: 7d 4a ea 14 add r10,r10,r29
06b4 1a7c: 7d 49 03 a6 mtctr r10
06b8 1a80: 4e 80 04 20 bctr <== Switch based on register r29
...
So all ideas to overcome those are very welcome.
Thanks
Christophe
[1]
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=361048&state=*
On Sat, Jun 24, 2023 at 10:06:23AM +0000, Christophe Leroy wrote:
> Hello Josh and Peter,
>
> As mentionned in the cover letter of my series "powerpc/objtool: uaccess
> validation for PPC32 (v3)" [1], a few switch table lookup fail, and it
> would help if you had ideas on how to handle them.
>
> First one is as follows. First switch is properly detected, second is not.
>
> 0000 00003818 <vsnprintf>:
> ...
> 0054 386c: 3f 40 00 00 lis r26,0 386e: R_PPC_ADDR16_HA .rodata+0x6c
> 0058 3870: 3f 20 00 00 lis r25,0 3872: R_PPC_ADDR16_HA .rodata+0x4c
> 005c 3874: 7f be eb 78 mr r30,r29
> 0060 3878: 3b 5a 00 00 addi r26,r26,0 387a: R_PPC_ADDR16_LO
> .rodata+0x6c <== First switch table address loaded in r26 register
> 0064 387c: 3b 39 00 00 addi r25,r25,0 387e: R_PPC_ADDR16_LO
> .rodata+0x4c <== Second switch table address loaded in r25 register
> ...
> 009c 38b4: 41 9d 02 64 bgt cr7,3b18 <vsnprintf+0x300> <==
> Conditional jump to where second switch is
> 00a0 38b8: 55 29 10 3a slwi r9,r9,2
> 00a4 38bc: 7d 39 48 2e lwzx r9,r25,r9
> 00a8 38c0: 7d 29 ca 14 add r9,r9,r25
> 00ac 38c4: 7d 29 03 a6 mtctr r9
> 00b0 38c8: 4e 80 04 20 bctr <== Dynamic switch branch based on r25
> register
> ...
> 0300 3b18: 39 29 ff f8 addi r9,r9,-8
> 0304 3b1c: 55 2a 06 3e clrlwi r10,r9,24
> 0308 3b20: 2b 8a 00 0a cmplwi cr7,r10,10
> 030c 3b24: 89 3f 00 00 lbz r9,0(r31)
> 0310 3b28: 41 9d 01 88 bgt cr7,3cb0 <vsnprintf+0x498>
> 0314 3b2c: 55 4a 10 3a slwi r10,r10,2
> 0318 3b30: 7d 5a 50 2e lwzx r10,r26,r10
> 031c 3b34: 7d 4a d2 14 add r10,r10,r26
> 0320 3b38: 7d 49 03 a6 mtctr r10
> 0324 3b3c: 4e 80 04 20 bctr <== Dynamic switch branch based on r26
> register
> ...
Josh is the one that knows most about the jump table stuff, but I think
he's traveling or something like that atm so he might be a little slow.
Is the problem above that both the .rodata references are before the
conditional jump, such that objtool fails to correlate the indirect jump
and .rodata ?
Looking at mark_func_jump_table() that only seems to consider
unconditional jumps wrt jump-tables and the above doesn't match this
pattern.
Worse is that the two jump tables are interleaved, this means the only
way to untangle things is to actually track the register state :/
Specifically, if GCC wanted it could flip the r25 and r26 loads and then
objtool wouldn't be able to match any of them I think. Because at that
point the first jump-table would match the r26 jump-table or so (I
think, I've not fully considered the current code).
Ho-humm... what a tangle.
So for AARGH64 we also had trouble with jump-tables, but LLVM-BOLT
managed to get that working:
https://github.com/llvm/llvm-project/blob/main/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp#L458
perhaps we can glean a clue there, but I don't immediately see the same
patterns there.
I can't seem to come up with anything better than tracking the register
state, and effectively working back from 'ctr' to a .rodata. That's
going to be a bit of effort though...