From: Thomas Garnier Subject: Re: x86: PIE support and option to extend KASLR randomization Date: Tue, 29 Aug 2017 12:34:04 -0700 Message-ID: References: <20170810172615.51965-1-thgarnie@google.com> <20170811124127.kkb5pnkljz4umxuj@gmail.com> <20170815075609.mmzbfwritjzvrpsn@gmail.com> <20170816151235.oamkdva6cwpc4cex@gmail.com> <20170817080920.5ljlkktngw2cisfg@gmail.com>

<20170825080443.tvvr6wzs362cjcuu@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Herbert Xu , "David S . Miller" , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Peter Zijlstra , Josh Poimboeuf , Arnd Bergmann , Matthias Kaehlcke , Boris Ostrovsky , Juergen Gross , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Joerg Roedel , Tom Lendacky , Andy Lutomirski , Borislav Petkov , Brian Gerst , "Kirill A . Shutemov" , "Rafael J . Wysocki" , Len Brown , Pavel Machek , Tejun Heo , Christoph La To: Ingo Molnar Return-path: List-Post: List-Help: List-Unsubscribe: List-Subscribe: In-Reply-To: List-Id: linux-crypto.vger.kernel.org On Fri, Aug 25, 2017 at 8:05 AM, Thomas Garnier wrote: > On Fri, Aug 25, 2017 at 1:04 AM, Ingo Molnar wrote: >> >> * Thomas Garnier wrote: >> >>> With the fix for function tracing, the hackbench results have an >>> average of +0.8 to +1.4% (from +8% to +10% before). With a default >>> configuration, the numbers are closer to 0.8%. >>> >>> On the .text size, with gcc 4.9 I see +0.8% on default configuration >>> and +1.180% on the ubuntu configuration. >> >> A 1% text size increase is still significant. Could you look at the disassembly, >> where does the size increase come from? > > I will take a look, in this current iteration I added the .got and > .got.plt so removing them will remove a big (even if they are small, > we don't use them to increase perf). > > What do you think about the perf numbers in general so far? I looked at the size increase. I could identify two common cases: 1) PIE sometime needs two instructions to represent a single instruction on mcmodel=kernel. For example, this instruction plays on the sign extension (mcmodel=kernel): mov r9,QWORD PTR [r11*8-0x7e3da060] (8 bytes) The address 0xffffffff81c25fa0 can be represented as -0x7e3da060 using a 32S relocation. with PIE: lea rbx,[rip+] (7 bytes) mov r9,QWORD PTR [rbx+r11*8] (6 bytes) 2) GCC does not optimize switches in PIE in order to reduce relocations: For example the switch in phy_modes [1]: static inline const char *phy_modes(phy_interface_t interface) { switch (interface) { case PHY_INTERFACE_MODE_NA: return ""; case PHY_INTERFACE_MODE_INTERNAL: return "internal"; case PHY_INTERFACE_MODE_MII: return "mii"; Without PIE (gcc 7.2.0), the whole table is optimize to be one instruction: 0x000000000040045b <+27>: mov rdi,QWORD PTR [rax*8+0x400660] With PIE (gcc 7.2.0): 0x0000000000000641 <+33>: movsxd rax,DWORD PTR [rdx+rax*4] 0x0000000000000645 <+37>: add rax,rdx 0x0000000000000648 <+40>: jmp rax .... 0x000000000000065d <+61>: lea rdi,[rip+0x264] # 0x8c8 0x0000000000000664 <+68>: jmp 0x651 0x0000000000000666 <+70>: lea rdi,[rip+0x2bc] # 0x929 0x000000000000066d <+77>: jmp 0x651 0x000000000000066f <+79>: lea rdi,[rip+0x2a8] # 0x91e 0x0000000000000676 <+86>: jmp 0x651 0x0000000000000678 <+88>: lea rdi,[rip+0x294] # 0x913 0x000000000000067f <+95>: jmp 0x651 That's a deliberate choice, clang is able to optimize it (clang-3.8): 0x0000000000000963 <+19>: lea rcx,[rip+0x200406] # 0x200d70 0x000000000000096a <+26>: mov rdi,QWORD PTR [rcx+rax*8] I checked gcc and the code deciding to fold the switch basically do not do it for pic to reduce relocations [2]. The switches are the biggest increase on small functions but I don't think they represent a large portion of the difference (number 1 is). A side note, while testing gcc 7.2.0 on hackbench I have seen the PIE kernel being faster by 1% across multiple runs (comparing 50 runs done across 5 reboots twice). I don't think PIE is faster than a mcmodel=kernel but recent versions of gcc makes them fairly similar. [1] http://elixir.free-electrons.com/linux/v4.13-rc7/source/include/linux/phy.h#L113 [2] https://github.com/gcc-mirror/gcc/blob/7977b0509f07e42fbe0f06efcdead2b7e4a5135f/gcc/tree-switch-conversion.c#L828 > >> >> Thanks, >> >> Ingo > > > > -- > Thomas -- Thomas