Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751110AbbEGI54 (ORCPT ); Thu, 7 May 2015 04:57:56 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35251 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750806AbbEGI5x (ORCPT ); Thu, 7 May 2015 04:57:53 -0400 Message-ID: <554B290D.6000005@redhat.com> Date: Thu, 07 May 2015 10:57:49 +0200 From: Denys Vlasenko User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: "H. Peter Anvin" , Ingo Molnar CC: Steven Rostedt , Borislav Petkov , Andy Lutomirski , Frederic Weisbecker , Alexei Starovoitov , Will Drewry , Kees Cook , x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] x86: Deinline cpuid_eax and friends References: <1430932057-26574-1-git-send-email-dvlasenk@redhat.com> <554A649C.8070605@zytor.com> <554A6706.8010709@redhat.com> <554A7C8B.4030007@zytor.com> In-Reply-To: <554A7C8B.4030007@zytor.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2795 Lines: 62 On 05/06/2015 10:41 PM, H. Peter Anvin wrote: > On 05/06/2015 12:09 PM, Denys Vlasenko wrote: >>> >>> How on Earth does it make 44 bytes? Is this due to paravirt_fail? >> >> No, just this construct >> >> unsigned int eax, ebx, ecx, edx; >> cpuid(op, &eax, &ebx, &ecx, &edx); >> >> is not really that cheap to set up. You need to allocate >> variables on stack and take address of each: >> >> ffffffff81063668 : >> ffffffff81063668: 55 push %rbp >> ffffffff81063669: 48 89 e5 mov %rsp,%rbp >> ffffffff8106366c: 48 83 ec 10 sub $0x10,%rsp >> ffffffff81063670: 48 8d 4d fc lea -0x4(%rbp),%rcx >> ffffffff81063674: 89 7d f0 mov %edi,-0x10(%rbp) >> ffffffff81063677: 48 8d 55 f8 lea -0x8(%rbp),%rdx >> ffffffff8106367b: 48 8d 75 f4 lea -0xc(%rbp),%rsi >> ffffffff8106367f: 48 8d 7d f0 lea -0x10(%rbp),%rdi >> ffffffff81063683: c7 45 f8 00 00 00 00 movl $0x0,-0x8(%rbp) >> ffffffff8106368a: e8 3c ff ff ff callq ffffffff810635cb <__cpuid> >> ffffffff8106368f: 8b 45 f0 mov -0x10(%rbp),%eax >> ffffffff81063692: c9 leaveq >> ffffffff81063693: c3 retq >> > > That almost certainly is due to paravirt_fail, because otherwise cpuid > would be inline, and gcc actually knows how to optimize around the cpuid > instruction to the point of eliminating the temporaries. Yes, with HYPERVISOR_GUEST off cpuid_eax() is smaller: ffffffff81055a66 : ffffffff81055a66: 55 push %rbp ffffffff81055a67: 89 f8 mov %edi,%eax ffffffff81055a69: 31 c9 xor %ecx,%ecx ffffffff81055a6b: 48 89 e5 mov %rsp,%rbp ffffffff81055a6e: 53 push %rbx ffffffff81055a6f: 0f a2 cpuid ffffffff81055a71: 5b pop %rbx ffffffff81055a72: 5d pop %rbp ffffffff81055a73: c3 retq However, it is not small enough to make vmlinux grow: text data bss dec hex filename 81746530 13978160 20066304 115790994 6e6d492 vmlinux.before 81746509 13978160 20066304 115790973 6e6d47d vmlinux To recap: with this patch Code is smaller with and without HYPERVISOR_GUEST. Slowdown per cpuid_REG() call is at worst 4%. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/