Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751722AbdCNTCs (ORCPT ); Tue, 14 Mar 2017 15:02:48 -0400 Received: from terminus.zytor.com ([65.50.211.136]:45710 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750861AbdCNTCq (ORCPT ); Tue, 14 Mar 2017 15:02:46 -0400 Subject: Re: [PATCH v10 6/7] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID To: Kyle Huey , "Robert O'Callahan" , Thomas Gleixner , Andy Lutomirski , Ingo Molnar , x86@kernel.org, Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Jeff Dike , Richard Weinberger , Alexander Viro , Shuah Khan , Dave Hansen , Borislav Petkov , Peter Zijlstra , Boris Ostrovsky , Len Brown , "Rafael J. Wysocki" , Dmitry Safonov , David Matlack References: <20161108183956.4521-1-khuey@kylehuey.com> <20161108183956.4521-7-khuey@kylehuey.com> Cc: linux-kernel@vger.kernel.org, user-mode-linux-devel@lists.sourceforge.net, user-mode-linux-user@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org, kvm@vger.kernel.org From: "H. Peter Anvin" Message-ID: <30f2ec3e-d0c8-8dd2-837f-3380237d843c@zytor.com> Date: Tue, 14 Mar 2017 12:01:27 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20161108183956.4521-7-khuey@kylehuey.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1514 Lines: 50 On 11/08/16 10:39, Kyle Huey wrote: > } > > + if (test_tsk_thread_flag(prev_p, TIF_NOCPUID) ^ > + test_tsk_thread_flag(next_p, TIF_NOCPUID)) { > + set_cpuid_faulting(test_tsk_thread_flag(next_p, TIF_NOCPUID)); > + } > + > if (test_tsk_thread_flag(prev_p, TIF_NOTSC) ^ > test_tsk_thread_flag(next_p, TIF_NOTSC)) { > /* prev and next are different */ > if (test_tsk_thread_flag(next_p, TIF_NOTSC)) > hard_disable_TSC(); > else > hard_enable_TSC(); > } I'm unhappy about this part: we already do two XORs on these after bit extraction, which is quite inefficient; and at least theoretically we could be indirecting though the ->stack pointer for every one if gcc can't tell it won't have changed (we really need to get thread_info moved into the task_struct allocation and away from the kernel stack, especially since on x86 the pointer is the same size as the vestigial structure it points to.) It would be so much saner to do one xor and then go onto a common slow path: struct thread_info *prev_ti = task_thread_info(prev_p); struct thread_info *next_ti = task_thread_info(next_p); tif_flipped = prev_ti->flags ^ next_ti->flags; if (unlikely(tif_flipped & (_TIF_BLOCKSTEP | _TIF_NOTSC | _TIF_NOCPUID))) { if (tif_flipped & _TIF_BLOCKSTEP) { ... } if (tif_flipped & _TIF_NOTSC) { ... } if (tif_flipped & _TIF_NOCPUID) { ... } } Then we can also replace test_tsk_thread_flag() with test_ti_thread_flag() in other places in this function. -hpa