Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp4285163pxj; Mon, 21 Jun 2021 18:45:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwKnRaGSpMu7CmwD52qdiEgGYdGelg8tgNQKv81aJrtQDjvN2ixYqi4oyb+GI1ayUvgK11L X-Received: by 2002:a17:906:6c92:: with SMTP id s18mr1121956ejr.246.1624326357330; Mon, 21 Jun 2021 18:45:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624326357; cv=none; d=google.com; s=arc-20160816; b=cHsbFFKEQJyFVYyqryTlRnn4hc4vgdc6yK7vZJHa0BcheLieQkFCYUHNj2oHerntWq U3XLKx0IW3vOxQN3fLXzEau+jado5bqEbeuq/B15/70lAluSx/JTsTCPcYx9Ya0/R/hg P5pC0ZPf0iu9/60MMvR8Q5nDmxoFWPg0Gi0bXppkY9pDqaI/jpH880/S6dgSkz3fC6Av XJQ4jVQLhzcpRs8tKZJzjMshmy7RN1adZ+SXbggBBH44UUgEC6ho02nZQQsYEDtPqL7c 3Mmshq8tIFHGnyWDp1XTL4eHqoZt4pxwvNDxJbLp1cl7Sf7fEkHVZinFR5H9Zu212ryl XE8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :ironport-sdr:ironport-sdr; bh=tre5Szlpyxv8/jJqxvK/WXOx7GcXXR70D2Am+1YIXSQ=; b=obbFEXItzr6pILcnnBC+44wdwhTDEEbVJneGoK71S9Es+oBirS9TEiERhTBCfqYEP8 LF8mMNneTeZdaedKghuxoc0lgqk1fFt8Uld/U9eliCnPauKdf5owhTEPUirJo6bNnGaN QmuYLKWXHQbgjq/2xujH3nEGbwwNG4M81FuzKw8omnHvCafOLhmzjwIjFecCpl3WqOVg wD/krKjO4gnVbjYt/Vq8T0DcC8cp7q+Yjer+9KPaYQeSy0InOK0aiWch4CHxkbrEPURO LcFDfGNFJwbcdF5/WDyzQTdi62ziPQvTjHzBZh0m4qGeYAJLqTq36/espHjmTlEwhFIf /K7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u15si11995889eju.32.2021.06.21.18.45.33; Mon, 21 Jun 2021 18:45:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230161AbhFVBpp (ORCPT + 99 others); Mon, 21 Jun 2021 21:45:45 -0400 Received: from mga04.intel.com ([192.55.52.120]:51601 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229663AbhFVBpp (ORCPT ); Mon, 21 Jun 2021 21:45:45 -0400 IronPort-SDR: D6GrgZUTWyVCHD88mDeUWcyy4ALzXjDCSW9NxuIA9dzBNJheG0hGKdUdggqQik1Vb3Y+LpTGkn OzhP6F286mIw== X-IronPort-AV: E=McAfee;i="6200,9189,10022"; a="205139619" X-IronPort-AV: E=Sophos;i="5.83,290,1616482800"; d="scan'208";a="205139619" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jun 2021 18:43:30 -0700 IronPort-SDR: M4LbRyUdsW/Smz7CkhYeLlA+20dwSZf0+6/KL714AmSS6tp7QyRkZ22Sct6TByeFgSqneUMYEc pOLErqIA8qZA== X-IronPort-AV: E=Sophos;i="5.83,290,1616482800"; d="scan'208";a="486715732" Received: from xsang-optiplex-9020.sh.intel.com (HELO xsang-OptiPlex-9020) ([10.239.159.41]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jun 2021 18:43:25 -0700 Date: Tue, 22 Jun 2021 09:59:37 +0800 From: Oliver Sang To: Thomas Gleixner Cc: LKML , Andy Lutomirski , Dave Hansen , "Yu, Fenghua" , "Luck, Tony" , "Yu, Yu-cheng" , Sebastian Andrzej Siewior , Borislav Petkov , Peter Zijlstra , Kan Liang , "Li, Aubrey" , "Xing, Zhengjun" , "Tang, Feng" , "Liu, Yujie" , "Si, Beibei" , "Li, Philip" , "Du, Julie" Subject: Re: [patch V3 00/66] x86/fpu: Spring cleaning and PKRU sanitizing Message-ID: <20210622015937.GB687@xsang-OptiPlex-9020> References: <20210618141823.161158090@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210618141823.161158090@linutronix.de> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Thomas, On Fri, Jun 18, 2021 at 10:18:23PM +0800, Thomas Gleixner wrote: > The main parts of this series are: > > - Yet more bug fixes > > - Simplification and removal/replacement of redundant and/or > overengineered code. > > - Name space cleanup as the existing names were just a permanent source > of confusion. > > - Clear seperation of user ABI and kernel internal state handling. > > - Removal of PKRU from being XSTATE managed in the kernel because PKRU > has to be eagerly restored on context switch and keeping it in sync > in the xstate buffer is just pointless overhead and fragile. > > The kernel still XSAVEs PKRU on context switch but the value in the > buffer is not longer used and never restored from the buffer. > > This still needs to be cleaned up, but the series is already 40+ > patches large and the cleanup of this is not a functional problem. > > The functional issues of PKRU management are fully addressed with the > series as is. > > - Cleanup of fpu signal restore > > - Make the fast path self contained. Handle #PF directly and skip > the slow path on any other exception as that will just end up > with the same result that the frame is invalid. This allows > the compiler to optimize the slow path out for 64bit kernels > w/o ia32 emulation. > > - Reduce code duplication and unnecessary operations > > > It applies on top of > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master > > and is also available via git: > > git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu 0-Day kernel CI tested this branch from performance view, choosing some sub-tests from will-it-scale (detail as below), since we thought if the branch has the impact of fpu ops, will-it-scale should be able to catch it. we also plan to add stress-ng for new round test. could you suggest if any other suitable test suites? and what's the most proper sub-tests in will-it-scale and stress-ng? Test Summary ============ no obvious will-it-scale performance changes found so far Test Environment ================ https://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git/log/?h=x86/fpu * 0619677ee36c3 (tglx-devel/x86/fpu) x86/fpu/signal: Let xrstor handle the features to init <----- the tip we tested * a114fd9946c28 x86/fpu/signal: Handle #PF in the direct restore path * 73e26fdd0cf1c x86/fpu: Return proper error codes from user access functions ... * 63bf804bfa6b0 x86/fpu: Make init_fpstate correct with optimized XSAVE * 6db8e02d5e932 x86/fpu: x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate() * 4fe93c2272dbb Merge branch 'x86/fpu' of ../tip into x86/fpu <----- the base we compared |\ | * b7c11876d24bd (tip/x86/fpu, peterz-queue/x86/fpu) selftests/x86: Test signal frame XSTATE header corruption handling 64bit kernel testing, upon below platform: model: Cascade Lake Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz nr_node: 2 nr_cpu: 88 memory: 128G 32bit kernel testing, upon below platform: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz model: Ivy Bridge nr_node: 1 nr_cpu: 8 memory: 16G tested below test suites: will-it-scale-performance-context_switch1 will-it-scale-performance-page_fault1 will-it-scale-performance-poll1 will-it-scale-performance-pthread_mutex1 will-it-scale-performance-writeseek1 > > This is a follow up to V2 which can be found here: > > https://lore.kernel.org/r/20210614154408.673478623@linutronix.de > > Changes vs. V2: > > - Fixed the testing fallout (Dave, Kan) > > - Fixed a few issues found by myself when going through the lot > with a fine comb, especially MXCSR handling > > - Drop the FNSAVE optimizations > > - Cleanup of signal restore > > - Addressed review comments, mostly comments and a hopefully better > naming scheme which now just uses the instruction names and > consolidates everything else on save/restore so it's close to the way > how the hardware works. > > - A few cleanups and simplifications on the way (mostly regset related). > > - Picked up tags > > With the above I'm not intending to do any further surgery on that > code at the moment, though there is still room for improvement which > can and has to be worked on when new bits are added. > > Thanks, > > tglx > --- > arch/x86/events/intel/lbr.c | 6 > arch/x86/include/asm/fpu/internal.h | 211 +++------- > arch/x86/include/asm/fpu/xstate.h | 70 ++- > arch/x86/include/asm/pgtable.h | 57 -- > arch/x86/include/asm/pkeys.h | 9 > arch/x86/include/asm/pkru.h | 62 +++ > arch/x86/include/asm/processor.h | 9 > arch/x86/include/asm/special_insns.h | 14 > arch/x86/kernel/cpu/common.c | 34 - > arch/x86/kernel/fpu/core.c | 276 +++++++------ > arch/x86/kernel/fpu/init.c | 15 > arch/x86/kernel/fpu/regset.c | 220 ++++++----- > arch/x86/kernel/fpu/signal.c | 423 +++++++++------------ > arch/x86/kernel/fpu/xstate.c | 693 ++++++++++++++--------------------- > arch/x86/kernel/process.c | 22 - > arch/x86/kernel/process_64.c | 28 + > arch/x86/kernel/traps.c | 5 > arch/x86/kvm/svm/sev.c | 1 > arch/x86/kvm/x86.c | 56 +- > arch/x86/mm/extable.c | 2 > arch/x86/mm/fault.c | 2 > arch/x86/mm/pkeys.c | 22 - > include/linux/pkeys.h | 4 > 23 files changed, 1060 insertions(+), 1181 deletions(-) > >