Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp5253913imm; Tue, 21 Aug 2018 08:40:01 -0700 (PDT) X-Google-Smtp-Source: AA+uWPz5/lP92vWkI6dCIgh7NrEJAa5t5fbO7EeM49qcaZvJWxH1WjJqEzfG81LwWjG16RRUdjtM X-Received: by 2002:a17:902:9a8a:: with SMTP id w10-v6mr50001580plp.14.1534866001553; Tue, 21 Aug 2018 08:40:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534866001; cv=none; d=google.com; s=arc-20160816; b=g2yZiIiEEUjuXQrjl+V8dIeMigr3gA0k+TfsRdHJnFqGOEzJQm7oDL+SST2Y1gQVf5 4nsbLHza8V9DlqFSomj9c+ibobdpuIP5qTs30eZrbcHjUDVVhc8Q7MKZtlqy0YaCKE87 VEb9i2DadnrF6/8fICR6INc+xX32zgTyWW2j1C0lTb3GBMpeq9n+U6LOAf36cic3L5Wg hNqdaHwrs4d/jaS2CxU6tNRt+bsg/kFZvvG96pvb0EU8jwxvFvFK7H7ZOwxYhJzcycVP 4qZAzKvlS97mkM0iIN1ihdu0QYr/AXpWnI4YJlBPYsAvXfAB5CN6VXLDbkVFO6Lfa8D2 QsJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=xioOW+1WqSPgaYVVgaBwtvO2O0tqolVvTMA1CBW9czk=; b=zSa7BRQ6RhAMVfuWiyQFJ4KsdyerPBBY8eHSTqAbIWDCgLf/1RbFJPSB9laIrgvSeL yYJYVn6QZ6Yv6StHrb945+LR43OAOvBDtMSqDxFh3mtUHtUU9oA1b//d+5BnAfMvuiCp OmQ96+oF++zt5cywMd7VTw9Pnc2DpsIQDAWuZuiifqgvyLd5q9CBwR9QAuUaWrEeyC4g 8FaHb4yqrKaZ3O5ABLyhgzFU6Weci1+uHWfOLQ/ZaxQfOSnwVQM84XiHriMpotU0M01V Z0JIsnhMdSvgFxgLTypTGpH+Uc74V6wwfiB9BywvR+pu0NclCd3PrzBbr2WmAYatBuAS gEGQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d129-v6si707073pfd.113.2018.08.21.08.39.35; Tue, 21 Aug 2018 08:40:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728029AbeHUS6s (ORCPT + 99 others); Tue, 21 Aug 2018 14:58:48 -0400 Received: from mx2.suse.de ([195.135.220.15]:50336 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727965AbeHUS6n (ORCPT ); Tue, 21 Aug 2018 14:58:43 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 70B82ADBD; Tue, 21 Aug 2018 15:38:04 +0000 (UTC) From: Juergen Gross To: linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org, x86@kernel.org Cc: boris.ostrovsky@oracle.com, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, Juergen Gross Subject: [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear Date: Tue, 21 Aug 2018 17:37:55 +0200 Message-Id: <20180821153755.30462-3-jgross@suse.com> X-Mailer: git-send-email 2.13.7 In-Reply-To: <20180821153755.30462-1-jgross@suse.com> References: <20180821153755.30462-1-jgross@suse.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Using only 32-bit writes for the pte will result in an intermediate L1TF vulnerable PTE. When running as a Xen PV guest this will at once switch the guest to shadow mode resulting in a loss of performance. Use arch_atomic64_xchg() instead which will perform the requested operation atomically with all 64 bits. Some performance considerations according to: https://software.intel.com/sites/default/files/managed/ad/dc/Intel-Xeon-Scalable-Processor-throughput-latency.pdf The main number should be the latency, as there is no tight loop around native_ptep_get_and_clear(). "lock cmpxchg8b" has a latency of 20 cycles, while "lock xchg" (with a memory operand) isn't mentioned in that document. "lock xadd" (with xadd having 3 cycles less latency than xchg) has a latency of 11, so we can assume a latency of 14 for "lock xchg". Signed-off-by: Juergen Gross --- In case adding about 6 cycles for native_ptep_get_and_clear() is believed to be too bad I can modify the patch to add a paravirt function for that purpose in order to add the overhead for Xen guests only (in fact the overhead for Xen guests will be less, as only one instruction writing to the PTE has to be emulated by the hypervisor). --- arch/x86/include/asm/pgtable-3level.h | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h index a564084c6141..f8b1ad2c3828 100644 --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -2,6 +2,8 @@ #ifndef _ASM_X86_PGTABLE_3LEVEL_H #define _ASM_X86_PGTABLE_3LEVEL_H +#include + /* * Intel Physical Address Extension (PAE) Mode - three-level page * tables on PPro+ CPUs. @@ -150,10 +152,7 @@ static inline pte_t native_ptep_get_and_clear(pte_t *ptep) { pte_t res; - /* xchg acts as a barrier before the setting of the high bits */ - res.pte_low = xchg(&ptep->pte_low, 0); - res.pte_high = ptep->pte_high; - ptep->pte_high = 0; + res.pte = (pteval_t)arch_atomic64_xchg((atomic64_t *)ptep, 0); return res; } -- 2.13.7