Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp2349924ybh; Fri, 24 Jul 2020 10:32:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx9/qJaslQ/ytHavXUVfkFNXmPxcdtkq9Bt5ISHkvOu3H7I0vvtBiblaMJ8Z52RDhx7U4NJ X-Received: by 2002:a17:906:6959:: with SMTP id c25mr9973265ejs.375.1595611926773; Fri, 24 Jul 2020 10:32:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595611926; cv=none; d=google.com; s=arc-20160816; b=V9Pii/eN0OBQQoO/vk4Ybd1bRQfTNjdVgg9O41oIqLiGM8fXQxQZeom3N3DQbRkh4+ LF7RaJeBWK3BbU+er7XCOlFbRnO6zLCPm7Xho6ujRmpMMgZZFQs/EMFicfKQke66uI+n FayNRLYs80IScSyWsA0Q5MyhBJVWMmy2I76p13b6kNKk7YGp4t23Eu0Njb0SIjAERSMm B/5YjZxa3NpcUCp1Sns3jlZIszh3EdvFjcWGSYwDbDU4EUd5AxZAVFWlaWxSvkh/uq7C S3OYNvbfZ+YBUfrUgcXp81eBIh6iS2mq5X9omEHNf+xYaSvQJaL0Y8jthoDUk0PfvoXP ifew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:in-reply-to:cc:references:message-id :date:subject:mime-version:from:content-transfer-encoding :dkim-signature; bh=xVJb6QB7yJwnn1Z6Ovzuc7SiH2/mFZI/eEetibZkjak=; b=cP1r6FPmAl7l+VOS2K6HQ4GckTEy25RYCR+sFJUkt3fpOiyRd94aYYJGE2GYrHOrDF WsFrqIYznb8LT1GMD4blR1Jn/RWqYJsrRcDTwYwkaE7Ark7v5+Ba22TDqe1+xhVuQmgq WR7bVkleYj0r1kS/hZpyyLYK/ks5jpuBtOSMSok5wilvypVnzFvzsrQTpHDwadw2qW81 uv0e2QywzfacAR1iIe8MoIO1npw3a9hy7yjXi0ziKcTlBbJsOhUs1oS7YGoCmFtbY1ZM KUqE+Jv5s1bn83os72RHHWGniEJ0ydNBhywaDfK4HrFdp5tBX8Kwj1xA7mYcIHOayLYo 8wfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=oCF2bACf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i8si869955edx.405.2020.07.24.10.31.43; Fri, 24 Jul 2020 10:32:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=oCF2bACf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726704AbgGXR33 (ORCPT + 99 others); Fri, 24 Jul 2020 13:29:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726366AbgGXR32 (ORCPT ); Fri, 24 Jul 2020 13:29:28 -0400 Received: from mail-pl1-x643.google.com (mail-pl1-x643.google.com [IPv6:2607:f8b0:4864:20::643]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4855DC0619E4 for ; Fri, 24 Jul 2020 10:29:28 -0700 (PDT) Received: by mail-pl1-x643.google.com with SMTP id t6so4862370plo.3 for ; Fri, 24 Jul 2020 10:29:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=xVJb6QB7yJwnn1Z6Ovzuc7SiH2/mFZI/eEetibZkjak=; b=oCF2bACf0J3wi10Mkevj6a1lx9ZJ2XYqftXol/Y5C3BUVETOi1ThoRhsL2OxKFCyb4 LGJ3ZE9QuRoT+dYLR9CdhNjE9P+KA1F0J34lD0z4Rjmsdl1uULCFsHQh/kjIrw/3yYIN MIxEx6ObahD1W2wfYbyu5J04MgrODyT6hzuLM3mpaqgWNY1btI1HYMmRRqfrG327H9oM z5+CURUIEaZafgf40maNhBKcXiwP+sIOTIqvFURURI9WNSOzDt9L7YckftzfASDKb67X c6S0yxsINqy7Fs9MCqKwbe5ZznH0YewS4KMFQ8V9Fo2+D3qH4LKoiOb9hjdKp/ooZMns 71UA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=xVJb6QB7yJwnn1Z6Ovzuc7SiH2/mFZI/eEetibZkjak=; b=q51pGQQ1L6ab44si/yp1oeuM4PJCcd2nEKziDwUnTQqyrF02e+OU13htdf9RD7c9ct QNk7OGEoEnmW5OWB4f9GIEOKbzNZpS1lk+lg9dc6FclRuTUhXg55WBxjy930A8dEUws2 rnnopmUpSV7Aecuy+7NjQHQ3jKN4IUWvxBfb5M2eiOYC/9pru26YT6VsN/VsGfWQZZMJ 0eBmhqnBX6+pLsITSg6rc4QoTSNZAbkCsIBkrcKtMejO68N9D5/s7QiHyw2EcEglOoXP CKwtyQlI/lN23FNWNxtJYPzdMsFs7PN1NEquW0HyWp277puXAHr9e7jr2wpeUeslQH9Y cLRg== X-Gm-Message-State: AOAM5335qrPFhLFzsZ17+dd5eNxU7pLNULxTxWyc8mFFF0+yZp6tGR5d pRb8ETGCTj6tp0a3BuzvjmGGig== X-Received: by 2002:a17:90a:1b2c:: with SMTP id q41mr6495292pjq.195.1595611767681; Fri, 24 Jul 2020 10:29:27 -0700 (PDT) Received: from ?IPv6:2601:646:c200:1ef2:cc1b:c95e:6185:cfc7? ([2601:646:c200:1ef2:cc1b:c95e:6185:cfc7]) by smtp.gmail.com with ESMTPSA id x66sm6882326pgb.12.2020.07.24.10.29.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 24 Jul 2020 10:29:27 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Andy Lutomirski Mime-Version: 1.0 (1.0) Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions Date: Fri, 24 Jul 2020 10:29:23 -0700 Message-Id: References: <20200724172344.GO844235@iweiny-DESK2.sc.intel.com> Cc: Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Dave Hansen , x86@kernel.org, Dan Williams , Vishal Verma , Andrew Morton , Fenghua Yu , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org In-Reply-To: <20200724172344.GO844235@iweiny-DESK2.sc.intel.com> To: Ira Weiny X-Mailer: iPhone Mail (17F80) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jul 24, 2020, at 10:23 AM, Ira Weiny wrote: >=20 > =EF=BB=BFOn Thu, Jul 23, 2020 at 10:15:17PM +0200, Thomas Gleixner wrote: >> Thomas Gleixner writes: >>=20 >>> Ira Weiny writes: >>>> On Fri, Jul 17, 2020 at 12:06:10PM +0200, Peter Zijlstra wrote: >>>>>> On Fri, Jul 17, 2020 at 12:20:56AM -0700, ira.weiny@intel.com wrote: >>>>> I've been really digging into this today and I'm very concerned that I= 'm >>>>> completely missing something WRT idtentry_enter() and idtentry_exit().= >>>>>=20 >>>>> I've instrumented idt_{save,restore}_pkrs(), and __dev_access_{en,dis}= able() >>>>> with trace_printk()'s. >>>>>=20 >>>>> With this debug code, I have found an instance where it seems like >>>>> idtentry_enter() is called without a corresponding idtentry_exit(). T= his has >>>>> left the thread ref counter at 0 which results in very bad things happ= ening >>>>> when __dev_access_disable() is called and the ref count goes negative.= >>>>>=20 >>>>> Effectively this seems to be happening: >>>>>=20 >>>>> ... >>>>> // ref =3D=3D 0 >>>>> dev_access_enable() // ref +=3D 1 =3D=3D> disable protection >>>>> // exception (which one I don't know) >>>>> idtentry_enter() >>>>> // ref =3D 0 >>>>> _handler() // or whatever code... >>>>> // *_exit() not called [at least there is no trace_printk()= output]... >>>>> // Regardless of trace output, the ref is left at 0 >>>>> dev_access_disable() // ref -=3D 1 =3D=3D> -1 =3D=3D> does not enab= le protection >>>>> (Bad stuff is bound to happen now...) >>>=20 >>> Well, if any exception which calls idtentry_enter() would return without= >>> going through idtentry_exit() then lots of bad stuff would happen even >>> without your patches. >>>=20 >>>> Also is there any chance that the process could be getting scheduled an= d that >>>> is causing an issue? >>>=20 >>> Only from #PF, but after the fault has been resolved and the tasks is >>> scheduled in again then the task returns through idtentry_exit() to the >>> place where it took the fault. That's not guaranteed to be on the same >>> CPU. If schedule is not aware of the fact that the exception turned off >>> stuff then you surely get into trouble. So you really want to store it >>> in the task itself then the context switch code can actually see the >>> state and act accordingly. >>=20 >> Actually thats nasty as well as you need a stack of PKRS values to >> handle nested exceptions. But it might be still the most reasonable >> thing to do. 7 PKRS values plus an index should be really sufficient, >> that's 32bytes total, not that bad. >=20 > I've thought about this a bit more and unless I'm wrong I think the > idtentry_state provides for that because each nested exception has it's ow= n > idtentry_state doesn't it? Only the ones that use idtentry_enter() instead of, say, nmi_enter(). >=20 > Ira