Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp2434848ybh; Fri, 24 Jul 2020 12:44:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyRfoJYTg19+TG2Dv8bc7ozMXsfkGe+cSnje6ILsTVtkFSRVbFygUqzzw7/otZrLaL6LGFn X-Received: by 2002:a17:906:fad4:: with SMTP id lu20mr10373808ejb.1.1595619871918; Fri, 24 Jul 2020 12:44:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595619871; cv=none; d=google.com; s=arc-20160816; b=JGSGaDWiEki4COS/7RUAofXGa87ibhOp/TGySN5naITP47WFgoIfNaw7hKxxafo9Gq D8AXt8VO77BFSTfhdWuxcVKd36M/4RVtjj7FIH1mtkI1P7HuBv7K1a3U66kYJmDofaR7 BsqEYOYPsOPDEGtCnFrj5FG8kkN5IBF6ujiJ1ls+s65O74Cu3R7tvyUkVW7QcrpM9sjJ I6AyW6sVMf8mjPqT6sn2M21skbReE2FeNv2A6a0vrFRRm9hJPWJ7yFEX08R9WyzxlYXQ J0h2rky2FoS7M8SrikYFR3bgLgM5od845XCCtITDncqH5n4PEg/AzXzv8ERH9Dr70t9R PL8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:ironport-sdr :ironport-sdr; bh=psu1kmBRjPrOHxzezeeSSa+NIYqbC0HUquU0vn/OfTc=; b=gT/9ywDwieD5BBJa5xXyMjbuiXJtIG5UET0zgmKrOD1oKoNvO46staotyuC0iAMJW6 +0VLwA7wmSwIjg+2fR/C9LrLLzbnIhNQlv8vT9+S44x7kRrZ0tk64mtJvmYMewrIg6YN NBSH8CJW9qpX8LYXSvrqmctymgrQsPB4LhEI+UHonWwu2AW9kMJIhN9PrO3GrXhNkXOO SbhTCJUjC1gRFf2HuYRkvezF5ce2LpgtDDBRlDCpksC+jUt4J1uIfWKJWv++6ua0uAOI EHOsOW01CFNDjtpAuSSx+QCNG/55+FNxqowXXTt3Qs+iowSihMxTNVWP/ZGvL4fzsSJ2 uNbA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mj23si1168025ejb.746.2020.07.24.12.44.08; Fri, 24 Jul 2020 12:44:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726572AbgGXTn6 (ORCPT + 99 others); Fri, 24 Jul 2020 15:43:58 -0400 Received: from mga18.intel.com ([134.134.136.126]:28554 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726085AbgGXTn5 (ORCPT ); Fri, 24 Jul 2020 15:43:57 -0400 IronPort-SDR: EfyhiW3llWlszXI3GBFqhly+qsP2rPDJX12B1ZCW1e0fDLHBKP+784TFknVpFI6TXo/vf9mqp7 UoYWvt582+sA== X-IronPort-AV: E=McAfee;i="6000,8403,9692"; a="138282996" X-IronPort-AV: E=Sophos;i="5.75,391,1589266800"; d="scan'208";a="138282996" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jul 2020 12:43:56 -0700 IronPort-SDR: xw35vgWVMvuh2FxBDPdLOZ98hPNSo7kyjZ00bcaGwKX/k2DG8gIcu2MZetY8wt/D3BJ5XnscaB FrmYkinokGDA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,391,1589266800"; d="scan'208";a="272667289" Received: from iweiny-desk2.sc.intel.com ([10.3.52.147]) by fmsmga008.fm.intel.com with ESMTP; 24 Jul 2020 12:43:56 -0700 Date: Fri, 24 Jul 2020 12:43:56 -0700 From: Ira Weiny To: Andy Lutomirski Cc: Thomas Gleixner , Peter Zijlstra , Ingo Molnar , Borislav Petkov , Andy Lutomirski , Dave Hansen , x86@kernel.org, Dan Williams , Vishal Verma , Andrew Morton , Fenghua Yu , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Subject: Re: [PATCH RFC V2 17/17] x86/entry: Preserve PKRS MSR across exceptions Message-ID: <20200724194355.GA844234@iweiny-DESK2.sc.intel.com> References: <20200724172344.GO844235@iweiny-DESK2.sc.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.11.1 (2018-12-01) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 24, 2020 at 10:29:23AM -0700, Andy Lutomirski wrote: > > > On Jul 24, 2020, at 10:23 AM, Ira Weiny wrote: > > > > On Thu, Jul 23, 2020 at 10:15:17PM +0200, Thomas Gleixner wrote: > >> Thomas Gleixner writes: > >> > >>> Ira Weiny writes: > >>>> On Fri, Jul 17, 2020 at 12:06:10PM +0200, Peter Zijlstra wrote: > >>>>>> On Fri, Jul 17, 2020 at 12:20:56AM -0700, ira.weiny@intel.com wrote: > >>>>> I've been really digging into this today and I'm very concerned that I'm > >>>>> completely missing something WRT idtentry_enter() and idtentry_exit(). > >>>>> > >>>>> I've instrumented idt_{save,restore}_pkrs(), and __dev_access_{en,dis}able() > >>>>> with trace_printk()'s. > >>>>> > >>>>> With this debug code, I have found an instance where it seems like > >>>>> idtentry_enter() is called without a corresponding idtentry_exit(). This has > >>>>> left the thread ref counter at 0 which results in very bad things happening > >>>>> when __dev_access_disable() is called and the ref count goes negative. > >>>>> > >>>>> Effectively this seems to be happening: > >>>>> > >>>>> ... > >>>>> // ref == 0 > >>>>> dev_access_enable() // ref += 1 ==> disable protection > >>>>> // exception (which one I don't know) > >>>>> idtentry_enter() > >>>>> // ref = 0 > >>>>> _handler() // or whatever code... > >>>>> // *_exit() not called [at least there is no trace_printk() output]... > >>>>> // Regardless of trace output, the ref is left at 0 > >>>>> dev_access_disable() // ref -= 1 ==> -1 ==> does not enable protection > >>>>> (Bad stuff is bound to happen now...) > >>> > >>> Well, if any exception which calls idtentry_enter() would return without > >>> going through idtentry_exit() then lots of bad stuff would happen even > >>> without your patches. > >>> > >>>> Also is there any chance that the process could be getting scheduled and that > >>>> is causing an issue? > >>> > >>> Only from #PF, but after the fault has been resolved and the tasks is > >>> scheduled in again then the task returns through idtentry_exit() to the > >>> place where it took the fault. That's not guaranteed to be on the same > >>> CPU. If schedule is not aware of the fact that the exception turned off > >>> stuff then you surely get into trouble. So you really want to store it > >>> in the task itself then the context switch code can actually see the > >>> state and act accordingly. > >> > >> Actually thats nasty as well as you need a stack of PKRS values to > >> handle nested exceptions. But it might be still the most reasonable > >> thing to do. 7 PKRS values plus an index should be really sufficient, > >> that's 32bytes total, not that bad. > > > > I've thought about this a bit more and unless I'm wrong I think the > > idtentry_state provides for that because each nested exception has it's own > > idtentry_state doesn't it? > > Only the ones that use idtentry_enter() instead of, say, nmi_enter(). Oh agreed... But with this patch we are still better off than just preserving during context switch. I need to update the commit message here to make this clear though. Thanks, Ira