Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757331AbdIIBjd (ORCPT ); Fri, 8 Sep 2017 21:39:33 -0400 Received: from mail-pf0-f169.google.com ([209.85.192.169]:35781 "EHLO mail-pf0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757209AbdIIBjc (ORCPT ); Fri, 8 Sep 2017 21:39:32 -0400 X-Google-Smtp-Source: ADKCNb4rFjh0jVaV5y1t/UVJLZTHLIcYGyztbvDepi7BDw+j34Pcvt2QNMCVFYcydVpsZ78MQySo/g== Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf From: Andy Lutomirski X-Mailer: iPhone Mail (14G60) In-Reply-To: Date: Fri, 8 Sep 2017 18:39:28 -0700 Cc: Andy Lutomirski , Borislav Petkov , Markus Trippelsdorf , Ingo Molnar , Thomas Gleixner , Peter Zijlstra , LKML , Ingo Molnar , Tom Lendacky Message-Id: <8D582966-08B6-46F2-B12A-BC33F7EF0EB6@amacapital.net> References: <20170908080536.ninspvplibd37fj2@pd.tnic> <20170908091614.nmdxjnukxowlsjja@pd.tnic> <20170908094815.GA278@x4> <20170908103513.npjmb2kcjt2zljb2@gmail.com> <20170908103906.GB278@x4> <20170908113039.GA285@x4> <20170908171633.GA279@x4> <20170908215656.qw66lgfsfgpoqrdm@pd.tnic> To: Linus Torvalds Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id v891dgCw030889 Content-Length: 2170 Lines: 52 > On Sep 8, 2017, at 6:05 PM, Linus Torvalds wrote: > >> On Fri, Sep 8, 2017 at 5:00 PM, Andy Lutomirski wrote: >> >> I'm not convinced. The SDM says (Vol 3, 11.3, under WC): >> >> If the WC buffer is partially filled, the writes may be delayed until >> the next occurrence of a serializing event; such as, an SFENCE or >> MFENCE instruction, CPUID execution, a read or write to uncached >> memory, an interrupt occurrence, or a LOCK instruction execution. >> >> Thanks, Intel, for definiing "serializing event" differently here than >> anywhere else in the whole manual. > > Yeah, it's really badly defined. Ok, maybe a locked instruction does > actually wait for it.. It should be invisible to anything, regardless. > >> 1. The kernel wants to reclaim a page of normal memory, so it unmaps >> it and flushes. Another CPU has an entry for that page in its WC >> buffer. I don't think we care whether the flush causes the WC write >> to really hit RAM because it's unobservable -- we just need to make >> sure it is ordered, as seen by software, before the flush operation >> completes. From the quote above, I think we're okay here. > > Agreed. > >> 2. The kernel is unmapping some IO memory (e.g. a GPU command buffer). >> It wants a guarantee that, when flush_tlb_mm_range returns, all CPUs >> are really done writing to it. Here I'm less convinced. The SDM >> quote certainly suggests to me that we have a promise that the WC >> write has *started* before flush_tlb_mm_range returns, but I'm not >> sure I believe that it's guaranteed to have retired. > > If others have writable TLB entries, what keeps them from just > continuing to write for a long time afterwards? Whoever unmaps the resource by kicking out their drm fd? I admit I'm just trying to think of the worst case. > >> I'd prefer to leave it as is except on the buggy AMD CPUs, though, >> since the current code is nice and fast. > > So is there a patch to detect the 383 erratum and serialize for those? > I may have missed that part. > The patch is in my head. It's imaginarily attached to this email. > Linus