Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757344AbdIHXIQ (ORCPT ); Fri, 8 Sep 2017 19:08:16 -0400 Received: from mail.kernel.org ([198.145.29.99]:41130 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757242AbdIHXIP (ORCPT ); Fri, 8 Sep 2017 19:08:15 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D378421D28 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org X-Google-Smtp-Source: AOwi7QB/vpKQ13MF74L0YIMyxulYXkijS3Ae4hk9RGSS0KMutRP23tzyymPY0aAluV9cSrlbongEtgwZ/HiOQ96USZk= MIME-Version: 1.0 In-Reply-To: <20170908215656.qw66lgfsfgpoqrdm@pd.tnic> References: <20170908080536.ninspvplibd37fj2@pd.tnic> <20170908091614.nmdxjnukxowlsjja@pd.tnic> <20170908094815.GA278@x4> <20170908103513.npjmb2kcjt2zljb2@gmail.com> <20170908103906.GB278@x4> <20170908113039.GA285@x4> <20170908171633.GA279@x4> <20170908215656.qw66lgfsfgpoqrdm@pd.tnic> From: Andy Lutomirski Date: Fri, 8 Sep 2017 16:07:53 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf To: Borislav Petkov , Linus Torvalds Cc: Markus Trippelsdorf , Andy Lutomirski , Ingo Molnar , Thomas Gleixner , Peter Zijlstra , LKML , Ingo Molnar , Tom Lendacky Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2104 Lines: 47 [Linus, I added you to get your opinion on whether the last bit here is a problem.] On Fri, Sep 8, 2017 at 2:56 PM, Borislav Petkov wrote: > On Fri, Sep 08, 2017 at 02:47:00PM -0700, Andy Lutomirski wrote: >> Any chance you could test with CONFIG_DEBUG_VM=y? There are lots of >> potentially useful assertions in that code. >> >> Can you also post your /proc/cpuinfo? And can you re-confirm that a >> problematic guest kernel is causing problems in the *host*? > > Also, have you seen any MCEs during early boot, after the freezes? > > You probably wouldn't have because we don't log them on F10h due to > broken BIOSen. So add "mce=bootlog" to your grub and warm-reset your box > after one of those freezes and send me dmesg. It should have an MCE in > there, if it happens what I think it happens. > Here's my theory as to what's happening. Before my patch, flush_tlb_mm_range() guaranteed that the range would be flushed on all CPUs prior to returning. With the patch, it only promises that it will be flushed on all CPUs prior to anyone trying to access it on the CPU in question. This has two consequences: 1. A kernel thread that accidentally reads or writes a user address could hit a stale TLB entry. This seems harmless in the sense that this can only happen if we already have a bug. 2. The CPU itself could see the TLB entry and do nefarious architecturally invisible things with it. I bet that #2 dramatically increases the chance that we hit erratum 383. I can imagine a case where we have a problem even in the absence of an erratum. Specifically, suppose we have some page mapped. CPU A writes to it using combining (it's mapped WC or an explicit streaming write is done). CPU B removes the TLB entry and does flush_tlb_mm_range(). CPU B would expect that all writes to the page are done, but CPU A's write is still sitting in the streaming buffers. I *think* this is impossible because CPU A's mm_cpumask manipulations are atomic and should therefore force out the streaming write buffers, but maybe there's some other scenario where this matters. --Andy