Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751641AbdIKPIV (ORCPT ); Mon, 11 Sep 2017 11:08:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35440 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750954AbdIKPIU (ORCPT ); Mon, 11 Sep 2017 11:08:20 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B946E169395 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=riel@redhat.com Message-ID: <1505142497.21121.36.camel@redhat.com> Subject: Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf From: Rik van Riel To: Andy Lutomirski Cc: Borislav Petkov , Linus Torvalds , Markus Trippelsdorf , Ingo Molnar , Thomas Gleixner , Peter Zijlstra , LKML , Ingo Molnar , Tom Lendacky Date: Mon, 11 Sep 2017 11:08:17 -0400 In-Reply-To: References: <20170909143335.ja2iwjsbeyfxz4ez@pd.tnic> <20170909144350.GA290@x4> <20170909163225.GA290@x4> <20170909170537.6xmxtzwripplhhwi@pd.tnic> <20170909172352.GA290@x4> <20170909173633.4ttfk7maooxkcwum@pd.tnic> <20170909181445.GA281@x4> <20170909182952.itqad4ryngjwrgqf@pd.tnic> <20170909190948.xydyega7i2rjnlqt@pd.tnic> <1505092341.21121.34.camel@redhat.com> Organization: Red Hat, Inc Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 11 Sep 2017 15:08:20 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 968 Lines: 18 On Sun, 2017-09-10 at 18:46 -0700, Andy Lutomirski wrote: > > No, nothing stops the problematic speculative load.  Here's the > issue. > One CPU removes a reference to a page table from a higher-level page > table, flushes, and then frees the page table.  Then it re-allocates > it and writes something unrelated there.  Another CPU that has CR3 > pointing to the page hierarchy in question could have a reference to > the freed table in its paging structure cache.  Even if it's > guaranteed to not try to access the addresses in question (because > they're user addresses and the other CPU is in kernel mode, etc), but > there is never a guarantee that the CPU doesn't randomly try to fill > its TLB for the affected addresses.  This results in invalid PTEs in > the TLB, possible accesses using bogus memory types, and maybe even > reads from IO space. Good point, I had forgotten all about memory accesses that do not originate with software behavior.