Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3251795imm; Fri, 20 Jul 2018 12:54:46 -0700 (PDT) X-Google-Smtp-Source: AAOMgpf7Vxi06v8EeRq8Ws8Tr1XAJ+n3fnAFWX5tnQs0L4bXvpRrvTdY57nVm58w2rMhSKUeqT5s X-Received: by 2002:a62:5bc3:: with SMTP id p186-v6mr3519094pfb.42.1532116486909; Fri, 20 Jul 2018 12:54:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532116486; cv=none; d=google.com; s=arc-20160816; b=WS5X6PMPP4l8fhG5JzN2NHsHF53mFsdJ0E/ZssaeRuz3VExzVivmuH1dDwbFDCnaP5 xwTs9Sj1UqDlPxyfpW6sabCa0u1qWY5rkNtvL2Hun9pZkKr0ygw0wjgWvxLX7xMWkTNC QAafCRrc8EQDyetKIRmUDHCJoCph6krHz3K5E40DrsDCeaKcRj7DA3o8xRBjZ/VCJQgz gP+C7XBQp5M+e9JEJURlgb3JnWHFh2vvH/0YTGZsaO6qLdCFRdfbFmG/o+fmHhipt9KU yihLmxCCk4tEOSw31vwRSBi37SrbXlPbBinABEs+xayzoPdYR3SWJbIGwAav0hyNilCT 177A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=LnnGA0Lhv20Nr7huPB3PNd5Vbn8h4MErp95a598675M=; b=qOpfjV9kP54KoIc8O2vRO/Z/B+ERl3d4Kl9DnRN3CefYQvhmpojArJZmMgDGY7fjjZ r3kptdARimPJvoHzbah8AU5XZbk0c0OaimK6aLoml9fUmjWDAROnDDgNWZ7+FKXMGZwi Y5OycXyPzjTk/heT8pUh5cuiI4jeO+AtNVTmcDPD23EjL4Wnz1ZPPhBWOpp5VJNY4HZk IIm5oegiFzlCjPVHvmQYSMvjHu/zYUwAIJK9TpKS9WZFYHJV89eaMW28juShwv7Bvacc DWyOflghpbMxDTytJmFfVx3mVUP27APWSdez9isn/Zki7tMgLaXcb9wTYDbXaWHnw9LZ 7VHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w135-v6si2664340pff.8.2018.07.20.12.54.31; Fri, 20 Jul 2018 12:54:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728036AbeGTUnc (ORCPT + 99 others); Fri, 20 Jul 2018 16:43:32 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:37830 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727805AbeGTUnc (ORCPT ); Fri, 20 Jul 2018 16:43:32 -0400 Received: from p4fea5a5a.dip0.t-ipconnect.de ([79.234.90.90] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1fgbTB-0002H9-Tw; Fri, 20 Jul 2018 21:53:42 +0200 Date: Fri, 20 Jul 2018 21:53:41 +0200 (CEST) From: Thomas Gleixner To: Andy Lutomirski cc: Joerg Roedel , Ingo Molnar , "H . Peter Anvin" , X86 ML , LKML , Linux-MM , Linus Torvalds , Dave Hansen , Josh Poimboeuf , Juergen Gross , Peter Zijlstra , Borislav Petkov , Jiri Kosina , Boris Ostrovsky , Brian Gerst , David Laight , Denys Vlasenko , Eduardo Valentin , Greg KH , Will Deacon , "Liguori, Anthony" , Daniel Gruss , Hugh Dickins , Kees Cook , Andrea Arcangeli , Waiman Long , Pavel Machek , "David H . Gutteridge" , Joerg Roedel , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim Subject: Re: [PATCH 1/3] perf/core: Make sure the ring-buffer is mapped in all page-tables In-Reply-To: Message-ID: References: <1532103744-31902-1-git-send-email-joro@8bytes.org> <1532103744-31902-2-git-send-email-joro@8bytes.org> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 20 Jul 2018, Thomas Gleixner wrote: > On Fri, 20 Jul 2018, Andy Lutomirski wrote: > > On Fri, Jul 20, 2018 at 12:27 PM, Thomas Gleixner wrote: > > > On Fri, 20 Jul 2018, Andy Lutomirski wrote: > > >> > On Jul 20, 2018, at 6:22 AM, Joerg Roedel wrote: > > >> > > > >> > From: Joerg Roedel > > >> > > > >> > The ring-buffer is accessed in the NMI handler, so we better > > >> > avoid faulting on it. Sync the vmalloc range with all > > >> > page-tables in system to make sure everyone has it mapped. > > >> > > > >> > This fixes a WARN_ON_ONCE() that can be triggered with PTI > > >> > enabled on x86-32: > > >> > > > >> > WARNING: CPU: 4 PID: 0 at arch/x86/mm/fault.c:320 vmalloc_fault+0x220/0x230 > > >> > > > >> > This triggers because with PTI enabled on an PAE kernel the > > >> > PMDs are no longer shared between the page-tables, so the > > >> > vmalloc changes do not propagate automatically. > > >> > > >> It seems like it would be much more robust to fix the vmalloc_fault() > > >> code instead. > > > > > > Right, but now the obvious fix for the issue at hand is this. We surely > > > should revisit this. > > > > If you commit this under this reasoning, then please at least make it say: > > > > /* XXX: The vmalloc_fault() code is buggy on PTI+PAE systems, and this > > is a workaround. */ > > > > Let's not have code in the kernel that pretends to make sense but is > > actually voodoo magic that works around bugs elsewhere. It's no fun > > to maintain down the road. > > Fair enough. Lemme amend it. Joerg is looking into it, but I surely want to > get that stuff some exposure in next ASAP. Delta patch below. Thanks. tglx 8<------------- --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -815,8 +815,12 @@ static void rb_free_work(struct work_str vfree(base); kfree(rb); - /* Make sure buffer is unmapped in all page-tables */ - vmalloc_sync_all(); + /* + * FIXME: PAE workaround for vmalloc_fault(): Make sure buffer is + * unmapped in all page-tables. + */ + if (IS_ENABLED(CONFIG_X86_PAE)) + vmalloc_sync_all(); } void rb_free(struct ring_buffer *rb) @@ -844,11 +848,13 @@ struct ring_buffer *rb_alloc(int nr_page goto fail_all_buf; /* - * The buffer is accessed in NMI handlers, make sure it is - * mapped in all page-tables in the system so that we don't - * fault on the range in an NMI handler. + * FIXME: PAE workaround for vmalloc_fault(): The buffer is + * accessed in NMI handlers, make sure it is mapped in all + * page-tables in the system so that we don't fault on the range in + * an NMI handler. */ - vmalloc_sync_all(); + if (IS_ENABLED(CONFIG_X86_PAE)) + vmalloc_sync_all(); rb->user_page = all_buf; rb->data_pages[0] = all_buf + PAGE_SIZE;