Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp1966125ybj; Wed, 6 May 2020 08:21:26 -0700 (PDT) X-Google-Smtp-Source: APiQypKxafwBxMSb3jsnMkhBwiyBSGhUh6cjr2Lm+1w9jD9UN2Xssu4mZHp7TLctiHZKI3wDiVVY X-Received: by 2002:a17:906:748c:: with SMTP id e12mr7516732ejl.375.1588778486551; Wed, 06 May 2020 08:21:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588778486; cv=none; d=google.com; s=arc-20160816; b=0DLfTzw3i1SsHBZxiKtDcI4EqcgH40Ky3v7TB3eOleVFCBV4mJBmAe460GDcXmNnyT uGXVC/2IX1Omok57YEBa8gkYaTnPRC0lyHzqTqndRShIoLTOIqZqg/QvvgQf87emT9mC DUtQK0cIFkALnU1f0EFzL3jHUrGdN/AKORkDlWx+K5K+dTnQi+oSynOZiQyOyCRUGI9u YgQG6itsdU+FjCkf8zW+tmsw6N03s/Map0PvYwfk/7ft7wLCsrkWZ/XAagL6IqjtwKcq kl0FQs8TzkrlYGN50sip4yO6BNpEbn9ooeovpRRQksluJPEvrAjbzMmCug7hyA3UIFIk 31cA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=DXYVzQyVywl6GQZuQVp733Ke0vjcJTfLjSFEhcJgjd4=; b=AYS5JDHGdKxlNQWIxa11d0b8zLM9GpZGXJIcMPUURB8lVQzGSp9EQSLvg6EQQudK1A cEZAkYnhDnZk+LFFSQfWnErO8dype3CMqKa4QdOR4Sv425WMQHUXv9o2oeBz63pLjLxn ScWrvRs8OHPnqF0UjFlDjdTtRpxndGFVLUi805erUEw13jAqR9rtqmz+aU+iIIHNcVYt R6b2uPPFVRUsqU9L91EKF0nITJJtfQ+A5ZZgsR4wfNkbl1qEUOlXwkTb9z2t3yEl95ik 6QpVlgA4NUGGyx3D8BBJkWMOyaBI+izUAsBmBhlOHBfbxxB8ZyWn/9pg5KBuAew1GSq7 QBqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t11si1399109ejr.67.2020.05.06.08.21.03; Wed, 06 May 2020 08:21:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729528AbgEFPRI convert rfc822-to-8bit (ORCPT + 99 others); Wed, 6 May 2020 11:17:08 -0400 Received: from mail.kernel.org ([198.145.29.99]:56134 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729136AbgEFPRH (ORCPT ); Wed, 6 May 2020 11:17:07 -0400 Received: from gandalf.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4158020836; Wed, 6 May 2020 15:17:06 +0000 (UTC) Date: Wed, 6 May 2020 11:17:04 -0400 From: Steven Rostedt To: Joerg Roedel Cc: Mathieu Desnoyers , linux-kernel , Ingo Molnar , Thomas Gleixner , Peter Zijlstra , Borislav Petkov , Andrew Morton , Shile Zhang , Andy Lutomirski , "Rafael J. Wysocki" , Dave Hansen , Tzvetomir Stoyanov Subject: Re: [PATCH] tracing: Call vmalloc_sync_mappings() after alloc_percpu() Message-ID: <20200506111704.5a97a3e2@gandalf.local.home> In-Reply-To: <20200505123144.GM8135@suse.de> References: <20200430141120.GA8135@suse.de> <20200430121136.6d7aeb22@gandalf.local.home> <20200430191434.GC8135@suse.de> <20200430211308.74a994dc@oasis.local.home> <1902703609.78863.1588300015661.JavaMail.zimbra@efficios.com> <20200430223919.50861011@gandalf.local.home> <20200504151236.GI8135@suse.de> <20200504134042.178409c3@gandalf.local.home> <20200504183832.GL8135@suse.de> <20200504151006.69d2a16c@gandalf.local.home> <20200505123144.GM8135@suse.de> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 5 May 2020 14:31:44 +0200 Joerg Roedel wrote: > On Mon, May 04, 2020 at 03:10:06PM -0400, Steven Rostedt wrote: > > I'm fine with adding it to the tracing code (with that ridiculous > > comment! ;-) > > > > I'll even tag is as stable, but again, it's uncertain what commit that it > > "fixes". > > Okay, so here it is. It fixes the issue for me and doesn't cause any > lockdep splats on my machine. I am not sure how far this needs to be > backported via stable, so I didn't specify it. Same with a Fixes tag. > Oh, and I added your comment :-) (which makes up most of the added > lines, so feel free to change authorship to you). OK, I did take authorship. At least it will be me who gets the blame ;-) Here's the patch: From 685fc0b545f39bf9bfafa0396113c852ef3c81fc Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" Date: Wed, 6 May 2020 10:36:18 -0400 Subject: [PATCH] tracing: Add a vmalloc_sync_mappings() for safe measure x86_64 lazily maps in the vmalloc pages, and the way this works with per_cpu areas can be complex, to say the least. Mappings may happen at boot up, and if nothing synchronizes the page tables, those page mappings may not be synced till they are used. This causes issues for anything that might touch one of those mappings in the path of the page fault handler. When one of those unmapped mappings is touched in the page fault handler, it will cause another page fault, which in turn will cause a page fault, and leave us in a loop of page faults. Commit 763802b53a42 ("x86/mm: split vmalloc_sync_all()") split vmalloc_sync_all() into vmalloc_sync_unmappings() and vmalloc_sync_mappings(), as on system exit, it did not need to do a full sync on x86_64 (although it still needed to be done on x86_32). By chance, the vmalloc_sync_all() would synchronize the page mappings done at boot up and prevent the per cpu area from being a problem for tracing in the page fault handler. But when that synchronization in the exit of a task became a nop, it caused the problem to appear. Link: https://lore.kernel.org/r/20200429054857.66e8e333@oasis.local.home Cc: stable@vger.kernel.org Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code") Reported-by: "Tzvetomir Stoyanov (VMware)" Suggested-by: Joerg Roedel Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 8d2b98812625..9ed6d92768af 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -8525,6 +8525,19 @@ static int allocate_trace_buffers(struct trace_array *tr, int size) */ allocate_snapshot = false; #endif + + /* + * Because of some magic with the way alloc_percpu() works on + * x86_64, we need to synchronize the pgd of all the tables, + * otherwise the trace events that happen in x86_64 page fault + * handlers can't cope with accessing the chance that a + * alloc_percpu()'d memory might be touched in the page fault trace + * event. Oh, and we need to audit all other alloc_percpu() and vmalloc() + * calls in tracing, because something might get triggered within a + * page fault trace event! + */ + vmalloc_sync_mappings(); + return 0; } -- 2.20.1