Received: by 2002:a25:86ce:0:0:0:0:0 with SMTP id y14csp189237ybm; Mon, 20 May 2019 14:21:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqxl1ir0wFiUGYOz1c8X4qJiXYBG9Mcx3DPwF+fyK2yW0XE2CV31xpqRnlk2y+smP8KJSVr1 X-Received: by 2002:a62:6d41:: with SMTP id i62mr43393640pfc.227.1558387312624; Mon, 20 May 2019 14:21:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558387312; cv=none; d=google.com; s=arc-20160816; b=Tj8OKS6wN+ZhJsfgoYQU6VU8QRVwZO1AfjtCIaxstb/uPfYpc4QRPAQtbSMdu/8Pnh FRFqUV2bFvcBc+hjiItXtVGK2CTkMYBsHQoO6qbhDevGzCmJaD4xustzeX/B4tIXpsR+ /baT0HqwPLXKklKkV6KblckzBPacFDW6lNV3ulRMw8vmu/1oq2/xW2QIV3AYRAQlElMa G3VcwnY3BnBsijbmN0h2lYPT+BrOqxua7oskYd67agC9Tv1/JNRh+8gmDTf9BxDznCCw Ew4Z/oOlk46t/DV9nELtz7NpDVRUZ9YQCxARN+iZ2GpbqPbSgufxildt1jG5Q64+bct3 s0UQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Lag2NT86fVFuWbrw8abMeAun+qK/OvwVMOUmw3tWodE=; b=jNmVXjUmjw+UEktKqZnOew/N1ud82B/SPj3gLhW3x4IkdA0CEzxf5wk3J5x9nr9Cht DvtegXYSpNeuqPp26aIDaFeNlEc9e9G9OHIMOl/PSuR9MzdEoHIs5LTOiQ+Gq41qwJiH sAbBzZkL0A18SeDPYoECJiHf5Lb/sDU4vkPfm7ECb+L2P97/EtHv2GH8+JHaVd+4v5jS jHlxta3mX2LptA+UjuCPd96E0KAT5UMdxRMitlu2z/FlwUDgXkB21ofvjM07SY6PMSZ2 AJoZ217D1SnVr+giMurCM+RX1eEo7m4hjrWkknskJ5WyiSQkwlj6lCioVuZV3IXQ/N59 t0kA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o4si18933266pls.391.2019.05.20.14.21.37; Mon, 20 May 2019 14:21:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726619AbfETVTt (ORCPT + 99 others); Mon, 20 May 2019 17:19:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58036 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725989AbfETVTt (ORCPT ); Mon, 20 May 2019 17:19:49 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8FDE0301E3D2; Mon, 20 May 2019 21:19:40 +0000 (UTC) Received: from treble (ovpn-125-173.rdu2.redhat.com [10.10.125.173]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3C6935D704; Mon, 20 May 2019 21:19:33 +0000 (UTC) Date: Mon, 20 May 2019 16:19:31 -0500 From: Josh Poimboeuf To: Johannes Erdfelt Cc: Joe Lawrence , Jessica Yu , Jiri Kosina , Miroslav Benes , Steven Rostedt , Ingo Molnar , live-patching@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: Oops caused by race between livepatch and ftrace Message-ID: <20190520211931.vokbqxkx5kb6k2bz@treble> References: <20190520194915.GB1646@sventech.com> <90f78070-95ec-ce49-1641-19d061abecf4@redhat.com> <20190520210905.GC1646@sventech.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20190520210905.GC1646@sventech.com> User-Agent: NeoMutt/20180716 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Mon, 20 May 2019 21:19:48 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 20, 2019 at 02:09:05PM -0700, Johannes Erdfelt wrote: > On Mon, May 20, 2019, Joe Lawrence wrote: > > [ fixed jeyu's email address ] > > Thank you, the bounce message made it seem like my mail server was > blocked and not that the address didn't exist. > > I think MAINTAINERS needs an update since it still has the @redhat.com > address. I think you must have been looking at an old version. [(v5.2-rc1)] ~/git/linux $ grep jeyu MAINTAINERS M: Jessica Yu > > On 5/20/19 3:49 PM, Johannes Erdfelt wrote: > > > [ ... snip ... ] > > > > > > I have put together a test case that can reproduce the crash using > > > KVM. The tarball includes a minimal kernel and initramfs, along with > > > a script to run qemu and the .config used to build the kernel. By > > > default it will attempt to reproduce by loading multiple livepatches > > > at the same time. Passing 'test=ftrace' to the script will attempt to > > > reproduce by racing with ftrace. > > > > > > My test setup reproduces the race and oops more reliably by loading > > > multiple livepatches at the same time than with the ftrace method. It's > > > not 100% reproducible, so the test case may need to be run multiple > > > times. > > > > > > It can be found here (not attached because of its size): > > > http://johannes.erdfelt.com/5.2.0-rc1-a188339ca5-livepatch-race.tar.gz > > > > Hi Johannes, > > > > This is cool way to distribute the repro kernel, modules, etc! > > This oops was common in our production environment and was particularly > annoying since livepatches would load at boot and early enough to happen > before networking and SSH were started. > > Unfortunately it was difficult to reproduce on other hardware (changing > the timing just enough) and our production environment is very > complicated. > > I spent more time than I'd like to admit trying to reproduce this fairly > reliably. I knew that I needed to help make it as easy as possible to > reproduce to root cause it and for others to take a look at it as well. Can you try this patch (completely untested)? diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index 91cd519756d3..2d17e6e364b5 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include "core.h" #include "patch.h" @@ -730,16 +731,21 @@ static int klp_init_object_loaded(struct klp_patch *patch, struct klp_func *func; int ret; + mutex_lock(&text_mutex); + module_disable_ro(patch->mod); ret = klp_write_object_relocations(patch->mod, obj); if (ret) { module_enable_ro(patch->mod, true); + mutex_unlock(&text_mutex); return ret; } arch_klp_init_object_loaded(patch, obj); module_enable_ro(patch->mod, true); + mutex_unlock(&text_mutex); + klp_for_each_func(obj, func) { ret = klp_find_object_symbol(obj->name, func->old_name, func->old_sympos, diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index a12aff849c04..8259d4ba8b00 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -34,6 +34,7 @@ #include #include #include +#include #include @@ -2610,10 +2611,12 @@ static void ftrace_run_update_code(int command) { int ret; + mutex_lock(&text_mutex); + ret = ftrace_arch_code_modify_prepare(); FTRACE_WARN_ON(ret); if (ret) - return; + goto out_unlock; /* * By default we use stop_machine() to modify the code. @@ -2625,6 +2628,9 @@ static void ftrace_run_update_code(int command) ret = ftrace_arch_code_modify_post_process(); FTRACE_WARN_ON(ret); + +out_unlock: + mutex_unlock(&text_mutex); } static void ftrace_run_modify_code(struct ftrace_ops *ops, int command, @@ -5776,6 +5782,7 @@ void ftrace_module_enable(struct module *mod) struct ftrace_page *pg; mutex_lock(&ftrace_lock); + mutex_lock(&text_mutex); if (ftrace_disabled) goto out_unlock; @@ -5837,6 +5844,7 @@ void ftrace_module_enable(struct module *mod) ftrace_arch_code_modify_post_process(); out_unlock: + mutex_unlock(&text_mutex); mutex_unlock(&ftrace_lock); process_cached_mods(mod->name);