Received: by 2002:a25:86ce:0:0:0:0:0 with SMTP id y14csp189290ybm; Mon, 20 May 2019 14:21:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqw5aYbwml2yPU0QWtxbKSLJ76tKZR6j+nBT0mhMSkLBTkz8e6NHw0tVf5qePgiM1E2wK7L/ X-Received: by 2002:a17:902:a405:: with SMTP id p5mr72753726plq.51.1558387316048; Mon, 20 May 2019 14:21:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558387316; cv=none; d=google.com; s=arc-20160816; b=IgmozRMcJ7TQJCZQddrP4sESod7wC36d8TpEm0eCr5/Lh0uFO0AXrrDdJeqHj8BqMJ 3k0COP7Q0ZFJCR7wIeD7AR04zuvMU4RefEQ3V1eWLdrvzEuoSMhpiIJDu1NpysZHwjC7 /e2kFNlUSpFWTFvB+1zbaJqhOl4bgH3UlHmDJCAql56ML9iN/LbetqOuyBLv0yVXBb5p POGfo2+gW0i1jg2zybi1L9aIMbGkleBmwHNZTAxQQD677gI8hpoD6yy+VJr6CsdsyCNb QDsTScsxmIoIPDJofwMAZaRM3wE/O6FGEjWfyEpXpHRCFqMcWpynDw8iZupRJDFLAcXp UX0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=VPyD1F2KqulPobChJNd+VcaA1Z58jF+07hqbcm4nvAI=; b=wgk0d2bTx/BInt9XGBzoR5RZCyYbl841az2HLz6kr9OULfkRV77x1RA3u302BPQiO1 Fi0XdnZL0LHphAO7TbFGGaqROXSTQHQSRsjzgLIS3zTvJILX0e6w9F/eKyblqe9L+df1 ruK56896zxx0u7u7Ts20CpF9ZA2YIl5X+62tALjYXHbEnGKA6s7td5Tpt4BD2dt9aJIJ 5qqep2GwBGSklPrBo+I/T1SLx0OzVVWR3cwDCpFHYyTth5iJLZm6XZ+DGJjgmEYLGJNO zJYmd98auohq7kUMj0Hmlxtx6VDotzlkdbxJ1AUv7vSwMl4QiIj3+zMJK91OuAT3RDwD D1Yg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v4si14252899pfn.197.2019.05.20.14.21.41; Mon, 20 May 2019 14:21:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726941AbfETVUD (ORCPT + 99 others); Mon, 20 May 2019 17:20:03 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47172 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725989AbfETVUC (ORCPT ); Mon, 20 May 2019 17:20:02 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EC9D585365; Mon, 20 May 2019 21:20:01 +0000 (UTC) Received: from [10.18.17.208] (dhcp-17-208.bos.redhat.com [10.18.17.208]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E980D60BF3; Mon, 20 May 2019 21:19:59 +0000 (UTC) Subject: Re: Oops caused by race between livepatch and ftrace To: Johannes Erdfelt Cc: Josh Poimboeuf , Jessica Yu , Jiri Kosina , Miroslav Benes , Steven Rostedt , Ingo Molnar , live-patching@vger.kernel.org, linux-kernel@vger.kernel.org References: <20190520194915.GB1646@sventech.com> <90f78070-95ec-ce49-1641-19d061abecf4@redhat.com> <20190520210905.GC1646@sventech.com> From: Joe Lawrence Message-ID: <1802c0d2-702f-08ec-6a85-c7f887eb6d14@redhat.com> Date: Mon, 20 May 2019 17:19:59 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <20190520210905.GC1646@sventech.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 20 May 2019 21:20:02 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/20/19 5:09 PM, Johannes Erdfelt wrote: > On Mon, May 20, 2019, Joe Lawrence wrote: >> [ fixed jeyu's email address ] > > Thank you, the bounce message made it seem like my mail server was > blocked and not that the address didn't exist. > > I think MAINTAINERS needs an update since it still has the @redhat.com > address. > Here's how it looks on my end: % git describe HEAD v5.1-12317-ga6a4b66bd8f4 % grep M:.*jeyu MAINTAINERS M: Jessica Yu >> On 5/20/19 3:49 PM, Johannes Erdfelt wrote: >>> [ ... snip ... ] >>> >>> I have put together a test case that can reproduce the crash using >>> KVM. The tarball includes a minimal kernel and initramfs, along with >>> a script to run qemu and the .config used to build the kernel. By >>> default it will attempt to reproduce by loading multiple livepatches >>> at the same time. Passing 'test=ftrace' to the script will attempt to >>> reproduce by racing with ftrace. >>> >>> My test setup reproduces the race and oops more reliably by loading >>> multiple livepatches at the same time than with the ftrace method. It's >>> not 100% reproducible, so the test case may need to be run multiple >>> times. >>> >>> It can be found here (not attached because of its size): >>> http://johannes.erdfelt.com/5.2.0-rc1-a188339ca5-livepatch-race.tar.gz >> >> Hi Johannes, >> >> This is cool way to distribute the repro kernel, modules, etc! > > This oops was common in our production environment and was particularly > annoying since livepatches would load at boot and early enough to happen > before networking and SSH were started. > > Unfortunately it was difficult to reproduce on other hardware (changing > the timing just enough) and our production environment is very > complicated. > > I spent more time than I'd like to admit trying to reproduce this fairly > reliably. I knew that I needed to help make it as easy as possible to > reproduce to root cause it and for others to take a look at it as well. > Thanks for building this test image -- it repro'd on the first try for me. Hmmm, I wonder then how reproducible it would be if we simply extracted the .ko's and test scripts from out of your initramfs and ran it on arbitrary machines. I think the rcutorture self-tests use qemu/kvm to fire up test VMs, but I dunno if livepatch self-tests are ready for level of sophistication yet :) Will need to think on that a bit. >> These two testing scenarios might be interesting to add to our selftests >> suite. Can you post or add the source(s) to livepatch-test.ko to the >> tarball? > > I made the livepatches using kpatch-build and this simple patch: > > diff --git a/fs/proc/version.c b/fs/proc/version.c > index 94901e8e700d..6b8a3449f455 100644 > --- a/fs/proc/version.c > +++ b/fs/proc/version.c > @@ -12,6 +12,7 @@ static int version_proc_show(struct seq_file *m, void *v) > utsname()->sysname, > utsname()->release, > utsname()->version); > + seq_printf(m, "example livepatch\n"); > return 0; > } > > I just created enough livepatches with the same source patch so that I > could reproduce the issue somewhat reliably. > > I'll see if I can make something that uses klp directly. Ah ok great, I was hoping it was a relatively simply livepatch. We could probably reuse lib/livepatch/test_klp_livepatch.c to do this (patching cmdline_proc_show instead). > The rest of the userspace in the initramfs is really straight forward > with the only interesting parts being a couple of shell scripts. Yup. I'll be on PTO later this week, but I'll see about extracting the scripts and building a pile of livepatch .ko's to see how easily it reproduces without qemu. Thanks, -- Joe