Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759659AbcDETTO (ORCPT ); Tue, 5 Apr 2016 15:19:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40228 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751649AbcDETTM (ORCPT ); Tue, 5 Apr 2016 15:19:12 -0400 Date: Tue, 5 Apr 2016 15:19:07 -0400 From: Jessica Yu To: Miroslav Benes Cc: Josh Poimboeuf , Jiri Kosina , Chris J Arges , eugene.shatokhin@rosalab.ru, live-patching@vger.kernel.org, Linux Kernel Mailing List , pmladek@suse.cz Subject: Re: Bug with paravirt ops and livepatches Message-ID: <20160405191907.GC10567@packer-debian-8-amd64.digitalocean.com> References: <20160329120518.GA21252@canonical.com> <20160401190704.GB7837@canonical.com> <20160404161428.3qap2i4vpgda66iw@treble.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: X-OS: Linux eisen.io 3.16.0-4-amd64 x86_64 User-Agent: Mutt/1.5.23 (2014-03-12) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 05 Apr 2016 19:19:12 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3216 Lines: 66 +++ Miroslav Benes [05/04/16 15:07 +0200]: >On Mon, 4 Apr 2016, Josh Poimboeuf wrote: > >> So I think this doesn't fix the problem. Dynamic relocations are >> applied to the "patch module", whereas the above code deals with the >> initialization order of the "patched module". This distinction >> originally confused me as well, until Jessica set me straight. >> >> Let me try to illustrate the problem with an example. Imagine you have >> a patch module P which applies a patch to module M. P replaces M's >> function F with a new function F', which uses paravirt ops. >> >> 1) Patch P is loaded before module M. P's new function F' has an >> instruction which is patched by apply_paravirt(), even though the >> patch hasn't been applied yet. >> >> 2) Module M is loaded. Before applying the patch, livepatch tries to >> apply a klp_reloc to the instruction in F' which was already patched >> by apply_paravirt() in step 1. This results in undefined behavior >> because it tries to patch the original instruction but instead >> patches the new paravirt instruction. >> >> So the above patch makes no difference because the paravirt module >> loading order doesn't really matter. > >Hi, > >we are trying really hard to understand the actual culprit here and as it >is quite confusing I have several questions/comments... I don't have a 100% clear understanding of the whole picture either, but I'll try to help clarify up some things.. >1. can you provide dynrela sections of the patch module from >https://github.com/dynup/kpatch/issues/580? What is interesting is that >kvm_arch_vm_ioctl() function contains rela records only for trivial (== >exported) symbols from the first look. The problem should be there only if >you want to patch a function which reference some paravirt_ops unexported >symbol. For that symbol dynrela should be created. Just to dispel some confusion over this, kpatch isn't "smart" enough yet to differentiate between exported and non-exported symbols, as Evgenii already mentioned. Just global and local, and whether the symbol belongs to a module or vmlinux. So that means dynrelas are indeed being created for the pv_*_ops symbols, despite the fact they are exported. So this is part of the problem, since apply_paravirt() runs first (as part of module_finalize()), and dynrelas are written second, livepatch is clobbering/overwriting those paravirt patch sites that apply_paravirt had fixed up already. If you skip generating dynrelas that affect those paravirt patch sites, I can verify that the crash disappears, since in that case we're not stepping over those paravirt patch sites with our dynrelas (just to test and verify the problem, I kind of cheated and forced kpatch to not generate dynrelas for pv_*_ops symbols, and the panics disappear). Another temporary solution was to not apply the dynrela if the target memory is not all zero's. These approaches aren't reliable enough to serve as a permanent solution but they do give us a better idea of what's happening. Here are some more relevant comments - https://github.com/dynup/kpatch/issues/580#issuecomment-199314636 https://github.com/dynup/kpatch/issues/580#issuecomment-202395452 Jessica