Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1468566ybl; Wed, 28 Aug 2019 15:25:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqy8Fppfr0+vt+5DVPNrRpizg0v1V8KhFQj1mcgtJQccf5QEyADKJZ+fLevi1Wb+820mqVC4 X-Received: by 2002:aa7:8588:: with SMTP id w8mr7701990pfn.244.1567031104415; Wed, 28 Aug 2019 15:25:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567031104; cv=none; d=google.com; s=arc-20160816; b=qC3vnE40sYUSDsSp3PYABGct3ZvNUogakBfJ9n9IQV9v5xevyLHllpUfuo5H7/RLhd O+NYL7o8QiwoQs86U/Nie8P88wkn7dTgnMWlpEVM4abiMSkJvAEJPS/MXRO6Hk/g91/P 9SKqbi72fK/UITc7jpWl+2TbkWcz4hbgxGo0EyG9MCB1QEfV8lHo1d0WEjj3WRTh902Q OQfFtYtfwGVPE1mCIU/5ZcBL9AUUE5OjksgW18liQRc7NDB/vbbZvQe6qQ9Xnaic2a3R P2TfdDO1q47mtuiLbBxuNZrBa5DAJFnQI521Z36gj+g4vVMvX60Esc/yJSeMUmOzNW8a AQEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=akmKqcTFKUa05TGRL2FcXxr4lJl27fsl6mDvDHiP0vo=; b=dGT5o6/OzxRHz0n7CnjBmO8QQE4nWXXBfZd8Xf8FX6Ma5bZqjnpbNjD2TjurPNo5n3 a3exQv5R6tAfYm4f4HXhLUKw3lqTOxl10wn2Rwwe6tEbR0lf5Swv87BNZLBh5RuCnqSR Ok/BnOtsR69wM1h1stwq5coquVrXPfbI/hcW88hbI/odD6t+NK37EuRX5+OfrIE4oZYZ sex7yfnjHL0pVtgDulzf/mO6rTWSitz7jHclVi47ROxK22iEccPqoZsC32fGS8TlxDCi FGbCCgwp4qQt5BfZxQ2EbKWAqRk2hAj1Meo4Ks4p9UeBIExB5L+yMHsBXCGepFr9NGel B8Xw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=hS0SaDL5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s9si502117pfh.75.2019.08.28.15.24.48; Wed, 28 Aug 2019 15:25:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=hS0SaDL5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726941AbfH1WW0 (ORCPT + 99 others); Wed, 28 Aug 2019 18:22:26 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:34058 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726725AbfH1WWZ (ORCPT ); Wed, 28 Aug 2019 18:22:25 -0400 Received: by mail-pg1-f194.google.com with SMTP id n9so466109pgc.1 for ; Wed, 28 Aug 2019 15:22:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=akmKqcTFKUa05TGRL2FcXxr4lJl27fsl6mDvDHiP0vo=; b=hS0SaDL5O1Ph2r+HBw7A/4RKfAomkMAO8XQ2sY2YLFTXzIBFDP+VHPdWnkcagGW+QX AeGj2YhFC+3kPKry8yIndOXmr/DagrT+AF2RVE9osqN8GwAb6iUflMwlUzj7Y5PJrOpJ lcOJ2NQmG9BhMIo7ziUQALZw79UAz9rdEaMD/x4kJ56I4ZXQQA4DMyYDp0AlgfkhDVgn T5pJUmQJWssVShrLPCxgtT9VDvkhVoZOO9ufqj+ALRpZD1vnSXtPS+9A/MhlT6tI29j2 3+TNciM3ykz9nZAonW1stQ6zJtY7gn1uSo8yuAFOsAZmBD47Naog0+JK3hEyTGX1WUj3 txkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=akmKqcTFKUa05TGRL2FcXxr4lJl27fsl6mDvDHiP0vo=; b=LFwME1jDlJ+x+Oqzy3fepc5gHo3esVJnpTwTJ9241c3FTSUeOLYs1nHqgmmxUDpZIj tU7nHW7xCu09/gYzUFRmN9fFAvig8mAeX1LkOZPzPZFy+7VFHEQbgONaHjgIQxUmkQ9M gu+IcsOkAGHEPpMXsGojeU0WtXpKPgMj75FE79po3Wg0CIqUYhlFeIaFszEMW+aoedjj eaWr25zqHAPDP6eai+5oEBP9Z0Jf3wRWrO1he/diAIgfFfMvF605eLfYnu46GjJKMeUD lvhC325DLYsNG7TcQRhlQsU+cKD9TO2GRZWDxbuxioiPodso5l2J+p/sybqxsNdX+4ke XLhA== X-Gm-Message-State: APjAAAU+gmqX8itFBx04CfH1ldyOBwvejholAuytzn8KQtpGRbxUha+I nMoWjqjCOP/FpkNrvVMO/QHcVpBvFHgUk9gqOLWyGw== X-Received: by 2002:a17:90a:ac02:: with SMTP id o2mr6606147pjq.134.1567030944115; Wed, 28 Aug 2019 15:22:24 -0700 (PDT) MIME-Version: 1.0 References: <20190828194226.GA29967@swahl-linux> <20190828221048.GB29967@swahl-linux> In-Reply-To: <20190828221048.GB29967@swahl-linux> From: Nick Desaulniers Date: Wed, 28 Aug 2019 15:22:13 -0700 Message-ID: Subject: Re: Purgatory compile flag changes apparently causing Kexec relocation overflows To: Steve Wahl Cc: Thomas Gleixner , LKML , russ.anderson@hpe.com, dimitri.sivanich@hpe.com, mike.travis@hpe.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 28, 2019 at 3:14 PM Steve Wahl wrote: > > On Wed, Aug 28, 2019 at 02:51:21PM -0700, Nick Desaulniers wrote: > > On Wed, Aug 28, 2019 at 12:42 PM Steve Wahl wrote: > > > > > > Please CC me on responses to this. > > > > > > I normally would do more diligence on this, but the timing is such > > > that I think it's better to get this out sooner. > > > > > > With the tip of the tree from https://github.com/torvalds/linux.git (a > > > few days old, most recent commit fetched is > > > bb7ba8069de933d69cb45dd0a5806b61033796a3), I'm seeing "kexec: Overflow > > > in relocation type 11 value 0x11fffd000" when I try to load a crash > > > kernel with kdump. This seems to be caused by commit > > > 059f801a937d164e03b33c1848bb3dca67c0b04, which changed the compiler > > > flags used to compile purgatory.ro, apparently creating 32 bit > > > relocations for things that aren't necessarily reachable with a 32 bit > > > reference. My guess is this only occurs when the crash kernel is > > > located outside 32-bit addressable physical space. > > > > > > I have so far verified that the problem occurs with that commit, and > > > does not occur with the previous commit. For this commit, Thomas > > > Gleixner mentioned a few of the changed flags should have been looked > > > at twice. I have not gone so far as to figure out which flags cause > > > the problem. > > > > > > The hardware in use is a HPE Superdome Flex with 48 * 32GiB dimms > > > (total 1536 GiB). > > > > > > One example of the exact error messages seen: > > > > > > 019-08-28T13:42:39.308110-05:00 uv4test14 kernel: [ 45.137743] kexec: Overflow in relocation type 11 value 0x17f7affd000 > > > 2019-08-28T13:42:39.308123-05:00 uv4test14 kernel: [ 45.137749] kexec-bzImage64: Loading purgatory failed > > > > Thanks for the report and sorry for the breakage. Can you please send > > me more information for how to precisely reproduce the issue? I'm > > happy to look into fixing it. > > Here's the details I know might be important: > > Since this appears to be a problem with the result of a relocation not > fitting within 32 bits, I think the location chosen to place the crash > kernel needs to be above 4GiB; so you need a machine with more memory > than that. > > At the moment I'm running SLES 12 sp 4 as the rest of the > environment. rpm says kdump is kdump-0.8.16-9.2.x86_64. I've fetched > the kernel sources and compiled directly on this system. I believe I > copied the kernel config from the SLES kernel and did a make > olddefconfig for configuration. Made and installed the kernel from > the kernel tree. > > crashkernel=512M,high is set on the command line. > > As the system boots, and systemd initializes kdump, it tries to load > the crash kernel, I believe through > /usr/lib/systemd/system/kdump.service running /lib/kdump/load.sh > --update. > > Once that completes, 'systemctl status kdump' indicates a failure, and > dmesg | grep kexec shows the error messages mentioned above. > > > Let me go dig up the different listed flags. Steve, it may be fastest > > for you to test re-adding them in your setup to see which one is > > important. > > I will work through that tomorrow and let you know what I find. > > > Tglx, if you want to revert the above patches, I'm ok with that. It's > > important that we fix the issue eventually that my patches were meant > > to address, but precisely *when* it's solved isn't critical; our > > kernels can carry out of tree patches for now until the issue is > > completely resolved worst case. One point that might be more useful first would be, is a revert of: commit b059f801a937 ("x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS") good enough, or must: commit 4ce97317f41d ("x86/purgatory: Do not use __builtin_memcpy and __builtin_memset") be reverted additionally? They were part of a 2 patch patchset. I would prefer tglx to revert as few patches as necessary if possible (to avoid "revert of revert" soup), and I doubt the latter patch needs to be reverted. (Even more preferential would be a fix, with no reverts, but whichever). -- Thanks, ~Nick Desaulniers