Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp338862ybi; Fri, 24 May 2019 04:45:08 -0700 (PDT) X-Google-Smtp-Source: APXvYqwb5Aou4zBlB4SSdWjKFM77z4b0f9/Po06zzCXtTe0f88VreIPE5sDUK/oeeFO8WRCpjS4v X-Received: by 2002:a17:90a:b116:: with SMTP id z22mr8420384pjq.69.1558698308186; Fri, 24 May 2019 04:45:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558698308; cv=none; d=google.com; s=arc-20160816; b=fZqS0XQ3O7ErxJiHNWeGlEmiy60nojlo4pcA5g15+CgVCVhqzjeMrJGaH/eP5269Uo xNLuBe9m3yBtLrtn0Rou8k1W9At+iF5as4pStisa3gPgNm2DR/PtcrQ3hjqOWh1CjMdl 0wWsTEbTBzHIEfNnjdQ95ulflsohIYxqdBCy1B572U4Cnihb/EQdS3ZrMWYjYms2ka3/ CDOjxPDFBUKH/HRa6gfRcxuP709ZBL1e4uF0YLt3zrg8PS3wZ2LoEBeu16Rl9M07ZeNc FWNT0xtM7Z/1t3knni72qDgNCeCYFOv1vATQAj7TF2rb2mdxw6f3g+1uJ3rEvjMlx5VI 3b6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=4MhuplKsh2cfr1c0kMziw9CvqMrJuFy02MaD7UagMkE=; b=eBwTa+JE6CojF938/+W2tn0fhLo/ZWGax8JfML7XvQbl8rsIav8jzokgLm0u5roTM0 n0Qcc0c6KdQbipuomKUR1USloH4ehYANY+HTCJuutWzmxPMBkX8G6mGL1Z+Ylt+/cttm BSthtL8lghmZZ30C//neUIZdPgN13QrIyzxSXCEVquZEbRTpCq3KvYsJWF4phXGRgUxm 9KJjBUUCaKREFhnEaQ9HClIKveRcn6UGSEXL7QZ6KbH+8HKJ/cEQPdNVr3u5t3DULMq7 32tn9mEou1ZA4Gy+dgxMd+aOU+sf1mSc+K3lsah3EJuDuUFE3JnbhzvzXpwSa15v/JKe 0r5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n4si3752487pgq.69.2019.05.24.04.44.53; Fri, 24 May 2019 04:45:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391256AbfEXLne (ORCPT + 99 others); Fri, 24 May 2019 07:43:34 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:35320 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390920AbfEXLnd (ORCPT ); Fri, 24 May 2019 07:43:33 -0400 Received: by mail-wm1-f68.google.com with SMTP id w9so2778179wmi.0 for ; Fri, 24 May 2019 04:43:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=4MhuplKsh2cfr1c0kMziw9CvqMrJuFy02MaD7UagMkE=; b=irDDYA2QPb/+dc2pQyoUub9LteuFt4Kr5AZvGcoJnFvIp5b5l3OQ/uqX+AjOsG59+0 8j8B/Anu1Fq0YjSeWtP3IKAQTpGS5V/xJDc1+ebWXkoMGr5ZibOigYndt2E+SsUu2mLL 5jF3LVUM8ac7OtZJTzYVW0xt//gALMGRkS0SxiAzmctM6XfnrebV8Gy3Pe32V5QKoUuk ELYiaClGkdTF9CFGEuYQik1/FlCcdJ0DGq+DoupOaKMDkVi+06r43fsX+ncRbtS79Q7f ec/5eiGzVY+5mtOcXhPzYp6Y4UK/VkusB02rqFCQ6oShShdIV3dvxMlYzx77eQey4f8e Q/jw== X-Gm-Message-State: APjAAAUiwx95VwrCcUlvVKfTkG/wy+njTBxGbBsysEtA45x0UxJSpBHN 0wYjB71SOUdtcmgh5bVQnUeH1A== X-Received: by 2002:a1c:414:: with SMTP id 20mr15495759wme.84.1558698211400; Fri, 24 May 2019 04:43:31 -0700 (PDT) Received: from localhost (nat-pool-brq-t.redhat.com. [213.175.37.10]) by smtp.gmail.com with ESMTPSA id s127sm2254385wmf.48.2019.05.24.04.43.30 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 24 May 2019 04:43:30 -0700 (PDT) Date: Fri, 24 May 2019 13:43:29 +0200 From: Oleksandr Natalenko To: Mel Gorman Cc: Justin Piszcz , LKML Subject: Re: 5.1 and 5.1.1: BUG: unable to handle kernel paging request at ffffea0002030000 Message-ID: <20190524114329.hujd3qvtusz6uyfk@butterfly.localdomain> References: <20190520115608.GK18914@techsingularity.net> <20190521124310.GM18914@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190521124310.GM18914@techsingularity.net> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 21, 2019 at 01:43:10PM +0100, Mel Gorman wrote: > On Tue, May 21, 2019 at 05:01:06AM -0400, Justin Piszcz wrote: > > On Mon, May 20, 2019 at 7:56 AM Mel Gorman wrote: > > > > > > On Sun, May 12, 2019 at 04:27:45AM -0400, Justin Piszcz wrote: > > > > Hello, > > > > > > > > I've turned off zram/zswap and I am still seeing the following during > > > > periods of heavy I/O, I am returning to 5.0.xx in the meantime. > > > > > > > > Kernel: 5.1.1 > > > > Arch: x86_64 > > > > Dist: Debian x86_64 > > > > > > > > [29967.019411] BUG: unable to handle kernel paging request at ffffea0002030000 > > > > [29967.019414] #PF error: [normal kernel read fault] > > > > [29967.019415] PGD 103ffee067 P4D 103ffee067 PUD 103ffed067 PMD 0 > > > > [29967.019417] Oops: 0000 [#1] SMP PTI > > > > [29967.019419] CPU: 10 PID: 77 Comm: khugepaged Tainted: G > > > > T 5.1.1 #4 > > > > [29967.019420] Hardware name: Supermicro X9SRL-F/X9SRL-F, BIOS 3.2 01/16/2015 > > > > [29967.019424] RIP: 0010:isolate_freepages_block+0xb9/0x310 > > > > [29967.019425] Code: 24 28 48 c1 e0 06 40 f6 c5 1f 48 89 44 24 20 49 > > > > 8d 45 79 48 89 44 24 18 44 89 f0 4d 89 ee 45 89 fd 41 89 c7 0f 84 ef > > > > 00 00 00 <48> 8b 03 41 83 c4 01 a9 00 00 01 00 75 0c 48 8b 43 08 a8 01 > > > > 0f 84 > > > > > > If you have debugging symbols installed, can you translate the faulting > > > address with the following? > > > > > > ADDR=`nm /path/to/vmlinux-or-debuginfo-file | grep "t isolate_freepages_block\$" | awk '{print $1}'` > > > addr2line -i -e vmlinux `printf "0x%lX" $((0x$ADDR+0xb9))` > > > > Another event this morning, this occurred when copying a single ~25GB > > backup file from one block device device (3ware HW RAID) to a SW > > RAID-1 (mdadm): > > > > With this event, it was a fault and khugepaged is not stuck at 100% > > but this may be related as the stack trace is similar where > > compaction_alloc is utilizing most of the CPU: > > https://lkml.org/lkml/2019/5/9/225 > > > > # ADDR=`nm /usr/src/linux/vmlinux | grep "t isolate_freepages_block\$" > > | awk '{print $1}'` > > # echo $ADDR > > ffffffff812274f0 > > # addr2line -i -e /usr/src/linux/vmlinux `printf "0x%lX" $((0x$ADDR+0x83d))` > > compaction.c:? > > # addr2line -i -e /usr/src/linux/vmlinux `printf "0x%lX" $((0x$ADDR+0x8d0))` > > compaction.c:? > > > > Please use the offset 0xb9 > > addr2line -i -e /usr/src/linux/vmlinux `printf "0x%lX" $((0x$ADDR+0xb9)) > > -- > Mel Gorman > SUSE Labs Cc'ing myself since i observe such a behaviour sometimes right after KVM VM is launched. No luck with reproducing it on demand so far, though. -- Best regards, Oleksandr Natalenko (post-factum) Senior Software Maintenance Engineer