Received: by 10.192.165.156 with SMTP id m28csp1761868imm; Tue, 17 Apr 2018 05:14:06 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/ZXnSrBwkRTDCp4o99n82BbA14avnzcC0eURQaobe1H2qFggi+jPPjrE8tY6HRhRluB6jr X-Received: by 10.99.165.3 with SMTP id n3mr1565593pgf.19.1523967246904; Tue, 17 Apr 2018 05:14:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523967246; cv=none; d=google.com; s=arc-20160816; b=PU/FjSJESvWuqC8oDhJaxvnpLnb9BFs2Iy6B8t/mTlECAuiT8/EPyOfOeC706qaEDF A8ifXJYYtDGqpQOAP/DtY1TvnN+i9G8DwDXXeYwU5MtXONfde5mVO8fls8IUf9g+p1yr TCOGPZJsETBJB0+RrTDl/Pb/IVRsk2YdWZa1Ev3hvklyRUbpz5tUwuiTsW5jOWwfH7xP i3QiEzQ3/U8ReWn8I3+KO5bkvEO8vh5dwL/pnQ4aj/T+HfegdIuINRQK8FKWg/iZBr58 92N8cq40a+wIttao1st/eQZqRDksp2gb5DOtlrrbljsjQSqW5yiRMB7aQQT3WHRvb1Q+ QPtg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=Kx93ynm5XFqrFsLUrtP60CIxRzQXM9K60btIXlRix80=; b=dElqMby7NaQ9xwdGfJFn5yAPM56QbXwHWIqDDV7c39oJbjZWIm0XRQ3W/URz9jc8aG lybiI6wakmAFn9FB685ltxJm4ClK5fnx6C2ut0yNm3+FHZrD4IxdvVOyHXCo6LrOLVkE YGvx1kL04yhuZLydYrxyUqGBfYTuImWeJVyPXxJDZ1LPkGdgB+RpitnVQDOuq1y9+VuW OvQKu0Sf8neX9u/pZrb2k/PFWtpzXtUquEKNjl1GoFWNrOg4HQEHGfxO10d0Tfdo8hFc QlAjnzCvmEwPpX+HRoiypRIcK06fYpLgx38brwZT1TRTI3+rVDbCHGnGzrLx/waiEp7d skdg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 131si4824157pfa.246.2018.04.17.05.13.52; Tue, 17 Apr 2018 05:14:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753323AbeDQMMN (ORCPT + 99 others); Tue, 17 Apr 2018 08:12:13 -0400 Received: from mx2.suse.de ([195.135.220.15]:39745 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753093AbeDQMMK (ORCPT ); Tue, 17 Apr 2018 08:12:10 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 991C0AEDF; Tue, 17 Apr 2018 12:12:08 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 3A72F1E0531; Tue, 17 Apr 2018 14:12:07 +0200 (CEST) Date: Tue, 17 Apr 2018 14:12:07 +0200 From: Jan Kara To: Pavlos Parissis Cc: Jan Kara , Guillaume Morin , stable@vger.kernel.org, decui@microsoft.com, jack@suse.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, mszeredi@redhat.com Subject: Re: kernel panics with 4.14.X versions Message-ID: <20180417121207.cs7eijrndovbplgz@quack2.suse.cz> References: <20180416132550.d25jtdntdvpy55l3@bender.morinfr.org> <20180416144041.t2mt7ugzwqr56ka3@quack2.suse.cz> <9b11cfba-4bdc-8a3e-cd33-2f7e8d513bdf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <9b11cfba-4bdc-8a3e-cd33-2f7e8d513bdf@gmail.com> User-Agent: NeoMutt/20170421 (1.8.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 17-04-18 01:31:24, Pavlos Parissis wrote: > On 16/04/2018 04:40 μμ, Jan Kara wrote: > > How easily can you hit this? > > Very easily, I only need to wait 1-2 days for a crash to occur. I wouldn't call that very easily but opinions may differ :). Anyway it's good (at least for debugging) that it's reproducible. > > Are you able to run debug kernels > > Well, I was under the impression I do as I have: > grep -E 'DEBUG_KERNEL|DEBUG_INFO' /boot/config-4.14.32-1.el7.x86_64 > CONFIG_DEBUG_INFO=y > # CONFIG_DEBUG_INFO_REDUCED is not set > # CONFIG_DEBUG_INFO_SPLIT is not set > # CONFIG_DEBUG_INFO_DWARF4 is not set > CONFIG_DEBUG_KERNEL=y > > Do you think that my kernel doesn't produce a proper crash dump? > I have a production cluster where I can run any kernel we need, so if I need > to compile again with different settings I can certainly do that. OK, good. So please try running 4.16 as you mention below to verify whether this is just a -stable regression or also a problem in the current upstream kernel. Based on your results with 4.16 I'll prepare a debug patch for you to apply on top of 4.14.32 so that we can debug this further. > > / inspect > > crash dumps when the issue occurs? > > I can't do that as the server isn't responsive and I can only power cycle it. Well, kernel crash dumps work in that situation as well - when the kernel panics, it will kexec into a new kernel and dump memory of the old kernel to disk. It can then be investigated with the 'crash' utility. But obviously you don't have this set up and don't have experience with this so let's go via a standard 'debug patch' route. > > Also testing with the latest mainline > > kernel (4.16) would be welcome whether this isn't just an issue with the > > backport of fsnotify fixes from Miklos. > > I can try the kernel-ml-4.16.2 from elrepo (we use CentOS 7). Yes, that would be good. Honza -- Jan Kara SUSE Labs, CR