Received: by 10.223.185.116 with SMTP id b49csp3031456wrg; Mon, 5 Mar 2018 12:50:04 -0800 (PST) X-Google-Smtp-Source: AG47ELuoUpR4GCkJGHmKl73t3kXOUady0r9H9CvDcMYkg6Ec7PJPZXkUQxoKptFrcw9WQhhsSiHf X-Received: by 2002:a17:902:b943:: with SMTP id h3-v6mr8959101pls.45.1520283003912; Mon, 05 Mar 2018 12:50:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520283003; cv=none; d=google.com; s=arc-20160816; b=p4Z/X4le6foC7iAxvJz3JNgEBYqiJmsL1EcrB0zCnAEORsqeejfz54QIxnJtKJPhhu +ld6TRZhG2envUR1ZNWnCydgJL9uoHQ4R+Hdi3TiE0WQmEERRnI3RGEke7/0ZAHA48eZ el6Pb6D3YqIsXfHDgbEX7KokoKlqR6Oii89G8ZSW8ew577EbqWGcj6MwER/3HWN5nX41 25X/k+AVXLyv5SsFXlC4Im4pvlLUwxt8I3uvEIWosBXBUPOeqAqRC1QNm9XchplH/0tG droFqnw/89N7Xq+87NCM3EyEpTrHgAwK/lJaMU34KCJ9fwT48BjnaWv80WhOzM0MZYap aTKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=0cklkoHghMbULwk3j/6mCKaBNjuulejKe2yCFlfkso8=; b=sOdtSczxPbEdYvRmwnSmIfzKrhi3RO+nL+QsK3ilSGKYP3pZgolBqY8+xz7eeADd1u B11DFmWbkzd8bSB+jnpBpRJkuko7+gJ0PhV38pidElObUG1/nNc/qELXt7bHx6JXv7fO G9aMLWQJxeb7ZO2nLsFLuZpgutBDKbDX44PXq6T/guTWHGzSW5fEdK0SU8vXRIc9Pc7O IxHVsKZNyo3fUSTfm5v0QiOECF9FOZwD1x6W8yxvfa2HWCR2xtIvaDec2dchcez7frxm AvhVXqhvOJd5CIZHgckJ6XTVJbObgE72Q7nRU5M7KQB9jKHuRiT/EVh7fdCrS9UXqJ02 Xdhw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q195si7656100pgq.309.2018.03.05.12.49.49; Mon, 05 Mar 2018 12:50:03 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932592AbeCEUsk (ORCPT + 99 others); Mon, 5 Mar 2018 15:48:40 -0500 Received: from mx2.suse.de ([195.135.220.15]:55845 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932531AbeCEUsi (ORCPT ); Mon, 5 Mar 2018 15:48:38 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 104D3AE16; Mon, 5 Mar 2018 20:48:37 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 950C21E050C; Mon, 5 Mar 2018 21:48:36 +0100 (CET) Date: Mon, 5 Mar 2018 21:48:36 +0100 From: Jan Kara To: Dexuan Cui Cc: "linux-fsdevel@vger.kernel.org" , Jan Kara , Amir Goldstein , Miklos Szeredi , Haiyang Zhang , "'linux-kernel@vger.kernel.org'" , Jork Loeser Subject: Re: Any known soft lockup issue with vfs_write()->fsnotify()? Message-ID: <20180305204836.qznlcm6uwurfs2n4@quack2.suse.cz> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170421 (1.8.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi! On Fri 02-03-18 22:28:50, Dexuan Cui wrote: > Recently people are getting a soft lock issue with vfs_write()->fsnotify(). > The detailed calltrace is available at: > https://github.com/coreos/bugs/issues/2356 > https://github.com/coreos/bugs/issues/2364 I didn't see them yet. > The kernel versions showing up the issue are: > 4.14.11-coreos > 4.14.19-coreos > 4.13.0-1009 -- this is the kernel with which I'm personally seeing the lockup. > > I have not got a chance to try the latest mainline kernel yet. It would be good to try 4.15 kernel to see whether recent fixes from Miklos didn't fix your problem. They should be present in 4.14.11/19 kernels as well but one never knows... > Before the lockup error message suddenly appears, Linux has been running > fine for many hours. I have NOT found a consistent way to reproduce the > lockup yet. > > Looks the kernel is stuck in fsnotify(), when it tries to get the > fsnotify_mark_srcu lock. It is not possible that we would 'hang' in srcu_read_lock() - that is just a read of one variable and increment of another. We'd have to be looping somewhere and watchdog would have to happen to hit us always at that place. Weird. Are you sure RIP points to srcu_read_lock? > "git log fs/notify/fsnotify.c" on the latest mainline shows that some > recent patches might help. > > I'd like to check if this is a known issue. As I've mentioned above, so far I didn't see reports like this... Honza -- Jan Kara SUSE Labs, CR