Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755880AbdC1Nb2 (ORCPT ); Tue, 28 Mar 2017 09:31:28 -0400 Received: from mx2.suse.de ([195.135.220.15]:50226 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755253AbdC1Nb0 (ORCPT ); Tue, 28 Mar 2017 09:31:26 -0400 Date: Tue, 28 Mar 2017 15:30:40 +0200 From: Michal Hocko To: Ilya Dryomov Cc: Greg Kroah-Hartman , "linux-kernel@vger.kernel.org" , stable@vger.kernel.org, Sergey Jerusalimov , Jeff Layton Subject: Re: [PATCH 4.4 48/76] libceph: force GFP_NOIO for socket allocations Message-ID: <20170328133040.GJ18241@dhcp22.suse.cz> References: <20170328122559.966310440@linuxfoundation.org> <20170328122601.905696872@linuxfoundation.org> <20170328124312.GE18241@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1085 Lines: 25 On Tue 28-03-17 15:23:58, Ilya Dryomov wrote: > On Tue, Mar 28, 2017 at 2:43 PM, Michal Hocko wrote: > > On Tue 28-03-17 14:30:45, Greg KH wrote: > >> 4.4-stable review patch. If anyone has any objections, please let me know. > > > > I haven't seen the original patch but the changelog makes me worried. > > How exactly this is a problem? Where do we lockup? Does rbd/libceph take > > any xfs locks? > > No, it doesn't. This is just another instance of "using GFP_KERNEL on > the writeback path may lead to a deadlock" with nothing extra to it. > > XFS is writing out data, libceph messenger worker tries to open > a socket and recurses back into XFS because the sockfs inode is > allocated with GFP_KERNEL. The message with some of the data never > goes out and eventually we get a deadlock. > > I've only included the offending stack trace. I guess I should have > stressed that ceph-msgr workqueue is used for reclaim. Could you be more specific about the lockup scenario. I still do not get how this would lead to a deadlock. -- Michal Hocko SUSE Labs