Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965293AbcCOJZk (ORCPT ); Tue, 15 Mar 2016 05:25:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:39376 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965233AbcCOJZW (ORCPT ); Tue, 15 Mar 2016 05:25:22 -0400 Date: Tue, 15 Mar 2016 10:25:43 +0100 From: Jan Kara To: Alan Stern Cc: Jan Kara , Tejun Heo , Peter Chen , florian@mickler.org, linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, usb-storage@lists.one-eyed-alien.net, Jan Kara , jkosina@suse.cz Subject: Re: Freezable workqueue blocks non-freezable workqueue during the system resume process Message-ID: <20160315092543.GD17942@quack.suse.cz> References: <20160314072234.GC5213@quack.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3253 Lines: 68 On Mon 14-03-16 10:37:22, Alan Stern wrote: > On Mon, 14 Mar 2016, Jan Kara wrote: > > > On Fri 11-03-16 12:56:10, Tejun Heo wrote: > > > Hello, Jan. > > > > > > On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote: > > > > > Ugh... that's nasty. I wonder whether the right thing to do is making > > > > > writeback workers non-freezable. IOs are supposed to be blocked from > > > > > lower layer anyway. Jan, what do you think? > > > > > > > > Well no, at least currently IO is not blocked in lower layers AFAIK - for > > > > that you'd need to freeze block devices & filesystems and there are issues > > > > > > At least libata does and I think SCSI does too, but yeah, there > > > probably are drivers which depend on block layer blocking IOs, which > > > btw is a pretty fragile way to go about as upper layers might not be > > > the only source of activities. > > > > > > > with that (Jiri Kosina was the last one which was trying to make this work > > > > AFAIR). And I think you need to stop writeback (and generally any IO) to be > > > > generated so that it doesn't interact in a strange way with device drivers > > > > being frozen. So IMO until suspend freezes filesystems & devices properly > > > > you have to freeze writeback workqueue. > > What do you mean by "freezes ... devices"? Only a piece of code can be > frozen -- not a device. By that I meant block device and filesystem freezing. That way filesystem is frozen so that it doesn't submit any more IO to the device. > The kernel does suspend device drivers; that is, it invokes their > suspend callbacks. But it doesn't "freeze" them in any sense. Once a > driver has been suspended, it assumes it won't receive any I/O requests > until it has been resumed. Therefore the kernel first has to prevent > all the upper layers from generating such requests and/or sending them > to the low-level drivers. OK, so Tejun and you should talk together because you both seem to want something else... If I understand it right, Tejun wants suspended devices to just queue requests that have been submitted after these devices were suspended and complete them once they are resumed... > > > I still think the right thing to do is plugging that block layer or > > > low level drivers. It's like we're trying to plug multiple sources > > > when we can plug the point where they come together anyway. > > > > I agree that freezing writeback workers is a workaround for real issues at > > best and ideally we shouldn't have to do that. But at least for now I had > > the impression that it is needed for suspend to work reasonably reliably. > > The design is not to plug low-level drivers, but instead to prevent > them from receiving any requests by plugging or freezing high-level > code. > > It's pretty clear that we don't want to have ongoing I/O during a > system suspend, right? And that means the I/O has to be prevented (or > "plugged", if you prefer) somewhere -- either at an upper layer or at a > lower layer. There was a choice to be made, and the decision was to do > it at an upper layer. I agree the IO has to be plugged somewhere. And Tejun seems to want to plug it at lower layer... Honza -- Jan Kara SUSE Labs, CR