2016-03-02 16:01:08

by Tejun Heo

[permalink] [raw]
Subject: Re: Freezable workqueue blocks non-freezable workqueue during the system resume process

Hello,

(cc'ing Jan)

On Fri, Feb 26, 2016 at 02:19:20PM +0800, Peter Chen wrote:
> On Thu, Feb 25, 2016 at 05:01:12PM -0500, Tejun Heo wrote:
> > Hello, Peter.
> >
> > On Wed, Feb 24, 2016 at 03:24:30PM +0800, Peter Chen wrote:
> > > > You might want to complain to the block-layer people about this. I
> > > > don't know if anything can be done to fix it.
> > > >
> > > > Or maybe flush_work and flush_delayed_work can be changed to avoid
> > > > blocking if the workqueue is frozen. Tejun?
> > > >
> > >
> > > I have a patch to show the root cause of this issue.
> > >
> > > http://www.spinics.net/lists/linux-usb/msg136815.html
> >
> > I don't get it. Why would it deadlock? Shouldn't things get rolling
> > once the workqueues are thawed?
>
> The workqueue writeback can't be thawed due to driver's resume
> (dpm_complete) is lock nested, and can't be finished.

Ugh... that's nasty. I wonder whether the right thing to do is making
writeback workers non-freezable. IOs are supposed to be blocked from
lower layer anyway. Jan, what do you think?

Thanks.

--
tejun


2016-03-03 09:32:54

by Jan Kara

[permalink] [raw]
Subject: Re: Freezable workqueue blocks non-freezable workqueue during the system resume process

Hello,

On Wed 02-03-16 11:00:58, Tejun Heo wrote:
> On Fri, Feb 26, 2016 at 02:19:20PM +0800, Peter Chen wrote:
> > On Thu, Feb 25, 2016 at 05:01:12PM -0500, Tejun Heo wrote:
> > > Hello, Peter.
> > >
> > > On Wed, Feb 24, 2016 at 03:24:30PM +0800, Peter Chen wrote:
> > > > > You might want to complain to the block-layer people about this. I
> > > > > don't know if anything can be done to fix it.
> > > > >
> > > > > Or maybe flush_work and flush_delayed_work can be changed to avoid
> > > > > blocking if the workqueue is frozen. Tejun?
> > > > >
> > > >
> > > > I have a patch to show the root cause of this issue.
> > > >
> > > > http://www.spinics.net/lists/linux-usb/msg136815.html
> > >
> > > I don't get it. Why would it deadlock? Shouldn't things get rolling
> > > once the workqueues are thawed?
> >
> > The workqueue writeback can't be thawed due to driver's resume
> > (dpm_complete) is lock nested, and can't be finished.
>
> Ugh... that's nasty. I wonder whether the right thing to do is making
> writeback workers non-freezable. IOs are supposed to be blocked from
> lower layer anyway. Jan, what do you think?

Well no, at least currently IO is not blocked in lower layers AFAIK - for
that you'd need to freeze block devices & filesystems and there are issues
with that (Jiri Kosina was the last one which was trying to make this work
AFAIR). And I think you need to stop writeback (and generally any IO) to be
generated so that it doesn't interact in a strange way with device drivers
being frozen. So IMO until suspend freezes filesystems & devices properly
you have to freeze writeback workqueue.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2016-03-11 17:56:17

by Tejun Heo

[permalink] [raw]
Subject: Re: Freezable workqueue blocks non-freezable workqueue during the system resume process

Hello, Jan.

On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote:
> > Ugh... that's nasty. I wonder whether the right thing to do is making
> > writeback workers non-freezable. IOs are supposed to be blocked from
> > lower layer anyway. Jan, what do you think?
>
> Well no, at least currently IO is not blocked in lower layers AFAIK - for
> that you'd need to freeze block devices & filesystems and there are issues

At least libata does and I think SCSI does too, but yeah, there
probably are drivers which depend on block layer blocking IOs, which
btw is a pretty fragile way to go about as upper layers might not be
the only source of activities.

> with that (Jiri Kosina was the last one which was trying to make this work
> AFAIR). And I think you need to stop writeback (and generally any IO) to be
> generated so that it doesn't interact in a strange way with device drivers
> being frozen. So IMO until suspend freezes filesystems & devices properly
> you have to freeze writeback workqueue.

I still think the right thing to do is plugging that block layer or
low level drivers. It's like we're trying to plug multiple sources
when we can plug the point where they come together anyway.

Thanks.

--
tejun

2016-03-14 07:25:07

by Jan Kara

[permalink] [raw]
Subject: Re: Freezable workqueue blocks non-freezable workqueue during the system resume process

On Fri 11-03-16 12:56:10, Tejun Heo wrote:
> Hello, Jan.
>
> On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote:
> > > Ugh... that's nasty. I wonder whether the right thing to do is making
> > > writeback workers non-freezable. IOs are supposed to be blocked from
> > > lower layer anyway. Jan, what do you think?
> >
> > Well no, at least currently IO is not blocked in lower layers AFAIK - for
> > that you'd need to freeze block devices & filesystems and there are issues
>
> At least libata does and I think SCSI does too, but yeah, there
> probably are drivers which depend on block layer blocking IOs, which
> btw is a pretty fragile way to go about as upper layers might not be
> the only source of activities.
>
> > with that (Jiri Kosina was the last one which was trying to make this work
> > AFAIR). And I think you need to stop writeback (and generally any IO) to be
> > generated so that it doesn't interact in a strange way with device drivers
> > being frozen. So IMO until suspend freezes filesystems & devices properly
> > you have to freeze writeback workqueue.
>
> I still think the right thing to do is plugging that block layer or
> low level drivers. It's like we're trying to plug multiple sources
> when we can plug the point where they come together anyway.

I agree that freezing writeback workers is a workaround for real issues at
best and ideally we shouldn't have to do that. But at least for now I had
the impression that it is needed for suspend to work reasonably reliably.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2016-03-14 14:37:27

by Alan Stern

[permalink] [raw]
Subject: Re: Freezable workqueue blocks non-freezable workqueue during the system resume process

On Mon, 14 Mar 2016, Jan Kara wrote:

> On Fri 11-03-16 12:56:10, Tejun Heo wrote:
> > Hello, Jan.
> >
> > On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote:
> > > > Ugh... that's nasty. I wonder whether the right thing to do is making
> > > > writeback workers non-freezable. IOs are supposed to be blocked from
> > > > lower layer anyway. Jan, what do you think?
> > >
> > > Well no, at least currently IO is not blocked in lower layers AFAIK - for
> > > that you'd need to freeze block devices & filesystems and there are issues
> >
> > At least libata does and I think SCSI does too, but yeah, there
> > probably are drivers which depend on block layer blocking IOs, which
> > btw is a pretty fragile way to go about as upper layers might not be
> > the only source of activities.
> >
> > > with that (Jiri Kosina was the last one which was trying to make this work
> > > AFAIR). And I think you need to stop writeback (and generally any IO) to be
> > > generated so that it doesn't interact in a strange way with device drivers
> > > being frozen. So IMO until suspend freezes filesystems & devices properly
> > > you have to freeze writeback workqueue.

What do you mean by "freezes ... devices"? Only a piece of code can be
frozen -- not a device.

The kernel does suspend device drivers; that is, it invokes their
suspend callbacks. But it doesn't "freeze" them in any sense. Once a
driver has been suspended, it assumes it won't receive any I/O requests
until it has been resumed. Therefore the kernel first has to prevent
all the upper layers from generating such requests and/or sending them
to the low-level drivers.

> > I still think the right thing to do is plugging that block layer or
> > low level drivers. It's like we're trying to plug multiple sources
> > when we can plug the point where they come together anyway.
>
> I agree that freezing writeback workers is a workaround for real issues at
> best and ideally we shouldn't have to do that. But at least for now I had
> the impression that it is needed for suspend to work reasonably reliably.

The design is not to plug low-level drivers, but instead to prevent
them from receiving any requests by plugging or freezing high-level
code.

It's pretty clear that we don't want to have ongoing I/O during a
system suspend, right? And that means the I/O has to be prevented (or
"plugged", if you prefer) somewhere -- either at an upper layer or at a
lower layer. There was a choice to be made, and the decision was to do
it at an upper layer.

Alan Stern

2016-03-15 09:25:40

by Jan Kara

[permalink] [raw]
Subject: Re: Freezable workqueue blocks non-freezable workqueue during the system resume process

On Mon 14-03-16 10:37:22, Alan Stern wrote:
> On Mon, 14 Mar 2016, Jan Kara wrote:
>
> > On Fri 11-03-16 12:56:10, Tejun Heo wrote:
> > > Hello, Jan.
> > >
> > > On Thu, Mar 03, 2016 at 10:33:10AM +0100, Jan Kara wrote:
> > > > > Ugh... that's nasty. I wonder whether the right thing to do is making
> > > > > writeback workers non-freezable. IOs are supposed to be blocked from
> > > > > lower layer anyway. Jan, what do you think?
> > > >
> > > > Well no, at least currently IO is not blocked in lower layers AFAIK - for
> > > > that you'd need to freeze block devices & filesystems and there are issues
> > >
> > > At least libata does and I think SCSI does too, but yeah, there
> > > probably are drivers which depend on block layer blocking IOs, which
> > > btw is a pretty fragile way to go about as upper layers might not be
> > > the only source of activities.
> > >
> > > > with that (Jiri Kosina was the last one which was trying to make this work
> > > > AFAIR). And I think you need to stop writeback (and generally any IO) to be
> > > > generated so that it doesn't interact in a strange way with device drivers
> > > > being frozen. So IMO until suspend freezes filesystems & devices properly
> > > > you have to freeze writeback workqueue.
>
> What do you mean by "freezes ... devices"? Only a piece of code can be
> frozen -- not a device.

By that I meant block device and filesystem freezing. That way filesystem
is frozen so that it doesn't submit any more IO to the device.

> The kernel does suspend device drivers; that is, it invokes their
> suspend callbacks. But it doesn't "freeze" them in any sense. Once a
> driver has been suspended, it assumes it won't receive any I/O requests
> until it has been resumed. Therefore the kernel first has to prevent
> all the upper layers from generating such requests and/or sending them
> to the low-level drivers.

OK, so Tejun and you should talk together because you both seem to want
something else... If I understand it right, Tejun wants suspended devices
to just queue requests that have been submitted after these devices were
suspended and complete them once they are resumed...

> > > I still think the right thing to do is plugging that block layer or
> > > low level drivers. It's like we're trying to plug multiple sources
> > > when we can plug the point where they come together anyway.
> >
> > I agree that freezing writeback workers is a workaround for real issues at
> > best and ideally we shouldn't have to do that. But at least for now I had
> > the impression that it is needed for suspend to work reasonably reliably.
>
> The design is not to plug low-level drivers, but instead to prevent
> them from receiving any requests by plugging or freezing high-level
> code.
>
> It's pretty clear that we don't want to have ongoing I/O during a
> system suspend, right? And that means the I/O has to be prevented (or
> "plugged", if you prefer) somewhere -- either at an upper layer or at a
> lower layer. There was a choice to be made, and the decision was to do
> it at an upper layer.

I agree the IO has to be plugged somewhere. And Tejun seems to want to plug
it at lower layer...

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2016-03-16 15:01:00

by Tejun Heo

[permalink] [raw]
Subject: Re: Freezable workqueue blocks non-freezable workqueue during the system resume process

Hello, Jan, Alan.

On Tue, Mar 15, 2016 at 10:25:43AM +0100, Jan Kara wrote:
> > The kernel does suspend device drivers; that is, it invokes their
> > suspend callbacks. But it doesn't "freeze" them in any sense. Once a
> > driver has been suspended, it assumes it won't receive any I/O requests
> > until it has been resumed. Therefore the kernel first has to prevent
> > all the upper layers from generating such requests and/or sending them
> > to the low-level drivers.
>
> OK, so Tejun and you should talk together because you both seem to want
> something else... If I understand it right, Tejun wants suspended devices
> to just queue requests that have been submitted after these devices were
> suspended and complete them once they are resumed...

Yeah, I suppose that's why we have the code base we do now. I don't
think freezing kernel threads is the right mechanism to plug IO
devices during suspend. It's way too error-prone and causes a
dependency nightmare as it acts essentially as a system-wide lock.

More complex drivers already plug themselves which are necessary no
matter what as upper layers or some kthreads aren't the only sources
of commands to devices. We can plug at block layer for IOs coming
down from higher layers. We can even provide a mechanism to plug
certain kthreads if necessary but they should be contained in the
driver - e.g. the suspend callback specifically blocking certain
specific kthreads - instead of the vague "the system is generally
stopped now and it seems to work most of the time" that we're doing
now.

Thanks.

--
tejun