2015-02-25 14:36:22

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v2] virtio-balloon: do not call blocking ops when !TASK_RUNNING

virtio balloon has this code:
wait_event_interruptible(vb->config_change,
(diff = towards_target(vb)) != 0
|| vb->need_stats_update
|| kthread_should_stop()
|| freezing(current));

Which is a problem because towards_target() call might block after
wait_event_interruptible sets task state to TAST_INTERRUPTIBLE, causing
the task_struct::state collision typical of nesting of sleeping
primitives

See also http://lwn.net/Articles/628628/ or Thomas's
bug report
http://article.gmane.org/gmane.linux.kernel.virtualization/24846
for a fuller explanation.

To fix, rewrite using wait_woken.

Cc: [email protected]
Reported-by: Thomas Huth <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---

changes from v1:
remove wait_event_interruptible
noticed by Cornelia Huck <[email protected]>

drivers/virtio/virtio_balloon.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 0413157..5a6ad6d 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -29,6 +29,7 @@
#include <linux/module.h>
#include <linux/balloon_compaction.h>
#include <linux/oom.h>
+#include <linux/wait.h>

/*
* Balloon device works in 4K page units. So each page is pointed to by
@@ -334,17 +335,25 @@ static int virtballoon_oom_notify(struct notifier_block *self,
static int balloon(void *_vballoon)
{
struct virtio_balloon *vb = _vballoon;
+ DEFINE_WAIT_FUNC(wait, woken_wake_function);

set_freezable();
while (!kthread_should_stop()) {
s64 diff;

try_to_freeze();
- wait_event_interruptible(vb->config_change,
- (diff = towards_target(vb)) != 0
- || vb->need_stats_update
- || kthread_should_stop()
- || freezing(current));
+
+ add_wait_queue(&vb->config_change, &wait);
+ for (;;) {
+ if ((diff = towards_target(vb)) != 0 ||
+ vb->need_stats_update ||
+ kthread_should_stop() ||
+ freezing(current))
+ break;
+ wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
+ }
+ remove_wait_queue(&vb->config_change, &wait);
+
if (vb->need_stats_update)
stats_handle_request(vb);
if (diff > 0)
--
MST


2015-02-25 15:11:38

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v2] virtio-balloon: do not call blocking ops when !TASK_RUNNING

On Wed, 25 Feb 2015 15:36:02 +0100
"Michael S. Tsirkin" <[email protected]> wrote:

> virtio balloon has this code:
> wait_event_interruptible(vb->config_change,
> (diff = towards_target(vb)) != 0
> || vb->need_stats_update
> || kthread_should_stop()
> || freezing(current));
>
> Which is a problem because towards_target() call might block after
> wait_event_interruptible sets task state to TAST_INTERRUPTIBLE, causing
> the task_struct::state collision typical of nesting of sleeping
> primitives
>
> See also http://lwn.net/Articles/628628/ or Thomas's
> bug report
> http://article.gmane.org/gmane.linux.kernel.virtualization/24846
> for a fuller explanation.
>
> To fix, rewrite using wait_woken.
>
> Cc: [email protected]
> Reported-by: Thomas Huth <[email protected]>
> Signed-off-by: Michael S. Tsirkin <[email protected]>
> ---
>
> changes from v1:
> remove wait_event_interruptible
> noticed by Cornelia Huck <[email protected]>
>
> drivers/virtio/virtio_balloon.c | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>

I was able to reproduce Thomas' original problem and can confirm that
it is gone with this patch.

Reviewed-by: Cornelia Huck <[email protected]>

2015-02-25 15:38:08

by Thomas Huth

[permalink] [raw]
Subject: Re: [PATCH v2] virtio-balloon: do not call blocking ops when !TASK_RUNNING

On Wed, 25 Feb 2015 16:11:27 +0100
Cornelia Huck <[email protected]> wrote:

> On Wed, 25 Feb 2015 15:36:02 +0100
> "Michael S. Tsirkin" <[email protected]> wrote:
>
> > virtio balloon has this code:
> > wait_event_interruptible(vb->config_change,
> > (diff = towards_target(vb)) != 0
> > || vb->need_stats_update
> > || kthread_should_stop()
> > || freezing(current));
> >
> > Which is a problem because towards_target() call might block after
> > wait_event_interruptible sets task state to TAST_INTERRUPTIBLE, causing
> > the task_struct::state collision typical of nesting of sleeping
> > primitives
> >
> > See also http://lwn.net/Articles/628628/ or Thomas's
> > bug report
> > http://article.gmane.org/gmane.linux.kernel.virtualization/24846
> > for a fuller explanation.
> >
> > To fix, rewrite using wait_woken.
> >
> > Cc: [email protected]
> > Reported-by: Thomas Huth <[email protected]>
> > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > ---
> >
> > changes from v1:
> > remove wait_event_interruptible
> > noticed by Cornelia Huck <[email protected]>
> >
> > drivers/virtio/virtio_balloon.c | 19 ++++++++++++++-----
> > 1 file changed, 14 insertions(+), 5 deletions(-)
> >
>
> I was able to reproduce Thomas' original problem and can confirm that
> it is gone with this patch.
>
> Reviewed-by: Cornelia Huck <[email protected]>

Right, I just applied the patch on my system, too, and the problem is
indeed gone! Thanks for the quick fix!

Tested-by: Thomas Huth <[email protected]>