2014-07-16 05:59:58

by Dexuan Cui

[permalink] [raw]
Subject: [PATCH] Drivers: hv: hv_fcopy: fix a race condition for SMP guest

We should schedule the 5s "timer work" before starting the data transfer,
otherwise, the data transfer code may finish so fast on another
virtual cpu that when the code(fcopy_write()) trying to cancel the 5s
"timer work" can occasionally fail because the "timer work" may haven't
been scheduled yet and as a result the fcopy process will be aborted
wrongly by fcopy_work_func() in 5s.

Thank Liz Zhang <[email protected]> for the initial investigation
on the bug.

This addresses https://bugzilla.redhat.com/show_bug.cgi?id=1118123

Tested-by: Liz Zhang <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: [email protected]
Signed-off-by: Dexuan Cui <[email protected]>
---
drivers/hv/hv_fcopy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index eaaa3d8..23b2ce2 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -246,8 +246,8 @@ void hv_fcopy_onchannelcallback(void *context)
/*
* Send the information to the user-level daemon.
*/
- fcopy_send_data();
schedule_delayed_work(&fcopy_work, 5*HZ);
+ fcopy_send_data();
return;
}
icmsghdr->icflags = ICMSGHDRFLAG_TRANSACTION | ICMSGHDRFLAG_RESPONSE;
--
1.9.1


2014-07-16 13:12:54

by KY Srinivasan

[permalink] [raw]
Subject: RE: [PATCH] Drivers: hv: hv_fcopy: fix a race condition for SMP guest



> -----Original Message-----
> From: Dexuan Cui [mailto:[email protected]]
> Sent: Wednesday, July 16, 2014 12:01 AM
> To: [email protected]; [email protected]; driverdev-
> [email protected]; [email protected]; [email protected];
> [email protected]
> Cc: KY Srinivasan; Haiyang Zhang
> Subject: [PATCH] Drivers: hv: hv_fcopy: fix a race condition for SMP guest
>
> We should schedule the 5s "timer work" before starting the data transfer,
> otherwise, the data transfer code may finish so fast on another virtual cpu
> that when the code(fcopy_write()) trying to cancel the 5s "timer work" can
> occasionally fail because the "timer work" may haven't been scheduled yet
> and as a result the fcopy process will be aborted wrongly by
> fcopy_work_func() in 5s.
>
> Thank Liz Zhang <[email protected]> for the initial investigation on the
> bug.
>
> This addresses https://bugzilla.redhat.com/show_bug.cgi?id=1118123
>
> Tested-by: Liz Zhang <[email protected]>
> Cc: K. Y. Srinivasan <[email protected]>
> Cc: Haiyang Zhang <[email protected]>
> Cc: [email protected]
> Signed-off-by: Dexuan Cui <[email protected]>

Thanks Dexuan.
Signed-off-by: K. Y. Srinivasan <[email protected]>


K. Y

> ---
> drivers/hv/hv_fcopy.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c index
> eaaa3d8..23b2ce2 100644
> --- a/drivers/hv/hv_fcopy.c
> +++ b/drivers/hv/hv_fcopy.c
> @@ -246,8 +246,8 @@ void hv_fcopy_onchannelcallback(void *context)
> /*
> * Send the information to the user-level daemon.
> */
> - fcopy_send_data();
> schedule_delayed_work(&fcopy_work, 5*HZ);
> + fcopy_send_data();
> return;
> }
> icmsghdr->icflags = ICMSGHDRFLAG_TRANSACTION |
> ICMSGHDRFLAG_RESPONSE;
> --
> 1.9.1