2014-11-27 06:22:12

by Dexuan Cui

[permalink] [raw]
Subject: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer failure

In the case the user-space daemon crashes, hangs or is killed, we
need to down the semaphore, otherwise, after the daemon starts next
time, the obsolete data in fcopy_transaction.message or
fcopy_transaction.fcopy_msg will be used immediately.

Reviewed-by: Vitaly Kuznetsov <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Signed-off-by: Dexuan Cui <[email protected]>
---

v2: I removed the "FCP" prefix as Greg asked.

I also updated the output message a little:
"FCP: failed to acquire the semaphore" -->
"can not acquire the semaphore: it is benign"

drivers/hv/hv_fcopy.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
index 23b2ce2..c518ad9 100644
--- a/drivers/hv/hv_fcopy.c
+++ b/drivers/hv/hv_fcopy.c
@@ -86,6 +86,15 @@ static void fcopy_work_func(struct work_struct *dummy)
* process the pending transaction.
*/
fcopy_respond_to_host(HV_E_FAIL);
+
+ /* In the case the user-space daemon crashes, hangs or is killed, we
+ * need to down the semaphore, otherwise, after the daemon starts next
+ * time, the obsolete data in fcopy_transaction.message or
+ * fcopy_transaction.fcopy_msg will be used immediately.
+ */
+ if (down_trylock(&fcopy_transaction.read_sema))
+ pr_debug("can not acquire the semaphore: it is benign\n");
+
}

static int fcopy_handle_handshake(u32 version)
--
1.9.1


2014-11-27 07:14:45

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer failure



----- Original Message -----
> In the case the user-space daemon crashes, hangs or is killed, we
> need to down the semaphore, otherwise, after the daemon starts next
> time, the obsolete data in fcopy_transaction.message or
> fcopy_transaction.fcopy_msg will be used immediately.
>
> Reviewed-by: Vitaly Kuznetsov <[email protected]>
> Cc: K. Y. Srinivasan <[email protected]>
> Signed-off-by: Dexuan Cui <[email protected]>
> ---
>
> v2: I removed the "FCP" prefix as Greg asked.
>
> I also updated the output message a little:
> "FCP: failed to acquire the semaphore" -->
> "can not acquire the semaphore: it is benign"
>
> drivers/hv/hv_fcopy.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
> index 23b2ce2..c518ad9 100644
> --- a/drivers/hv/hv_fcopy.c
> +++ b/drivers/hv/hv_fcopy.c
> @@ -86,6 +86,15 @@ static void fcopy_work_func(struct work_struct *dummy)
> * process the pending transaction.
> */
> fcopy_respond_to_host(HV_E_FAIL);
> +
> + /* In the case the user-space daemon crashes, hangs or is killed, we
> + * need to down the semaphore, otherwise, after the daemon starts next
> + * time, the obsolete data in fcopy_transaction.message or
> + * fcopy_transaction.fcopy_msg will be used immediately.
> + */

Looks still racy, what happens if the daemon start before down_trylock()
but after fcopy_respont_to_host() here?

> + if (down_trylock(&fcopy_transaction.read_sema))
> + pr_debug("can not acquire the semaphore: it is benign\n");

typo
> +
> }
>
> static int fcopy_handle_handshake(u32 version)
> --
> 1.9.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2014-11-27 08:51:27

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer failure

> -----Original Message-----
> From: Jason Wang [mailto:[email protected]]
> Sent: Thursday, November 27, 2014 15:15 PM
> To: Dexuan Cui
> Cc: [email protected]; [email protected]; driverdev-
> [email protected]; [email protected]; [email protected]; KY
> Srinivasan; [email protected]; Haiyang Zhang
> Subject: Re: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer
> failure
> ----- Original Message -----
> > In the case the user-space daemon crashes, hangs or is killed, we
> > need to down the semaphore, otherwise, after the daemon starts next
> > time, the obsolete data in fcopy_transaction.message or
> > fcopy_transaction.fcopy_msg will be used immediately.
> >
> > Reviewed-by: Vitaly Kuznetsov <[email protected]>
> > Cc: K. Y. Srinivasan <[email protected]>
> > Signed-off-by: Dexuan Cui <[email protected]>
> > ---
> >
> > v2: I removed the "FCP" prefix as Greg asked.
> >
> > I also updated the output message a little:
> > "FCP: failed to acquire the semaphore" -->
> > "can not acquire the semaphore: it is benign"
> >
> > drivers/hv/hv_fcopy.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
> > index 23b2ce2..c518ad9 100644
> > --- a/drivers/hv/hv_fcopy.c
> > +++ b/drivers/hv/hv_fcopy.c
> > @@ -86,6 +86,15 @@ static void fcopy_work_func(struct work_struct
> *dummy)
> > * process the pending transaction.
> > */
> > fcopy_respond_to_host(HV_E_FAIL);
> > +
> > + /* In the case the user-space daemon crashes, hangs or is killed, we
> > + * need to down the semaphore, otherwise, after the daemon starts
> next
> > + * time, the obsolete data in fcopy_transaction.message or
> > + * fcopy_transaction.fcopy_msg will be used immediately.
> > + */
>
> Looks still racy, what happens if the daemon start before down_trylock()
> but after fcopy_respont_to_host() here?
Jason,
Thanks for pointing this out!
IMO we can resolve this by adding down_trylock() in fcopy_release().
What's your opinion?

>
> > + if (down_trylock(&fcopy_transaction.read_sema))
> > + pr_debug("can not acquire the semaphore: it is benign\n");
>
> typo
> > +
> > }
Sorry -- what typo do you mean?

Thanks,
-- Dexuan
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-11-27 09:01:27

by Jason Wang

[permalink] [raw]
Subject: RE: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer failure



On Thu, Nov 27, 2014 at 4:50 PM, Dexuan Cui <[email protected]> wrote:
>> -----Original Message-----
>> From: Jason Wang [mailto:[email protected]]
>> Sent: Thursday, November 27, 2014 15:15 PM
>> To: Dexuan Cui
>> Cc: [email protected]; [email protected];
>> driverdev-
>> [email protected]; [email protected]; [email protected]; KY
>> Srinivasan; [email protected]; Haiyang Zhang
>> Subject: Re: [PATCH v2] hv: hv_fcopy: drop the obsolete message on
>> transfer
>> failure
>> ----- Original Message -----
>> > In the case the user-space daemon crashes, hangs or is killed, we
>> > need to down the semaphore, otherwise, after the daemon starts
>> next
>> > time, the obsolete data in fcopy_transaction.message or
>> > fcopy_transaction.fcopy_msg will be used immediately.
>> >
>> > Reviewed-by: Vitaly Kuznetsov <[email protected]>
>> > Cc: K. Y. Srinivasan <[email protected]>
>> > Signed-off-by: Dexuan Cui <[email protected]>
>> > ---
>> >
>> > v2: I removed the "FCP" prefix as Greg asked.
>> >
>> > I also updated the output message a little:
>> > "FCP: failed to acquire the semaphore" -->
>> > "can not acquire the semaphore: it is benign"
>> >
>> > drivers/hv/hv_fcopy.c | 9 +++++++++
>> > 1 file changed, 9 insertions(+)
>> >
>> > diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
>> > index 23b2ce2..c518ad9 100644
>> > --- a/drivers/hv/hv_fcopy.c
>> > +++ b/drivers/hv/hv_fcopy.c
>> > @@ -86,6 +86,15 @@ static void fcopy_work_func(struct work_struct
>> *dummy)
>> > * process the pending transaction.
>> > */
>> > fcopy_respond_to_host(HV_E_FAIL);
>> > +
>> > + /* In the case the user-space daemon crashes, hangs or is
>> killed, we
>> > + * need to down the semaphore, otherwise, after the daemon
>> starts
>> next
>> > + * time, the obsolete data in fcopy_transaction.message or
>> > + * fcopy_transaction.fcopy_msg will be used immediately.
>> > + */
>>
>> Looks still racy, what happens if the daemon start before
>> down_trylock()
>> but after fcopy_respont_to_host() here?
> Jason,
> Thanks for pointing this out!
> IMO we can resolve this by adding down_trylock() in fcopy_release().
> What's your opinion?


Looks better and need to cancel the timeout also here?
>
>
>>
>> > + if (down_trylock(&fcopy_transaction.read_sema))
>> > + pr_debug("can not acquire the semaphore: it is benign\n");
>>
>> typo
>> > +
>> > }
> Sorry -- what typo do you mean?

s/benign/begin/ ?
>
> Thanks,
> -- Dexuan
> �NrybXǧv^)޺{.n+{zXܨ}Ơz&j:+vzZ++zfh~izw?&)ߢf^jǫym@Aa 0hi

2014-11-27 11:44:56

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer failure

> -----Original Message-----
> From: Jason Wang [mailto:[email protected]]
> Sent: Thursday, November 27, 2014 17:01 PM
> To: Dexuan Cui
> Cc: [email protected]; [email protected]; driverdev-
> [email protected]; [email protected]; [email protected]; KY
> Srinivasan; [email protected]; Haiyang Zhang
> Subject: RE: [PATCH v2] hv: hv_fcopy: drop the obsolete message on transfer
> failure
> On Thu, Nov 27, 2014 at 4:50 PM, Dexuan Cui <[email protected]> wrote:
> >> -----Original Message-----
> >> From: Jason Wang [mailto:[email protected]]
> >> Sent: Thursday, November 27, 2014 15:15 PM
> >> To: Dexuan Cui
> >> Cc: [email protected]; [email protected];
> >> driverdev-
> >> [email protected]; [email protected]; [email protected]; KY
> >> Srinivasan; [email protected]; Haiyang Zhang
> >> Subject: Re: [PATCH v2] hv: hv_fcopy: drop the obsolete message on
> >> transfer
> >> failure
> >> ----- Original Message -----
> >> > In the case the user-space daemon crashes, hangs or is killed, we
> >> > need to down the semaphore, otherwise, after the daemon starts
> >> next
> >> > time, the obsolete data in fcopy_transaction.message or
> >> > fcopy_transaction.fcopy_msg will be used immediately.
> >> >
> >> > Reviewed-by: Vitaly Kuznetsov <[email protected]>
> >> > Cc: K. Y. Srinivasan <[email protected]>
> >> > Signed-off-by: Dexuan Cui <[email protected]>
> >> > ---
> >> >
> >> > v2: I removed the "FCP" prefix as Greg asked.
> >> >
> >> > I also updated the output message a little:
> >> > "FCP: failed to acquire the semaphore" -->
> >> > "can not acquire the semaphore: it is benign"
> >> >
> >> > drivers/hv/hv_fcopy.c | 9 +++++++++
> >> > 1 file changed, 9 insertions(+)
> >> >
> >> > diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
> >> > index 23b2ce2..c518ad9 100644
> >> > --- a/drivers/hv/hv_fcopy.c
> >> > +++ b/drivers/hv/hv_fcopy.c
> >> > @@ -86,6 +86,15 @@ static void fcopy_work_func(struct work_struct
> >> *dummy)
> >> > * process the pending transaction.
> >> > */
> >> > fcopy_respond_to_host(HV_E_FAIL);
> >> > +
> >> > + /* In the case the user-space daemon crashes, hangs or is
> >> killed, we
> >> > + * need to down the semaphore, otherwise, after the daemon
> >> starts
> >> next
> >> > + * time, the obsolete data in fcopy_transaction.message or
> >> > + * fcopy_transaction.fcopy_msg will be used immediately.
> >> > + */
> >>
> >> Looks still racy, what happens if the daemon start before
> >> down_trylock()
> >> but after fcopy_respont_to_host() here?
> > Jason,
> > Thanks for pointing this out!
> > IMO we can resolve this by adding down_trylock() in fcopy_release().
> > What's your opinion?
>
>
> Looks better and need to cancel the timeout also here?
OK, let me post a v3.

> >
> >
> >>
> >> > + if (down_trylock(&fcopy_transaction.read_sema))
> >> > + pr_debug("can not acquire the semaphore: it is benign\n");
> >>
> >> typo
> >> > +
> >> > }
> > Sorry -- what typo do you mean?
>
> s/benign/begin/ ?
I meant the issue(can't get the semaphore) is benign.

I think we can just remove the message, as KY suggested.
Instead, I'll add a comment for it.

Thanks,
-- Dexuan

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?