2011-09-28 18:06:34

by Anna Schumaker

[permalink] [raw]
Subject: nfsd grace period open owners

Hi Bruce,

I'm updating my fault injection patches to work with your recent stateid changes, and I had a question about something I'm seeing on the server. When I reboot the server it will enter a grace period and reply to any OPEN requests from the client with NFSERR_GRACE. I used Wireshark to see that NFSERR_GRACE is returned 12 times before the open succeeds, so 13 OPEN requests total. When I send the command to forget all open owners I find that 13 open owners have been forgotten, but I only expecting to find 1. Is this what you would expect the server to do?

I've pasted the code I use to count and delete open owners below. The variable "num" has the value 0, so all open owners should be deleted.

- Bryan

static int nfsd_forget_n_openowners(u64 num)
{
int i, count = 0;
struct nfs4_stateowner *sop, *next;

for (i = 0; i < OPEN_OWNER_HASH_SIZE; i++) {
list_for_each_entry_safe(sop, next, &open_ownerstr_hashtbl[i], so_strhash) {
release_openowner(openowner(sop));
if (++count == num)
return count;
}
}
return count;
}

void nfsd_forget_openowners(u64 num)
{
int count;

nfs4_lock_state();
count = nfsd_forget_n_openowners(num);
nfs4_unlock_state();

printk(KERN_INFO "%s %s Forgot %d open owners", __FILE__, __func__, count);
}


2011-09-28 18:23:06

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfsd grace period open owners

On Wed, Sep 28, 2011 at 02:06:16PM -0400, Bryan Schumaker wrote:
> Hi Bruce,
>
> I'm updating my fault injection patches to work with your recent stateid changes, and I had a question about something I'm seeing on the server. When I reboot the server it will enter a grace period and reply to any OPEN requests from the client with NFSERR_GRACE. I used Wireshark to see that NFSERR_GRACE is returned 12 times before the open succeeds, so 13 OPEN requests total. When I send the command to forget all open owners I find that 13 open owners have been forgotten, but I only expecting to find 1. Is this what you would expect the server to do?

Huh. No, that doesn't sound right. A failed open doesn't confirm the
openowner, and I don't think there's any reason to keep around an
unconfirmed openowner.

Out of curiosity, is the client using a different openowner for each
open?

--b.

>
> I've pasted the code I use to count and delete open owners below. The variable "num" has the value 0, so all open owners should be deleted.
>
> - Bryan
>
> static int nfsd_forget_n_openowners(u64 num)
> {
> int i, count = 0;
> struct nfs4_stateowner *sop, *next;
>
> for (i = 0; i < OPEN_OWNER_HASH_SIZE; i++) {
> list_for_each_entry_safe(sop, next, &open_ownerstr_hashtbl[i], so_strhash) {
> release_openowner(openowner(sop));
> if (++count == num)
> return count;
> }
> }
> return count;
> }
>
> void nfsd_forget_openowners(u64 num)
> {
> int count;
>
> nfs4_lock_state();
> count = nfsd_forget_n_openowners(num);
> nfs4_unlock_state();
>
> printk(KERN_INFO "%s %s Forgot %d open owners", __FILE__, __func__, count);
> }

2011-09-28 19:09:21

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfsd grace period open owners

On Wed, Sep 28, 2011 at 02:43:02PM -0400, Bryan Schumaker wrote:
> On 09/28/2011 02:23 PM, J. Bruce Fields wrote:
> > On Wed, Sep 28, 2011 at 02:06:16PM -0400, Bryan Schumaker wrote:
> >> Hi Bruce,
> >>
> >> I'm updating my fault injection patches to work with your recent stateid changes, and I had a question about something I'm seeing on the server. When I reboot the server it will enter a grace period and reply to any OPEN requests from the client with NFSERR_GRACE. I used Wireshark to see that NFSERR_GRACE is returned 12 times before the open succeeds, so 13 OPEN requests total. When I send the command to forget all open owners I find that 13 open owners have been forgotten, but I only expecting to find 1. Is this what you would expect the server to do?
> >
> > Huh. No, that doesn't sound right. A failed open doesn't confirm the
> > openowner, and I don't think there's any reason to keep around an
> > unconfirmed openowner.
>
> Makes sense to me.
>
> >
> > Out of curiosity, is the client using a different openowner for each
> > open?
>
> Yeah, looks like the client is using a different one for each attempt.

Makes sense. Yes, so it's the obvious problem--we create a new
openowner early in the OPEN processing but don't clean it up on failure.

I'm not sure how to fix it. By the time we fail later on, we've lost
track of whether the openowner is one we created for this open or not.

Well, wait, I suppose in the former case it will still be unconfirmed.
So probably we should just free op_openowner on exit from open if it's
unconfirmed.

--b.

2011-09-28 18:43:04

by Anna Schumaker

[permalink] [raw]
Subject: Re: nfsd grace period open owners

On 09/28/2011 02:23 PM, J. Bruce Fields wrote:
> On Wed, Sep 28, 2011 at 02:06:16PM -0400, Bryan Schumaker wrote:
>> Hi Bruce,
>>
>> I'm updating my fault injection patches to work with your recent stateid changes, and I had a question about something I'm seeing on the server. When I reboot the server it will enter a grace period and reply to any OPEN requests from the client with NFSERR_GRACE. I used Wireshark to see that NFSERR_GRACE is returned 12 times before the open succeeds, so 13 OPEN requests total. When I send the command to forget all open owners I find that 13 open owners have been forgotten, but I only expecting to find 1. Is this what you would expect the server to do?
>
> Huh. No, that doesn't sound right. A failed open doesn't confirm the
> openowner, and I don't think there's any reason to keep around an
> unconfirmed openowner.

Makes sense to me.

>
> Out of curiosity, is the client using a different openowner for each
> open?

Yeah, looks like the client is using a different one for each attempt.

- Bryan

>
> --b.
>
>>
>> I've pasted the code I use to count and delete open owners below. The variable "num" has the value 0, so all open owners should be deleted.
>>
>> - Bryan
>>
>> static int nfsd_forget_n_openowners(u64 num)
>> {
>> int i, count = 0;
>> struct nfs4_stateowner *sop, *next;
>>
>> for (i = 0; i < OPEN_OWNER_HASH_SIZE; i++) {
>> list_for_each_entry_safe(sop, next, &open_ownerstr_hashtbl[i], so_strhash) {
>> release_openowner(openowner(sop));
>> if (++count == num)
>> return count;
>> }
>> }
>> return count;
>> }
>>
>> void nfsd_forget_openowners(u64 num)
>> {
>> int count;
>>
>> nfs4_lock_state();
>> count = nfsd_forget_n_openowners(num);
>> nfs4_unlock_state();
>>
>> printk(KERN_INFO "%s %s Forgot %d open owners", __FILE__, __func__, count);
>> }