Hi,
I have a customer who is experiencing a problem on an older kernel.
After a server restart (actually a cluster fail-over) the client
recovers all opens and locks correctly, but then sends a (previously
queued) WRITE request with a filehandle/stateid for a LAYOUT that was
provided by the server before the restart. Unsurprisingly this doesn't
work.
I've been hunting through the code to find out where the code attempts
to invalidate all layouts as required by 12.7.4 in RFC-5661. But I
cannot find it.
I'm guessing that adding a call to pnfs_destroy_layout() in
__nfs4_reclaim_open_state() would do the trick but maybe there is a
better way that is already implemented in later kernels, that I cannot
find.
Any pointers?
Thanks,
NeilBrown
On Mon, 20 May 2024, NeilBrown wrote:
>
> Hi,
> I have a customer who is experiencing a problem on an older kernel.
> After a server restart (actually a cluster fail-over) the client
> recovers all opens and locks correctly, but then sends a (previously
> queued) WRITE request with a filehandle/stateid for a LAYOUT that was
> provided by the server before the restart. Unsurprisingly this doesn't
> work.
>
> I've been hunting through the code to find out where the code attempts
> to invalidate all layouts as required by 12.7.4 in RFC-5661. But I
> cannot find it.
>
> I'm guessing that adding a call to pnfs_destroy_layout() in
> __nfs4_reclaim_open_state() would do the trick but maybe there is a
> better way that is already implemented in later kernels, that I cannot
> find.
>
> Any pointers?
I think I found it. nfs4_establish_lease() calls pnfs_destroy_layouts().
And that is present in the old kernel that isn't working... both. Must
be something else.
Sorry for the noise.
NeilBrown