LinuxLists.cc - [PATCH 1/2] NFS: Fix a bug in nfs_fscache_release

2010-02-05 22:43:15

Subject: [PATCH 1/2] NFS: Fix a bug in nfs_fscache_release_page()

Not having an fscache cookie is perfectly valid if the user didn't mount
with the fscache option.

This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=15234

Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected]
---
fs/nfs/fscache.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index fa58800..534adb8 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -355,11 +355,11 @@ void nfs_fscache_reset_inode_cookie(struct inode *inode)
int nfs_fscache_release_page(struct page *page, gfp_t gfp)
{
struct nfs_inode *nfsi = NFS_I(page->mapping->host);
- struct fscache_cookie *cookie = nfsi->fscache;
-
- BUG_ON(!cookie);

if (PageFsCache(page)) {
+ struct fscache_cookie *cookie = nfsi->fscache;
+
+ BUG_ON(!cookie);
dfprintk(FSCACHE, "NFS: fscache releasepage (0x%p/0x%p/0x%p)\n",
cookie, page, nfsi);

--
1.6.6

2010-02-08 15:02:23

by Myklebust, Trond

[permalink] [raw]

Subject: [PATCH 0/2] Fix for the nfs_release_page() bug

Here are the two proposed patches. The first is needed for both mainline
and [email protected]. The second, being a cleanup, will be send to
mainline only.

Cheers
Trond

---

Trond Myklebust (2):
NFS: Remove a redundant check for PageFsCache in nfs_migrate_page()
NFS: Fix a bug in nfs_fscache_release_page()

fs/nfs/fscache.c | 9 ++++-----
fs/nfs/write.c | 3 +--
2 files changed, 5 insertions(+), 7 deletions(-)

--
Signature

2010-02-05 22:43:15

by Myklebust, Trond

[permalink] [raw]

Subject: [PATCH 2/2] NFS: Fix the mapping of the NFSERR_SERVERFAULT error

It was recently pointed out that the NFSERR_SERVERFAULT error, which is
designed to inform the user of a serious internal error on the server, was
being mapped to an error value that is internal to the kernel.

This patch maps it to the error EREMOTEIO, which is exported to userland
through errno.h.

Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected]
---
fs/nfs/mount_clnt.c | 2 +-
fs/nfs/nfs2xdr.c | 2 +-
fs/nfs/nfs4xdr.c | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/mount_clnt.c b/fs/nfs/mount_clnt.c
index 0adefc4..59047f8 100644
--- a/fs/nfs/mount_clnt.c
+++ b/fs/nfs/mount_clnt.c
@@ -120,7 +120,7 @@ static struct {
{ .status = MNT3ERR_INVAL, .errno = -EINVAL, },
{ .status = MNT3ERR_NAMETOOLONG, .errno = -ENAMETOOLONG, },
{ .status = MNT3ERR_NOTSUPP, .errno = -ENOTSUPP, },
- { .status = MNT3ERR_SERVERFAULT, .errno = -ESERVERFAULT, },
+ { .status = MNT3ERR_SERVERFAULT, .errno = -EREMOTEIO, },
};

struct mountres {
diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
index 5e078b2..7bc2da8 100644
--- a/fs/nfs/nfs2xdr.c
+++ b/fs/nfs/nfs2xdr.c
@@ -699,7 +699,7 @@ static struct {
{ NFSERR_BAD_COOKIE, -EBADCOOKIE },
{ NFSERR_NOTSUPP, -ENOTSUPP },
{ NFSERR_TOOSMALL, -ETOOSMALL },
- { NFSERR_SERVERFAULT, -ESERVERFAULT },
+ { NFSERR_SERVERFAULT, -EREMOTEIO },
{ NFSERR_BADTYPE, -EBADTYPE },
{ NFSERR_JUKEBOX, -EJUKEBOX },
{ -1, -EIO }
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index e437fd6..5cd5184 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -4631,7 +4631,7 @@ static int decode_sequence(struct xdr_stream *xdr,
* If the server returns different values for sessionID, slotID or
* sequence number, the server is looney tunes.
*/
- status = -ESERVERFAULT;
+ status = -EREMOTEIO;

if (memcmp(id.data, res->sr_session->sess_id.data,
NFS4_MAX_SESSIONID_LEN)) {
@@ -5774,7 +5774,7 @@ static struct {
{ NFS4ERR_BAD_COOKIE, -EBADCOOKIE },
{ NFS4ERR_NOTSUPP, -ENOTSUPP },
{ NFS4ERR_TOOSMALL, -ETOOSMALL },
- { NFS4ERR_SERVERFAULT, -ESERVERFAULT },
+ { NFS4ERR_SERVERFAULT, -EREMOTEIO },
{ NFS4ERR_BADTYPE, -EBADTYPE },
{ NFS4ERR_LOCKED, -EAGAIN },
{ NFS4ERR_SYMLINK, -ELOOP },
@@ -5801,7 +5801,7 @@ nfs4_stat_to_errno(int stat)
}
if (stat <= 10000 || stat > 10100) {
/* The server is looney tunes. */
- return -ESERVERFAULT;
+ return -EREMOTEIO;
}
/* If we cannot translate the error, the recovery routines should
* handle it.
--
1.6.6

2010-02-05 23:14:24

by Chuck Lever

[permalink] [raw]

Subject: Re: [PATCH 2/2] NFS: Fix the mapping of the NFSERR_SERVERFAULT error

On 02/05/2010 05:43 PM, Trond Myklebust wrote:
> It was recently pointed out that the NFSERR_SERVERFAULT error, which is
> designed to inform the user of a serious internal error on the server, was
> being mapped to an error value that is internal to the kernel.
>
> This patch maps it to the error EREMOTEIO, which is exported to userland
> through errno.h.
>
> Signed-off-by: Trond Myklebust<[email protected]>
> Cc: [email protected]
> ---
> fs/nfs/mount_clnt.c | 2 +-
> fs/nfs/nfs2xdr.c | 2 +-
> fs/nfs/nfs4xdr.c | 6 +++---
> 3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfs/mount_clnt.c b/fs/nfs/mount_clnt.c
> index 0adefc4..59047f8 100644
> --- a/fs/nfs/mount_clnt.c
> +++ b/fs/nfs/mount_clnt.c
> @@ -120,7 +120,7 @@ static struct {
> { .status = MNT3ERR_INVAL, .errno = -EINVAL, },
> { .status = MNT3ERR_NAMETOOLONG, .errno = -ENAMETOOLONG, },
> { .status = MNT3ERR_NOTSUPP, .errno = -ENOTSUPP, },
> - { .status = MNT3ERR_SERVERFAULT, .errno = -ESERVERFAULT, },
> + { .status = MNT3ERR_SERVERFAULT, .errno = -EREMOTEIO, },
> };
>
> struct mountres {

The decode_status() and decode_fhs_status() functions return -EACCES if
they don't recognize the server's error code. Should they return
-EREMOTEIO instead?

Otherwise, looks good.

> diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
> index 5e078b2..7bc2da8 100644
> --- a/fs/nfs/nfs2xdr.c
> +++ b/fs/nfs/nfs2xdr.c
> @@ -699,7 +699,7 @@ static struct {
> { NFSERR_BAD_COOKIE, -EBADCOOKIE },
> { NFSERR_NOTSUPP, -ENOTSUPP },
> { NFSERR_TOOSMALL, -ETOOSMALL },
> - { NFSERR_SERVERFAULT, -ESERVERFAULT },
> + { NFSERR_SERVERFAULT, -EREMOTEIO },
> { NFSERR_BADTYPE, -EBADTYPE },
> { NFSERR_JUKEBOX, -EJUKEBOX },
> { -1, -EIO }
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index e437fd6..5cd5184 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -4631,7 +4631,7 @@ static int decode_sequence(struct xdr_stream *xdr,
> * If the server returns different values for sessionID, slotID or
> * sequence number, the server is looney tunes.
> */
> - status = -ESERVERFAULT;
> + status = -EREMOTEIO;
>
> if (memcmp(id.data, res->sr_session->sess_id.data,
> NFS4_MAX_SESSIONID_LEN)) {
> @@ -5774,7 +5774,7 @@ static struct {
> { NFS4ERR_BAD_COOKIE, -EBADCOOKIE },
> { NFS4ERR_NOTSUPP, -ENOTSUPP },
> { NFS4ERR_TOOSMALL, -ETOOSMALL },
> - { NFS4ERR_SERVERFAULT, -ESERVERFAULT },
> + { NFS4ERR_SERVERFAULT, -EREMOTEIO },
> { NFS4ERR_BADTYPE, -EBADTYPE },
> { NFS4ERR_LOCKED, -EAGAIN },
> { NFS4ERR_SYMLINK, -ELOOP },
> @@ -5801,7 +5801,7 @@ nfs4_stat_to_errno(int stat)
> }
> if (stat<= 10000 || stat> 10100) {
> /* The server is looney tunes. */
> - return -ESERVERFAULT;
> + return -EREMOTEIO;
> }
> /* If we cannot translate the error, the recovery routines should
> * handle it.

--
chuck[dot]lever[at]oracle[dot]com

2010-02-07 11:26:24

by Suresh Jayaraman

[permalink] [raw]

Subject: Re: [PATCH 1/2] NFS: Fix a bug in nfs_fscache_release_page()

On 02/06/2010 04:13 AM, Trond Myklebust wrote:
> Not having an fscache cookie is perfectly valid if the user didn't mount
> with the fscache option.
>
> This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=15234
>
> Signed-off-by: Trond Myklebust <[email protected]>
> Cc: [email protected]
> ---
> fs/nfs/fscache.c | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> index fa58800..534adb8 100644
> --- a/fs/nfs/fscache.c
> +++ b/fs/nfs/fscache.c
> @@ -355,11 +355,11 @@ void nfs_fscache_reset_inode_cookie(struct inode *inode)
> int nfs_fscache_release_page(struct page *page, gfp_t gfp)
> {
> struct nfs_inode *nfsi = NFS_I(page->mapping->host);
> - struct fscache_cookie *cookie = nfsi->fscache;
> -
> - BUG_ON(!cookie);
>
> if (PageFsCache(page)) {
> + struct fscache_cookie *cookie = nfsi->fscache;
> +
> + BUG_ON(!cookie);
> dfprintk(FSCACHE, "NFS: fscache releasepage (0x%p/0x%p/0x%p)\n",
> cookie, page, nfsi);
>

There are only two callers for nfs_fscache_release_page() -
nfs_release_page() and nfs_migrate_page(). nfs_migrate_page already does
this:

if (PageFsCache(page))
nfs_fscache_release_page(page, GFP_KERNEL);

and the assumption in nfs_release_page() is that the page should have
either PG_private set or PG_fscache set and nfs_fscache_release_page
gets called only if PG_private is not set.

I think the idea is that nfs_fscache_release_page should not get called
if fsc option is not used. So it appears to me this patch is fixing the
symptom not the actual issue. Perhaps, this the assumption in
nfs_release_page is wrong or the PageFsCache() check should be moved to
nfs_release_page?

Thanks,

--
Suresh Jayaraman

2010-02-08 13:24:08

by Myklebust, Trond

[permalink] [raw]

Subject: Re: [PATCH 1/2] NFS: Fix a bug in nfs_fscache_release_page()

On Sun, 2010-02-07 at 16:56 +0530, Suresh Jayaraman wrote:
> On 02/06/2010 04:13 AM, Trond Myklebust wrote:
> > Not having an fscache cookie is perfectly valid if the user didn't mount
> > with the fscache option.
> >
> > This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=15234
> >
> > Signed-off-by: Trond Myklebust <[email protected]>
> > Cc: [email protected]
> > ---
> > fs/nfs/fscache.c | 6 +++---
> > 1 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
> > index fa58800..534adb8 100644
> > --- a/fs/nfs/fscache.c
> > +++ b/fs/nfs/fscache.c
> > @@ -355,11 +355,11 @@ void nfs_fscache_reset_inode_cookie(struct inode *inode)
> > int nfs_fscache_release_page(struct page *page, gfp_t gfp)
> > {
> > struct nfs_inode *nfsi = NFS_I(page->mapping->host);
> > - struct fscache_cookie *cookie = nfsi->fscache;
> > -
> > - BUG_ON(!cookie);
> >
> > if (PageFsCache(page)) {
> > + struct fscache_cookie *cookie = nfsi->fscache;
> > +
> > + BUG_ON(!cookie);
> > dfprintk(FSCACHE, "NFS: fscache releasepage (0x%p/0x%p/0x%p)\n",
> > cookie, page, nfsi);
> >
>
> There are only two callers for nfs_fscache_release_page() -
> nfs_release_page() and nfs_migrate_page(). nfs_migrate_page already does
> this:
>
> if (PageFsCache(page))
> nfs_fscache_release_page(page, GFP_KERNEL);
>
> and the assumption in nfs_release_page() is that the page should have
> either PG_private set or PG_fscache set and nfs_fscache_release_page
> gets called only if PG_private is not set.

...or if it gets cleared.

> I think the idea is that nfs_fscache_release_page should not get called
> if fsc option is not used. So it appears to me this patch is fixing the
> symptom not the actual issue. Perhaps, this the assumption in
> nfs_release_page is wrong or the PageFsCache() check should be moved to
> nfs_release_page?

No. We should rather get rid of the redundant check for PageFsCache() in
nfs_migrate_page. PageFsCache() is particular to fscache, so the test
belongs in the fscache code.

Trond

2010-02-08 14:51:23

by Myklebust, Trond

[permalink] [raw]

Subject: Re: [PATCH 1/2] NFS: Fix a bug in nfs_fscache_release_page()

On Mon, 2010-02-08 at 08:23 -0500, Trond Myklebust wrote:
> On Sun, 2010-02-07 at 16:56 +0530, Suresh Jayaraman wrote:
> > There are only two callers for nfs_fscache_release_page() -
> > nfs_release_page() and nfs_migrate_page(). nfs_migrate_page already does
> > this:
> >
> > if (PageFsCache(page))
> > nfs_fscache_release_page(page, GFP_KERNEL);
> >
> > and the assumption in nfs_release_page() is that the page should have
> > either PG_private set or PG_fscache set and nfs_fscache_release_page
> > gets called only if PG_private is not set.
>
> ...or if it gets cleared.

To be more precise, even before we put call to nfs_wb_page() in
nfs_release_page(), it was possible for the PG_private bit to be set
when doing the test in shrink_page_list(), but for an outstanding commit
operation to complete before the second test in nfs_release_page.

In this case, nfs_fscache_release_page would get called with neither
PG_private nor PG_fscache being set, and the Oops could occur.

> > I think the idea is that nfs_fscache_release_page should not get called
> > if fsc option is not used. So it appears to me this patch is fixing the
> > symptom not the actual issue. Perhaps, this the assumption in
> > nfs_release_page is wrong or the PageFsCache() check should be moved to
> > nfs_release_page?
>
> No. We should rather get rid of the redundant check for PageFsCache() in
> nfs_migrate_page. PageFsCache() is particular to fscache, so the test
> belongs in the fscache code.

I've added a cleanup patch (which will not go to [email protected]) to
do this.

Trond

2010-02-08 15:13:32

by Myklebust, Trond

[permalink] [raw]

Subject: Re: [PATCH 2/2] NFS: Fix the mapping of the NFSERR_SERVERFAULT error

On Fri, 2010-02-05 at 18:12 -0500, Chuck Lever wrote:
> On 02/05/2010 05:43 PM, Trond Myklebust wrote:
> > It was recently pointed out that the NFSERR_SERVERFAULT error, which is
> > designed to inform the user of a serious internal error on the server, was
> > being mapped to an error value that is internal to the kernel.
> >
> > This patch maps it to the error EREMOTEIO, which is exported to userland
> > through errno.h.
> >
> > Signed-off-by: Trond Myklebust<[email protected]>
> > Cc: [email protected]
> > ---
> > fs/nfs/mount_clnt.c | 2 +-
> > fs/nfs/nfs2xdr.c | 2 +-
> > fs/nfs/nfs4xdr.c | 6 +++---
> > 3 files changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/nfs/mount_clnt.c b/fs/nfs/mount_clnt.c
> > index 0adefc4..59047f8 100644
> > --- a/fs/nfs/mount_clnt.c
> > +++ b/fs/nfs/mount_clnt.c
> > @@ -120,7 +120,7 @@ static struct {
> > { .status = MNT3ERR_INVAL, .errno = -EINVAL, },
> > { .status = MNT3ERR_NAMETOOLONG, .errno = -ENAMETOOLONG, },
> > { .status = MNT3ERR_NOTSUPP, .errno = -ENOTSUPP, },
> > - { .status = MNT3ERR_SERVERFAULT, .errno = -ESERVERFAULT, },
> > + { .status = MNT3ERR_SERVERFAULT, .errno = -EREMOTEIO, },
> > };
> >
> > struct mountres {
>
> The decode_status() and decode_fhs_status() functions return -EACCES if
> they don't recognize the server's error code. Should they return
> -EREMOTEIO instead?

It might possibly be a bit more enlightening to the user, but that would
be a separate patch. I wouldn't see that fix as being particularly
critical: EACCES is a valid error code that can be returned to userland.

Cheers
Trond

2010-02-08 16:34:05

by Suresh Jayaraman

[permalink] [raw]

Subject: Re: [PATCH 1/2] NFS: Fix a bug in nfs_fscache_release_page()

On 02/08/2010 08:20 PM, Trond Myklebust wrote:
> On Mon, 2010-02-08 at 08:23 -0500, Trond Myklebust wrote:
>> On Sun, 2010-02-07 at 16:56 +0530, Suresh Jayaraman wrote:
>>> There are only two callers for nfs_fscache_release_page() -
>>> nfs_release_page() and nfs_migrate_page(). nfs_migrate_page already does
>>> this:
>>>
>>> if (PageFsCache(page))
>>> nfs_fscache_release_page(page, GFP_KERNEL);
>>>
>>> and the assumption in nfs_release_page() is that the page should have
>>> either PG_private set or PG_fscache set and nfs_fscache_release_page
>>> gets called only if PG_private is not set.
>>
>> ...or if it gets cleared.
>
> To be more precise, even before we put call to nfs_wb_page() in
> nfs_release_page(), it was possible for the PG_private bit to be set

Yes, I have seen a similar bug report before we added nfs_wb_page on a
2.6.32 kernel too.

> when doing the test in shrink_page_list(), but for an outstanding commit
> operation to complete before the second test in nfs_release_page.
>
> In this case, nfs_fscache_release_page would get called with neither
> PG_private nor PG_fscache being set, and the Oops could occur.
>

We seem to ensure that we're holding a page lock in try_to_release_page.
Even if the outstanding commit is complete by the time we are in
nfs_releage_page, page flags should not have been modified, right?

>>> I think the idea is that nfs_fscache_release_page should not get called
>>> if fsc option is not used. So it appears to me this patch is fixing the
>>> symptom not the actual issue. Perhaps, this the assumption in
>>> nfs_release_page is wrong or the PageFsCache() check should be moved to
>>> nfs_release_page?
>>
>> No. We should rather get rid of the redundant check for PageFsCache() in
>> nfs_migrate_page. PageFsCache() is particular to fscache, so the test
>> belongs in the fscache code.

yes, make sense.

Thanks,

--
Suresh Jayaraman

2010-02-08 16:40:44

by Myklebust, Trond

[permalink] [raw]

Subject: Re: [PATCH 1/2] NFS: Fix a bug in nfs_fscache_release_page()

On Mon, 2010-02-08 at 22:03 +0530, Suresh Jayaraman wrote:
> We seem to ensure that we're holding a page lock in try_to_release_page.
> Even if the outstanding commit is complete by the time we are in
> nfs_releage_page, page flags should not have been modified, right?

No. PG_private may be cleared while another process is holding the page
lock (just like PG_writeback may).

Once cleared, PG_private cannot be set again without grabbing the page
lock.

Trond

2010-02-09 06:26:22

by Suresh Jayaraman

[permalink] [raw]

Subject: Re: [PATCH 1/2] NFS: Fix a bug in nfs_fscache_release_page()

On 02/08/2010 10:09 PM, Trond Myklebust wrote:
> On Mon, 2010-02-08 at 22:03 +0530, Suresh Jayaraman wrote:
>> We seem to ensure that we're holding a page lock in try_to_release_page.
>> Even if the outstanding commit is complete by the time we are in
>> nfs_releage_page, page flags should not have been modified, right?
>
> No. PG_private may be cleared while another process is holding the page
> lock (just like PG_writeback may).
>
> Once cleared, PG_private cannot be set again without grabbing the page
> lock.
>

Ah, ok. I got stuck there. Thanks for the clarifying.

--
Suresh Jayaraman