2022-04-06 21:24:12

by J. Bruce Fields

[permalink] [raw]
Subject: sporadic hangs on generic/186

In the last couple days I've started getting hangs on xfstests
generic/186 on upstream. I also notice the test completes after 10+
hours (usually it takes about 5 minutes). Sometimes this is accompanied
by "nfs: RPC call returned error 12" on the client.

Test description is:

# Ensuring that copy on write in buffered mode works when free space
# is heavily fragmented.
# - Create two files
# - Reflink the odd blocks of the first file into a third file.
# - Reflink the even blocks of the second file into the third file.
# - Try to fragment the free space by allocating a huge file and
# punching out every other block.
# - CoW across the halfway mark.
# - Check that the files are now different where we say they're
# different.

so maybe it's really some xfs change, I don't know. Or maybe it's a
problem with my particular test filesystem (which doesn't get recreated
for each test run).

The problem doesn't reproduce easily enough to bisect.

I may just turn off that test for now.

--b.


2022-04-07 01:08:05

by NeilBrown

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Thu, 07 Apr 2022, Dave Chinner wrote:
> On Wed, Apr 06, 2022 at 03:54:24PM -0400, J. Bruce Fields wrote:
> > In the last couple days I've started getting hangs on xfstests
> > generic/186 on upstream. I also notice the test completes after 10+
> > hours (usually it takes about 5 minutes). Sometimes this is accompanied
> > by "nfs: RPC call returned error 12" on the client.
>
> #define ENOMEM 12 /* Out of memory */
>
> So either the client or the server is running out of memory
> somewhere?

Probably the client. There are a bunch of changes recently which add
__GFP_NORETRY to memory allocations from PF_WQ_WORKERs because that can
result in deadlocks when swapping over NFS.
This means that kmalloc request that previously never failed (because
GFP_KERNEL never fails for kernel threads I think) can now fail. This
has tickled one bug that I know of. There are likely to be more.

The RPC code should simply retry these allocations after a short delay.
HZ/4 is the number that is used in a couple of places. Possibly there
are more places that need to handle -ENOMEM with rpc_delay().

NeilBrown

2022-04-07 01:19:49

by Dave Chinner

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Wed, Apr 06, 2022 at 03:54:24PM -0400, J. Bruce Fields wrote:
> In the last couple days I've started getting hangs on xfstests
> generic/186 on upstream. I also notice the test completes after 10+
> hours (usually it takes about 5 minutes). Sometimes this is accompanied
> by "nfs: RPC call returned error 12" on the client.

#define ENOMEM 12 /* Out of memory */

So either the client or the server is running out of memory
somewhere?

> Test description is:
>
> # Ensuring that copy on write in buffered mode works when free space
> # is heavily fragmented.
> # - Create two files
> # - Reflink the odd blocks of the first file into a third file.
> # - Reflink the even blocks of the second file into the third file.
> # - Try to fragment the free space by allocating a huge file and
> # punching out every other block.
> # - CoW across the halfway mark.
> # - Check that the files are now different where we say they're
> # different.

It's using clone_file_range and hole punch, so I'm assuming that
this is nfsv4 client side testing? In which case, the behaviour is
going to be largely dependent on the server side implementation of
these functions? Does reverting the client or the server to an older
kernel resolve the issue?

Cheers,

Dave.
--
Dave Chinner
[email protected]

2022-04-07 01:22:55

by Chuck Lever III

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186



> On Apr 6, 2022, at 3:54 PM, J. Bruce Fields <[email protected]> wrote:
>
> In the last couple days I've started getting hangs on xfstests
> generic/186 on upstream. I also notice the test completes after 10+
> hours (usually it takes about 5 minutes). Sometimes this is accompanied
> by "nfs: RPC call returned error 12" on the client.
>
> Test description is:
>
> # Ensuring that copy on write in buffered mode works when free space
> # is heavily fragmented.
> # - Create two files
> # - Reflink the odd blocks of the first file into a third file.
> # - Reflink the even blocks of the second file into the third file.
> # - Try to fragment the free space by allocating a huge file and
> # punching out every other block.
> # - CoW across the halfway mark.
> # - Check that the files are now different where we say they're
> # different.
>
> so maybe it's really some xfs change, I don't know. Or maybe it's a
> problem with my particular test filesystem (which doesn't get recreated
> for each test run).
>
> The problem doesn't reproduce easily enough to bisect.
>
> I may just turn off that test for now.

Thanks for the report.

I don't have a SCRATCH_DEV so generic/186 isn't run on my systems.


--
Chuck Lever



2022-04-07 04:43:45

by J. Bruce Fields

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Thu, Apr 07, 2022 at 11:19:34AM +1000, NeilBrown wrote:
> I had a look through the various places where alloc can now fail.
>
> I think xdr_alloc_bvec() in xprt_sent_pagedata() is the most likely
> cause of a problem here. I don't think an -ENOMEM from there is caught,
> so it could likely filter up to NFS and result in the message you got.
>
> I don't think we can easily handle failure there. We need to stay with
> GFP_KERNEL rely on PF_MEMALLOC to make forward progress for
> swap-over-NFS.
>
> Bruce: can you change that one line back to GFP_KERNEL and see if the
> problem goes away?

Like this? Sure--might take me a day or two to run the tests and get
results back.--b.

diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
index 05b38bf68316..506627dc9a0f 100644
--- a/net/sunrpc/socklib.c
+++ b/net/sunrpc/socklib.c
@@ -223,7 +223,7 @@ static int xprt_send_pagedata(struct socket *sock, struct msghdr *msg,
{
int err;

- err = xdr_alloc_bvec(xdr, rpc_task_gfp_mask());
+ err = xdr_alloc_bvec(xdr, GFP_KERNEL);
if (err < 0)
return err;

2022-04-07 06:37:07

by NeilBrown

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Thu, 07 Apr 2022, J. Bruce Fields wrote:
> On Thu, Apr 07, 2022 at 11:19:34AM +1000, NeilBrown wrote:
> > I had a look through the various places where alloc can now fail.
> >
> > I think xdr_alloc_bvec() in xprt_sent_pagedata() is the most likely
> > cause of a problem here. I don't think an -ENOMEM from there is caught,
> > so it could likely filter up to NFS and result in the message you got.
> >
> > I don't think we can easily handle failure there. We need to stay with
> > GFP_KERNEL rely on PF_MEMALLOC to make forward progress for
> > swap-over-NFS.
> >
> > Bruce: can you change that one line back to GFP_KERNEL and see if the
> > problem goes away?
>
> Like this? Sure--might take me a day or two to run the tests and get
> results back.--b.
>
> diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
> index 05b38bf68316..506627dc9a0f 100644
> --- a/net/sunrpc/socklib.c
> +++ b/net/sunrpc/socklib.c
> @@ -223,7 +223,7 @@ static int xprt_send_pagedata(struct socket *sock, struct msghdr *msg,
> {
> int err;
>
> - err = xdr_alloc_bvec(xdr, rpc_task_gfp_mask());
> + err = xdr_alloc_bvec(xdr, GFP_KERNEL);
> if (err < 0)
> return err;
>
>

That looks right.

I instrumented my kernel to deliberately fail 10% of the time, and I got
lots of

nfs: RPC call returned error 12

so I'm fairly sure this explains that message.
But you say the hangs were only occasionally accompanied by the message,
so it probably doesn't explain the hangs.

NeilBrown

2022-04-07 07:39:25

by Trond Myklebust

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Thu, 2022-04-07 at 11:19 +1000, NeilBrown wrote:
> On Thu, 07 Apr 2022, NeilBrown wrote:
> > On Thu, 07 Apr 2022, Dave Chinner wrote:
> > > On Wed, Apr 06, 2022 at 03:54:24PM -0400, J. Bruce Fields wrote:
> > > > In the last couple days I've started getting hangs on xfstests
> > > > generic/186 on upstream.  I also notice the test completes
> > > > after 10+
> > > > hours (usually it takes about 5 minutes).  Sometimes this is
> > > > accompanied
> > > > by "nfs: RPC call returned error 12" on the client.
> > >
> > > #define ENOMEM          12      /* Out of memory */
> > >
> > > So either the client or the server is running out of memory
> > > somewhere?
> >
> > Probably the client.  There are a bunch of changes recently which
> > add
> > __GFP_NORETRY to memory allocations from PF_WQ_WORKERs because that
> > can
> > result in deadlocks when swapping over NFS.
> > This means that kmalloc request that previously never failed
> > (because
> > GFP_KERNEL never fails for kernel threads I think) can now fail. 
> > This
> > has tickled one bug that I know of.  There are likely to be more.
> >
> > The RPC code should simply retry these allocations after a short
> > delay.
> > HZ/4 is the number that is used in a couple of places.  Possibly
> > there
> > are more places that need to handle -ENOMEM with rpc_delay().
>
> I had a look through the various places where alloc can now fail.
>
> I think xdr_alloc_bvec() in xprt_sent_pagedata() is the most likely
> cause of a problem here.  I don't think an -ENOMEM from there is
> caught,
> so it could likely filter up to NFS and result in the message you
> got.
>
> I don't think we can easily handle failure there.  We need to stay
> with
> GFP_KERNEL rely on PF_MEMALLOC to make forward progress for
> swap-over-NFS.
>
> Bruce: can you change that one line back to GFP_KERNEL and see if the
> problem goes away?
>

Can we please just move the call to xdr_alloc_bvec() out of
xprt_send_pagedata()? Move the client side allocation into
xs_stream_prepare_request() and xs_udp_send_request(), then move the
server side allocation into svc_udp_sendto().

That makes it possible to handle errors.

> The other problem I found is that rpc_alloc_task() can now fail, but
> rpc_new_task assumes that it never will.  If it does, then we get a
> NULL
> deref.
>
> I don't think rpc_new_task() can ever be called from the rpciod work
> queue, so it is safe to just use a mempool with GFP_KERNEL like we
> did
> before.
>
No. We shouldn't ever use mempools with GFP_KERNEL.

Most, if not all of the callers of rpc_run_task() are still capable of
dealing with errors, and ditto for the callers of rpc_run_bc_task().

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2022-04-07 09:00:42

by NeilBrown

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Thu, 07 Apr 2022, NeilBrown wrote:
> On Thu, 07 Apr 2022, Dave Chinner wrote:
> > On Wed, Apr 06, 2022 at 03:54:24PM -0400, J. Bruce Fields wrote:
> > > In the last couple days I've started getting hangs on xfstests
> > > generic/186 on upstream. I also notice the test completes after 10+
> > > hours (usually it takes about 5 minutes). Sometimes this is accompanied
> > > by "nfs: RPC call returned error 12" on the client.
> >
> > #define ENOMEM 12 /* Out of memory */
> >
> > So either the client or the server is running out of memory
> > somewhere?
>
> Probably the client. There are a bunch of changes recently which add
> __GFP_NORETRY to memory allocations from PF_WQ_WORKERs because that can
> result in deadlocks when swapping over NFS.
> This means that kmalloc request that previously never failed (because
> GFP_KERNEL never fails for kernel threads I think) can now fail. This
> has tickled one bug that I know of. There are likely to be more.
>
> The RPC code should simply retry these allocations after a short delay.
> HZ/4 is the number that is used in a couple of places. Possibly there
> are more places that need to handle -ENOMEM with rpc_delay().

I had a look through the various places where alloc can now fail.

I think xdr_alloc_bvec() in xprt_sent_pagedata() is the most likely
cause of a problem here. I don't think an -ENOMEM from there is caught,
so it could likely filter up to NFS and result in the message you got.

I don't think we can easily handle failure there. We need to stay with
GFP_KERNEL rely on PF_MEMALLOC to make forward progress for
swap-over-NFS.

Bruce: can you change that one line back to GFP_KERNEL and see if the
problem goes away?

The other problem I found is that rpc_alloc_task() can now fail, but
rpc_new_task assumes that it never will. If it does, then we get a NULL
deref.

I don't think rpc_new_task() can ever be called from the rpciod work
queue, so it is safe to just use a mempool with GFP_KERNEL like we did
before.

NeilBrown

2022-04-07 15:45:38

by NeilBrown

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Thu, 07 Apr 2022, Trond Myklebust wrote:
> On Thu, 2022-04-07 at 11:19 +1000, NeilBrown wrote:
> > On Thu, 07 Apr 2022, NeilBrown wrote:
> > > On Thu, 07 Apr 2022, Dave Chinner wrote:
> > > > On Wed, Apr 06, 2022 at 03:54:24PM -0400, J. Bruce Fields wrote:
> > > > > In the last couple days I've started getting hangs on xfstests
> > > > > generic/186 on upstream.  I also notice the test completes
> > > > > after 10+
> > > > > hours (usually it takes about 5 minutes).  Sometimes this is
> > > > > accompanied
> > > > > by "nfs: RPC call returned error 12" on the client.
> > > >
> > > > #define ENOMEM          12      /* Out of memory */
> > > >
> > > > So either the client or the server is running out of memory
> > > > somewhere?
> > >
> > > Probably the client.  There are a bunch of changes recently which
> > > add
> > > __GFP_NORETRY to memory allocations from PF_WQ_WORKERs because that
> > > can
> > > result in deadlocks when swapping over NFS.
> > > This means that kmalloc request that previously never failed
> > > (because
> > > GFP_KERNEL never fails for kernel threads I think) can now fail. 
> > > This
> > > has tickled one bug that I know of.  There are likely to be more.
> > >
> > > The RPC code should simply retry these allocations after a short
> > > delay.
> > > HZ/4 is the number that is used in a couple of places.  Possibly
> > > there
> > > are more places that need to handle -ENOMEM with rpc_delay().
> >
> > I had a look through the various places where alloc can now fail.
> >
> > I think xdr_alloc_bvec() in xprt_sent_pagedata() is the most likely
> > cause of a problem here.  I don't think an -ENOMEM from there is
> > caught,
> > so it could likely filter up to NFS and result in the message you
> > got.
> >
> > I don't think we can easily handle failure there.  We need to stay
> > with
> > GFP_KERNEL rely on PF_MEMALLOC to make forward progress for
> > swap-over-NFS.
> >
> > Bruce: can you change that one line back to GFP_KERNEL and see if the
> > problem goes away?
> >
>
> Can we please just move the call to xdr_alloc_bvec() out of
> xprt_send_pagedata()? Move the client side allocation into
> xs_stream_prepare_request() and xs_udp_send_request(), then move the
> server side allocation into svc_udp_sendto().
>
> That makes it possible to handle errors.

Like the below I guess. Seems sensible, but I don't know the code well
enough to really review it.

>
> > The other problem I found is that rpc_alloc_task() can now fail, but
> > rpc_new_task assumes that it never will.  If it does, then we get a
> > NULL
> > deref.
> >
> > I don't think rpc_new_task() can ever be called from the rpciod work
> > queue, so it is safe to just use a mempool with GFP_KERNEL like we
> > did
> > before.
> >
> No. We shouldn't ever use mempools with GFP_KERNEL.

Why not? mempools with GFP_KERNEL make perfect sense outside of the
rpciod and nfsiod threads.

>
> Most, if not all of the callers of rpc_run_task() are still capable of
> dealing with errors, and ditto for the callers of rpc_run_bc_task().

Yes, they can deal with errors. But in many cases that do so by passing
the error up the call stack so we could start getting ENOMEM for
systemcalls like stat(). I don't think that is a good idea.

Thanks,
NeilBrown


>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]
>
>
>

diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
index 05b38bf68316..71ba4cf513bc 100644
--- a/net/sunrpc/socklib.c
+++ b/net/sunrpc/socklib.c
@@ -221,12 +221,6 @@ static int xprt_send_kvec(struct socket *sock, struct msghdr *msg,
static int xprt_send_pagedata(struct socket *sock, struct msghdr *msg,
struct xdr_buf *xdr, size_t base)
{
- int err;
-
- err = xdr_alloc_bvec(xdr, rpc_task_gfp_mask());
- if (err < 0)
- return err;
-
iov_iter_bvec(&msg->msg_iter, WRITE, xdr->bvec, xdr_buf_pagecount(xdr),
xdr->page_len + xdr->page_base);
return xprt_sendmsg(sock, msg, base + xdr->page_base);
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 78af7518f263..2661828f7a85 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -828,6 +828,9 @@ xs_stream_prepare_request(struct rpc_rqst *req)
xdr_free_bvec(&req->rq_rcv_buf);
req->rq_task->tk_status = xdr_alloc_bvec(
&req->rq_rcv_buf, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
+ if (req->rq_task->tk_status == 0)
+ req->rq_task->tk_status = xdr_alloc_bvec(
+ &req->rq_snd_buf, GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN);
}

/*

2022-04-07 16:42:22

by Trond Myklebust

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Thu, 2022-04-07 at 14:11 +1000, NeilBrown wrote:
> On Thu, 07 Apr 2022, Trond Myklebust wrote:
> > On Thu, 2022-04-07 at 11:19 +1000, NeilBrown wrote:
> > > On Thu, 07 Apr 2022, NeilBrown wrote:
> > > > On Thu, 07 Apr 2022, Dave Chinner wrote:
> > > > > On Wed, Apr 06, 2022 at 03:54:24PM -0400, J. Bruce Fields
> > > > > wrote:
> > > > > > In the last couple days I've started getting hangs on
> > > > > > xfstests
> > > > > > generic/186 on upstream.  I also notice the test completes
> > > > > > after 10+
> > > > > > hours (usually it takes about 5 minutes).  Sometimes this
> > > > > > is
> > > > > > accompanied
> > > > > > by "nfs: RPC call returned error 12" on the client.
> > > > >
> > > > > #define ENOMEM          12      /* Out of memory */
> > > > >
> > > > > So either the client or the server is running out of memory
> > > > > somewhere?
> > > >
> > > > Probably the client.  There are a bunch of changes recently
> > > > which
> > > > add
> > > > __GFP_NORETRY to memory allocations from PF_WQ_WORKERs because
> > > > that
> > > > can
> > > > result in deadlocks when swapping over NFS.
> > > > This means that kmalloc request that previously never failed
> > > > (because
> > > > GFP_KERNEL never fails for kernel threads I think) can now
> > > > fail. 
> > > > This
> > > > has tickled one bug that I know of.  There are likely to be
> > > > more.
> > > >
> > > > The RPC code should simply retry these allocations after a
> > > > short
> > > > delay.
> > > > HZ/4 is the number that is used in a couple of places. 
> > > > Possibly
> > > > there
> > > > are more places that need to handle -ENOMEM with rpc_delay().
> > >
> > > I had a look through the various places where alloc can now fail.
> > >
> > > I think xdr_alloc_bvec() in xprt_sent_pagedata() is the most
> > > likely
> > > cause of a problem here.  I don't think an -ENOMEM from there is
> > > caught,
> > > so it could likely filter up to NFS and result in the message you
> > > got.
> > >
> > > I don't think we can easily handle failure there.  We need to
> > > stay
> > > with
> > > GFP_KERNEL rely on PF_MEMALLOC to make forward progress for
> > > swap-over-NFS.
> > >
> > > Bruce: can you change that one line back to GFP_KERNEL and see if
> > > the
> > > problem goes away?
> > >
> >
> > Can we please just move the call to xdr_alloc_bvec() out of
> > xprt_send_pagedata()? Move the client side allocation into
> > xs_stream_prepare_request() and xs_udp_send_request(), then move
> > the
> > server side allocation into svc_udp_sendto().
> >
> > That makes it possible to handle errors.
>
> Like the below I guess.  Seems sensible, but I don't know the code
> well
> enough to really review it.
>
> >
> > > The other problem I found is that rpc_alloc_task() can now fail,
> > > but
> > > rpc_new_task assumes that it never will.  If it does, then we get
> > > a
> > > NULL
> > > deref.
> > >
> > > I don't think rpc_new_task() can ever be called from the rpciod
> > > work
> > > queue, so it is safe to just use a mempool with GFP_KERNEL like
> > > we
> > > did
> > > before.
> > >
> > No. We shouldn't ever use mempools with GFP_KERNEL.
>
> Why not?  mempools with GFP_KERNEL make perfect sense outside of the
> rpciod and nfsiod threads.

If you can afford to make it an infinite wait, there is __GFP_NOFAIL,
so why waste the resources of an emergency pool? In my opinion,
however, an infinite uninterruptible sleep is bad policy for almost all
cases because someone will want to break out at some point.

>
> >
> > Most, if not all of the callers of rpc_run_task() are still capable
> > of
> > dealing with errors, and ditto for the callers of
> > rpc_run_bc_task().
>
> Yes, they can deal with errors.  But in many cases that do so by
> passing
> the error up the call stack so we could start getting ENOMEM for
> systemcalls like stat().  I don't think that is a good idea.
>

stat() has always been capable of returning ENOMEM if, for instance,
inode allocation fails. There are almost no calls in NFS (or most other
filesystems for that matter) that can't fail somehow when memory starts
to get really scarce.

The bottom line is that we use ordinary GFP_KERNEL memory allocations
where we can. The new code follows that rule, breaking it only in cases
where the specific rules of rpciod/xprtiod/nfsiod make it impossible to
wait forever in the memory manager.

I am preparing a set of patches to address the issues that you've
identified, plus a case in call_transmit_status/call_bc_transmit_status
where we're not handling ENOMEM. There are also patches to fix up two
cases in the NFS code itself where we're not currently handling errors
from rpc_run_task.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2022-04-08 03:15:58

by NeilBrown

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Thu, 07 Apr 2022, Trond Myklebust wrote:
> On Thu, 2022-04-07 at 14:11 +1000, NeilBrown wrote:
> > On Thu, 07 Apr 2022, Trond Myklebust wrote:
> > > On Thu, 2022-04-07 at 11:19 +1000, NeilBrown wrote:
> > > > On Thu, 07 Apr 2022, NeilBrown wrote:
> > > > > On Thu, 07 Apr 2022, Dave Chinner wrote:
> > > > > > On Wed, Apr 06, 2022 at 03:54:24PM -0400, J. Bruce Fields
> > > > > > wrote:
> > > > > > > In the last couple days I've started getting hangs on
> > > > > > > xfstests
> > > > > > > generic/186 on upstream.  I also notice the test completes
> > > > > > > after 10+
> > > > > > > hours (usually it takes about 5 minutes).  Sometimes this
> > > > > > > is
> > > > > > > accompanied
> > > > > > > by "nfs: RPC call returned error 12" on the client.
> > > > > >
> > > > > > #define ENOMEM          12      /* Out of memory */
> > > > > >
> > > > > > So either the client or the server is running out of memory
> > > > > > somewhere?
> > > > >
> > > > > Probably the client.  There are a bunch of changes recently
> > > > > which
> > > > > add
> > > > > __GFP_NORETRY to memory allocations from PF_WQ_WORKERs because
> > > > > that
> > > > > can
> > > > > result in deadlocks when swapping over NFS.
> > > > > This means that kmalloc request that previously never failed
> > > > > (because
> > > > > GFP_KERNEL never fails for kernel threads I think) can now
> > > > > fail. 
> > > > > This
> > > > > has tickled one bug that I know of.  There are likely to be
> > > > > more.
> > > > >
> > > > > The RPC code should simply retry these allocations after a
> > > > > short
> > > > > delay.
> > > > > HZ/4 is the number that is used in a couple of places. 
> > > > > Possibly
> > > > > there
> > > > > are more places that need to handle -ENOMEM with rpc_delay().
> > > >
> > > > I had a look through the various places where alloc can now fail.
> > > >
> > > > I think xdr_alloc_bvec() in xprt_sent_pagedata() is the most
> > > > likely
> > > > cause of a problem here.  I don't think an -ENOMEM from there is
> > > > caught,
> > > > so it could likely filter up to NFS and result in the message you
> > > > got.
> > > >
> > > > I don't think we can easily handle failure there.  We need to
> > > > stay
> > > > with
> > > > GFP_KERNEL rely on PF_MEMALLOC to make forward progress for
> > > > swap-over-NFS.
> > > >
> > > > Bruce: can you change that one line back to GFP_KERNEL and see if
> > > > the
> > > > problem goes away?
> > > >
> > >
> > > Can we please just move the call to xdr_alloc_bvec() out of
> > > xprt_send_pagedata()? Move the client side allocation into
> > > xs_stream_prepare_request() and xs_udp_send_request(), then move
> > > the
> > > server side allocation into svc_udp_sendto().
> > >
> > > That makes it possible to handle errors.
> >
> > Like the below I guess.  Seems sensible, but I don't know the code
> > well
> > enough to really review it.
> >
> > >
> > > > The other problem I found is that rpc_alloc_task() can now fail,
> > > > but
> > > > rpc_new_task assumes that it never will.  If it does, then we get
> > > > a
> > > > NULL
> > > > deref.
> > > >
> > > > I don't think rpc_new_task() can ever be called from the rpciod
> > > > work
> > > > queue, so it is safe to just use a mempool with GFP_KERNEL like
> > > > we
> > > > did
> > > > before.
> > > >
> > > No. We shouldn't ever use mempools with GFP_KERNEL.
> >
> > Why not?  mempools with GFP_KERNEL make perfect sense outside of the
> > rpciod and nfsiod threads.
>
> If you can afford to make it an infinite wait, there is __GFP_NOFAIL,
> so why waste the resources of an emergency pool? In my opinion,
> however, an infinite uninterruptible sleep is bad policy for almost all
> cases because someone will want to break out at some point.

"infinite" isn't a useful description. The important question is "what
will allow the allocation to complete?".

For __GFP_NOFAIL there is no clear answer beyond "reduction of memory
pressure", and sometimes that is enough.
For a mempool we have a much more specific answer. Memory becomes
available as previous requests complete. rpc_task_mempool has a size of
8 so there can always be 8 requests in flight. Waiting on the mempool
will wait at most until there are seven requests in flight, and then
will return a task. This is a much better guarantee than for
__GFP_NOFAIL.
If you ever need an rpc task to relieve memory pressure, then
__GFP_NOFAIL could deadlock, while using the mempool won't.

If you are never going to block waiting on a mempool and would rather
fail instead, then there seems little point in having the mempool.

>
> >
> > >
> > > Most, if not all of the callers of rpc_run_task() are still capable
> > > of
> > > dealing with errors, and ditto for the callers of
> > > rpc_run_bc_task().
> >
> > Yes, they can deal with errors.  But in many cases that do so by
> > passing
> > the error up the call stack so we could start getting ENOMEM for
> > systemcalls like stat().  I don't think that is a good idea.
> >
>
> stat() has always been capable of returning ENOMEM if, for instance,
> inode allocation fails. There are almost no calls in NFS (or most other
> filesystems for that matter) that can't fail somehow when memory starts
> to get really scarce.

Fair enough. It is writes that are really important.

>
> The bottom line is that we use ordinary GFP_KERNEL memory allocations
> where we can. The new code follows that rule, breaking it only in cases
> where the specific rules of rpciod/xprtiod/nfsiod make it impossible to
> wait forever in the memory manager.

It is not safe to use GFP_KERNEL for an allocation that is needed in
order to free memory - and so any allocation that is needed to write out
data from the page cache.

>
> I am preparing a set of patches to address the issues that you've
> identified, plus a case in call_transmit_status/call_bc_transmit_status
> where we're not handling ENOMEM. There are also patches to fix up two
> cases in the NFS code itself where we're not currently handling errors
> from rpc_run_task.

I look forward to reviewing them.

Thanks,
NeilBrown

2022-04-08 05:36:48

by Dave Chinner

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Fri, Apr 08, 2022 at 12:46:08PM +1000, NeilBrown wrote:
> On Thu, 07 Apr 2022, Trond Myklebust wrote:
> > The bottom line is that we use ordinary GFP_KERNEL memory allocations
> > where we can. The new code follows that rule, breaking it only in cases
> > where the specific rules of rpciod/xprtiod/nfsiod make it impossible to
> > wait forever in the memory manager.
>
> It is not safe to use GFP_KERNEL for an allocation that is needed in
> order to free memory - and so any allocation that is needed to write out
> data from the page cache.

Except that same page cache writeback path can be called from
syscall context (e.g. fsync()) which has nothing to do with memory
reclaim. In that case GFP_KERNEL is the correct allocation context
to use because there are no constraints on what memory reclaim can
be performed from this path.

IOWs, if the context initiating data writeback doesn't allow
GFP_KERNEL allocations, then it should be calling
memalloc_nofs_save() or memalloc_noio_save() to constrain all
allocations to the required context. We should not be requiring the
filesystem (or any other subsystem) to magically infer that the IO
is being done in a constrained allocation context and modify the
context they use appropriately.

If we this, then all filesystems would simply use GFP_NOIO
everywhere because the loop device layers the entire filesystem IO
path under block device context (i.e. requiring GFP_NOIO allocation
context). We don't do this - the loop device sets PF_MEMALLOC_NOIO
instead so all allocations in that path run with at least GFP_NOIO
constraints and filesystems are none the wiser about the constraints
of the calling context.

IOWs, GFP_KERNEL is generally right context to be using in
filesystem IO paths and callers need to restrict allocation contexts
via task flags if they cannot allow certain types of reclaim
recursion to occur...

Cheers,

Dave.
--
Dave Chinner
[email protected]

2022-04-08 06:04:03

by NeilBrown

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Fri, 08 Apr 2022, Dave Chinner wrote:
> On Fri, Apr 08, 2022 at 12:46:08PM +1000, NeilBrown wrote:
> > On Thu, 07 Apr 2022, Trond Myklebust wrote:
> > > The bottom line is that we use ordinary GFP_KERNEL memory allocations
> > > where we can. The new code follows that rule, breaking it only in cases
> > > where the specific rules of rpciod/xprtiod/nfsiod make it impossible to
> > > wait forever in the memory manager.
> >
> > It is not safe to use GFP_KERNEL for an allocation that is needed in
> > order to free memory - and so any allocation that is needed to write out
> > data from the page cache.
>
> Except that same page cache writeback path can be called from
> syscall context (e.g. fsync()) which has nothing to do with memory
> reclaim. In that case GFP_KERNEL is the correct allocation context
> to use because there are no constraints on what memory reclaim can
> be performed from this path.
>
> IOWs, if the context initiating data writeback doesn't allow
> GFP_KERNEL allocations, then it should be calling
> memalloc_nofs_save() or memalloc_noio_save() to constrain all
> allocations to the required context. We should not be requiring the
> filesystem (or any other subsystem) to magically infer that the IO
> is being done in a constrained allocation context and modify the
> context they use appropriately.
>
> If we this, then all filesystems would simply use GFP_NOIO
> everywhere because the loop device layers the entire filesystem IO
> path under block device context (i.e. requiring GFP_NOIO allocation
> context). We don't do this - the loop device sets PF_MEMALLOC_NOIO
> instead so all allocations in that path run with at least GFP_NOIO
> constraints and filesystems are none the wiser about the constraints
> of the calling context.
>
> IOWs, GFP_KERNEL is generally right context to be using in
> filesystem IO paths and callers need to restrict allocation contexts
> via task flags if they cannot allow certain types of reclaim
> recursion to occur...

NOIO and NOFS are not the issue here. We all agree that
memalloc_noXX_save() is the right thing to do.

The issue is that memalloc can block indefinitely in low-memory
situations, and any code that has to make progress in low-memory
situations - like writeout - needs to be careful.

This is why the block layer uses mempools for request headers etc - so
that progress is guaranteed without depending on alloc_page() to
succeed.

File systems do often get away with using GFP_KERNEL because the
important paths has PF_MEMALLOC and hence __GFP_MEMALLOC in effect and
that provides access to some shared reserves. Shared reserves are risky
- the other users you are sharing with might steal it all.

File systems tend to survive anyway because there is a limit on the
mount of dirty filesystem data - so there is lots of non-filesystem data
around, and a good chance that some of that can be freed.

I say "tend to" because I believe the is no real guarantee. It seems to
actually work 99.999% of the time, and maybe that is enough.

I suspect you might be able to deadlock filesystem writeout by
memlocking lots of memory while there are lots of dirty pages. It
probably wouldn't be easy though.

swap-out is different. There is no limit the the amount of dirty anon
data, so it is fairly easy to get a situation where you absolutely must
write out some anon pages before alloc_page() can succeed.
Most filesystems don't handle swap-out directly - they just tell the MM
which disk addresses to use and submit_bio() is used for writing.
The bio is allocated from a mempool, and nothing below submit_bio() uses
GFP_KERNEL to alloc_page() - they all use mempools (or accept failure in
some other way). A separate mempool at each level - they aren't shared
(so they are quite different from __GFP_MEMALLOC).

NFS is different. NFS handles swap using the same paths as other
writes, so it is much more likely to face indefinite waits in
alloc_page() - it least when handling swap. __GFP_MEMALLOC helps to a
degree but there a lots of levels, and the more levels we have have
local reserves (even if the mempool only reserves a single element), the
better.

The networking people refuse to use mempools (or at least, they did once
some years ago) and I cannot entirely blame them as there are lots of
moving parts - lots of different things that might need to be allocated
(arp table entries?) but usually aren't. So for everything in the
socket layer and below we rely on __GFP_MEMALLOC (and recommend
increasing the reserves a bit above the default, just in case).

But in NFS and particularly in SUNRPC we already have the mempool, and
there is only a small number of things that we need to allocate to
ensure forward progress. So using a mempool as designed, rather than
relying on MEMALLOC reserves is the sensible thing to do, and leaves
more of the reserves for the networking layer.

In fact, the allocation that SUNRPC now does before trying a mempool
should really be using __GFP_NOMEMALLOC so that they don't take from the
shared reserves (even when PF_MEMALLOC is set). As it has a private
reserve (the mempool) it should leave the common reserve for other
layers (sockets etc).

NeilBrown

2022-04-11 02:09:57

by Dave Chinner

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186

On Fri, Apr 08, 2022 at 03:32:38PM +1000, NeilBrown wrote:
> On Fri, 08 Apr 2022, Dave Chinner wrote:
> > On Fri, Apr 08, 2022 at 12:46:08PM +1000, NeilBrown wrote:
> > > On Thu, 07 Apr 2022, Trond Myklebust wrote:
> > > > The bottom line is that we use ordinary GFP_KERNEL memory allocations
> > > > where we can. The new code follows that rule, breaking it only in cases
> > > > where the specific rules of rpciod/xprtiod/nfsiod make it impossible to
> > > > wait forever in the memory manager.
> > >
> > > It is not safe to use GFP_KERNEL for an allocation that is needed in
> > > order to free memory - and so any allocation that is needed to write out
> > > data from the page cache.
> >
> > Except that same page cache writeback path can be called from
> > syscall context (e.g. fsync()) which has nothing to do with memory
> > reclaim. In that case GFP_KERNEL is the correct allocation context
> > to use because there are no constraints on what memory reclaim can
> > be performed from this path.
> >
> > IOWs, if the context initiating data writeback doesn't allow
> > GFP_KERNEL allocations, then it should be calling
> > memalloc_nofs_save() or memalloc_noio_save() to constrain all
> > allocations to the required context. We should not be requiring the
> > filesystem (or any other subsystem) to magically infer that the IO
> > is being done in a constrained allocation context and modify the
> > context they use appropriately.
> >
> > If we this, then all filesystems would simply use GFP_NOIO
> > everywhere because the loop device layers the entire filesystem IO
> > path under block device context (i.e. requiring GFP_NOIO allocation
> > context). We don't do this - the loop device sets PF_MEMALLOC_NOIO
> > instead so all allocations in that path run with at least GFP_NOIO
> > constraints and filesystems are none the wiser about the constraints
> > of the calling context.
> >
> > IOWs, GFP_KERNEL is generally right context to be using in
> > filesystem IO paths and callers need to restrict allocation contexts
> > via task flags if they cannot allow certain types of reclaim
> > recursion to occur...
>
> NOIO and NOFS are not the issue here. We all agree that
> memalloc_noXX_save() is the right thing to do.
>
> The issue is that memalloc can block indefinitely in low-memory
> situations, and any code that has to make progress in low-memory
> situations - like writeout - needs to be careful.

Yup, and you've missed my point entirely, then explained exactly why
high level memory allocation context needs to be set by mempool
based allocations...

> The bio is allocated from a mempool, and nothing below submit_bio() uses
> GFP_KERNEL to alloc_page() - they all use mempools (or accept failure in
> some other way). A separate mempool at each level - they aren't shared
> (so they are quite different from __GFP_MEMALLOC).

.... because this house of cards using mempools only works if it
is nested mempools all the way down.

> The networking people refuse to use mempools (or at least, they did once

Same as many filesystem people refuse to use mempools, because the
page writeback IO path in a filesystem can have *unbound* memory
demand. mempools just don't work in filesytsems that need to run
transactions or read metadata (e.g. for block allocation
transactions). We've been through this many times before; it's why
filesystems like XFS and Btrfs do not allow ->writepage from memory
reclaim contexts.

> But in NFS and particularly in SUNRPC we already have the mempool, and
> there is only a small number of things that we need to allocate to
> ensure forward progress. So using a mempool as designed, rather than
> relying on MEMALLOC reserves is the sensible thing to do, and leaves
> more of the reserves for the networking layer.

Except the mempool now requires everything in the path below it not
to block on memory allocation for it to guarantee forwards progress
i.e. nested allocations need to succeed or error out - either
guarantees *mempool* forwards progress, not necessarily forwards
progress cleaning memory, but this is only needed for the paths
where we actually need to provide a forwards progress guarantee.

THe RPC code puts the entire network stack under a mempool, and as
you say the network stack is resistent to using mempools. At which
point, NFS needs memalloc_noreclaim_save() context to be set
explicitly in the path nested inside the mempool allocation so that
non-mempool allocations don't block trying to reclaim memory that
might never come available....

That's the point I was trying to make - GFP_KERNEL is not the
problem here - it's that mempools only work when it's mempools all
the way down. Individual code paths and/or allocations may have no
idea that they are running in a nested mempool allocation context,
and so all allocations in that path - GFP_KERNEL or otherwise - need
to be automatically converted to fail-instead-of-block semantics by
the high level code that uses a mempool based allocation...

> In fact, the allocation that SUNRPC now does before trying a mempool
> should really be using __GFP_NOMEMALLOC so that they don't take from the
> shared reserves (even when PF_MEMALLOC is set). As it has a private
> reserve (the mempool) it should leave the common reserve for other
> layers (sockets etc).

Yup, but that's a high level, pre-mempool allocation optimisation
which is not useful to future allocations done inside the mempool
forwards progress guarantee context....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2022-04-12 04:30:01

by NeilBrown

[permalink] [raw]
Subject: Re: sporadic hangs on generic/186


On Mon, 11 Apr 2022, Dave Chinner wrote:

> Yup, and you've missed my point entirely,

Hmm... It seems that I did. Sorry.

Your point is that the flags passed to kmalloc et al should be relevant,
and that we should be focusing on process context - including for
access to reserves in memalloc context (using memalloc_noreclaim_save()
as appropriate). I agree.

My point is that memalloc_noreclaim_save() isn't necessarily sufficient.
It provides access to a shared pool of reserves, which is hard to reason
about. It *might* provide a guarantee that every allocation will
succeed without deadlock - but only if there is a reasonable limit on
the number or parallel allocation chains - and we don't have any
mechanism for tracking those.

So there is still a place for mempools - they complement __GFP_MEMALLOC
reserves and can both be used together. mempools() are particularly
appropriate when the allocation commonly (or always) occurs in
PF_MEMALLOC context, as using the mempool provides a better guarantee,
and removes a burden from the shared pool.

This suggests to me that a different interface to mempools could be
useful. One that uses the mempool preallocation instead of the shared
preallocation
If PF_MEMALLOC or GFP_MEMALLOC are set, then the mempool is used which
avoids the shared reserves.
Otherwise (or if __GFP_NOMEMALLOC is set), follow the normal allocation
path and don't use the mempool reserves.

Something like the following.

Do you agree?

Thanks,
NeilBrown

diff --git a/include/linux/mempool.h b/include/linux/mempool.h
index 0c964ac107c2..4414644c49d5 100644
--- a/include/linux/mempool.h
+++ b/include/linux/mempool.h
@@ -46,6 +46,7 @@ extern mempool_t *mempool_create_node(int min_nr, mempool_alloc_t *alloc_fn,
extern int mempool_resize(mempool_t *pool, int new_min_nr);
extern void mempool_destroy(mempool_t *pool);
extern void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask) __malloc;
+extern void *mempool_memalloc(mempool_t *pool, gfp_t gfp_mask) __malloc;
extern void mempool_free(void *element, mempool_t *pool);

/*
diff --git a/mm/mempool.c b/mm/mempool.c
index b933d0fc21b8..1182bd3443cc 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -417,8 +417,12 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask)
goto repeat_alloc;
}

- /* We must not sleep if !__GFP_DIRECT_RECLAIM */
- if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) {
+ /*
+ * We must not sleep if !__GFP_DIRECT_RECLAIM
+ * and must not retry in __GFP_NORETRY
+ */
+ if (!(gfp_mask & __GFP_DIRECT_RECLAIM) ||
+ (gfp_mask & __GFP_NORETRY)) {
spin_unlock_irqrestore(&pool->lock, flags);
return NULL;
}
@@ -440,6 +444,30 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask)
}
EXPORT_SYMBOL(mempool_alloc);

+/**
+ * mempool_memalloc - allocate, using mempool rather than shared reserves
+ * @pool: mempool to allocate from
+ * @gfp_mask: GFP allocation flags
+ *
+ *
+ * If the process context or gfp flags permit allocation from reserves
+ * (i.e. PF_MEMALLOC or GFP_MEMALLOC), then use the mempool for allocation,
+ * otherwise allocate directly using the underlying allocator
+ *
+ * Return: pointer to allocated element, or %NULL on failure.
+ */
+
+void *mempool_memalloc(mempool_t *pool, gfp_t gfp_mask)
+{
+ if (gfp_mask & __GFP_NOMEMALLOC ||
+ (!(current->flags & PF_MEMALLOC) &&
+ !(gfp_mask & __GFP_MEMALLOC)))
+ /* No reserves requested */
+ return pool->alloc(gfp_mask, pool->pool_data);
+ else
+ return mempool_alloc(pool, gfp_mask);
+}
+
/**
* mempool_free - return an element to the pool.
* @element: pool element pointer.
diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 682fcd24bf43..2ecbe6b89fbb 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -615,8 +615,6 @@ rpcauth_bind_root_cred(struct rpc_task *task, int lookupflags)
};
struct rpc_cred *ret;

- if (RPC_IS_ASYNC(task))
- lookupflags |= RPCAUTH_LOOKUP_ASYNC;
ret = auth->au_ops->lookup_cred(auth, &acred, lookupflags);
put_cred(acred.cred);
return ret;
@@ -633,8 +631,6 @@ rpcauth_bind_machine_cred(struct rpc_task *task, int lookupflags)

if (!acred.principal)
return NULL;
- if (RPC_IS_ASYNC(task))
- lookupflags |= RPCAUTH_LOOKUP_ASYNC;
return auth->au_ops->lookup_cred(auth, &acred, lookupflags);
}

@@ -658,7 +654,7 @@ rpcauth_bindcred(struct rpc_task *task, const struct cred *cred, int flags)
};

if (flags & RPC_TASK_ASYNC)
- lookupflags |= RPCAUTH_LOOKUP_NEW | RPCAUTH_LOOKUP_ASYNC;
+ lookupflags |= RPCAUTH_LOOKUP_NEW;
if (task->tk_op_cred)
/* Task must use exactly this rpc_cred */
new = get_rpccred(task->tk_op_cred);
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index de7e5b41ab8f..4f68934dbeb5 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -1343,11 +1343,7 @@ gss_hash_cred(struct auth_cred *acred, unsigned int hashbits)
static struct rpc_cred *
gss_lookup_cred(struct rpc_auth *auth, struct auth_cred *acred, int flags)
{
- gfp_t gfp = GFP_KERNEL;
-
- if (flags & RPCAUTH_LOOKUP_ASYNC)
- gfp = GFP_NOWAIT | __GFP_NOWARN;
- return rpcauth_lookup_credcache(auth, acred, flags, gfp);
+ return rpcauth_lookup_credcache(auth, acred, flags, GFP_KERNEL);
}

static struct rpc_cred *
diff --git a/net/sunrpc/auth_unix.c b/net/sunrpc/auth_unix.c
index 1e091d3fa607..6170d4d34687 100644
--- a/net/sunrpc/auth_unix.c
+++ b/net/sunrpc/auth_unix.c
@@ -45,14 +45,10 @@ static struct rpc_cred *unx_lookup_cred(struct rpc_auth *auth,
{
struct rpc_cred *ret;

- ret = kmalloc(sizeof(*ret), rpc_task_gfp_mask());
- if (!ret) {
- if (!(flags & RPCAUTH_LOOKUP_ASYNC))
- return ERR_PTR(-ENOMEM);
- ret = mempool_alloc(unix_pool, GFP_NOWAIT);
- if (!ret)
- return ERR_PTR(-ENOMEM);
- }
+ ret = mempool_memalloc(unix_pool, rpc_task_gfp_mask());
+ if (!ret)
+ return ERR_PTR(-ENOMEM);
+
rpcauth_init_cred(ret, acred, auth, &unix_credops);
ret->cr_flags = 1UL << RPCAUTH_CRED_UPTODATE;
return ret;
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
index 7f70c1e608b7..4138aa62d3f3 100644
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -1040,12 +1040,9 @@ int rpc_malloc(struct rpc_task *task)
gfp_t gfp = rpc_task_gfp_mask();

size += sizeof(struct rpc_buffer);
- if (size <= RPC_BUFFER_MAXSIZE) {
- buf = kmem_cache_alloc(rpc_buffer_slabp, gfp);
- /* Reach for the mempool if dynamic allocation fails */
- if (!buf && RPC_IS_ASYNC(task))
- buf = mempool_alloc(rpc_buffer_mempool, GFP_NOWAIT);
- } else
+ if (size <= RPC_BUFFER_MAXSIZE)
+ buf = mempool_memalloc(rpc_buffer_mempool, gfp);
+ else
buf = kmalloc(size, gfp);

if (!buf)
@@ -1110,12 +1107,7 @@ static void rpc_init_task(struct rpc_task *task, const struct rpc_task_setup *ta

static struct rpc_task *rpc_alloc_task(void)
{
- struct rpc_task *task;
-
- task = kmem_cache_alloc(rpc_task_slabp, rpc_task_gfp_mask());
- if (task)
- return task;
- return mempool_alloc(rpc_task_mempool, GFP_NOWAIT);
+ return mempool_memalloc(rpc_task_mempool, rpc_task_gfp_mask());
}

/*

diff --git a/include/linux/sunrpc/auth.h b/include/linux/sunrpc/auth.h
index 3e6ce288a7fc..98da816b5fc2 100644
--- a/include/linux/sunrpc/auth.h
+++ b/include/linux/sunrpc/auth.h
@@ -99,7 +99,6 @@ struct rpc_auth_create_args {

/* Flags for rpcauth_lookupcred() */
#define RPCAUTH_LOOKUP_NEW 0x01 /* Accept an uninitialised cred */
-#define RPCAUTH_LOOKUP_ASYNC 0x02 /* Don't block waiting for memory */

/*
* Client authentication ops