2019-03-04 15:32:48

by Chuck Lever III

[permalink] [raw]
Subject: [PATCH RFC] SUNRPC: Avoid digging into the ATOMIC pool

Page allocation requests made when the SPARSE_PAGES flag is set are
allowed to fail, and are not critical. No need to spend a rare
resource.

Signed-off-by: Chuck Lever <[email protected]>
---
net/sunrpc/socklib.c | 2 +-
net/sunrpc/xprtrdma/rpc_rdma.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
index 7e55cfc..9faea12 100644
--- a/net/sunrpc/socklib.c
+++ b/net/sunrpc/socklib.c
@@ -106,7 +106,7 @@ static size_t xdr_skb_read_and_csum_bits(struct xdr_skb_reader *desc, void *to,
/* ACL likes to be lazy in allocating pages - ACLs
* are small by default but can get huge. */
if ((xdr->flags & XDRBUF_SPARSE_PAGES) && *ppage == NULL) {
- *ppage = alloc_page(GFP_ATOMIC);
+ *ppage = alloc_page(GFP_NOWAIT | __GFP_NOWARN);
if (unlikely(*ppage == NULL)) {
if (copied == 0)
copied = -ENOMEM;
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 6c1fb27..b759b16 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -238,7 +238,7 @@ static bool rpcrdma_results_inline(struct rpcrdma_xprt *r_xprt,
*/
if (unlikely(xdrbuf->flags & XDRBUF_SPARSE_PAGES)) {
if (!*ppages)
- *ppages = alloc_page(GFP_ATOMIC);
+ *ppages = alloc_page(GFP_NOWAIT | __GFP_NOWARN);
if (!*ppages)
return -ENOBUFS;
}



2019-03-04 16:02:17

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH RFC] SUNRPC: Avoid digging into the ATOMIC pool

On Mon, 2019-03-04 at 10:32 -0500, Chuck Lever wrote:
> Page allocation requests made when the SPARSE_PAGES flag is set are
> allowed to fail, and are not critical. No need to spend a rare
> resource.
>
> Signed-off-by: Chuck Lever <[email protected]>
> ---
> net/sunrpc/socklib.c | 2 +-
> net/sunrpc/xprtrdma/rpc_rdma.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
> index 7e55cfc..9faea12 100644
> --- a/net/sunrpc/socklib.c
> +++ b/net/sunrpc/socklib.c
> @@ -106,7 +106,7 @@ static size_t xdr_skb_read_and_csum_bits(struct
> xdr_skb_reader *desc, void *to,
> /* ACL likes to be lazy in allocating pages - ACLs
> * are small by default but can get huge. */
> if ((xdr->flags & XDRBUF_SPARSE_PAGES) && *ppage ==
> NULL) {
> - *ppage = alloc_page(GFP_ATOMIC);
> + *ppage = alloc_page(GFP_NOWAIT | __GFP_NOWARN);
> if (unlikely(*ppage == NULL)) {
> if (copied == 0)
> copied = -ENOMEM;

Hmm... Can't we make this GFP_KERNEL? I can't see that we're holding
any locks here. It does look like svc_udp_recvfrom() is wrapping the
call to csum_partial_copy_to_xdr() in a local_bh_disable()/enable()
pair, but is that actually needed? If so, for what?

> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c
> b/net/sunrpc/xprtrdma/rpc_rdma.c
> index 6c1fb27..b759b16 100644
> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
> @@ -238,7 +238,7 @@ static bool rpcrdma_results_inline(struct
> rpcrdma_xprt *r_xprt,
> */
> if (unlikely(xdrbuf->flags & XDRBUF_SPARSE_PAGES)) {
> if (!*ppages)
> - *ppages = alloc_page(GFP_ATOMIC);
> + *ppages = alloc_page(GFP_NOWAIT |
> __GFP_NOWARN);
> if (!*ppages)
> return -ENOBUFS;
> }
>

Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2019-03-04 16:05:24

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH RFC] SUNRPC: Avoid digging into the ATOMIC pool



> On Mar 4, 2019, at 11:02 AM, Trond Myklebust <[email protected]> wrote:
>
> On Mon, 2019-03-04 at 10:32 -0500, Chuck Lever wrote:
>> Page allocation requests made when the SPARSE_PAGES flag is set are
>> allowed to fail, and are not critical. No need to spend a rare
>> resource.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
>> ---
>> net/sunrpc/socklib.c | 2 +-
>> net/sunrpc/xprtrdma/rpc_rdma.c | 2 +-
>> 2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
>> index 7e55cfc..9faea12 100644
>> --- a/net/sunrpc/socklib.c
>> +++ b/net/sunrpc/socklib.c
>> @@ -106,7 +106,7 @@ static size_t xdr_skb_read_and_csum_bits(struct
>> xdr_skb_reader *desc, void *to,
>> /* ACL likes to be lazy in allocating pages - ACLs
>> * are small by default but can get huge. */
>> if ((xdr->flags & XDRBUF_SPARSE_PAGES) && *ppage ==
>> NULL) {
>> - *ppage = alloc_page(GFP_ATOMIC);
>> + *ppage = alloc_page(GFP_NOWAIT | __GFP_NOWARN);
>> if (unlikely(*ppage == NULL)) {
>> if (copied == 0)
>> copied = -ENOMEM;
>
> Hmm... Can't we make this GFP_KERNEL? I can't see that we're holding
> any locks here.

We're holding the transport send lock. Since this is an order 0 allocation
it's unlikely to fail even with a less aggressive request like NOWAIT.


> It does look like svc_udp_recvfrom() is wrapping the
> call to csum_partial_copy_to_xdr() in a local_bh_disable()/enable()
> pair, but is that actually needed? If so, for what?
>
>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c
>> b/net/sunrpc/xprtrdma/rpc_rdma.c
>> index 6c1fb27..b759b16 100644
>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>> @@ -238,7 +238,7 @@ static bool rpcrdma_results_inline(struct
>> rpcrdma_xprt *r_xprt,
>> */
>> if (unlikely(xdrbuf->flags & XDRBUF_SPARSE_PAGES)) {
>> if (!*ppages)
>> - *ppages = alloc_page(GFP_ATOMIC);
>> + *ppages = alloc_page(GFP_NOWAIT |
>> __GFP_NOWARN);
>> if (!*ppages)
>> return -ENOBUFS;
>> }
>>
>
> Cheers
> Trond
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]

--
Chuck Lever




2019-03-04 16:09:17

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH RFC] SUNRPC: Avoid digging into the ATOMIC pool

On Mon, 2019-03-04 at 11:05 -0500, Chuck Lever wrote:
> > On Mar 4, 2019, at 11:02 AM, Trond Myklebust <
> > [email protected]> wrote:
> >
> > On Mon, 2019-03-04 at 10:32 -0500, Chuck Lever wrote:
> > > Page allocation requests made when the SPARSE_PAGES flag is set
> > > are
> > > allowed to fail, and are not critical. No need to spend a rare
> > > resource.
> > >
> > > Signed-off-by: Chuck Lever <[email protected]>
> > > ---
> > > net/sunrpc/socklib.c | 2 +-
> > > net/sunrpc/xprtrdma/rpc_rdma.c | 2 +-
> > > 2 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
> > > index 7e55cfc..9faea12 100644
> > > --- a/net/sunrpc/socklib.c
> > > +++ b/net/sunrpc/socklib.c
> > > @@ -106,7 +106,7 @@ static size_t
> > > xdr_skb_read_and_csum_bits(struct
> > > xdr_skb_reader *desc, void *to,
> > > /* ACL likes to be lazy in allocating pages - ACLs
> > > * are small by default but can get huge. */
> > > if ((xdr->flags & XDRBUF_SPARSE_PAGES) && *ppage ==
> > > NULL) {
> > > - *ppage = alloc_page(GFP_ATOMIC);
> > > + *ppage = alloc_page(GFP_NOWAIT | __GFP_NOWARN);
> > > if (unlikely(*ppage == NULL)) {
> > > if (copied == 0)
> > > copied = -ENOMEM;
> >
> > Hmm... Can't we make this GFP_KERNEL? I can't see that we're
> > holding
> > any locks here.
>
> We're holding the transport send lock. Since this is an order 0
> allocation
> it's unlikely to fail even with a less aggressive request like
> NOWAIT.

We should only be holding the receive mutex. That prevents others from
closing the socket (or starting a second receive worker) but that's
all.

OTOH, the only thing using this mechanism on UDP should be the NFSv2/v3
ACL receive code, so I'm not really sure anyone cares.


> > It does look like svc_udp_recvfrom() is wrapping the
> > call to csum_partial_copy_to_xdr() in a local_bh_disable()/enable()
> > pair, but is that actually needed? If so, for what?
> >
> > > diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c
> > > b/net/sunrpc/xprtrdma/rpc_rdma.c
> > > index 6c1fb27..b759b16 100644
> > > --- a/net/sunrpc/xprtrdma/rpc_rdma.c
> > > +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
> > > @@ -238,7 +238,7 @@ static bool rpcrdma_results_inline(struct
> > > rpcrdma_xprt *r_xprt,
> > > */
> > > if (unlikely(xdrbuf->flags & XDRBUF_SPARSE_PAGES)) {
> > > if (!*ppages)
> > > - *ppages = alloc_page(GFP_ATOMIC);
> > > + *ppages = alloc_page(GFP_NOWAIT |
> > > __GFP_NOWARN);
> > > if (!*ppages)
> > > return -ENOBUFS;
> > > }
> > >
> >
> > Cheers
> > Trond
> > --
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > [email protected]
>
> --
> Chuck Lever
>
>
>
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2019-03-04 16:13:15

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH RFC] SUNRPC: Avoid digging into the ATOMIC pool



> On Mar 4, 2019, at 11:09 AM, Trond Myklebust <[email protected]> wrote:
>
> On Mon, 2019-03-04 at 11:05 -0500, Chuck Lever wrote:
>>> On Mar 4, 2019, at 11:02 AM, Trond Myklebust <
>>> [email protected]> wrote:
>>>
>>> On Mon, 2019-03-04 at 10:32 -0500, Chuck Lever wrote:
>>>> Page allocation requests made when the SPARSE_PAGES flag is set
>>>> are
>>>> allowed to fail, and are not critical. No need to spend a rare
>>>> resource.
>>>>
>>>> Signed-off-by: Chuck Lever <[email protected]>
>>>> ---
>>>> net/sunrpc/socklib.c | 2 +-
>>>> net/sunrpc/xprtrdma/rpc_rdma.c | 2 +-
>>>> 2 files changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
>>>> index 7e55cfc..9faea12 100644
>>>> --- a/net/sunrpc/socklib.c
>>>> +++ b/net/sunrpc/socklib.c
>>>> @@ -106,7 +106,7 @@ static size_t
>>>> xdr_skb_read_and_csum_bits(struct
>>>> xdr_skb_reader *desc, void *to,
>>>> /* ACL likes to be lazy in allocating pages - ACLs
>>>> * are small by default but can get huge. */
>>>> if ((xdr->flags & XDRBUF_SPARSE_PAGES) && *ppage ==
>>>> NULL) {
>>>> - *ppage = alloc_page(GFP_ATOMIC);
>>>> + *ppage = alloc_page(GFP_NOWAIT | __GFP_NOWARN);
>>>> if (unlikely(*ppage == NULL)) {
>>>> if (copied == 0)
>>>> copied = -ENOMEM;
>>>
>>> Hmm... Can't we make this GFP_KERNEL? I can't see that we're
>>> holding
>>> any locks here.
>>
>> We're holding the transport send lock. Since this is an order 0
>> allocation
>> it's unlikely to fail even with a less aggressive request like
>> NOWAIT.
>
> We should only be holding the receive mutex. That prevents others from
> closing the socket (or starting a second receive worker) but that's
> all.

OK, I guess the UDP and RPC-over-RDMA cases are different. For RDMA,
this allocation is done in ->send_request.


> OTOH, the only thing using this mechanism on UDP should be the NFSv2/v3
> ACL receive code, so I'm not really sure anyone cares.

True. However, as I've said before, the right fix for this is to
get rid of SPARSE_PAGES and ensure that NFS always allocates enough
receive buffer space during Call encoding. But it's not clear anyone
cares enough.

Will the current patch be acceptable, or do you want me to use a
sharper stick when allocating?


>>> It does look like svc_udp_recvfrom() is wrapping the
>>> call to csum_partial_copy_to_xdr() in a local_bh_disable()/enable()
>>> pair, but is that actually needed? If so, for what?
>>>
>>>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> b/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> index 6c1fb27..b759b16 100644
>>>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> @@ -238,7 +238,7 @@ static bool rpcrdma_results_inline(struct
>>>> rpcrdma_xprt *r_xprt,
>>>> */
>>>> if (unlikely(xdrbuf->flags & XDRBUF_SPARSE_PAGES)) {
>>>> if (!*ppages)
>>>> - *ppages = alloc_page(GFP_ATOMIC);
>>>> + *ppages = alloc_page(GFP_NOWAIT |
>>>> __GFP_NOWARN);
>>>> if (!*ppages)
>>>> return -ENOBUFS;
>>>> }
>>>>
>>>
>>> Cheers
>>> Trond
>>> --
>>> Trond Myklebust
>>> Linux NFS client maintainer, Hammerspace
>>> [email protected]
>>
>> --
>> Chuck Lever
>>
>>
>>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]

--
Chuck Lever




2019-03-04 16:21:32

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH RFC] SUNRPC: Avoid digging into the ATOMIC pool

On Mon, 2019-03-04 at 11:13 -0500, Chuck Lever wrote:
> > On Mar 4, 2019, at 11:09 AM, Trond Myklebust <
> > [email protected]> wrote:
> >
> > On Mon, 2019-03-04 at 11:05 -0500, Chuck Lever wrote:
> > > > On Mar 4, 2019, at 11:02 AM, Trond Myklebust <
> > > > [email protected]> wrote:
> > > >
> > > > On Mon, 2019-03-04 at 10:32 -0500, Chuck Lever wrote:
> > > > > Page allocation requests made when the SPARSE_PAGES flag is
> > > > > set
> > > > > are
> > > > > allowed to fail, and are not critical. No need to spend a
> > > > > rare
> > > > > resource.
> > > > >
> > > > > Signed-off-by: Chuck Lever <[email protected]>
> > > > > ---
> > > > > net/sunrpc/socklib.c | 2 +-
> > > > > net/sunrpc/xprtrdma/rpc_rdma.c | 2 +-
> > > > > 2 files changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
> > > > > index 7e55cfc..9faea12 100644
> > > > > --- a/net/sunrpc/socklib.c
> > > > > +++ b/net/sunrpc/socklib.c
> > > > > @@ -106,7 +106,7 @@ static size_t
> > > > > xdr_skb_read_and_csum_bits(struct
> > > > > xdr_skb_reader *desc, void *to,
> > > > > /* ACL likes to be lazy in allocating pages -
> > > > > ACLs
> > > > > * are small by default but can get huge. */
> > > > > if ((xdr->flags & XDRBUF_SPARSE_PAGES) &&
> > > > > *ppage ==
> > > > > NULL) {
> > > > > - *ppage = alloc_page(GFP_ATOMIC);
> > > > > + *ppage = alloc_page(GFP_NOWAIT |
> > > > > __GFP_NOWARN);
> > > > > if (unlikely(*ppage == NULL)) {
> > > > > if (copied == 0)
> > > > > copied = -ENOMEM;
> > > >
> > > > Hmm... Can't we make this GFP_KERNEL? I can't see that we're
> > > > holding
> > > > any locks here.
> > >
> > > We're holding the transport send lock. Since this is an order 0
> > > allocation
> > > it's unlikely to fail even with a less aggressive request like
> > > NOWAIT.
> >
> > We should only be holding the receive mutex. That prevents others
> > from
> > closing the socket (or starting a second receive worker) but that's
> > all.
>
> OK, I guess the UDP and RPC-over-RDMA cases are different. For RDMA,
> this allocation is done in ->send_request.

Right. My comment was only intended to apply to the UDP case.

> > OTOH, the only thing using this mechanism on UDP should be the
> > NFSv2/v3
> > ACL receive code, so I'm not really sure anyone cares.
>
> True. However, as I've said before, the right fix for this is to
> get rid of SPARSE_PAGES and ensure that NFS always allocates enough
> receive buffer space during Call encoding. But it's not clear anyone
> cares enough.
>
> Will the current patch be acceptable, or do you want me to use a
> sharper stick when allocating?

It should be fine.

>
> > > > It does look like svc_udp_recvfrom() is wrapping the
> > > > call to csum_partial_copy_to_xdr() in a
> > > > local_bh_disable()/enable()
> > > > pair, but is that actually needed? If so, for what?
> > > >
> > > > > diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c
> > > > > b/net/sunrpc/xprtrdma/rpc_rdma.c
> > > > > index 6c1fb27..b759b16 100644
> > > > > --- a/net/sunrpc/xprtrdma/rpc_rdma.c
> > > > > +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
> > > > > @@ -238,7 +238,7 @@ static bool rpcrdma_results_inline(struct
> > > > > rpcrdma_xprt *r_xprt,
> > > > > */
> > > > > if (unlikely(xdrbuf->flags &
> > > > > XDRBUF_SPARSE_PAGES)) {
> > > > > if (!*ppages)
> > > > > - *ppages =
> > > > > alloc_page(GFP_ATOMIC);
> > > > > + *ppages = alloc_page(GFP_NOWAIT
> > > > > |
> > > > > __GFP_NOWARN);
> > > > > if (!*ppages)
> > > > > return -ENOBUFS;
> > > > > }
> > > > >
> > > >
> > > > Cheers
> > > > Trond
> > > > --
> > > > Trond Myklebust
> > > > Linux NFS client maintainer, Hammerspace
> > > > [email protected]
> > >
> > > --
> > > Chuck Lever
> > >
> > >
> > >
> > --
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > [email protected]
>
> --
> Chuck Lever
>
>
>
--
Trond Myklebust
CTO, Hammerspace Inc
4300 El Camino Real, Suite 105
Los Altos, CA 94022
http://www.hammer.space