2016-10-23 18:21:16

by J. Bruce Fields

[permalink] [raw]
Subject: upstream server crash

I'm getting an intermittent crash in the nfs server as of
68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
pointers for RPC Call and Reply messages".

I haven't tried to understand that commit or why it would be a problem yet, I
don't see an obvious connection--I can take a closer look Monday.

Could even be that I just landed on this commit by chance, the problem is a
little hard to reproduce so I don't completely trust my testing.

--b.

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff816937d2>] __memcpy+0x12/0x20
PGD 0
Oops: 0002 [#1] PREEMPT SMP
Modules linked in: rpcsec_gss_krb5 nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc
CPU: 0 PID: 4437 Comm: nfsd Not tainted 4.9.0-rc1-00075-gae0340c #766
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
task: ffff88006d810d40 task.stack: ffffc90000644000
RIP: 0010:[<ffffffff816937d2>] [<ffffffff816937d2>] __memcpy+0x12/0x20
RSP: 0018:ffffc90000647d60 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff88007b5ca000 RCX: 000000000000000a
RDX: 0000000000000004 RSI: ffff88007bab7000 RDI: 0000000000000000
RBP: ffffc90000647db8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880078535000
R13: ffff880035d02000 R14: ffff88007b4775b0 R15: ffff88007b477000
FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000787de000 CR4: 00000000000406f0
Stack:
ffffffffa00191ab ffff88006d810d40 ffff880001000000 ffff88007b435a00
ffff880078535378 0000000000000004 ffff880078535000 0000000078535000
ffff88007b5ca000 0000000000000000 ffffffffa0028626 ffffc90000647e30
Call Trace:
[<ffffffffa00191ab>] ? svc_tcp_recvfrom+0x6eb/0x820 [sunrpc]
[<ffffffffa0028626>] ? svc_recv+0x1e6/0xf00 [sunrpc]
[<ffffffffa0029240>] svc_recv+0xe00/0xf00 [sunrpc]
[<ffffffffa00b57ff>] nfsd+0x16f/0x280 [nfsd]
[<ffffffffa00b5695>] ? nfsd+0x5/0x280 [nfsd]
[<ffffffffa00b5690>] ? nfsd_destroy+0x190/0x190 [nfsd]
[<ffffffff810a6c00>] kthread+0xf0/0x110
[<ffffffff810a6b10>] ? kthread_park+0x60/0x60
[<ffffffff81b39607>] ret_from_fork+0x27/0x40
Code: c3 e8 53 fb ff ff 48 8b 43 60 48 2b 43 50 88 43 4e 5b 5d eb ea 90 90 90 90 66 66 90 66 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3
RIP [<ffffffff816937d2>] __memcpy+0x12/0x20
RSP <ffffc90000647d60>
CR2: 0000000000000000



2016-10-23 20:04:55

by Chuck Lever

[permalink] [raw]
Subject: Re: upstream server crash


> On Oct 23, 2016, at 2:21 PM, J. Bruce Fields <[email protected]> wrote:
>
> I'm getting an intermittent crash in the nfs server as of
> 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> pointers for RPC Call and Reply messages".
>
> I haven't tried to understand that commit or why it would be a problem yet, I
> don't see an obvious connection--I can take a closer look Monday.

I don't see anything in the backtrace that connects to the
client. However, if the client was involved indirectly (say,
with NFSv3 NLM callbacks, maybe?) it could have overwritten
memory that was in use by the server.

What happens if you enable SLAB debugging?


> Could even be that I just landed on this commit by chance, the problem is a
> little hard to reproduce so I don't completely trust my testing.

Can you describe what your testing does?


> --b.
>
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffff816937d2>] __memcpy+0x12/0x20
> PGD 0
> Oops: 0002 [#1] PREEMPT SMP
> Modules linked in: rpcsec_gss_krb5 nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc
> CPU: 0 PID: 4437 Comm: nfsd Not tainted 4.9.0-rc1-00075-gae0340c #766
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
> task: ffff88006d810d40 task.stack: ffffc90000644000
> RIP: 0010:[<ffffffff816937d2>] [<ffffffff816937d2>] __memcpy+0x12/0x20
> RSP: 0018:ffffc90000647d60 EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff88007b5ca000 RCX: 000000000000000a
> RDX: 0000000000000004 RSI: ffff88007bab7000 RDI: 0000000000000000
> RBP: ffffc90000647db8 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff880078535000
> R13: ffff880035d02000 R14: ffff88007b4775b0 R15: ffff88007b477000
> FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 00000000787de000 CR4: 00000000000406f0
> Stack:
> ffffffffa00191ab ffff88006d810d40 ffff880001000000 ffff88007b435a00
> ffff880078535378 0000000000000004 ffff880078535000 0000000078535000
> ffff88007b5ca000 0000000000000000 ffffffffa0028626 ffffc90000647e30
> Call Trace:
> [<ffffffffa00191ab>] ? svc_tcp_recvfrom+0x6eb/0x820 [sunrpc]
> [<ffffffffa0028626>] ? svc_recv+0x1e6/0xf00 [sunrpc]
> [<ffffffffa0029240>] svc_recv+0xe00/0xf00 [sunrpc]
> [<ffffffffa00b57ff>] nfsd+0x16f/0x280 [nfsd]
> [<ffffffffa00b5695>] ? nfsd+0x5/0x280 [nfsd]
> [<ffffffffa00b5690>] ? nfsd_destroy+0x190/0x190 [nfsd]
> [<ffffffff810a6c00>] kthread+0xf0/0x110
> [<ffffffff810a6b10>] ? kthread_park+0x60/0x60
> [<ffffffff81b39607>] ret_from_fork+0x27/0x40
> Code: c3 e8 53 fb ff ff 48 8b 43 60 48 2b 43 50 88 43 4e 5b 5d eb ea 90 90 90 90 66 66 90 66 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3
> RIP [<ffffffff816937d2>] __memcpy+0x12/0x20
> RSP <ffffc90000647d60>
> CR2: 0000000000000000
>

--
Chuck Lever




2016-10-23 20:14:59

by J. Bruce Fields

[permalink] [raw]
Subject: Re: upstream server crash

On Sun, Oct 23, 2016 at 04:04:47PM -0400, Chuck Lever wrote:
>
> > On Oct 23, 2016, at 2:21 PM, J. Bruce Fields <[email protected]> wrote:
> >
> > I'm getting an intermittent crash in the nfs server as of
> > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > pointers for RPC Call and Reply messages".
> >
> > I haven't tried to understand that commit or why it would be a problem yet, I
> > don't see an obvious connection--I can take a closer look Monday.
>
> I don't see anything in the backtrace that connects to the
> client. However, if the client was involved indirectly (say,
> with NFSv3 NLM callbacks, maybe?) it could have overwritten
> memory that was in use by the server.

The crashes are consistently happening only over 4.1, so backchannel
code seems more likely. (There were also v3 tests run earlier on the
same machine.)

> What happens if you enable SLAB debugging?

I have CONFIG_DEBUG_SLAB set. I haven't seen any warnings.

> > Could even be that I just landed on this commit by chance, the problem is a
> > little hard to reproduce so I don't completely trust my testing.
>
> Can you describe what your testing does?

I've seen this both in some posix locking tests and in what I believe is
an xfs fsstress run. I haven't looked more closely than that.

--b.

>
>
> > --b.
> >
> > BUG: unable to handle kernel NULL pointer dereference at (null)
> > IP: [<ffffffff816937d2>] __memcpy+0x12/0x20
> > PGD 0
> > Oops: 0002 [#1] PREEMPT SMP
> > Modules linked in: rpcsec_gss_krb5 nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc
> > CPU: 0 PID: 4437 Comm: nfsd Not tainted 4.9.0-rc1-00075-gae0340c #766
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
> > task: ffff88006d810d40 task.stack: ffffc90000644000
> > RIP: 0010:[<ffffffff816937d2>] [<ffffffff816937d2>] __memcpy+0x12/0x20
> > RSP: 0018:ffffc90000647d60 EFLAGS: 00010202
> > RAX: 0000000000000000 RBX: ffff88007b5ca000 RCX: 000000000000000a
> > RDX: 0000000000000004 RSI: ffff88007bab7000 RDI: 0000000000000000
> > RBP: ffffc90000647db8 R08: 0000000000000001 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff880078535000
> > R13: ffff880035d02000 R14: ffff88007b4775b0 R15: ffff88007b477000
> > FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000000 CR3: 00000000787de000 CR4: 00000000000406f0
> > Stack:
> > ffffffffa00191ab ffff88006d810d40 ffff880001000000 ffff88007b435a00
> > ffff880078535378 0000000000000004 ffff880078535000 0000000078535000
> > ffff88007b5ca000 0000000000000000 ffffffffa0028626 ffffc90000647e30
> > Call Trace:
> > [<ffffffffa00191ab>] ? svc_tcp_recvfrom+0x6eb/0x820 [sunrpc]
> > [<ffffffffa0028626>] ? svc_recv+0x1e6/0xf00 [sunrpc]
> > [<ffffffffa0029240>] svc_recv+0xe00/0xf00 [sunrpc]
> > [<ffffffffa00b57ff>] nfsd+0x16f/0x280 [nfsd]
> > [<ffffffffa00b5695>] ? nfsd+0x5/0x280 [nfsd]
> > [<ffffffffa00b5690>] ? nfsd_destroy+0x190/0x190 [nfsd]
> > [<ffffffff810a6c00>] kthread+0xf0/0x110
> > [<ffffffff810a6b10>] ? kthread_park+0x60/0x60
> > [<ffffffff81b39607>] ret_from_fork+0x27/0x40
> > Code: c3 e8 53 fb ff ff 48 8b 43 60 48 2b 43 50 88 43 4e 5b 5d eb ea 90 90 90 90 66 66 90 66 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 f3
> > RIP [<ffffffff816937d2>] __memcpy+0x12/0x20
> > RSP <ffffc90000647d60>
> > CR2: 0000000000000000
> >
>
> --
> Chuck Lever
>
>

2016-10-24 03:15:23

by Eryu Guan

[permalink] [raw]
Subject: Re: upstream server crash

On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> I'm getting an intermittent crash in the nfs server as of
> 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> pointers for RPC Call and Reply messages".
>
> I haven't tried to understand that commit or why it would be a problem yet, I
> don't see an obvious connection--I can take a closer look Monday.
>
> Could even be that I just landed on this commit by chance, the problem is a
> little hard to reproduce so I don't completely trust my testing.

I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
reliably by running xfstests generic/013 case, on a loopback mounted
NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
please see

http://marc.info/?l=linux-nfs&m=147714320129362&w=2

Thanks,
Eryu

2016-10-24 13:31:12

by Jeff Layton

[permalink] [raw]
Subject: Re: upstream server crash

On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
> On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> >
> > I'm getting an intermittent crash in the nfs server as of
> > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > pointers for RPC Call and Reply messages".
> >
> > I haven't tried to understand that commit or why it would be a problem yet, I
> > don't see an obvious connection--I can take a closer look Monday.
> >
> > Could even be that I just landed on this commit by chance, the problem is a
> > little hard to reproduce so I don't completely trust my testing.
>
> I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
> reliably by running xfstests generic/013 case, on a loopback mounted
> NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
> please see
>
> http://marc.info/?l=linux-nfs&m=147714320129362&w=2
>

Looks like you landed at the same commit as Bruce, so that's probably
legit. That commit is very small though. The only real change that
doesn't affect the new field is this:


@@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
                     req->rq_buffer,
                     req->rq_callsize);
        xdr_buf_init(&req->rq_rcv_buf,
-                    (char *)req->rq_buffer + req->rq_callsize,
+                    req->rq_rbuffer,
                     req->rq_rcvsize);


So I'm guessing this is breaking the callback channel somehow?

--
Jeff Layton <[email protected]>

2016-10-24 13:52:05

by Chuck Lever

[permalink] [raw]
Subject: Re: upstream server crash


> On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
>
> On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
>> On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
>>>
>>> I'm getting an intermittent crash in the nfs server as of
>>> 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
>>> pointers for RPC Call and Reply messages".
>>>
>>> I haven't tried to understand that commit or why it would be a problem yet, I
>>> don't see an obvious connection--I can take a closer look Monday.
>>>
>>> Could even be that I just landed on this commit by chance, the problem is a
>>> little hard to reproduce so I don't completely trust my testing.
>>
>> I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
>> reliably by running xfstests generic/013 case, on a loopback mounted
>> NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
>> please see
>>
>> http://marc.info/?l=linux-nfs&m=147714320129362&w=2
>>
>
> Looks like you landed at the same commit as Bruce, so that's probably
> legit. That commit is very small though. The only real change that
> doesn't affect the new field is this:
>
>
> @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
> req->rq_buffer,
> req->rq_callsize);
> xdr_buf_init(&req->rq_rcv_buf,
> - (char *)req->rq_buffer + req->rq_callsize,
> + req->rq_rbuffer,
> req->rq_rcvsize);
>
>
> So I'm guessing this is breaking the callback channel somehow?

Could be the TCP backchannel code is using rq_buffer in a different
way than RDMA backchannel or the forward channel code.


--
Chuck Lever




2016-10-24 15:19:41

by Jeff Layton

[permalink] [raw]
Subject: Re: upstream server crash

On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
> >
> > On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
> >
> > On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
> > >
> > > On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> > > >
> > > >
> > > > I'm getting an intermittent crash in the nfs server as of
> > > > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > > > pointers for RPC Call and Reply messages".
> > > >
> > > > I haven't tried to understand that commit or why it would be a problem yet, I
> > > > don't see an obvious connection--I can take a closer look Monday.
> > > >
> > > > Could even be that I just landed on this commit by chance, the problem is a
> > > > little hard to reproduce so I don't completely trust my testing.
> > >
> > > I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
> > > reliably by running xfstests generic/013 case, on a loopback mounted
> > > NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
> > > please see
> > >
> > > http://marc.info/?l=linux-nfs&m=147714320129362&w=2
> > >
> >
> > Looks like you landed at the same commit as Bruce, so that's probably
> > legit. That commit is very small though. The only real change that
> > doesn't affect the new field is this:
> >
> >
> > @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
> > req->rq_buffer,
> > req->rq_callsize);
> > xdr_buf_init(&req->rq_rcv_buf,
> > - (char *)req->rq_buffer + req->rq_callsize,
> > + req->rq_rbuffer,
> > req->rq_rcvsize);
> >
> >
> > So I'm guessing this is breaking the callback channel somehow?
>
> Could be the TCP backchannel code is using rq_buffer in a different
> way than RDMA backchannel or the forward channel code.
>

Well, it basically allocates a page per rpc_rqst and then maps that.

One thing I notice is that this patch ensures that rq_rbuffer gets set
up in rpc_malloc and xprt_rdma_allocate, but it looks like
xprt_alloc_bc_req didn't get the same treatment.

I suspect that that may be the problem...

--
Jeff Layton <[email protected]>

2016-10-24 15:24:42

by Jeff Layton

[permalink] [raw]
Subject: Re: upstream server crash

On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
> On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
> >
> > >
> > >
> > > On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
> > >
> > > On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
> > > >
> > > >
> > > > On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> > > > >
> > > > >
> > > > >
> > > > > I'm getting an intermittent crash in the nfs server as of
> > > > > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > > > > pointers for RPC Call and Reply messages".
> > > > >
> > > > > I haven't tried to understand that commit or why it would be a problem yet, I
> > > > > don't see an obvious connection--I can take a closer look Monday.
> > > > >
> > > > > Could even be that I just landed on this commit by chance, the problem is a
> > > > > little hard to reproduce so I don't completely trust my testing.
> > > >
> > > > I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
> > > > reliably by running xfstests generic/013 case, on a loopback mounted
> > > > NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
> > > > please see
> > > >
> > > > http://marc.info/?l=linux-nfs&m=147714320129362&w=2
> > > >
> > >
> > > Looks like you landed at the same commit as Bruce, so that's probably
> > > legit. That commit is very small though. The only real change that
> > > doesn't affect the new field is this:
> > >
> > >
> > > @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
> > > req->rq_buffer,
> > > req->rq_callsize);
> > > xdr_buf_init(&req->rq_rcv_buf,
> > > - (char *)req->rq_buffer + req->rq_callsize,
> > > + req->rq_rbuffer,
> > > req->rq_rcvsize);
> > >
> > >
> > > So I'm guessing this is breaking the callback channel somehow?
> >
> > Could be the TCP backchannel code is using rq_buffer in a different
> > way than RDMA backchannel or the forward channel code.
> >
>
> Well, it basically allocates a page per rpc_rqst and then maps that.
>
> One thing I notice is that this patch ensures that rq_rbuffer gets set
> up in rpc_malloc and xprt_rdma_allocate, but it looks like
> xprt_alloc_bc_req didn't get the same treatment.
>
> I suspect that that may be the problem...
>
In fact, maybe we just need this here? (untested and probably
whitespace damaged):

diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
index ac701c28f44f..c561aa8ce05b 100644
--- a/net/sunrpc/backchannel_rqst.c
+++ b/net/sunrpc/backchannel_rqst.c
@@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
goto out_free;
}
req->rq_rcv_buf.len = PAGE_SIZE;
+ req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;

/* Preallocate one XDR send buffer */
if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {

2016-10-24 15:55:40

by Chuck Lever

[permalink] [raw]
Subject: Re: upstream server crash


> On Oct 24, 2016, at 11:24 AM, Jeff Layton <[email protected]> wrote:
>
> On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
>> On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
>>>
>>>>
>>>>
>>>> On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
>>>>
>>>> On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
>>>>>
>>>>>
>>>>> On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> I'm getting an intermittent crash in the nfs server as of
>>>>>> 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
>>>>>> pointers for RPC Call and Reply messages".
>>>>>>
>>>>>> I haven't tried to understand that commit or why it would be a problem yet, I
>>>>>> don't see an obvious connection--I can take a closer look Monday.
>>>>>>
>>>>>> Could even be that I just landed on this commit by chance, the problem is a
>>>>>> little hard to reproduce so I don't completely trust my testing.
>>>>>
>>>>> I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
>>>>> reliably by running xfstests generic/013 case, on a loopback mounted
>>>>> NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
>>>>> please see
>>>>>
>>>>> http://marc.info/?l=linux-nfs&m=147714320129362&w=2
>>>>>
>>>>
>>>> Looks like you landed at the same commit as Bruce, so that's probably
>>>> legit. That commit is very small though. The only real change that
>>>> doesn't affect the new field is this:
>>>>
>>>>
>>>> @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
>>>> req->rq_buffer,
>>>> req->rq_callsize);
>>>> xdr_buf_init(&req->rq_rcv_buf,
>>>> - (char *)req->rq_buffer + req->rq_callsize,
>>>> + req->rq_rbuffer,
>>>> req->rq_rcvsize);
>>>>
>>>>
>>>> So I'm guessing this is breaking the callback channel somehow?
>>>
>>> Could be the TCP backchannel code is using rq_buffer in a different
>>> way than RDMA backchannel or the forward channel code.
>>>
>>
>> Well, it basically allocates a page per rpc_rqst and then maps that.
>>
>> One thing I notice is that this patch ensures that rq_rbuffer gets set
>> up in rpc_malloc and xprt_rdma_allocate, but it looks like
>> xprt_alloc_bc_req didn't get the same treatment.
>>
>> I suspect that that may be the problem...
>>
> In fact, maybe we just need this here? (untested and probably
> whitespace damaged):
>
> diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
> index ac701c28f44f..c561aa8ce05b 100644
> --- a/net/sunrpc/backchannel_rqst.c
> +++ b/net/sunrpc/backchannel_rqst.c
> @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
> goto out_free;
> }
> req->rq_rcv_buf.len = PAGE_SIZE;
> + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;

That looks plausible! Basically that is needed after xdr_buf_init()
is done for a backchannel rpc_rqst's receive buffer.

net/sunrpc/xprtrdma/backchannel.c might need a similar change. I saw
crashes with generic/013 at bake-a-thon last week, but as the iommu
was involved with those, I've been looking in a different place. Will
give this a try.


> /* Preallocate one XDR send buffer */
> if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {

--
Chuck Lever




2016-10-24 18:08:59

by J. Bruce Fields

[permalink] [raw]
Subject: Re: upstream server crash

On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
> On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
> > On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
> > >
> > > >
> > > >
> > > > On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
> > > >
> > > > On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
> > > > >
> > > > >
> > > > > On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > I'm getting an intermittent crash in the nfs server as of
> > > > > > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > > > > > pointers for RPC Call and Reply messages".
> > > > > >
> > > > > > I haven't tried to understand that commit or why it would be a problem yet, I
> > > > > > don't see an obvious connection--I can take a closer look Monday.
> > > > > >
> > > > > > Could even be that I just landed on this commit by chance, the problem is a
> > > > > > little hard to reproduce so I don't completely trust my testing.
> > > > >
> > > > > I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
> > > > > reliably by running xfstests generic/013 case, on a loopback mounted
> > > > > NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
> > > > > please see
> > > > >
> > > > > http://marc.info/?l=linux-nfs&m=147714320129362&w=2
> > > > >
> > > >
> > > > Looks like you landed at the same commit as Bruce, so that's probably
> > > > legit. That commit is very small though. The only real change that
> > > > doesn't affect the new field is this:
> > > >
> > > >
> > > > @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
> > > > req->rq_buffer,
> > > > req->rq_callsize);
> > > > xdr_buf_init(&req->rq_rcv_buf,
> > > > - (char *)req->rq_buffer + req->rq_callsize,
> > > > + req->rq_rbuffer,
> > > > req->rq_rcvsize);
> > > >
> > > >
> > > > So I'm guessing this is breaking the callback channel somehow?
> > >
> > > Could be the TCP backchannel code is using rq_buffer in a different
> > > way than RDMA backchannel or the forward channel code.
> > >
> >
> > Well, it basically allocates a page per rpc_rqst and then maps that.
> >
> > One thing I notice is that this patch ensures that rq_rbuffer gets set
> > up in rpc_malloc and xprt_rdma_allocate, but it looks like
> > xprt_alloc_bc_req didn't get the same treatment.
> >
> > I suspect that that may be the problem...
> >
> In fact, maybe we just need this here? (untested and probably
> whitespace damaged):

No change in results for me.

--b.
>
> diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
> index ac701c28f44f..c561aa8ce05b 100644
> --- a/net/sunrpc/backchannel_rqst.c
> +++ b/net/sunrpc/backchannel_rqst.c
> @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
> goto out_free;
> }
> req->rq_rcv_buf.len = PAGE_SIZE;
> + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
>
> /* Preallocate one XDR send buffer */
> if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {

2016-10-24 19:17:39

by Jeff Layton

[permalink] [raw]
Subject: Re: upstream server crash

On Mon, 2016-10-24 at 14:08 -0400, J. Bruce Fields wrote:
> On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
> >
> > On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
> > >
> > > On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
> > > >
> > > >
> > > > >
> > > > >
> > > > >
> > > > > On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
> > > > >
> > > > > On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I'm getting an intermittent crash in the nfs server as of
> > > > > > > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > > > > > > pointers for RPC Call and Reply messages".
> > > > > > >
> > > > > > > I haven't tried to understand that commit or why it would be a problem yet, I
> > > > > > > don't see an obvious connection--I can take a closer look Monday.
> > > > > > >
> > > > > > > Could even be that I just landed on this commit by chance, the problem is a
> > > > > > > little hard to reproduce so I don't completely trust my testing.
> > > > > >
> > > > > > I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
> > > > > > reliably by running xfstests generic/013 case, on a loopback mounted
> > > > > > NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
> > > > > > please see
> > > > > >
> > > > > > http://marc.info/?l=linux-nfs&m=147714320129362&w=2
> > > > > >
> > > > >
> > > > > Looks like you landed at the same commit as Bruce, so that's probably
> > > > > legit. That commit is very small though. The only real change that
> > > > > doesn't affect the new field is this:
> > > > >
> > > > >
> > > > > @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
> > > > > req->rq_buffer,
> > > > > req->rq_callsize);
> > > > > xdr_buf_init(&req->rq_rcv_buf,
> > > > > - (char *)req->rq_buffer + req->rq_callsize,
> > > > > + req->rq_rbuffer,
> > > > > req->rq_rcvsize);
> > > > >
> > > > >
> > > > > So I'm guessing this is breaking the callback channel somehow?
> > > >
> > > > Could be the TCP backchannel code is using rq_buffer in a different
> > > > way than RDMA backchannel or the forward channel code.
> > > >
> > >
> > > Well, it basically allocates a page per rpc_rqst and then maps that.
> > >
> > > One thing I notice is that this patch ensures that rq_rbuffer gets set
> > > up in rpc_malloc and xprt_rdma_allocate, but it looks like
> > > xprt_alloc_bc_req didn't get the same treatment.
> > >
> > > I suspect that that may be the problem...
> > >
> > In fact, maybe we just need this here? (untested and probably
> > whitespace damaged):
>
> No change in results for me.
>
> --b.
> >
> >
> > diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
> > index ac701c28f44f..c561aa8ce05b 100644
> > --- a/net/sunrpc/backchannel_rqst.c
> > +++ b/net/sunrpc/backchannel_rqst.c
> > @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
> > goto out_free;
> > }
> > req->rq_rcv_buf.len = PAGE_SIZE;
> > + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
> >
> > /* Preallocate one XDR send buffer */
> > if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {

Ahh ok, I think I see.

We probably also need to set rq_rbuffer in bc_malloc and and
xprt_rdma_bc_allocate.

My guess is that we're ending up in rpc_xdr_encode with a NULL
rq_rbuffer pointer, so the right fix would seem to be to ensure that it
is properly set whenever rq_buffer is set.

So I think this may be what we want, actually. I'll plan to test it out
but may not get to it before tomorrow.

--
Jeff Layton <[email protected]>


Attachments:
0001-sunrpc-fix-some-missing-rq_rbuffer-assignments.patch (1.26 kB)

2016-10-24 20:40:25

by J. Bruce Fields

[permalink] [raw]
Subject: Re: upstream server crash

On Mon, Oct 24, 2016 at 03:17:34PM -0400, Jeff Layton wrote:
> On Mon, 2016-10-24 at 14:08 -0400, J. Bruce Fields wrote:
> > On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
> > >
> > > On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
> > > >
> > > > On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
> > > > > >
> > > > > > On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > I'm getting an intermittent crash in the nfs server as of
> > > > > > > > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > > > > > > > pointers for RPC Call and Reply messages".
> > > > > > > >
> > > > > > > > I haven't tried to understand that commit or why it would be a problem yet, I
> > > > > > > > don't see an obvious connection--I can take a closer look Monday.
> > > > > > > >
> > > > > > > > Could even be that I just landed on this commit by chance, the problem is a
> > > > > > > > little hard to reproduce so I don't completely trust my testing.
> > > > > > >
> > > > > > > I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
> > > > > > > reliably by running xfstests generic/013 case, on a loopback mounted
> > > > > > > NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
> > > > > > > please see
> > > > > > >
> > > > > > > http://marc.info/?l=linux-nfs&m=147714320129362&w=2
> > > > > > >
> > > > > >
> > > > > > Looks like you landed at the same commit as Bruce, so that's probably
> > > > > > legit. That commit is very small though. The only real change that
> > > > > > doesn't affect the new field is this:
> > > > > >
> > > > > >
> > > > > > @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
> > > > > > req->rq_buffer,
> > > > > > req->rq_callsize);
> > > > > > xdr_buf_init(&req->rq_rcv_buf,
> > > > > > - (char *)req->rq_buffer + req->rq_callsize,
> > > > > > + req->rq_rbuffer,
> > > > > > req->rq_rcvsize);
> > > > > >
> > > > > >
> > > > > > So I'm guessing this is breaking the callback channel somehow?
> > > > >
> > > > > Could be the TCP backchannel code is using rq_buffer in a different
> > > > > way than RDMA backchannel or the forward channel code.
> > > > >
> > > >
> > > > Well, it basically allocates a page per rpc_rqst and then maps that.
> > > >
> > > > One thing I notice is that this patch ensures that rq_rbuffer gets set
> > > > up in rpc_malloc and xprt_rdma_allocate, but it looks like
> > > > xprt_alloc_bc_req didn't get the same treatment.
> > > >
> > > > I suspect that that may be the problem...
> > > >
> > > In fact, maybe we just need this here? (untested and probably
> > > whitespace damaged):
> >
> > No change in results for me.
> >
> > --b.
> > >
> > >
> > > diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
> > > index ac701c28f44f..c561aa8ce05b 100644
> > > --- a/net/sunrpc/backchannel_rqst.c
> > > +++ b/net/sunrpc/backchannel_rqst.c
> > > @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
> > > goto out_free;
> > > }
> > > req->rq_rcv_buf.len = PAGE_SIZE;
> > > + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
> > >
> > > /* Preallocate one XDR send buffer */
> > > if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {
>
> Ahh ok, I think I see.
>
> We probably also need to set rq_rbuffer in bc_malloc and and
> xprt_rdma_bc_allocate.
>
> My guess is that we're ending up in rpc_xdr_encode with a NULL
> rq_rbuffer pointer, so the right fix would seem to be to ensure that it
> is properly set whenever rq_buffer is set.
>
> So I think this may be what we want, actually. I'll plan to test it out
> but may not get to it before tomorrow.

It passes here.

--b.

>
> --
> Jeff Layton <[email protected]>

> From ef2a391bc4d8f6b729aacee7cde8d9baf86767c3 Mon Sep 17 00:00:00 2001
> From: Jeff Layton <[email protected]>
> Date: Mon, 24 Oct 2016 15:13:40 -0400
> Subject: [PATCH] sunrpc: fix some missing rq_rbuffer assignments
>
> I think we basically need to set rq_rbuffer whenever rq_buffer is set.
>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 1 +
> net/sunrpc/xprtsock.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> index 2d8545c34095..fc4535ead7c2 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
> @@ -182,6 +182,7 @@ xprt_rdma_bc_allocate(struct rpc_task *task)
> return -ENOMEM;
>
> rqst->rq_buffer = page_address(page);
> + rqst->rq_rbuffer = (char *)rqst->rq_buffer + rqst->rq_callsize;
> return 0;
> }
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 0137af1c0916..e01c825bc683 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2563,6 +2563,7 @@ static int bc_malloc(struct rpc_task *task)
> buf->len = PAGE_SIZE;
>
> rqst->rq_buffer = buf->data;
> + rqst->rq_rbuffer = (char *)rqst->rq_buffer + rqst->rq_callsize;
> return 0;
> }
>
> --
> 2.7.4
>


2016-10-24 21:39:01

by Chuck Lever

[permalink] [raw]
Subject: Re: upstream server crash


> On Oct 24, 2016, at 4:40 PM, J. Bruce Fields <[email protected]> wrote:
>
> On Mon, Oct 24, 2016 at 03:17:34PM -0400, Jeff Layton wrote:
>> On Mon, 2016-10-24 at 14:08 -0400, J. Bruce Fields wrote:
>>> On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
>>>>
>>>> On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
>>>>>
>>>>> On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
>>>>>>>
>>>>>>> On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm getting an intermittent crash in the nfs server as of
>>>>>>>>> 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
>>>>>>>>> pointers for RPC Call and Reply messages".
>>>>>>>>>
>>>>>>>>> I haven't tried to understand that commit or why it would be a problem yet, I
>>>>>>>>> don't see an obvious connection--I can take a closer look Monday.
>>>>>>>>>
>>>>>>>>> Could even be that I just landed on this commit by chance, the problem is a
>>>>>>>>> little hard to reproduce so I don't completely trust my testing.
>>>>>>>>
>>>>>>>> I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
>>>>>>>> reliably by running xfstests generic/013 case, on a loopback mounted
>>>>>>>> NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
>>>>>>>> please see
>>>>>>>>
>>>>>>>> http://marc.info/?l=linux-nfs&m=147714320129362&w=2
>>>>>>>>
>>>>>>>
>>>>>>> Looks like you landed at the same commit as Bruce, so that's probably
>>>>>>> legit. That commit is very small though. The only real change that
>>>>>>> doesn't affect the new field is this:
>>>>>>>
>>>>>>>
>>>>>>> @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
>>>>>>> req->rq_buffer,
>>>>>>> req->rq_callsize);
>>>>>>> xdr_buf_init(&req->rq_rcv_buf,
>>>>>>> - (char *)req->rq_buffer + req->rq_callsize,
>>>>>>> + req->rq_rbuffer,
>>>>>>> req->rq_rcvsize);
>>>>>>>
>>>>>>>
>>>>>>> So I'm guessing this is breaking the callback channel somehow?
>>>>>>
>>>>>> Could be the TCP backchannel code is using rq_buffer in a different
>>>>>> way than RDMA backchannel or the forward channel code.
>>>>>>
>>>>>
>>>>> Well, it basically allocates a page per rpc_rqst and then maps that.
>>>>>
>>>>> One thing I notice is that this patch ensures that rq_rbuffer gets set
>>>>> up in rpc_malloc and xprt_rdma_allocate, but it looks like
>>>>> xprt_alloc_bc_req didn't get the same treatment.
>>>>>
>>>>> I suspect that that may be the problem...
>>>>>
>>>> In fact, maybe we just need this here? (untested and probably
>>>> whitespace damaged):
>>>
>>> No change in results for me.
>>>
>>> --b.
>>>>
>>>>
>>>> diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
>>>> index ac701c28f44f..c561aa8ce05b 100644
>>>> --- a/net/sunrpc/backchannel_rqst.c
>>>> +++ b/net/sunrpc/backchannel_rqst.c
>>>> @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
>>>> goto out_free;
>>>> }
>>>> req->rq_rcv_buf.len = PAGE_SIZE;
>>>> + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
>>>>
>>>> /* Preallocate one XDR send buffer */
>>>> if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {
>>
>> Ahh ok, I think I see.
>>
>> We probably also need to set rq_rbuffer in bc_malloc and and
>> xprt_rdma_bc_allocate.
>>
>> My guess is that we're ending up in rpc_xdr_encode with a NULL
>> rq_rbuffer pointer, so the right fix would seem to be to ensure that it
>> is properly set whenever rq_buffer is set.
>>
>> So I think this may be what we want, actually. I'll plan to test it out
>> but may not get to it before tomorrow.
>
> It passes here.

Without Jeff's patch, my server locks up during generic/013 with NFS/RDMA
and NFSv4.1. With it, I get all the way to generic/089, and then encounter
this:

Oct 24 17:31:11 klimt kernel: general protection fault: 0000 [#1] SMP
Oct 24 17:31:11 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor raid6_pq rpcrdma pcspkr i2c_i801 lpc_ich ib_ipoib i2c_smbus mfd_core mei_me sg mei rdma_ucm shpchp ioatdma wmi ib_ucm ipmi_si ipmi_msghandler ib_uverbs ib_umad rdma_cm ib_cm iw_cm acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib ib_core mlx4_en sd_mod sr_mod cdrom ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core ahci libahci igb crc32c_intel ptp libata pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
Oct 24 17:31:11 klimt kernel: CPU: 3 PID: 1681 Comm: nfsd Not tainted 4.9.0-rc2-00006-g0e8f985 #2
Oct 24 17:31:11 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
Oct 24 17:31:11 klimt kernel: task: ffff88083feb15c0 task.stack: ffffc900058a0000
Oct 24 17:31:11 klimt kernel: RIP: 0010:[<ffffffff811ed1f0>] [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
Oct 24 17:31:11 klimt kernel: RSP: 0018:ffffc900058a3b88 EFLAGS: 00010286
Oct 24 17:31:11 klimt kernel: RAX: 0000000000000000 RBX: 00000000024000c0 RCX: 0000000000293b6b
Oct 24 17:31:11 klimt kernel: RDX: 0000000000293b6a RSI: 0000000000000000 RDI: 0000000000000002
Oct 24 17:31:11 klimt kernel: RBP: ffffc900058a3bc8 R08: 000000000001c560 R09: ffff88085f003a40
Oct 24 17:31:11 klimt kernel: R10: ffff88085f003a40 R11: ffff88083fc9b3c0 R12: 00000000024000c0
Oct 24 17:31:11 klimt kernel: R13: 0000000000000018 R14: ffff88081993d600 R15: c0d435750e5f8620
Oct 24 17:31:11 klimt kernel: FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000
Oct 24 17:31:11 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 24 17:31:11 klimt kernel: CR2: 00007fca9725a298 CR3: 0000000001c06000 CR4: 00000000001406e0
Oct 24 17:31:11 klimt kernel: Stack:
Oct 24 17:31:11 klimt kernel: ffff88085f003a40 ffff88085f003a40 ffffffff812c227b 0000000000000018
Oct 24 17:31:11 klimt kernel: ffff880852ee6b00 ffff88085583a000 ffff88081993d600 ffff880839a95840
Oct 24 17:31:11 klimt kernel: ffffc900058a3be8 ffffffff811a65d0 ffff88081993d600 00000000024000c0
Oct 24 17:31:11 klimt kernel: Call Trace:
Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] ? selinux_cred_prepare+0x1b/0x30
Oct 24 17:31:11 klimt kernel: [<ffffffff811a65d0>] kmemdup+0x20/0x50
Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] selinux_cred_prepare+0x1b/0x30
Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] security_prepare_creds+0x39/0x60
Oct 24 17:31:11 klimt kernel: [<ffffffff810a486f>] prepare_creds+0x12f/0x150
Oct 24 17:31:11 klimt kernel: [<ffffffffa043ee6c>] nfsd_setuser+0x8c/0x250 [nfsd]
Oct 24 17:31:11 klimt kernel: [<ffffffffa0438077>] nfsd_setuser_and_check_port+0x77/0xa0 [nfsd]
Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] ? security_prepare_creds+0x39/0x60
Oct 24 17:31:11 klimt kernel: [<ffffffffa04a33f6>] ? write_bytes_to_xdr_buf+0xa6/0xd0 [sunrpc]
Oct 24 17:31:11 klimt kernel: [<ffffffffa0438dcc>] fh_verify+0x5cc/0x610 [nfsd]
Oct 24 17:31:11 klimt kernel: [<ffffffffa044659b>] nfsd4_getattr+0x1b/0x80 [nfsd]
Oct 24 17:31:11 klimt kernel: [<ffffffffa04481fd>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
Oct 24 17:31:11 klimt kernel: [<ffffffffa04352b4>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
Oct 24 17:31:11 klimt kernel: [<ffffffffa049b249>] svc_process_common+0x3d9/0x700 [sunrpc]
Oct 24 17:31:11 klimt kernel: [<ffffffffa049c0c1>] svc_process+0xf1/0x1d0 [sunrpc]
Oct 24 17:31:11 klimt kernel: [<ffffffffa0434d4f>] nfsd+0xff/0x160 [nfsd]
Oct 24 17:31:11 klimt kernel: [<ffffffffa0434c50>] ? nfsd_destroy+0x60/0x60 [nfsd]
Oct 24 17:31:11 klimt kernel: [<ffffffff810a2235>] kthread+0xe5/0xf0
Oct 24 17:31:11 klimt kernel: [<ffffffff810a2150>] ? kthread_stop+0x120/0x120
Oct 24 17:31:11 klimt kernel: [<ffffffff816aa795>] ret_from_fork+0x25/0x30
Oct 24 17:31:11 klimt kernel: Code: 85 d2 75 dd eb d3 48 89 c3 66 90 eb 6e 81 e3 00 00 10 00 0f 84 e4 fe ff ff 66 90 e9 e7 fe ff ff 49 63 42 20 48 8d 4a 01 4d 8b 02 <49> 8b 1c 07 4c 89 f8 65 49 0f c7 08 0f 94 c0 84 c0 0f 85 3b ff
Oct 24 17:31:11 klimt kernel: RIP [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
Oct 24 17:31:11 klimt kernel: RSP <ffffc900058a3b88>
Oct 24 17:31:11 klimt kernel: ---[ end trace af2b44f9db7f5a36 ]---

This appears to be unrelated. Maybe.

Bruce, do you have CONFIG_VMAP_STACK set? I took the default, just
wondering if that was an especially safe thing to try.


> --b.
>
>>
>> --
>> Jeff Layton <[email protected]>
>
>> From ef2a391bc4d8f6b729aacee7cde8d9baf86767c3 Mon Sep 17 00:00:00 2001
>> From: Jeff Layton <[email protected]>
>> Date: Mon, 24 Oct 2016 15:13:40 -0400
>> Subject: [PATCH] sunrpc: fix some missing rq_rbuffer assignments
>>
>> I think we basically need to set rq_rbuffer whenever rq_buffer is set.
>>
>> Signed-off-by: Jeff Layton <[email protected]>
>> ---
>> net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 1 +
>> net/sunrpc/xprtsock.c | 1 +
>> 2 files changed, 2 insertions(+)
>>
>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>> index 2d8545c34095..fc4535ead7c2 100644
>> --- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>> +++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
>> @@ -182,6 +182,7 @@ xprt_rdma_bc_allocate(struct rpc_task *task)
>> return -ENOMEM;
>>
>> rqst->rq_buffer = page_address(page);
>> + rqst->rq_rbuffer = (char *)rqst->rq_buffer + rqst->rq_callsize;
>> return 0;
>> }
>>
>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>> index 0137af1c0916..e01c825bc683 100644
>> --- a/net/sunrpc/xprtsock.c
>> +++ b/net/sunrpc/xprtsock.c
>> @@ -2563,6 +2563,7 @@ static int bc_malloc(struct rpc_task *task)
>> buf->len = PAGE_SIZE;
>>
>> rqst->rq_buffer = buf->data;
>> + rqst->rq_rbuffer = (char *)rqst->rq_buffer + rqst->rq_callsize;
>> return 0;
>> }
>>
>> --
>> 2.7.4

--
Chuck Lever




2016-10-25 00:57:04

by Jeff Layton

[permalink] [raw]
Subject: Re: upstream server crash

On Mon, 2016-10-24 at 17:38 -0400, Chuck Lever wrote:
> >
> > On Oct 24, 2016, at 4:40 PM, J. Bruce Fields <[email protected]> wrote:
> >
> > On Mon, Oct 24, 2016 at 03:17:34PM -0400, Jeff Layton wrote:
> > >
> > > On Mon, 2016-10-24 at 14:08 -0400, J. Bruce Fields wrote:
> > > >
> > > > On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
> > > > >
> > > > >
> > > > > On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
> > > > > >
> > > > > >
> > > > > > On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I'm getting an intermittent crash in the nfs server as of
> > > > > > > > > > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > > > > > > > > > pointers for RPC Call and Reply messages".
> > > > > > > > > >
> > > > > > > > > > I haven't tried to understand that commit or why it would be a problem yet, I
> > > > > > > > > > don't see an obvious connection--I can take a closer look Monday.
> > > > > > > > > >
> > > > > > > > > > Could even be that I just landed on this commit by chance, the problem is a
> > > > > > > > > > little hard to reproduce so I don't completely trust my testing.
> > > > > > > > >
> > > > > > > > > I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
> > > > > > > > > reliably by running xfstests generic/013 case, on a loopback mounted
> > > > > > > > > NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
> > > > > > > > > please see
> > > > > > > > >
> > > > > > > > > http://marc.info/?l=linux-nfs&m=147714320129362&w=2
> > > > > > > > >
> > > > > > > >
> > > > > > > > Looks like you landed at the same commit as Bruce, so that's probably
> > > > > > > > legit. That commit is very small though. The only real change that
> > > > > > > > doesn't affect the new field is this:
> > > > > > > >
> > > > > > > >
> > > > > > > > @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
> > > > > > > > req->rq_buffer,
> > > > > > > > req->rq_callsize);
> > > > > > > > xdr_buf_init(&req->rq_rcv_buf,
> > > > > > > > - (char *)req->rq_buffer + req->rq_callsize,
> > > > > > > > + req->rq_rbuffer,
> > > > > > > > req->rq_rcvsize);
> > > > > > > >
> > > > > > > >
> > > > > > > > So I'm guessing this is breaking the callback channel somehow?
> > > > > > >
> > > > > > > Could be the TCP backchannel code is using rq_buffer in a different
> > > > > > > way than RDMA backchannel or the forward channel code.
> > > > > > >
> > > > > >
> > > > > > Well, it basically allocates a page per rpc_rqst and then maps that.
> > > > > >
> > > > > > One thing I notice is that this patch ensures that rq_rbuffer gets set
> > > > > > up in rpc_malloc and xprt_rdma_allocate, but it looks like
> > > > > > xprt_alloc_bc_req didn't get the same treatment.
> > > > > >
> > > > > > I suspect that that may be the problem...
> > > > > >
> > > > > In fact, maybe we just need this here? (untested and probably
> > > > > whitespace damaged):
> > > >
> > > > No change in results for me.
> > > >
> > > > --b.
> > > > >
> > > > >
> > > > >
> > > > > diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
> > > > > index ac701c28f44f..c561aa8ce05b 100644
> > > > > --- a/net/sunrpc/backchannel_rqst.c
> > > > > +++ b/net/sunrpc/backchannel_rqst.c
> > > > > @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
> > > > > goto out_free;
> > > > > }
> > > > > req->rq_rcv_buf.len = PAGE_SIZE;
> > > > > + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
> > > > >
> > > > > /* Preallocate one XDR send buffer */
> > > > > if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {
> > >
> > > Ahh ok, I think I see.
> > >
> > > We probably also need to set rq_rbuffer in bc_malloc and and
> > > xprt_rdma_bc_allocate.
> > >
> > > My guess is that we're ending up in rpc_xdr_encode with a NULL
> > > rq_rbuffer pointer, so the right fix would seem to be to ensure that it
> > > is properly set whenever rq_buffer is set.
> > >
> > > So I think this may be what we want, actually. I'll plan to test it out
> > > but may not get to it before tomorrow.
> >
> > It passes here.
>
> Without Jeff's patch, my server locks up during generic/013 with NFS/RDMA
> and NFSv4.1. With it, I get all the way to generic/089, and then encounter
> this:
>

Thanks for testing. I just sent an "official" patch submission with the
same patch.


> Oct 24 17:31:11 klimt kernel: general protection fault: 0000 [#1] SMP
> Oct 24 17:31:11 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor raid6_pq rpcrdma pcspkr i2c_i801 lpc_ich ib_ipoib i2c_smbus mfd_core mei_me sg mei rdma_ucm shpchp ioatdma wmi ib_ucm ipmi_si ipmi_msghandler ib_uverbs ib_umad rdma_cm ib_cm iw_cm acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib ib_core mlx4_en sd_mod sr_mod cdrom ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core ahci libahci igb crc32c_intel ptp libata pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
> Oct 24 17:31:11 klimt kernel: CPU: 3 PID: 1681 Comm: nfsd Not tainted 4.9.0-rc2-00006-g0e8f985 #2
> Oct 24 17:31:11 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
> Oct 24 17:31:11 klimt kernel: task: ffff88083feb15c0 task.stack: ffffc900058a0000
> Oct 24 17:31:11 klimt kernel: RIP: 0010:[<ffffffff811ed1f0>] [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
> Oct 24 17:31:11 klimt kernel: RSP: 0018:ffffc900058a3b88 EFLAGS: 00010286
> Oct 24 17:31:11 klimt kernel: RAX: 0000000000000000 RBX: 00000000024000c0 RCX: 0000000000293b6b
> Oct 24 17:31:11 klimt kernel: RDX: 0000000000293b6a RSI: 0000000000000000 RDI: 0000000000000002
> Oct 24 17:31:11 klimt kernel: RBP: ffffc900058a3bc8 R08: 000000000001c560 R09: ffff88085f003a40
> Oct 24 17:31:11 klimt kernel: R10: ffff88085f003a40 R11: ffff88083fc9b3c0 R12: 00000000024000c0
> Oct 24 17:31:11 klimt kernel: R13: 0000000000000018 R14: ffff88081993d600 R15: c0d435750e5f8620
> Oct 24 17:31:11 klimt kernel: FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000
> Oct 24 17:31:11 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Oct 24 17:31:11 klimt kernel: CR2: 00007fca9725a298 CR3: 0000000001c06000 CR4: 00000000001406e0
> Oct 24 17:31:11 klimt kernel: Stack:
> Oct 24 17:31:11 klimt kernel: ffff88085f003a40 ffff88085f003a40 ffffffff812c227b 0000000000000018
> Oct 24 17:31:11 klimt kernel: ffff880852ee6b00 ffff88085583a000 ffff88081993d600 ffff880839a95840
> Oct 24 17:31:11 klimt kernel: ffffc900058a3be8 ffffffff811a65d0 ffff88081993d600 00000000024000c0
> Oct 24 17:31:11 klimt kernel: Call Trace:
> Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] ? selinux_cred_prepare+0x1b/0x30
> Oct 24 17:31:11 klimt kernel: [<ffffffff811a65d0>] kmemdup+0x20/0x50
> Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] selinux_cred_prepare+0x1b/0x30
> Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] security_prepare_creds+0x39/0x60
> Oct 24 17:31:11 klimt kernel: [<ffffffff810a486f>] prepare_creds+0x12f/0x150
> Oct 24 17:31:11 klimt kernel: [<ffffffffa043ee6c>] nfsd_setuser+0x8c/0x250 [nfsd]
> Oct 24 17:31:11 klimt kernel: [<ffffffffa0438077>] nfsd_setuser_and_check_port+0x77/0xa0 [nfsd]
> Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] ? security_prepare_creds+0x39/0x60
> Oct 24 17:31:11 klimt kernel: [<ffffffffa04a33f6>] ? write_bytes_to_xdr_buf+0xa6/0xd0 [sunrpc]
> Oct 24 17:31:11 klimt kernel: [<ffffffffa0438dcc>] fh_verify+0x5cc/0x610 [nfsd]
> Oct 24 17:31:11 klimt kernel: [<ffffffffa044659b>] nfsd4_getattr+0x1b/0x80 [nfsd]
> Oct 24 17:31:11 klimt kernel: [<ffffffffa04481fd>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
> Oct 24 17:31:11 klimt kernel: [<ffffffffa04352b4>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
> Oct 24 17:31:11 klimt kernel: [<ffffffffa049b249>] svc_process_common+0x3d9/0x700 [sunrpc]
> Oct 24 17:31:11 klimt kernel: [<ffffffffa049c0c1>] svc_process+0xf1/0x1d0 [sunrpc]
> Oct 24 17:31:11 klimt kernel: [<ffffffffa0434d4f>] nfsd+0xff/0x160 [nfsd]
> Oct 24 17:31:11 klimt kernel: [<ffffffffa0434c50>] ? nfsd_destroy+0x60/0x60 [nfsd]
> Oct 24 17:31:11 klimt kernel: [<ffffffff810a2235>] kthread+0xe5/0xf0
> Oct 24 17:31:11 klimt kernel: [<ffffffff810a2150>] ? kthread_stop+0x120/0x120
> Oct 24 17:31:11 klimt kernel: [<ffffffff816aa795>] ret_from_fork+0x25/0x30
> Oct 24 17:31:11 klimt kernel: Code: 85 d2 75 dd eb d3 48 89 c3 66 90 eb 6e 81 e3 00 00 10 00 0f 84 e4 fe ff ff 66 90 e9 e7 fe ff ff 49 63 42 20 48 8d 4a 01 4d 8b 02 <49> 8b 1c 07 4c 89 f8 65 49 0f c7 08 0f 94 c0 84 c0 0f 85 3b ff
> Oct 24 17:31:11 klimt kernel: RIP [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
> Oct 24 17:31:11 klimt kernel: RSP <ffffc900058a3b88>
> Oct 24 17:31:11 klimt kernel: ---[ end trace af2b44f9db7f5a36 ]---
>
> This appears to be unrelated. Maybe.
>
> Bruce, do you have CONFIG_VMAP_STACK set? I took the default, just
> wondering if that was an especially safe thing to try.
>
>

Yeah, looks unrelated to me too. It died down in the kmalloc code, so
I'd have to wonder if there is some slab corruption or something?

Chuck, can you do something like this on the kernel where you saw this
panic?

    $ gdb vmlinux
    gdb> list *(__kmalloc_track_caller+0x1b0)

It'd be good to see where this crashed.
--
Jeff Layton <[email protected]>

2016-10-25 01:00:49

by Chuck Lever

[permalink] [raw]
Subject: Re: upstream server crash


> On Oct 24, 2016, at 8:57 PM, Jeff Layton <[email protected]> wrote:
>
> On Mon, 2016-10-24 at 17:38 -0400, Chuck Lever wrote:
>>>
>>> On Oct 24, 2016, at 4:40 PM, J. Bruce Fields <[email protected]> wrote:
>>>
>>> On Mon, Oct 24, 2016 at 03:17:34PM -0400, Jeff Layton wrote:
>>>>
>>>> On Mon, 2016-10-24 at 14:08 -0400, J. Bruce Fields wrote:
>>>>>
>>>>> On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
>>>>>>
>>>>>>
>>>>>> On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
>>>>>>>>>
>>>>>>>>> On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I'm getting an intermittent crash in the nfs server as of
>>>>>>>>>>> 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
>>>>>>>>>>> pointers for RPC Call and Reply messages".
>>>>>>>>>>>
>>>>>>>>>>> I haven't tried to understand that commit or why it would be a problem yet, I
>>>>>>>>>>> don't see an obvious connection--I can take a closer look Monday.
>>>>>>>>>>>
>>>>>>>>>>> Could even be that I just landed on this commit by chance, the problem is a
>>>>>>>>>>> little hard to reproduce so I don't completely trust my testing.
>>>>>>>>>>
>>>>>>>>>> I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
>>>>>>>>>> reliably by running xfstests generic/013 case, on a loopback mounted
>>>>>>>>>> NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
>>>>>>>>>> please see
>>>>>>>>>>
>>>>>>>>>> http://marc.info/?l=linux-nfs&m=147714320129362&w=2
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Looks like you landed at the same commit as Bruce, so that's probably
>>>>>>>>> legit. That commit is very small though. The only real change that
>>>>>>>>> doesn't affect the new field is this:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
>>>>>>>>> req->rq_buffer,
>>>>>>>>> req->rq_callsize);
>>>>>>>>> xdr_buf_init(&req->rq_rcv_buf,
>>>>>>>>> - (char *)req->rq_buffer + req->rq_callsize,
>>>>>>>>> + req->rq_rbuffer,
>>>>>>>>> req->rq_rcvsize);
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So I'm guessing this is breaking the callback channel somehow?
>>>>>>>>
>>>>>>>> Could be the TCP backchannel code is using rq_buffer in a different
>>>>>>>> way than RDMA backchannel or the forward channel code.
>>>>>>>>
>>>>>>>
>>>>>>> Well, it basically allocates a page per rpc_rqst and then maps that.
>>>>>>>
>>>>>>> One thing I notice is that this patch ensures that rq_rbuffer gets set
>>>>>>> up in rpc_malloc and xprt_rdma_allocate, but it looks like
>>>>>>> xprt_alloc_bc_req didn't get the same treatment.
>>>>>>>
>>>>>>> I suspect that that may be the problem...
>>>>>>>
>>>>>> In fact, maybe we just need this here? (untested and probably
>>>>>> whitespace damaged):
>>>>>
>>>>> No change in results for me.
>>>>>
>>>>> --b.
>>>>>>
>>>>>>
>>>>>>
>>>>>> diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
>>>>>> index ac701c28f44f..c561aa8ce05b 100644
>>>>>> --- a/net/sunrpc/backchannel_rqst.c
>>>>>> +++ b/net/sunrpc/backchannel_rqst.c
>>>>>> @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
>>>>>> goto out_free;
>>>>>> }
>>>>>> req->rq_rcv_buf.len = PAGE_SIZE;
>>>>>> + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
>>>>>>
>>>>>> /* Preallocate one XDR send buffer */
>>>>>> if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {
>>>>
>>>> Ahh ok, I think I see.
>>>>
>>>> We probably also need to set rq_rbuffer in bc_malloc and and
>>>> xprt_rdma_bc_allocate.
>>>>
>>>> My guess is that we're ending up in rpc_xdr_encode with a NULL
>>>> rq_rbuffer pointer, so the right fix would seem to be to ensure that it
>>>> is properly set whenever rq_buffer is set.
>>>>
>>>> So I think this may be what we want, actually. I'll plan to test it out
>>>> but may not get to it before tomorrow.
>>>
>>> It passes here.
>>
>> Without Jeff's patch, my server locks up during generic/013 with NFS/RDMA
>> and NFSv4.1. With it, I get all the way to generic/089, and then encounter
>> this:
>>
>
> Thanks for testing. I just sent an "official" patch submission with the
> same patch.
>
>
>> Oct 24 17:31:11 klimt kernel: general protection fault: 0000 [#1] SMP
>> Oct 24 17:31:11 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor raid6_pq rpcrdma pcspkr i2c_i801 lpc_ich ib_ipoib i2c_smbus mfd_core mei_me sg mei rdma_ucm shpchp ioatdma wmi ib_ucm ipmi_si ipmi_msghandler ib_uverbs ib_umad rdma_cm ib_cm iw_cm acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib ib_core mlx4_en sd_mod sr_mod cdrom ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core ahci libahci igb crc32c_intel ptp libata pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
>> Oct 24 17:31:11 klimt kernel: CPU: 3 PID: 1681 Comm: nfsd Not tainted 4.9.0-rc2-00006-g0e8f985 #2
>> Oct 24 17:31:11 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
>> Oct 24 17:31:11 klimt kernel: task: ffff88083feb15c0 task.stack: ffffc900058a0000
>> Oct 24 17:31:11 klimt kernel: RIP: 0010:[<ffffffff811ed1f0>] [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
>> Oct 24 17:31:11 klimt kernel: RSP: 0018:ffffc900058a3b88 EFLAGS: 00010286
>> Oct 24 17:31:11 klimt kernel: RAX: 0000000000000000 RBX: 00000000024000c0 RCX: 0000000000293b6b
>> Oct 24 17:31:11 klimt kernel: RDX: 0000000000293b6a RSI: 0000000000000000 RDI: 0000000000000002
>> Oct 24 17:31:11 klimt kernel: RBP: ffffc900058a3bc8 R08: 000000000001c560 R09: ffff88085f003a40
>> Oct 24 17:31:11 klimt kernel: R10: ffff88085f003a40 R11: ffff88083fc9b3c0 R12: 00000000024000c0
>> Oct 24 17:31:11 klimt kernel: R13: 0000000000000018 R14: ffff88081993d600 R15: c0d435750e5f8620
>> Oct 24 17:31:11 klimt kernel: FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000
>> Oct 24 17:31:11 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> Oct 24 17:31:11 klimt kernel: CR2: 00007fca9725a298 CR3: 0000000001c06000 CR4: 00000000001406e0
>> Oct 24 17:31:11 klimt kernel: Stack:
>> Oct 24 17:31:11 klimt kernel: ffff88085f003a40 ffff88085f003a40 ffffffff812c227b 0000000000000018
>> Oct 24 17:31:11 klimt kernel: ffff880852ee6b00 ffff88085583a000 ffff88081993d600 ffff880839a95840
>> Oct 24 17:31:11 klimt kernel: ffffc900058a3be8 ffffffff811a65d0 ffff88081993d600 00000000024000c0
>> Oct 24 17:31:11 klimt kernel: Call Trace:
>> Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] ? selinux_cred_prepare+0x1b/0x30
>> Oct 24 17:31:11 klimt kernel: [<ffffffff811a65d0>] kmemdup+0x20/0x50
>> Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] selinux_cred_prepare+0x1b/0x30
>> Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] security_prepare_creds+0x39/0x60
>> Oct 24 17:31:11 klimt kernel: [<ffffffff810a486f>] prepare_creds+0x12f/0x150
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa043ee6c>] nfsd_setuser+0x8c/0x250 [nfsd]
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0438077>] nfsd_setuser_and_check_port+0x77/0xa0 [nfsd]
>> Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] ? security_prepare_creds+0x39/0x60
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa04a33f6>] ? write_bytes_to_xdr_buf+0xa6/0xd0 [sunrpc]
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0438dcc>] fh_verify+0x5cc/0x610 [nfsd]
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa044659b>] nfsd4_getattr+0x1b/0x80 [nfsd]
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa04481fd>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa04352b4>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa049b249>] svc_process_common+0x3d9/0x700 [sunrpc]
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa049c0c1>] svc_process+0xf1/0x1d0 [sunrpc]
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0434d4f>] nfsd+0xff/0x160 [nfsd]
>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0434c50>] ? nfsd_destroy+0x60/0x60 [nfsd]
>> Oct 24 17:31:11 klimt kernel: [<ffffffff810a2235>] kthread+0xe5/0xf0
>> Oct 24 17:31:11 klimt kernel: [<ffffffff810a2150>] ? kthread_stop+0x120/0x120
>> Oct 24 17:31:11 klimt kernel: [<ffffffff816aa795>] ret_from_fork+0x25/0x30
>> Oct 24 17:31:11 klimt kernel: Code: 85 d2 75 dd eb d3 48 89 c3 66 90 eb 6e 81 e3 00 00 10 00 0f 84 e4 fe ff ff 66 90 e9 e7 fe ff ff 49 63 42 20 48 8d 4a 01 4d 8b 02 <49> 8b 1c 07 4c 89 f8 65 49 0f c7 08 0f 94 c0 84 c0 0f 85 3b ff
>> Oct 24 17:31:11 klimt kernel: RIP [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
>> Oct 24 17:31:11 klimt kernel: RSP <ffffc900058a3b88>
>> Oct 24 17:31:11 klimt kernel: ---[ end trace af2b44f9db7f5a36 ]---
>>
>> This appears to be unrelated. Maybe.
>>
>> Bruce, do you have CONFIG_VMAP_STACK set? I took the default, just
>> wondering if that was an especially safe thing to try.
>>
>>
>
> Yeah, looks unrelated to me too. It died down in the kmalloc code, so
> I'd have to wonder if there is some slab corruption or something?
>
> Chuck, can you do something like this on the kernel where you saw this
> panic?
>
> $ gdb vmlinux
> gdb> list *(__kmalloc_track_caller+0x1b0)
>
> It'd be good to see where this crashed.

[cel@manet 4.9.0-rc2-00012-gf3e49aa]$ gdb vmlinux
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/lib/modules/4.9.0-rc2-00012-gf3e49aa/vmlinux...done.
(gdb) list *(__kmalloc_track_caller+0x1b0)
0xffffffff811ecee0 is in __kmalloc_track_caller (/home/cel/src/linux/linux-2.6/mm/slub.c:241).
236 * Core slab cache functions
237 *******************************************************************/
238
239 static inline void *get_freepointer(struct kmem_cache *s, void *object)
240 {
241 return *(void **)(object + s->offset);
242 }
243
244 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
245 {
(gdb)

--
Chuck Lever




2016-10-25 01:46:08

by Jeff Layton

[permalink] [raw]
Subject: Re: upstream server crash

On Mon, 2016-10-24 at 21:00 -0400, Chuck Lever wrote:
> >
> > On Oct 24, 2016, at 8:57 PM, Jeff Layton <[email protected]> wrote:
> >
> > On Mon, 2016-10-24 at 17:38 -0400, Chuck Lever wrote:
> > >
> > > >
> > > >
> > > > On Oct 24, 2016, at 4:40 PM, J. Bruce Fields <[email protected]> wrote:
> > > >
> > > > On Mon, Oct 24, 2016 at 03:17:34PM -0400, Jeff Layton wrote:
> > > > >
> > > > >
> > > > > On Mon, 2016-10-24 at 14:08 -0400, J. Bruce Fields wrote:
> > > > > >
> > > > > >
> > > > > > On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I'm getting an intermittent crash in the nfs server as of
> > > > > > > > > > > > 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
> > > > > > > > > > > > pointers for RPC Call and Reply messages".
> > > > > > > > > > > >
> > > > > > > > > > > > I haven't tried to understand that commit or why it would be a problem yet, I
> > > > > > > > > > > > don't see an obvious connection--I can take a closer look Monday.
> > > > > > > > > > > >
> > > > > > > > > > > > Could even be that I just landed on this commit by chance, the problem is a
> > > > > > > > > > > > little hard to reproduce so I don't completely trust my testing.
> > > > > > > > > > >
> > > > > > > > > > > I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
> > > > > > > > > > > reliably by running xfstests generic/013 case, on a loopback mounted
> > > > > > > > > > > NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
> > > > > > > > > > > please see
> > > > > > > > > > >
> > > > > > > > > > > http://marc.info/?l=linux-nfs&m=147714320129362&w=2
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Looks like you landed at the same commit as Bruce, so that's probably
> > > > > > > > > > legit. That commit is very small though. The only real change that
> > > > > > > > > > doesn't affect the new field is this:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
> > > > > > > > > > req->rq_buffer,
> > > > > > > > > > req->rq_callsize);
> > > > > > > > > > xdr_buf_init(&req->rq_rcv_buf,
> > > > > > > > > > - (char *)req->rq_buffer + req->rq_callsize,
> > > > > > > > > > + req->rq_rbuffer,
> > > > > > > > > > req->rq_rcvsize);
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > So I'm guessing this is breaking the callback channel somehow?
> > > > > > > > >
> > > > > > > > > Could be the TCP backchannel code is using rq_buffer in a different
> > > > > > > > > way than RDMA backchannel or the forward channel code.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Well, it basically allocates a page per rpc_rqst and then maps that.
> > > > > > > >
> > > > > > > > One thing I notice is that this patch ensures that rq_rbuffer gets set
> > > > > > > > up in rpc_malloc and xprt_rdma_allocate, but it looks like
> > > > > > > > xprt_alloc_bc_req didn't get the same treatment.
> > > > > > > >
> > > > > > > > I suspect that that may be the problem...
> > > > > > > >
> > > > > > > In fact, maybe we just need this here? (untested and probably
> > > > > > > whitespace damaged):
> > > > > >
> > > > > > No change in results for me.
> > > > > >
> > > > > > --b.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
> > > > > > > index ac701c28f44f..c561aa8ce05b 100644
> > > > > > > --- a/net/sunrpc/backchannel_rqst.c
> > > > > > > +++ b/net/sunrpc/backchannel_rqst.c
> > > > > > > @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
> > > > > > > goto out_free;
> > > > > > > }
> > > > > > > req->rq_rcv_buf.len = PAGE_SIZE;
> > > > > > > + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
> > > > > > >
> > > > > > > /* Preallocate one XDR send buffer */
> > > > > > > if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {
> > > > >
> > > > > Ahh ok, I think I see.
> > > > >
> > > > > We probably also need to set rq_rbuffer in bc_malloc and and
> > > > > xprt_rdma_bc_allocate.
> > > > >
> > > > > My guess is that we're ending up in rpc_xdr_encode with a NULL
> > > > > rq_rbuffer pointer, so the right fix would seem to be to ensure that it
> > > > > is properly set whenever rq_buffer is set.
> > > > >
> > > > > So I think this may be what we want, actually. I'll plan to test it out
> > > > > but may not get to it before tomorrow.
> > > >
> > > > It passes here.
> > >
> > > Without Jeff's patch, my server locks up during generic/013 with NFS/RDMA
> > > and NFSv4.1. With it, I get all the way to generic/089, and then encounter
> > > this:
> > >
> >
> > Thanks for testing. I just sent an "official" patch submission with the
> > same patch.
> >
> >
> > >
> > > Oct 24 17:31:11 klimt kernel: general protection fault: 0000 [#1] SMP
> > > Oct 24 17:31:11 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor raid6_pq rpcrdma pcspkr i2c_i801 lpc_ich ib_ipoib i2c_smbus mfd_core mei_me sg mei rdma_ucm shpchp ioatdma wmi ib_ucm ipmi_si ipmi_msghandler ib_uverbs ib_umad rdma_cm ib_cm iw_cm acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib ib_core mlx4_en sd_mod sr_mod cdrom ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core ahci libahci igb crc32c_intel ptp libata pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
> > > Oct 24 17:31:11 klimt kernel: CPU: 3 PID: 1681 Comm: nfsd Not tainted 4.9.0-rc2-00006-g0e8f985 #2
> > > Oct 24 17:31:11 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
> > > Oct 24 17:31:11 klimt kernel: task: ffff88083feb15c0 task.stack: ffffc900058a0000
> > > Oct 24 17:31:11 klimt kernel: RIP: 0010:[<ffffffff811ed1f0>] [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
> > > Oct 24 17:31:11 klimt kernel: RSP: 0018:ffffc900058a3b88 EFLAGS: 00010286
> > > Oct 24 17:31:11 klimt kernel: RAX: 0000000000000000 RBX: 00000000024000c0 RCX: 0000000000293b6b
> > > Oct 24 17:31:11 klimt kernel: RDX: 0000000000293b6a RSI: 0000000000000000 RDI: 0000000000000002
> > > Oct 24 17:31:11 klimt kernel: RBP: ffffc900058a3bc8 R08: 000000000001c560 R09: ffff88085f003a40
> > > Oct 24 17:31:11 klimt kernel: R10: ffff88085f003a40 R11: ffff88083fc9b3c0 R12: 00000000024000c0
> > > Oct 24 17:31:11 klimt kernel: R13: 0000000000000018 R14: ffff88081993d600 R15: c0d435750e5f8620
> > > Oct 24 17:31:11 klimt kernel: FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000
> > > Oct 24 17:31:11 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > Oct 24 17:31:11 klimt kernel: CR2: 00007fca9725a298 CR3: 0000000001c06000 CR4: 00000000001406e0
> > > Oct 24 17:31:11 klimt kernel: Stack:
> > > Oct 24 17:31:11 klimt kernel: ffff88085f003a40 ffff88085f003a40 ffffffff812c227b 0000000000000018
> > > Oct 24 17:31:11 klimt kernel: ffff880852ee6b00 ffff88085583a000 ffff88081993d600 ffff880839a95840
> > > Oct 24 17:31:11 klimt kernel: ffffc900058a3be8 ffffffff811a65d0 ffff88081993d600 00000000024000c0
> > > Oct 24 17:31:11 klimt kernel: Call Trace:
> > > Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] ? selinux_cred_prepare+0x1b/0x30
> > > Oct 24 17:31:11 klimt kernel: [<ffffffff811a65d0>] kmemdup+0x20/0x50
> > > Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] selinux_cred_prepare+0x1b/0x30
> > > Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] security_prepare_creds+0x39/0x60
> > > Oct 24 17:31:11 klimt kernel: [<ffffffff810a486f>] prepare_creds+0x12f/0x150
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa043ee6c>] nfsd_setuser+0x8c/0x250 [nfsd]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa0438077>] nfsd_setuser_and_check_port+0x77/0xa0 [nfsd]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] ? security_prepare_creds+0x39/0x60
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa04a33f6>] ? write_bytes_to_xdr_buf+0xa6/0xd0 [sunrpc]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa0438dcc>] fh_verify+0x5cc/0x610 [nfsd]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa044659b>] nfsd4_getattr+0x1b/0x80 [nfsd]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa04481fd>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa04352b4>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa049b249>] svc_process_common+0x3d9/0x700 [sunrpc]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa049c0c1>] svc_process+0xf1/0x1d0 [sunrpc]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa0434d4f>] nfsd+0xff/0x160 [nfsd]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffffa0434c50>] ? nfsd_destroy+0x60/0x60 [nfsd]
> > > Oct 24 17:31:11 klimt kernel: [<ffffffff810a2235>] kthread+0xe5/0xf0
> > > Oct 24 17:31:11 klimt kernel: [<ffffffff810a2150>] ? kthread_stop+0x120/0x120
> > > Oct 24 17:31:11 klimt kernel: [<ffffffff816aa795>] ret_from_fork+0x25/0x30
> > > Oct 24 17:31:11 klimt kernel: Code: 85 d2 75 dd eb d3 48 89 c3 66 90 eb 6e 81 e3 00 00 10 00 0f 84 e4 fe ff ff 66 90 e9 e7 fe ff ff 49 63 42 20 48 8d 4a 01 4d 8b 02 <49> 8b 1c 07 4c 89 f8 65 49 0f c7 08 0f 94 c0 84 c0 0f 85 3b ff
> > > Oct 24 17:31:11 klimt kernel: RIP [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
> > > Oct 24 17:31:11 klimt kernel: RSP <ffffc900058a3b88>
> > > Oct 24 17:31:11 klimt kernel: ---[ end trace af2b44f9db7f5a36 ]---
> > >
> > > This appears to be unrelated. Maybe.
> > >
> > > Bruce, do you have CONFIG_VMAP_STACK set? I took the default, just
> > > wondering if that was an especially safe thing to try.
> > >
> > >
> >
> > Yeah, looks unrelated to me too. It died down in the kmalloc code, so
> > I'd have to wonder if there is some slab corruption or something?
> >
> > Chuck, can you do something like this on the kernel where you saw this
> > panic?
> >
> > $ gdb vmlinux
> > gdb> list *(__kmalloc_track_caller+0x1b0)
> >
> > It'd be good to see where this crashed.
>
> [cel@manet 4.9.0-rc2-00012-gf3e49aa]$ gdb vmlinux
> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /usr/lib/modules/4.9.0-rc2-00012-gf3e49aa/vmlinux...done.
> (gdb) list *(__kmalloc_track_caller+0x1b0)
> 0xffffffff811ecee0 is in __kmalloc_track_caller (/home/cel/src/linux/linux-2.6/mm/slub.c:241).
> 236 * Core slab cache functions
> 237 *******************************************************************/
> 238
> 239 static inline void *get_freepointer(struct kmem_cache *s, void *object)
> 240 {
> 241 return *(void **)(object + s->offset);
> 242 }
> 243
> 244 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
> 245 {
> (gdb)
>
> --
> Chuck Lever
>
>
>

Thanks -- I'm guessing that means that "s" is bogus there, so
yeah...slab corruption.

I hit some panics in the same function about a year or so ago when I
was developing some fsnotify patches, and it turned out to be double-
free. Wonder what's sharing that slab?

--
Jeff Layton <[email protected]>

2016-10-25 02:02:17

by Chuck Lever

[permalink] [raw]
Subject: Re: upstream server crash


> On Oct 24, 2016, at 9:46 PM, Jeff Layton <[email protected]> wrote:
>
> On Mon, 2016-10-24 at 21:00 -0400, Chuck Lever wrote:
>>>
>>> On Oct 24, 2016, at 8:57 PM, Jeff Layton <[email protected]> wrote:
>>>
>>> On Mon, 2016-10-24 at 17:38 -0400, Chuck Lever wrote:
>>>>
>>>>>
>>>>>
>>>>> On Oct 24, 2016, at 4:40 PM, J. Bruce Fields <[email protected]> wrote:
>>>>>
>>>>> On Mon, Oct 24, 2016 at 03:17:34PM -0400, Jeff Layton wrote:
>>>>>>
>>>>>>
>>>>>> On Mon, 2016-10-24 at 14:08 -0400, J. Bruce Fields wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm getting an intermittent crash in the nfs server as of
>>>>>>>>>>>>> 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
>>>>>>>>>>>>> pointers for RPC Call and Reply messages".
>>>>>>>>>>>>>
>>>>>>>>>>>>> I haven't tried to understand that commit or why it would be a problem yet, I
>>>>>>>>>>>>> don't see an obvious connection--I can take a closer look Monday.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could even be that I just landed on this commit by chance, the problem is a
>>>>>>>>>>>>> little hard to reproduce so I don't completely trust my testing.
>>>>>>>>>>>>
>>>>>>>>>>>> I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
>>>>>>>>>>>> reliably by running xfstests generic/013 case, on a loopback mounted
>>>>>>>>>>>> NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
>>>>>>>>>>>> please see
>>>>>>>>>>>>
>>>>>>>>>>>> http://marc.info/?l=linux-nfs&m=147714320129362&w=2
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Looks like you landed at the same commit as Bruce, so that's probably
>>>>>>>>>>> legit. That commit is very small though. The only real change that
>>>>>>>>>>> doesn't affect the new field is this:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
>>>>>>>>>>> req->rq_buffer,
>>>>>>>>>>> req->rq_callsize);
>>>>>>>>>>> xdr_buf_init(&req->rq_rcv_buf,
>>>>>>>>>>> - (char *)req->rq_buffer + req->rq_callsize,
>>>>>>>>>>> + req->rq_rbuffer,
>>>>>>>>>>> req->rq_rcvsize);
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So I'm guessing this is breaking the callback channel somehow?
>>>>>>>>>>
>>>>>>>>>> Could be the TCP backchannel code is using rq_buffer in a different
>>>>>>>>>> way than RDMA backchannel or the forward channel code.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Well, it basically allocates a page per rpc_rqst and then maps that.
>>>>>>>>>
>>>>>>>>> One thing I notice is that this patch ensures that rq_rbuffer gets set
>>>>>>>>> up in rpc_malloc and xprt_rdma_allocate, but it looks like
>>>>>>>>> xprt_alloc_bc_req didn't get the same treatment.
>>>>>>>>>
>>>>>>>>> I suspect that that may be the problem...
>>>>>>>>>
>>>>>>>> In fact, maybe we just need this here? (untested and probably
>>>>>>>> whitespace damaged):
>>>>>>>
>>>>>>> No change in results for me.
>>>>>>>
>>>>>>> --b.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
>>>>>>>> index ac701c28f44f..c561aa8ce05b 100644
>>>>>>>> --- a/net/sunrpc/backchannel_rqst.c
>>>>>>>> +++ b/net/sunrpc/backchannel_rqst.c
>>>>>>>> @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
>>>>>>>> goto out_free;
>>>>>>>> }
>>>>>>>> req->rq_rcv_buf.len = PAGE_SIZE;
>>>>>>>> + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
>>>>>>>>
>>>>>>>> /* Preallocate one XDR send buffer */
>>>>>>>> if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {
>>>>>>
>>>>>> Ahh ok, I think I see.
>>>>>>
>>>>>> We probably also need to set rq_rbuffer in bc_malloc and and
>>>>>> xprt_rdma_bc_allocate.
>>>>>>
>>>>>> My guess is that we're ending up in rpc_xdr_encode with a NULL
>>>>>> rq_rbuffer pointer, so the right fix would seem to be to ensure that it
>>>>>> is properly set whenever rq_buffer is set.
>>>>>>
>>>>>> So I think this may be what we want, actually. I'll plan to test it out
>>>>>> but may not get to it before tomorrow.
>>>>>
>>>>> It passes here.
>>>>
>>>> Without Jeff's patch, my server locks up during generic/013 with NFS/RDMA
>>>> and NFSv4.1. With it, I get all the way to generic/089, and then encounter
>>>> this:
>>>>
>>>
>>> Thanks for testing. I just sent an "official" patch submission with the
>>> same patch.
>>>
>>>
>>>>
>>>> Oct 24 17:31:11 klimt kernel: general protection fault: 0000 [#1] SMP
>>>> Oct 24 17:31:11 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor raid6_pq rpcrdma pcspkr i2c_i801 lpc_ich ib_ipoib i2c_smbus mfd_core mei_me sg mei rdma_ucm shpchp ioatdma wmi ib_ucm ipmi_si ipmi_msghandler ib_uverbs ib_umad rdma_cm ib_cm iw_cm acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib ib_core mlx4_en sd_mod sr_mod cdrom ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core ahci libahci igb crc32c_intel ptp libata pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
>>>> Oct 24 17:31:11 klimt kernel: CPU: 3 PID: 1681 Comm: nfsd Not tainted 4.9.0-rc2-00006-g0e8f985 #2
>>>> Oct 24 17:31:11 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
>>>> Oct 24 17:31:11 klimt kernel: task: ffff88083feb15c0 task.stack: ffffc900058a0000
>>>> Oct 24 17:31:11 klimt kernel: RIP: 0010:[<ffffffff811ed1f0>] [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
>>>> Oct 24 17:31:11 klimt kernel: RSP: 0018:ffffc900058a3b88 EFLAGS: 00010286
>>>> Oct 24 17:31:11 klimt kernel: RAX: 0000000000000000 RBX: 00000000024000c0 RCX: 0000000000293b6b
>>>> Oct 24 17:31:11 klimt kernel: RDX: 0000000000293b6a RSI: 0000000000000000 RDI: 0000000000000002
>>>> Oct 24 17:31:11 klimt kernel: RBP: ffffc900058a3bc8 R08: 000000000001c560 R09: ffff88085f003a40
>>>> Oct 24 17:31:11 klimt kernel: R10: ffff88085f003a40 R11: ffff88083fc9b3c0 R12: 00000000024000c0
>>>> Oct 24 17:31:11 klimt kernel: R13: 0000000000000018 R14: ffff88081993d600 R15: c0d435750e5f8620
>>>> Oct 24 17:31:11 klimt kernel: FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000
>>>> Oct 24 17:31:11 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> Oct 24 17:31:11 klimt kernel: CR2: 00007fca9725a298 CR3: 0000000001c06000 CR4: 00000000001406e0
>>>> Oct 24 17:31:11 klimt kernel: Stack:
>>>> Oct 24 17:31:11 klimt kernel: ffff88085f003a40 ffff88085f003a40 ffffffff812c227b 0000000000000018
>>>> Oct 24 17:31:11 klimt kernel: ffff880852ee6b00 ffff88085583a000 ffff88081993d600 ffff880839a95840
>>>> Oct 24 17:31:11 klimt kernel: ffffc900058a3be8 ffffffff811a65d0 ffff88081993d600 00000000024000c0
>>>> Oct 24 17:31:11 klimt kernel: Call Trace:
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] ? selinux_cred_prepare+0x1b/0x30
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff811a65d0>] kmemdup+0x20/0x50
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] selinux_cred_prepare+0x1b/0x30
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] security_prepare_creds+0x39/0x60
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff810a486f>] prepare_creds+0x12f/0x150
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa043ee6c>] nfsd_setuser+0x8c/0x250 [nfsd]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0438077>] nfsd_setuser_and_check_port+0x77/0xa0 [nfsd]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] ? security_prepare_creds+0x39/0x60
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa04a33f6>] ? write_bytes_to_xdr_buf+0xa6/0xd0 [sunrpc]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0438dcc>] fh_verify+0x5cc/0x610 [nfsd]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa044659b>] nfsd4_getattr+0x1b/0x80 [nfsd]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa04481fd>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa04352b4>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa049b249>] svc_process_common+0x3d9/0x700 [sunrpc]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa049c0c1>] svc_process+0xf1/0x1d0 [sunrpc]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0434d4f>] nfsd+0xff/0x160 [nfsd]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0434c50>] ? nfsd_destroy+0x60/0x60 [nfsd]
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff810a2235>] kthread+0xe5/0xf0
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff810a2150>] ? kthread_stop+0x120/0x120
>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff816aa795>] ret_from_fork+0x25/0x30
>>>> Oct 24 17:31:11 klimt kernel: Code: 85 d2 75 dd eb d3 48 89 c3 66 90 eb 6e 81 e3 00 00 10 00 0f 84 e4 fe ff ff 66 90 e9 e7 fe ff ff 49 63 42 20 48 8d 4a 01 4d 8b 02 <49> 8b 1c 07 4c 89 f8 65 49 0f c7 08 0f 94 c0 84 c0 0f 85 3b ff
>>>> Oct 24 17:31:11 klimt kernel: RIP [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
>>>> Oct 24 17:31:11 klimt kernel: RSP <ffffc900058a3b88>
>>>> Oct 24 17:31:11 klimt kernel: ---[ end trace af2b44f9db7f5a36 ]---
>>>>
>>>> This appears to be unrelated. Maybe.
>>>>
>>>> Bruce, do you have CONFIG_VMAP_STACK set? I took the default, just
>>>> wondering if that was an especially safe thing to try.
>>>>
>>>>
>>>
>>> Yeah, looks unrelated to me too. It died down in the kmalloc code, so
>>> I'd have to wonder if there is some slab corruption or something?
>>>
>>> Chuck, can you do something like this on the kernel where you saw this
>>> panic?
>>>
>>> $ gdb vmlinux
>>> gdb> list *(__kmalloc_track_caller+0x1b0)
>>>
>>> It'd be good to see where this crashed.
>>
>> [cel@manet 4.9.0-rc2-00012-gf3e49aa]$ gdb vmlinux
>> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
>> Copyright (C) 2013 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "x86_64-redhat-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/>...
>> Reading symbols from /usr/lib/modules/4.9.0-rc2-00012-gf3e49aa/vmlinux...done.
>> (gdb) list *(__kmalloc_track_caller+0x1b0)
>> 0xffffffff811ecee0 is in __kmalloc_track_caller (/home/cel/src/linux/linux-2.6/mm/slub.c:241).
>> 236 * Core slab cache functions
>> 237 *******************************************************************/
>> 238
>> 239 static inline void *get_freepointer(struct kmem_cache *s, void *object)
>> 240 {
>> 241 return *(void **)(object + s->offset);
>> 242 }
>> 243
>> 244 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
>> 245 {
>> (gdb)
>>
>> --
>> Chuck Lever
>>
>>
>>
>
> Thanks -- I'm guessing that means that "s" is bogus there, so
> yeah...slab corruption.
>
> I hit some panics in the same function about a year or so ago when I
> was developing some fsnotify patches, and it turned out to be double-
> free. Wonder what's sharing that slab?

Could be anything: that's called from kmemdup(), so it's going to be
one of the generic slabs.

I've disabled CONFIG_VMAP_STACK and SELinux, and so far everything is
stable with NFSv4.1 on RDMA and v4.9-rc2.


--
Chuck Lever




2016-10-28 01:20:55

by Chuck Lever

[permalink] [raw]
Subject: Re: upstream server crash


> On Oct 24, 2016, at 10:02 PM, Chuck Lever <[email protected]> wrote:
>
>>
>> On Oct 24, 2016, at 9:46 PM, Jeff Layton <[email protected]> wrote:
>>
>> On Mon, 2016-10-24 at 21:00 -0400, Chuck Lever wrote:
>>>>
>>>> On Oct 24, 2016, at 8:57 PM, Jeff Layton <[email protected]> wrote:
>>>>
>>>> On Mon, 2016-10-24 at 17:38 -0400, Chuck Lever wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Oct 24, 2016, at 4:40 PM, J. Bruce Fields <[email protected]> wrote:
>>>>>>
>>>>>> On Mon, Oct 24, 2016 at 03:17:34PM -0400, Jeff Layton wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Mon, 2016-10-24 at 14:08 -0400, J. Bruce Fields wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm getting an intermittent crash in the nfs server as of
>>>>>>>>>>>>>> 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
>>>>>>>>>>>>>> pointers for RPC Call and Reply messages".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I haven't tried to understand that commit or why it would be a problem yet, I
>>>>>>>>>>>>>> don't see an obvious connection--I can take a closer look Monday.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could even be that I just landed on this commit by chance, the problem is a
>>>>>>>>>>>>>> little hard to reproduce so I don't completely trust my testing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
>>>>>>>>>>>>> reliably by running xfstests generic/013 case, on a loopback mounted
>>>>>>>>>>>>> NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
>>>>>>>>>>>>> please see
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://marc.info/?l=linux-nfs&m=147714320129362&w=2
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Looks like you landed at the same commit as Bruce, so that's probably
>>>>>>>>>>>> legit. That commit is very small though. The only real change that
>>>>>>>>>>>> doesn't affect the new field is this:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
>>>>>>>>>>>> req->rq_buffer,
>>>>>>>>>>>> req->rq_callsize);
>>>>>>>>>>>> xdr_buf_init(&req->rq_rcv_buf,
>>>>>>>>>>>> - (char *)req->rq_buffer + req->rq_callsize,
>>>>>>>>>>>> + req->rq_rbuffer,
>>>>>>>>>>>> req->rq_rcvsize);
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> So I'm guessing this is breaking the callback channel somehow?
>>>>>>>>>>>
>>>>>>>>>>> Could be the TCP backchannel code is using rq_buffer in a different
>>>>>>>>>>> way than RDMA backchannel or the forward channel code.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Well, it basically allocates a page per rpc_rqst and then maps that.
>>>>>>>>>>
>>>>>>>>>> One thing I notice is that this patch ensures that rq_rbuffer gets set
>>>>>>>>>> up in rpc_malloc and xprt_rdma_allocate, but it looks like
>>>>>>>>>> xprt_alloc_bc_req didn't get the same treatment.
>>>>>>>>>>
>>>>>>>>>> I suspect that that may be the problem...
>>>>>>>>>>
>>>>>>>>> In fact, maybe we just need this here? (untested and probably
>>>>>>>>> whitespace damaged):
>>>>>>>>
>>>>>>>> No change in results for me.
>>>>>>>>
>>>>>>>> --b.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
>>>>>>>>> index ac701c28f44f..c561aa8ce05b 100644
>>>>>>>>> --- a/net/sunrpc/backchannel_rqst.c
>>>>>>>>> +++ b/net/sunrpc/backchannel_rqst.c
>>>>>>>>> @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
>>>>>>>>> goto out_free;
>>>>>>>>> }
>>>>>>>>> req->rq_rcv_buf.len = PAGE_SIZE;
>>>>>>>>> + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
>>>>>>>>>
>>>>>>>>> /* Preallocate one XDR send buffer */
>>>>>>>>> if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {
>>>>>>>
>>>>>>> Ahh ok, I think I see.
>>>>>>>
>>>>>>> We probably also need to set rq_rbuffer in bc_malloc and and
>>>>>>> xprt_rdma_bc_allocate.
>>>>>>>
>>>>>>> My guess is that we're ending up in rpc_xdr_encode with a NULL
>>>>>>> rq_rbuffer pointer, so the right fix would seem to be to ensure that it
>>>>>>> is properly set whenever rq_buffer is set.
>>>>>>>
>>>>>>> So I think this may be what we want, actually. I'll plan to test it out
>>>>>>> but may not get to it before tomorrow.
>>>>>>
>>>>>> It passes here.
>>>>>
>>>>> Without Jeff's patch, my server locks up during generic/013 with NFS/RDMA
>>>>> and NFSv4.1. With it, I get all the way to generic/089, and then encounter
>>>>> this:
>>>>>
>>>>
>>>> Thanks for testing. I just sent an "official" patch submission with the
>>>> same patch.
>>>>
>>>>
>>>>>
>>>>> Oct 24 17:31:11 klimt kernel: general protection fault: 0000 [#1] SMP
>>>>> Oct 24 17:31:11 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor raid6_pq rpcrdma pcspkr i2c_i801 lpc_ich ib_ipoib i2c_smbus mfd_core mei_me sg mei rdma_ucm shpchp ioatdma wmi ib_ucm ipmi_si ipmi_msghandler ib_uverbs ib_umad rdma_cm ib_cm iw_cm acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib ib_core mlx4_en sd_mod sr_mod cdrom ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm mlx4_core ahci libahci igb crc32c_intel ptp libata pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
>>>>> Oct 24 17:31:11 klimt kernel: CPU: 3 PID: 1681 Comm: nfsd Not tainted 4.9.0-rc2-00006-g0e8f985 #2
>>>>> Oct 24 17:31:11 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
>>>>> Oct 24 17:31:11 klimt kernel: task: ffff88083feb15c0 task.stack: ffffc900058a0000
>>>>> Oct 24 17:31:11 klimt kernel: RIP: 0010:[<ffffffff811ed1f0>] [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
>>>>> Oct 24 17:31:11 klimt kernel: RSP: 0018:ffffc900058a3b88 EFLAGS: 00010286
>>>>> Oct 24 17:31:11 klimt kernel: RAX: 0000000000000000 RBX: 00000000024000c0 RCX: 0000000000293b6b
>>>>> Oct 24 17:31:11 klimt kernel: RDX: 0000000000293b6a RSI: 0000000000000000 RDI: 0000000000000002
>>>>> Oct 24 17:31:11 klimt kernel: RBP: ffffc900058a3bc8 R08: 000000000001c560 R09: ffff88085f003a40
>>>>> Oct 24 17:31:11 klimt kernel: R10: ffff88085f003a40 R11: ffff88083fc9b3c0 R12: 00000000024000c0
>>>>> Oct 24 17:31:11 klimt kernel: R13: 0000000000000018 R14: ffff88081993d600 R15: c0d435750e5f8620
>>>>> Oct 24 17:31:11 klimt kernel: FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000
>>>>> Oct 24 17:31:11 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> Oct 24 17:31:11 klimt kernel: CR2: 00007fca9725a298 CR3: 0000000001c06000 CR4: 00000000001406e0
>>>>> Oct 24 17:31:11 klimt kernel: Stack:
>>>>> Oct 24 17:31:11 klimt kernel: ffff88085f003a40 ffff88085f003a40 ffffffff812c227b 0000000000000018
>>>>> Oct 24 17:31:11 klimt kernel: ffff880852ee6b00 ffff88085583a000 ffff88081993d600 ffff880839a95840
>>>>> Oct 24 17:31:11 klimt kernel: ffffc900058a3be8 ffffffff811a65d0 ffff88081993d600 00000000024000c0
>>>>> Oct 24 17:31:11 klimt kernel: Call Trace:
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] ? selinux_cred_prepare+0x1b/0x30
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff811a65d0>] kmemdup+0x20/0x50
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff812c227b>] selinux_cred_prepare+0x1b/0x30
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] security_prepare_creds+0x39/0x60
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff810a486f>] prepare_creds+0x12f/0x150
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa043ee6c>] nfsd_setuser+0x8c/0x250 [nfsd]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0438077>] nfsd_setuser_and_check_port+0x77/0xa0 [nfsd]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff812bd3f9>] ? security_prepare_creds+0x39/0x60
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa04a33f6>] ? write_bytes_to_xdr_buf+0xa6/0xd0 [sunrpc]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0438dcc>] fh_verify+0x5cc/0x610 [nfsd]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa044659b>] nfsd4_getattr+0x1b/0x80 [nfsd]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa04481fd>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa04352b4>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa049b249>] svc_process_common+0x3d9/0x700 [sunrpc]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa049c0c1>] svc_process+0xf1/0x1d0 [sunrpc]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0434d4f>] nfsd+0xff/0x160 [nfsd]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffffa0434c50>] ? nfsd_destroy+0x60/0x60 [nfsd]
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff810a2235>] kthread+0xe5/0xf0
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff810a2150>] ? kthread_stop+0x120/0x120
>>>>> Oct 24 17:31:11 klimt kernel: [<ffffffff816aa795>] ret_from_fork+0x25/0x30
>>>>> Oct 24 17:31:11 klimt kernel: Code: 85 d2 75 dd eb d3 48 89 c3 66 90 eb 6e 81 e3 00 00 10 00 0f 84 e4 fe ff ff 66 90 e9 e7 fe ff ff 49 63 42 20 48 8d 4a 01 4d 8b 02 <49> 8b 1c 07 4c 89 f8 65 49 0f c7 08 0f 94 c0 84 c0 0f 85 3b ff
>>>>> Oct 24 17:31:11 klimt kernel: RIP [<ffffffff811ed1f0>] __kmalloc_track_caller+0x1b0/0x220
>>>>> Oct 24 17:31:11 klimt kernel: RSP <ffffc900058a3b88>
>>>>> Oct 24 17:31:11 klimt kernel: ---[ end trace af2b44f9db7f5a36 ]---
>>>>>
>>>>> This appears to be unrelated. Maybe.
>>>>>
>>>>> Bruce, do you have CONFIG_VMAP_STACK set? I took the default, just
>>>>> wondering if that was an especially safe thing to try.
>>>>>
>>>>>
>>>>
>>>> Yeah, looks unrelated to me too. It died down in the kmalloc code, so
>>>> I'd have to wonder if there is some slab corruption or something?
>>>>
>>>> Chuck, can you do something like this on the kernel where you saw this
>>>> panic?
>>>>
>>>> $ gdb vmlinux
>>>> gdb> list *(__kmalloc_track_caller+0x1b0)
>>>>
>>>> It'd be good to see where this crashed.
>>>
>>> [cel@manet 4.9.0-rc2-00012-gf3e49aa]$ gdb vmlinux
>>> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
>>> and "show warranty" for details.
>>> This GDB was configured as "x86_64-redhat-linux-gnu".
>>> For bug reporting instructions, please see:
>>> <http://www.gnu.org/software/gdb/bugs/>...
>>> Reading symbols from /usr/lib/modules/4.9.0-rc2-00012-gf3e49aa/vmlinux...done.
>>> (gdb) list *(__kmalloc_track_caller+0x1b0)
>>> 0xffffffff811ecee0 is in __kmalloc_track_caller (/home/cel/src/linux/linux-2.6/mm/slub.c:241).
>>> 236 * Core slab cache functions
>>> 237 *******************************************************************/
>>> 238
>>> 239 static inline void *get_freepointer(struct kmem_cache *s, void *object)
>>> 240 {
>>> 241 return *(void **)(object + s->offset);
>>> 242 }
>>> 243
>>> 244 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
>>> 245 {
>>> (gdb)
>>>
>>> --
>>> Chuck Lever
>>>
>>>
>>>
>>
>> Thanks -- I'm guessing that means that "s" is bogus there, so
>> yeah...slab corruption.
>>
>> I hit some panics in the same function about a year or so ago when I
>> was developing some fsnotify patches, and it turned out to be double-
>> free. Wonder what's sharing that slab?
>
> Could be anything: that's called from kmemdup(), so it's going to be
> one of the generic slabs.
>
> I've disabled CONFIG_VMAP_STACK and SELinux, and so far everything is
> stable with NFSv4.1 on RDMA and v4.9-rc2.

Just hit this on the server while running xfstests generic/089 on
NFSv4.0 / RDMA. Still v4.9-rc2 with a few NFS/RDMA patches, but
no kernel debugging enabled yet.

Oct 27 21:08:42 klimt kernel: general protection fault: 0000 [#1] SMP
Oct 27 21:08:42 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp btrfs kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel xor lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support raid6_pq pcspkr lpc_ich i2c_i801 mfd_core i2c_smbus mei_me mei rpcrdma sg ipmi_si shpchp ioatdma wmi ipmi_msghandler ib_ipoib acpi_pad acpi_power_meter rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib ib_core mlx4_en sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm igb libahci libata mlx4_core ptp crc32c_intel pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
Oct 27 21:08:42 klimt kernel: CPU: 3 PID: 1649 Comm: nfsd Not tainted 4.9.0-rc2-00004-ga75a35c #3
Oct 27 21:08:42 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
Oct 27 21:08:42 klimt kernel: task: ffff880841474140 task.stack: ffff880841798000
Oct 27 21:08:42 klimt kernel: RIP: 0010:[<ffffffff811e9a99>] [<ffffffff811e9a99>] kmem_cache_alloc+0x149/0x1b0
Oct 27 21:08:42 klimt kernel: RSP: 0018:ffff88084179bc98 EFLAGS: 00010282
Oct 27 21:08:42 klimt kernel: RAX: 0000000000000000 RBX: 00000000024000c0 RCX: 00000000095755fa
Oct 27 21:08:42 klimt kernel: RDX: 00000000095755f9 RSI: 00000000024000c0 RDI: ffff88085f007400
Oct 27 21:08:42 klimt kernel: RBP: ffff88084179bcc8 R08: 000000000001ce30 R09: ffff8808416a1070
Oct 27 21:08:42 klimt kernel: R10: 0000000000000003 R11: ffff8808416a0220 R12: 00000000024000c0
Oct 27 21:08:42 klimt kernel: R13: e748f37c723b66c0 R14: ffff88085f007400 R15: ffff88085f007400
Oct 27 21:08:42 klimt kernel: FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000
Oct 27 21:08:42 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 27 21:08:42 klimt kernel: CR2: 00007f6822890000 CR3: 0000000001c06000 CR4: 00000000001406e0
Oct 27 21:08:42 klimt kernel: Stack:
Oct 27 21:08:42 klimt kernel: ffffffff810a4456 0000000011270000 ffff880841474140 ffff880841484000
Oct 27 21:08:42 klimt kernel: 0000000000000000 ffff88084cbc4a00 ffff88084179bce8 ffffffff810a4456
Oct 27 21:08:42 klimt kernel: 0000000011270000 ffff8808416a1068 ffff88084179bd58 ffffffffa04c09ed
Oct 27 21:08:42 klimt kernel: Call Trace:
Oct 27 21:08:42 klimt kernel: [<ffffffff810a4456>] ? prepare_creds+0x26/0x150
Oct 27 21:08:42 klimt kernel: [<ffffffff810a4456>] prepare_creds+0x26/0x150
Oct 27 21:08:42 klimt kernel: [<ffffffffa04c09ed>] fh_verify+0x1ed/0x610 [nfsd]
Oct 27 21:08:42 klimt kernel: [<ffffffffa04ce6f9>] nfsd4_putfh+0x49/0x50 [nfsd]
Oct 27 21:08:42 klimt kernel: [<ffffffffa04d01fd>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
Oct 27 21:08:42 klimt kernel: [<ffffffffa04bd2b4>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
Oct 27 21:08:42 klimt kernel: [<ffffffffa0430249>] svc_process_common+0x3d9/0x700 [sunrpc]
Oct 27 21:08:42 klimt kernel: [<ffffffffa04310c1>] svc_process+0xf1/0x1d0 [sunrpc]
Oct 27 21:08:42 klimt kernel: [<ffffffffa04bcd4f>] nfsd+0xff/0x160 [nfsd]
Oct 27 21:08:42 klimt kernel: [<ffffffffa04bcc50>] ? nfsd_destroy+0x60/0x60 [nfsd]
Oct 27 21:08:42 klimt kernel: [<ffffffff810a1f25>] kthread+0xe5/0xf0
Oct 27 21:08:42 klimt kernel: [<ffffffff810a1e40>] ? kthread_stop+0x120/0x120
Oct 27 21:08:42 klimt kernel: [<ffffffff816aa795>] ret_from_fork+0x25/0x30
Oct 27 21:08:42 klimt kernel: Code: d0 41 ff d2 4d 8b 55 00 4d 85 d2 75 dc eb d1 81 e3 00 00 10 00 0f 84 0a ff ff ff e9 0f ff ff ff 49 63 47 20 48 8d 4a 01 4d 8b 07 <49> 8b 5c 05 00 4c 89 e8 65 49 0f c7 08 0f 94 c0 84 c0 0f 85 45
Oct 27 21:08:42 klimt kernel: RIP [<ffffffff811e9a99>] kmem_cache_alloc+0x149/0x1b0
Oct 27 21:08:42 klimt kernel: RSP <ffff88084179bc98>
Oct 27 21:08:42 klimt kernel: ---[ end trace 0bf398a5b035df79 ]---

Looks rather similar:

(gdb) list *(kmem_cache_alloc+0x149)
0xffffffff811e9a99 is in kmem_cache_alloc (/home/cel/src/linux/linux-2.6/mm/slub.c:241).
236 * Core slab cache functions
237 *******************************************************************/
238
239 static inline void *get_freepointer(struct kmem_cache *s, void *object)
240 {
241 return *(void **)(object + s->offset);
242 }
243
244 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
245 {
(gdb)


--
Chuck Lever




2016-10-28 20:50:57

by J. Bruce Fields

[permalink] [raw]
Subject: Re: upstream server crash

On Thu, Oct 27, 2016 at 09:20:41PM -0400, Chuck Lever wrote:
> Just hit this on the server while running xfstests generic/089 on
> NFSv4.0 / RDMA. Still v4.9-rc2 with a few NFS/RDMA patches, but
> no kernel debugging enabled yet.

Weird, I wouldn't even know where to start. It's not even obvious that
it's an NFS or RDMA bug at all.

--b.

>
> Oct 27 21:08:42 klimt kernel: general protection fault: 0000 [#1] SMP
> Oct 27 21:08:42 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp btrfs kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel xor lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support raid6_pq pcspkr lpc_ich i2c_i801 mfd_core i2c_smbus mei_me mei rpcrdma sg ipmi_si shpchp ioatdma wmi ipmi_msghandler ib_ipoib acpi_pad acpi_power_meter rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib ib_core mlx4_en sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm igb libahci libata mlx4_core ptp crc32c_intel pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
> Oct 27 21:08:42 klimt kernel: CPU: 3 PID: 1649 Comm: nfsd Not tainted 4.9.0-rc2-00004-ga75a35c #3
> Oct 27 21:08:42 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
> Oct 27 21:08:42 klimt kernel: task: ffff880841474140 task.stack: ffff880841798000
> Oct 27 21:08:42 klimt kernel: RIP: 0010:[<ffffffff811e9a99>] [<ffffffff811e9a99>] kmem_cache_alloc+0x149/0x1b0
> Oct 27 21:08:42 klimt kernel: RSP: 0018:ffff88084179bc98 EFLAGS: 00010282
> Oct 27 21:08:42 klimt kernel: RAX: 0000000000000000 RBX: 00000000024000c0 RCX: 00000000095755fa
> Oct 27 21:08:42 klimt kernel: RDX: 00000000095755f9 RSI: 00000000024000c0 RDI: ffff88085f007400
> Oct 27 21:08:42 klimt kernel: RBP: ffff88084179bcc8 R08: 000000000001ce30 R09: ffff8808416a1070
> Oct 27 21:08:42 klimt kernel: R10: 0000000000000003 R11: ffff8808416a0220 R12: 00000000024000c0
> Oct 27 21:08:42 klimt kernel: R13: e748f37c723b66c0 R14: ffff88085f007400 R15: ffff88085f007400
> Oct 27 21:08:42 klimt kernel: FS: 0000000000000000(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000
> Oct 27 21:08:42 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Oct 27 21:08:42 klimt kernel: CR2: 00007f6822890000 CR3: 0000000001c06000 CR4: 00000000001406e0
> Oct 27 21:08:42 klimt kernel: Stack:
> Oct 27 21:08:42 klimt kernel: ffffffff810a4456 0000000011270000 ffff880841474140 ffff880841484000
> Oct 27 21:08:42 klimt kernel: 0000000000000000 ffff88084cbc4a00 ffff88084179bce8 ffffffff810a4456
> Oct 27 21:08:42 klimt kernel: 0000000011270000 ffff8808416a1068 ffff88084179bd58 ffffffffa04c09ed
> Oct 27 21:08:42 klimt kernel: Call Trace:
> Oct 27 21:08:42 klimt kernel: [<ffffffff810a4456>] ? prepare_creds+0x26/0x150
> Oct 27 21:08:42 klimt kernel: [<ffffffff810a4456>] prepare_creds+0x26/0x150
> Oct 27 21:08:42 klimt kernel: [<ffffffffa04c09ed>] fh_verify+0x1ed/0x610 [nfsd]
> Oct 27 21:08:42 klimt kernel: [<ffffffffa04ce6f9>] nfsd4_putfh+0x49/0x50 [nfsd]
> Oct 27 21:08:42 klimt kernel: [<ffffffffa04d01fd>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
> Oct 27 21:08:42 klimt kernel: [<ffffffffa04bd2b4>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
> Oct 27 21:08:42 klimt kernel: [<ffffffffa0430249>] svc_process_common+0x3d9/0x700 [sunrpc]
> Oct 27 21:08:42 klimt kernel: [<ffffffffa04310c1>] svc_process+0xf1/0x1d0 [sunrpc]
> Oct 27 21:08:42 klimt kernel: [<ffffffffa04bcd4f>] nfsd+0xff/0x160 [nfsd]
> Oct 27 21:08:42 klimt kernel: [<ffffffffa04bcc50>] ? nfsd_destroy+0x60/0x60 [nfsd]
> Oct 27 21:08:42 klimt kernel: [<ffffffff810a1f25>] kthread+0xe5/0xf0
> Oct 27 21:08:42 klimt kernel: [<ffffffff810a1e40>] ? kthread_stop+0x120/0x120
> Oct 27 21:08:42 klimt kernel: [<ffffffff816aa795>] ret_from_fork+0x25/0x30
> Oct 27 21:08:42 klimt kernel: Code: d0 41 ff d2 4d 8b 55 00 4d 85 d2 75 dc eb d1 81 e3 00 00 10 00 0f 84 0a ff ff ff e9 0f ff ff ff 49 63 47 20 48 8d 4a 01 4d 8b 07 <49> 8b 5c 05 00 4c 89 e8 65 49 0f c7 08 0f 94 c0 84 c0 0f 85 45
> Oct 27 21:08:42 klimt kernel: RIP [<ffffffff811e9a99>] kmem_cache_alloc+0x149/0x1b0
> Oct 27 21:08:42 klimt kernel: RSP <ffff88084179bc98>
> Oct 27 21:08:42 klimt kernel: ---[ end trace 0bf398a5b035df79 ]---
>
> Looks rather similar:
>
> (gdb) list *(kmem_cache_alloc+0x149)
> 0xffffffff811e9a99 is in kmem_cache_alloc (/home/cel/src/linux/linux-2.6/mm/slub.c:241).
> 236 * Core slab cache functions
> 237 *******************************************************************/
> 238
> 239 static inline void *get_freepointer(struct kmem_cache *s, void *object)
> 240 {
> 241 return *(void **)(object + s->offset);
> 242 }
> 243
> 244 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
> 245 {
> (gdb)
>
>
> --
> Chuck Lever
>
>

2016-10-28 21:45:11

by Chuck Lever

[permalink] [raw]
Subject: Re: upstream server crash


> On Oct 24, 2016, at 3:17 PM, Jeff Layton <[email protected]> wrote:
>
> On Mon, 2016-10-24 at 14:08 -0400, J. Bruce Fields wrote:
>> On Mon, Oct 24, 2016 at 11:24:40AM -0400, Jeff Layton wrote:
>>>
>>> On Mon, 2016-10-24 at 11:19 -0400, Jeff Layton wrote:
>>>>
>>>> On Mon, 2016-10-24 at 09:51 -0400, Chuck Lever wrote:
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Oct 24, 2016, at 9:31 AM, Jeff Layton <[email protected]> wrote:
>>>>>>
>>>>>> On Mon, 2016-10-24 at 11:15 +0800, Eryu Guan wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Oct 23, 2016 at 02:21:15PM -0400, J. Bruce Fields wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm getting an intermittent crash in the nfs server as of
>>>>>>>> 68778945e46f143ed7974b427a8065f69a4ce944 "SUNRPC: Separate buffer
>>>>>>>> pointers for RPC Call and Reply messages".
>>>>>>>>
>>>>>>>> I haven't tried to understand that commit or why it would be a problem yet, I
>>>>>>>> don't see an obvious connection--I can take a closer look Monday.
>>>>>>>>
>>>>>>>> Could even be that I just landed on this commit by chance, the problem is a
>>>>>>>> little hard to reproduce so I don't completely trust my testing.
>>>>>>>
>>>>>>> I've hit the same crash on 4.9-rc1 kernel, and it's reproduced for me
>>>>>>> reliably by running xfstests generic/013 case, on a loopback mounted
>>>>>>> NFSv4.1 (or NFSv4.2), XFS is the underlying exported fs. More details
>>>>>>> please see
>>>>>>>
>>>>>>> http://marc.info/?l=linux-nfs&m=147714320129362&w=2
>>>>>>>
>>>>>>
>>>>>> Looks like you landed at the same commit as Bruce, so that's probably
>>>>>> legit. That commit is very small though. The only real change that
>>>>>> doesn't affect the new field is this:
>>>>>>
>>>>>>
>>>>>> @@ -1766,7 +1766,7 @@ rpc_xdr_encode(struct rpc_task *task)
>>>>>> req->rq_buffer,
>>>>>> req->rq_callsize);
>>>>>> xdr_buf_init(&req->rq_rcv_buf,
>>>>>> - (char *)req->rq_buffer + req->rq_callsize,
>>>>>> + req->rq_rbuffer,
>>>>>> req->rq_rcvsize);
>>>>>>
>>>>>>
>>>>>> So I'm guessing this is breaking the callback channel somehow?
>>>>>
>>>>> Could be the TCP backchannel code is using rq_buffer in a different
>>>>> way than RDMA backchannel or the forward channel code.
>>>>>
>>>>
>>>> Well, it basically allocates a page per rpc_rqst and then maps that.
>>>>
>>>> One thing I notice is that this patch ensures that rq_rbuffer gets set
>>>> up in rpc_malloc and xprt_rdma_allocate, but it looks like
>>>> xprt_alloc_bc_req didn't get the same treatment.
>>>>
>>>> I suspect that that may be the problem...
>>>>
>>> In fact, maybe we just need this here? (untested and probably
>>> whitespace damaged):
>>
>> No change in results for me.
>>
>> --b.
>>>
>>>
>>> diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c
>>> index ac701c28f44f..c561aa8ce05b 100644
>>> --- a/net/sunrpc/backchannel_rqst.c
>>> +++ b/net/sunrpc/backchannel_rqst.c
>>> @@ -100,6 +100,7 @@ struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt, gfp_t gfp_flags)
>>> goto out_free;
>>> }
>>> req->rq_rcv_buf.len = PAGE_SIZE;
>>> + req->rq_rbuffer = req->rq_rcv_buf.head[0].iov_base;
>>>
>>> /* Preallocate one XDR send buffer */
>>> if (xprt_alloc_xdr_buf(&req->rq_snd_buf, gfp_flags) < 0) {
>
> Ahh ok, I think I see.
>
> We probably also need to set rq_rbuffer in bc_malloc and and
> xprt_rdma_bc_allocate.
>
> My guess is that we're ending up in rpc_xdr_encode with a NULL
> rq_rbuffer pointer, so the right fix would seem to be to ensure that it
> is properly set whenever rq_buffer is set.
>
> So I think this may be what we want, actually. I'll plan to test it out
> but may not get to it before tomorrow.
>
> --
> Jeff Layton <[email protected]><0001-sunrpc-fix-some-missing-rq_rbuffer-assignments.patch>

This may not be working as well as I thought (at least for NFS/RDMA).
xprt_rdma_bc_send_request releases the page allocated by
xprt_rdma_bc_allocate before the reply arrives. call_decode
then tries to dereference rq_rbuffer, but that's now a pointer
to freed memory.


--
Chuck Lever