2016-06-17 13:38:24

by Benjamin Coddington

[permalink] [raw]
Subject: Re: slab-out-of-bounds in rpc/nfs

On 16 Jun 2016, at 13:52, Calvin Owens wrote:

> On Tuesday 03/08 at 11:37 +0100, Dmitry Vyukov wrote:
>> On Tue, Mar 8, 2016 at 11:27 AM, Benjamin Coddington
>> <[email protected]> wrote:
>>> Adding [email protected] ..
>>>
>>> On Mon, 7 Mar 2016, Alexei Starovoitov wrote:
>>>
>>>> seeing on ton of these errors on net-next with kasan on.
>>>> Likely old bug though.
>>>>
>>>> [ 373.705691] BUG: KASAN: slab-out-of-bounds in memcpy+0x28/0x40
>>>> at
>>>> addr ffff8811ada62cb0
>>>> [ 373.707137] Write of size 28 by task bash/7059
>>>> [ 373.708177]
>>>> =============================================================================
>>>> [ 373.709711] BUG kmalloc-4096 (Tainted: G W ): kasan:
>>>> bad access detected
>>>> [ 373.711185]
>>>> -----------------------------------------------------------------------------
>>>> [ 373.711185]
>>>> [ 373.721461] INFO: Allocated in rpc_malloc+0x58/0xd0 age=21 cpu=5
>>>> pid=7059
>>>> [ 373.727158] ___slab_alloc+0x4e2/0x500
>>>> [ 373.728469] __slab_alloc+0x43/0x70
>>>> [ 373.729222] __kmalloc+0x286/0x350
>>>> [ 373.729978] rpc_malloc+0x58/0xd0
>>>> [ 373.730590] call_allocate+0x333/0x690
>>>> [ 373.731428] __rpc_execute+0x187/0xad0
>>>> [ 373.734395] rpc_execute+0xe1/0x2c0
>>>> [ 373.735020] rpc_run_task+0x1ce/0x250
>>>> [ 373.735706] rpc_call_sync+0x93/0x150
>>>> [ 373.736387] nfs3_rpc_wrapper.constprop.12+0x9b/0x240
>>>> [ 373.742818] nfs3_proc_readdir+0x230/0x390
>>>> [ 373.750157] nfs_readdir_xdr_to_array+0x501/0x9b0
>>>> [ 373.753520] nfs_readdir_filler+0x68/0x160
>>>> [ 373.758455] do_read_cache_page+0x8c/0x3c0
>>>> [ 373.761745] read_cache_page+0x46/0x70
>>>> [ 373.763269] nfs_readdir+0x420/0x1380
>>>> [ 373.764078] INFO: Freed in rpc_free+0x41/0x70 age=64 cpu=5
>>>> pid=7059
>>>> [ 373.765335] __slab_free+0x175/0x280
>>>> [ 373.766106] kfree+0x25c/0x2a0
>>>> [ 373.766809] rpc_free+0x41/0x70
>>>> [ 373.767629] xprt_release+0x2c5/0x8f0
>>>> [ 373.768430] rpc_release_resources_task+0x14/0x80
>>>> [ 373.769403] __rpc_execute+0x547/0xad0
>>>> [ 373.770249] rpc_execute+0xe1/0x2c0
>>>> [ 373.770995] rpc_run_task+0x1ce/0x250
>>>> [ 373.771786] rpc_call_sync+0x93/0x150
>>>> [ 373.772672] nfs3_rpc_wrapper.constprop.12+0x9b/0x240
>>>> [ 373.773704] nfs3_proc_access+0x1f1/0x330
>>>> [ 373.774544] nfs_do_access+0x94f/0x12d0
>>>> [ 373.775572] nfs_permission+0x469/0x580
>>>> [ 373.776465] __inode_permission+0x151/0x230
>>>> [ 373.780764] inode_permission+0x21/0xf0
>>>> [ 373.791392] may_open+0x14b/0x260
>>>>
>>
>> The report misses the most interesting part -- the out-of-bounds
>> access stack. It should be at the bottom of the report. If you still
>> have the full report, please post it.
>
> I'm triggering this as well on 4.7-rc3. I can reproduce it as far back
> as 4.0,
> can't easily test any further back because that's when KASAN was
> merged.
>
> Logs and Kconfig follow. I can trigger this 100% of the time.

Hi Calvin, how are you triggering this? I would guess this is getdents
or a
readdir that's been signaled before the server replies..

Ben


2016-06-17 17:36:59

by Calvin Owens

[permalink] [raw]
Subject: Re: slab-out-of-bounds in rpc/nfs

On Friday 06/17 at 09:38 -0400, Benjamin Coddington wrote:
> On 16 Jun 2016, at 13:52, Calvin Owens wrote:
>
> > On Tuesday 03/08 at 11:37 +0100, Dmitry Vyukov wrote:
> > > On Tue, Mar 8, 2016 at 11:27 AM, Benjamin Coddington
> > > <[email protected]> wrote:
> > > > Adding [email protected] ..
> > > >
> > > > On Mon, 7 Mar 2016, Alexei Starovoitov wrote:
> > > >
> > > > > seeing on ton of these errors on net-next with kasan on.
> > > > > Likely old bug though.
> > > > >
> > > > > [ 373.705691] BUG: KASAN: slab-out-of-bounds in
> > > > > memcpy+0x28/0x40 at
> > > > > addr ffff8811ada62cb0
> > > > > [ 373.707137] Write of size 28 by task bash/7059
> > > > > [ 373.708177] =============================================================================
> > > > > [ 373.709711] BUG kmalloc-4096 (Tainted: G W ): kasan:
> > > > > bad access detected
> > > > > [ 373.711185] -----------------------------------------------------------------------------
> > > > > [ 373.711185]
> > > > > [ 373.721461] INFO: Allocated in rpc_malloc+0x58/0xd0
> > > > > age=21 cpu=5 pid=7059
> > > > > [ 373.727158] ___slab_alloc+0x4e2/0x500
> > > > > [ 373.728469] __slab_alloc+0x43/0x70
> > > > > [ 373.729222] __kmalloc+0x286/0x350
> > > > > [ 373.729978] rpc_malloc+0x58/0xd0
> > > > > [ 373.730590] call_allocate+0x333/0x690
> > > > > [ 373.731428] __rpc_execute+0x187/0xad0
> > > > > [ 373.734395] rpc_execute+0xe1/0x2c0
> > > > > [ 373.735020] rpc_run_task+0x1ce/0x250
> > > > > [ 373.735706] rpc_call_sync+0x93/0x150
> > > > > [ 373.736387] nfs3_rpc_wrapper.constprop.12+0x9b/0x240
> > > > > [ 373.742818] nfs3_proc_readdir+0x230/0x390
> > > > > [ 373.750157] nfs_readdir_xdr_to_array+0x501/0x9b0
> > > > > [ 373.753520] nfs_readdir_filler+0x68/0x160
> > > > > [ 373.758455] do_read_cache_page+0x8c/0x3c0
> > > > > [ 373.761745] read_cache_page+0x46/0x70
> > > > > [ 373.763269] nfs_readdir+0x420/0x1380
> > > > > [ 373.764078] INFO: Freed in rpc_free+0x41/0x70 age=64
> > > > > cpu=5 pid=7059
> > > > > [ 373.765335] __slab_free+0x175/0x280
> > > > > [ 373.766106] kfree+0x25c/0x2a0
> > > > > [ 373.766809] rpc_free+0x41/0x70
> > > > > [ 373.767629] xprt_release+0x2c5/0x8f0
> > > > > [ 373.768430] rpc_release_resources_task+0x14/0x80
> > > > > [ 373.769403] __rpc_execute+0x547/0xad0
> > > > > [ 373.770249] rpc_execute+0xe1/0x2c0
> > > > > [ 373.770995] rpc_run_task+0x1ce/0x250
> > > > > [ 373.771786] rpc_call_sync+0x93/0x150
> > > > > [ 373.772672] nfs3_rpc_wrapper.constprop.12+0x9b/0x240
> > > > > [ 373.773704] nfs3_proc_access+0x1f1/0x330
> > > > > [ 373.774544] nfs_do_access+0x94f/0x12d0
> > > > > [ 373.775572] nfs_permission+0x469/0x580
> > > > > [ 373.776465] __inode_permission+0x151/0x230
> > > > > [ 373.780764] inode_permission+0x21/0xf0
> > > > > [ 373.791392] may_open+0x14b/0x260
> > > > >
> > >
> > > The report misses the most interesting part -- the out-of-bounds
> > > access stack. It should be at the bottom of the report. If you still
> > > have the full report, please post it.
> >
> > I'm triggering this as well on 4.7-rc3. I can reproduce it as far back
> > as 4.0,
> > can't easily test any further back because that's when KASAN was merged.
> >
> > Logs and Kconfig follow. I can trigger this 100% of the time.
>
> Hi Calvin, how are you triggering this? I would guess this is getdents or a
> readdir that's been signaled before the server replies..

Unfortunately my current repro is "boot a specific server type at Facebook", I'll
drill down and see if I can get a minimal repro to send along.

Thanks,
Calvin