2022-09-12 06:17:22

by Ben Luo

[permalink] [raw]
Subject: [PATCH] mm/slub: return 0 when object pointer is NULL

NULL is definitly not a valid address

Signed-off-by: Ben Luo <[email protected]>
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index 862dbd9..50fad18 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -674,7 +674,7 @@ static inline int check_valid_pointer(struct kmem_cache *s,
void *base;

if (!object)
- return 1;
+ return 0;

base = slab_address(slab);
object = kasan_reset_tag(object);
--
1.8.3.1


2022-09-12 07:24:37

by Hyeonggon Yoo

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: return 0 when object pointer is NULL

On Mon, Sep 12, 2022 at 01:59:39PM +0800, Ben Luo wrote:
> NULL is definitly not a valid address
>
> Signed-off-by: Ben Luo <[email protected]>
> ---
> mm/slub.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 862dbd9..50fad18 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -674,7 +674,7 @@ static inline int check_valid_pointer(struct kmem_cache *s,
> void *base;
>
> if (!object)
> - return 1;
> + return 0;
>
> base = slab_address(slab);
> object = kasan_reset_tag(object);
> --
> 1.8.3.1
>

Hello Ben.

The return value is used to check if the @object has valid pointer
in @slab. (used for debugging) the return value is 0 if valid, 1 if invalid.

It does not return a pointer. So changing it to 0 because 1 is invalid
address does not make sense.

--
Thanks,
Hyeonggon

2022-09-12 08:17:43

by Ben Luo

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: return 0 when object pointer is NULL

Hello Hyeonggon,

Thanks for replying :)

在 2022/9/12 15:18, Hyeonggon Yoo 写道:
> On Mon, Sep 12, 2022 at 01:59:39PM +0800, Ben Luo wrote:
>> NULL is definitly not a valid address
>>
>> Signed-off-by: Ben Luo <[email protected]>
>> ---
>> mm/slub.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 862dbd9..50fad18 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -674,7 +674,7 @@ static inline int check_valid_pointer(struct kmem_cache *s,
>> void *base;
>>
>> if (!object)
>> - return 1;
>> + return 0;
>>
>> base = slab_address(slab);
>> object = kasan_reset_tag(object);
>> --
>> 1.8.3.1
>>
> Hello Ben.
>
> The return value is used to check if the @object has valid pointer
> in @slab. (used for debugging) the return value is 0 if valid, 1 if invalid.
>
> It does not return a pointer. So changing it to 0 because 1 is invalid
> address does not make sense.

I know the meaning of this return value, but I think this function was
expected by returning 0 if invalid ,1 if valid

Check this original code:

        if (object < base || object >= base + slab->objects * s->size ||
                (object - base) % s->size) {
                return 0;
        }

Object not in range of [base, base+length) is an invalid slab address,
and it will return 0


--

Thanks,

Ben

>

2022-09-12 08:42:12

by Hyeonggon Yoo

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: return 0 when object pointer is NULL

On Mon, Sep 12, 2022 at 03:29:30PM +0800, Ben Luo wrote:
> Hello Hyeonggon,
>
> Thanks for replying :)
>
> 在 2022/9/12 15:18, Hyeonggon Yoo 写道:
> > On Mon, Sep 12, 2022 at 01:59:39PM +0800, Ben Luo wrote:
> > > NULL is definitly not a valid address
> > >
> > > Signed-off-by: Ben Luo <[email protected]>
> > > ---
> > > mm/slub.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 862dbd9..50fad18 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -674,7 +674,7 @@ static inline int check_valid_pointer(struct kmem_cache *s,
> > > void *base;
> > > if (!object)
> > > - return 1;
> > > + return 0;
> > > base = slab_address(slab);
> > > object = kasan_reset_tag(object);
> > > --
> > > 1.8.3.1
> > >
> > Hello Ben.
> >
> > The return value is used to check if the @object has valid pointer
> > in @slab. (used for debugging) the return value is 0 if valid, 1 if invalid.
> >
> > It does not return a pointer. So changing it to 0 because 1 is invalid
> > address does not make sense.
>
> I know the meaning of this return value, but I think this function was
> expected by returning 0 if invalid ,1 if valid
>

Ah, right. Sorry for my silly mistake. I misread your patch and the code :D

Hmm then the question is: Why does check_valid_pointer() allow passing NULL?

So I just compiled and ran it with debugging enabled.

And I got:

[ 0.195507] =============================================================================
[ 0.195507] BUG ftrace_event_field (Tainted: G B ): Freechain corrupt
[ 0.195507] -----------------------------------------------------------------------------
[ 0.195507]
[ 0.195507] Slab 0xffffea000400d380 objects=28 used=28 fp=0x0000000000000000 flags=0x17ffffc0000200(slab|node=0|zone=2|lastcpu)
[ 0.195508] Object 0xffff88810034eab8 @offset=2744 fp=0x0000000000000000
[ 0.195508]
[ 0.195508] Redzone ffff88810034eab0: bb bb bb bb bb bb bb bb ........
[ 0.195509] Object ffff88810034eab8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 0.195509] Object ffff88810034eac8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 0.195510] Object ffff88810034ead8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
[ 0.195510] Redzone ffff88810034eae8: bb bb bb bb bb bb bb bb ........
[ 0.195510] Padding ffff88810034eb38: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
[ 0.195511] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B 6.0.0-rc3+ #1765
[ 0.195511] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[ 0.195511] Call Trace:
[ 0.195512] <TASK>
[ 0.195512] dump_stack_lvl+0x45/0x5e
[ 0.195513] object_err+0x30/0x44
[ 0.195513] deactivate_slab.cold+0x15/0x2c
[ 0.195514] ? __trace_early_add_events+0x3e/0x60
[ 0.195515] ? trace_event_init+0xaf/0x2cd
[ 0.195516] ? start_kernel+0x705/0x97d
[ 0.195517] ? check_object+0x138/0x250
[ 0.195518] ? init_object+0x6f/0x90
[ 0.195518] ? trace_define_field+0x50/0xe0
[ 0.195519] ___slab_alloc+0x4a0/0x6c0
[ 0.195520] ? trace_define_field+0x50/0xe0
[ 0.195521] ? trace_create_new_event+0x31/0xf0
[ 0.195522] ? trace_define_field+0x50/0xe0
[ 0.195523] kmem_cache_alloc+0x2a1/0x2f0
[ 0.195524] trace_define_field+0x50/0xe0
[ 0.195525] event_define_fields+0x8a/0xc0
[ 0.195527] __trace_early_add_events+0x3e/0x60
[ 0.195528] trace_event_init+0xaf/0x2cd
[ 0.195529] start_kernel+0x705/0x97d
[ 0.195529] ? x86_family+0x5/0x30
[ 0.195530] secondary_startup_64_no_verify+0xe5/0xeb
[ 0.195532] </TASK>
[ 0.195532] FIX ftrace_event_field: Isolate corrupted freechain

The call chain is:

deactivate_slab()
-> freelist_corrupted()
check_valid_pointer()

Hmm check_valid_pointer() is used to to check if the free pointer is valid.
and as NULL is used to mark that there is no next object, I think
check_valid_pointer() shuold return 1 if object is NULL.

Thanks!

> Check this original code:
>
>         if (object < base || object >= base + slab->objects * s->size ||
>                 (object - base) % s->size) {
>                 return 0;
>         }
>
> Object not in range of [base, base+length) is an invalid slab address, and
> it will return 0
>
>
> --
>
> Thanks,
>
> Ben
>
> >

--
Thanks,
Hyeonggon

2022-09-14 02:56:19

by kernel test robot

[permalink] [raw]
Subject: [mm/slub] fb670abe87: BUG_kmem_cache_node(Not_tainted):Freechain_corrupt

Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: fb670abe87296c7b214b6d9f29e9c7380d8d621c ("[PATCH] mm/slub: return 0 when object pointer is NULL")
url: https://github.com/intel-lab-lkp/linux/commits/Ben-Luo/mm-slub-return-0-when-object-pointer-is-NULL/20220912-140234
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/linux-mm/[email protected]

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+----------------------------------------------------------+------------+------------+
| | 2558c2ced7 | fb670abe87 |
+----------------------------------------------------------+------------+------------+
| BUG_kmem_cache_node(Not_tainted):Freechain_corrupt | 0 | 24 |
| BUG_kmem_cache(Tainted:G_B):Freechain_corrupt | 0 | 24 |
| BUG_kmem_cache_node(Tainted:G_B):Freechain_corrupt | 0 | 24 |
| BUG_kmem_cache_node(Tainted:G_B):Freepointer_corrupt | 0 | 24 |
| BUG_debug_objects_cache(Tainted:G_B):Freechain_corrupt | 0 | 24 |
| BUG_debug_objects_cache(Tainted:G_B):Freepointer_corrupt | 0 | 24 |
| BUG_vmap_area(Tainted:G_B):Freechain_corrupt | 0 | 20 |
| BUG_kmalloc-#(Tainted:G_B):Freechain_corrupt | 0 | 20 |
| BUG_kmalloc-#k(Tainted:G_B):Freechain_corrupt | 0 | 20 |
| BUG_kmalloc-#(Tainted:G_B):Freepointer_corrupt | 0 | 20 |
| BUG_radix_tree_node(Tainted:G_B):Freechain_corrupt | 0 | 14 |
| BUG_pool_workqueue(Tainted:G_B):Freechain_corrupt | 0 | 14 |
| BUG_trace_event_file(Tainted:G_B):Freechain_corrupt | 0 | 13 |
| BUG_ftrace_event_field(Tainted:G_B):Freechain_corrupt | 0 | 13 |
+----------------------------------------------------------+------------+------------+


[ 2.980173][ T0] =============================================================================
[ 2.981207][ T0] BUG kmem_cache_node (Not tainted): Freechain corrupt
[ 2.981954][ T0] -----------------------------------------------------------------------------
[ 2.981954][ T0]
[ 2.983185][ T0] Slab 0xea3fe800 objects=21 used=21 fp=0x00000000 flags=0x200(slab|zone=0)
[ 2.984205][ T0] Object 0xc0100f40 @offset=3904 fp=0x00000000
[ 2.984205][ T0]
[ 2.985174][ T0] Redzone c0100f00: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 2.986268][ T0] Redzone c0100f10: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 2.987368][ T0] Redzone c0100f20: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 2.988420][ T0] Redzone c0100f30: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 2.989412][ T0] Object c0100f40: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 2.990420][ T0] Object c0100f50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 2.991436][ T0] Object c0100f60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 2.992505][ T0] Object c0100f70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
[ 2.993539][ T0] Redzone c0100f80: bb bb bb bb ....
[ 2.994476][ T0] Padding c0100fb0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 2.995639][ T0] CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc3-00584-gfb670abe8729 #6
[ 2.996635][ T0] Call Trace:
[ 2.996998][ T0] ? show_stack+0x35/0x3b
[ 2.997511][ T0] dump_stack_lvl+0x55/0x79
[ 2.997986][ T0] dump_stack+0xd/0x10
[ 2.998412][ T0] print_trailer+0x104/0x10c
[ 2.998914][ T0] object_err+0x2b/0x3f
[ 2.999367][ T0] deactivate_slab.cold+0x13/0x2e
[ 2.999925][ T0] ? __kmem_cache_create+0x16/0xa0
[ 3.000519][ T0] ? kmem_cache_init+0x73/0xe9
[ 3.001069][ T0] ? start_kernel+0x1b8/0x413
[ 3.001585][ T0] ? i386_start_kernel+0x43/0x45
[ 3.002122][ T0] ? alloc_debug_processing+0x41/0x150
[ 3.002722][ T0] ? pcpu_alloc+0x734/0x9d0
[ 3.003235][ T0] ___slab_alloc+0x753/0xc00
[ 3.003853][ T0] ? init_kmem_cache_nodes+0x31/0x210
[ 3.004446][ T0] ? __mutex_unlock_slowpath+0x20/0x290
[ 3.005080][ T0] ? init_kmem_cache_nodes+0x31/0x210
[ 3.005676][ T0] ? rcu_read_lock_sched_held+0xe/0x70
[ 3.006282][ T0] ? pcpu_alloc+0x49f/0x9d0
[ 3.006772][ T0] kmem_cache_alloc+0x3b0/0x480
[ 3.007302][ T0] ? init_kmem_cache_nodes+0x31/0x210
[ 3.007874][ T0] init_kmem_cache_nodes+0x31/0x210
[ 3.008436][ T0] kmem_cache_open+0xf6/0x290
[ 3.008933][ T0] ? kmem_cache_open+0x192/0x290
[ 3.009454][ T0] __kmem_cache_create+0x16/0xa0
[ 3.009996][ T0] create_boot_cache+0x63/0x83
[ 3.010526][ T0] kmem_cache_init+0x73/0xe9
[ 3.011186][ T0] start_kernel+0x1b8/0x413
[ 3.011611][ T0] ? idt_setup_early_handler+0x39/0x4c
[ 3.012209][ T0] i386_start_kernel+0x43/0x45
[ 3.012728][ T0] startup_32_smp+0x161/0x164
[ 3.013254][ T0] Disabling lock debugging due to kernel taint
[ 3.013911][ T0] FIX kmem_cache_node: Isolate corrupted freechain
...


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]


To reproduce:

# build kernel
cd linux
cp config-6.0.0-rc3-00584-gfb670abe8729 .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.


--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (6.52 kB)
config-6.0.0-rc3-00584-gfb670abe8729 (131.72 kB)
job-script (4.72 kB)
dmesg.xz (203.20 kB)
Download all attachments