Received: by 2002:a05:7412:da14:b0:e2:908c:2ebd with SMTP id fe20csp2153216rdb; Mon, 9 Oct 2023 14:44:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEGFQhHLOwTCsvGmYCSuRuJxSU7pt57vkII+HSHwC1gugrl7s1B6KBMR29HWde6RS1Ir/Uy X-Received: by 2002:a05:6a00:2e28:b0:68e:46d2:e331 with SMTP id fc40-20020a056a002e2800b0068e46d2e331mr16900676pfb.23.1696887847697; Mon, 09 Oct 2023 14:44:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696887847; cv=none; d=google.com; s=arc-20160816; b=SXztnoVdv2Gj5RfNmjVCehTrngTIwN/7QfXZanjiDulKs5dEEuafD+T/tWEw9cR3aB Jrffdl7VOjydCBOOiXZsNJ3aMlh91aja1PCg9Q+xmMM3nut2Z1pnuG9s3XDqi3kWUEKl k9kjCnQU4XIlChOBY412MHoQsIWSQBAIgbVl7//l7N+DNlK/L39Igg+3fGIooGBCMb7B zZlx4CfELEfE5XR43fwr2m56RF7nCTBkIyRfabw2qWhiOiaw/3Dvwv85HZFFZ3zsNZA4 TMiekVRFo52Zn5665HLMxjReh0bLEeHYYeS37V5uvMP7TJwPJhyeN2XtMLVs5fI8pxk5 n06A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=1A4XKHpL4U37Y/3uWnlttIXwNnAiPoRHRzYT8xs8nBY=; fh=+AH9F/OK0MvPem9Z+ohgCqtxie+3yqnUKkzZoXZuplE=; b=y6kbAdACmW4DZKfs1MH3mtVboYm65vG1GCBt2eAyGrP5ZpirF/FWofZDSSw0SjZANq LXtc9XJsyYmN0Q0MgZik5b2uGO0undJaTaQMyibMq1UkCu3jbHPPqWFVAystHVr+uei/ ibiNE43gRPwWy7cfjSi7vxelqCSlugfxSIjynZ+EnFo89CVs6lX6UWxA4Jb2Ae6g/dwf m+QZjRBYCWJX49ZhxItXY8CxoEVdHVMTGMD0DHxKga233w9hLr1VaL9oNDUm8nzwzU7+ s21kn1cv8ZhLW0dkF2w90IaY2OrDd7jHMisQNj+FWQNZShvmC6LQF7N/u1b2zn/e86br 8bnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=KFlvIrsL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id fb40-20020a056a002da800b0068fbb75adedsi8403892pfb.127.2023.10.09.14.44.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Oct 2023 14:44:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=KFlvIrsL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id D0A8D8030984; Mon, 9 Oct 2023 14:44:04 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378776AbjJIVnq (ORCPT + 99 others); Mon, 9 Oct 2023 17:43:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378682AbjJIVnp (ORCPT ); Mon, 9 Oct 2023 17:43:45 -0400 X-Greylist: delayed 108228 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Mon, 09 Oct 2023 14:43:43 PDT Received: from out-207.mta1.migadu.com (out-207.mta1.migadu.com [95.215.58.207]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A53A9E for ; Mon, 9 Oct 2023 14:43:43 -0700 (PDT) Date: Mon, 9 Oct 2023 14:43:34 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1696887821; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=1A4XKHpL4U37Y/3uWnlttIXwNnAiPoRHRzYT8xs8nBY=; b=KFlvIrsLSr17ZzklTQ2tvr9mLaQwlNb0kyXDcwJ1Vc11nKinmN11/CZM4ZcKaoXmRFgQ+e VaOSfJuXEeBXVhflxnpmi1pii8pqifhnhmFZ6lwrlMu92HUjN60SQW3fO/z4zwAt6Vv1Ob +r/xIziKw4lVpmR5cZAL2tFwU9YyiWA= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Naresh Kamboju Cc: "open list:KERNEL SELFTEST FRAMEWORK" , open list , Cgroups , lkft-triage@lists.linaro.org, Lucas Karpinski , Shakeel Butt , Johannes Weiner , Michal Hocko , Muchun Song , Shuah Khan , Tejun Heo , Zefan Li , Andrew Morton , Mark Brown , Dan Carpenter , David Rientjes , Dennis Zhou , Vlastimil Babka Subject: Re: selftests: cgroup: test_core - Unable to handle kernel NULL pointer dereference at virtual address Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=2.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Mon, 09 Oct 2023 14:44:05 -0700 (PDT) X-Spam-Level: ** On Mon, Oct 09, 2023 at 04:08:13PM +0530, Naresh Kamboju wrote: > On Sun, 8 Oct 2023 at 21:09, Roman Gushchin wrote: > > > > On Sun, Oct 08, 2023 at 11:30:52AM +0530, Naresh Kamboju wrote: > > > While running selftests: cgroup: test_kmem on FVP following kernel crash > > > noticed on Linux next 6.6.0-rc4-next-20231006. > > > > Hi Naresh! > > > > Thank you for the report! > > > > I've tried to reproduce it, but wasn't successful so far: I've run test_kmem > > for several hundred times and haven't seen the crash. > > If you look at the problematic test case is > selftests: cgroup: test_core Ah, got it, and immediately reproduced (and fixed). Thank you once again for all your effort! The problem happens because some kernel allocations happen after mem_cgroup_exit(), which was dropping the reference to task->objcg, but not zeroing the pointer, so it eventually caused a double-free. I gonna post an updated version of my patchset, which introduced the issue, with the fix merged (and some other minor changes). Thanks! > > I recommend to run > ./run_kselftest.sh -c cgroup > > > > > Can you, please, provide some additional information? > > How easy to reproduce it? > > It is always reproducible on qemu-arm64, FVP, x86 with a given config. > > > Do you see it every time or with some probability? > > Always with selftests: cgroup: test_core. > I generally run all selftests: cgroup: * > > > Do you see it on other platforms? > qemu-arm64, FVP, x86. > > > Can you, please, check where exactly the crash happens using addr2line? > > I have provided links to build, test logs and steps to reproduce scripts. > > Is this an interesting log ? > > # selftests: cgroup: test_core > # ok 1 test_cgcore_internal_process_constraint > # ok 2 test_cgcore_top_down_constraint_enable > # ok 3 test_cgcore_top_down_constraint_disable > # ok 4 test_cgcore_no_internal_process_constraint_on_threads > # ok 5 test_cgcore_parent_becomes_threaded > # ok 6 test_cgcore_invalid_domain > # ok 7 test_cgcore_populated > # ok 8 test_cgcore_proc_migration > # ok 9 test_cgcore_thread_migration > # ok 10 test_cgcore_destroy > # ok 11 test_cgcore_lesser_euid_open > # ok 12 test_cgcore_lesser_ns_open > <1>[ 188.504854] Unable to handle kernel NULL pointer dereference at > virtual address 0000000000000000 > <1>[ 188.505724] Mem abort info: > <1>[ 188.506036] ESR = 0x0000000096000044 > <1>[ 188.507539] EC = 0x25: DABT (current EL), IL = 32 bits > <1>[ 188.508268] SET = 0, FnV = 0 > <1>[ 188.508895] EA = 0, S1PTW = 0 > <1>[ 188.509517] FSC = 0x04: level 0 translation fault > <1>[ 188.510279] Data abort info: > <1>[ 188.511513] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000 > <1>[ 188.512757] CM = 0, WnR = 1, TnD = 0, TagAccess = 0 > <1>[ 188.513333] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > <1>[ 188.514316] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104934000 > <1>[ 188.516277] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 > <0>[ 188.517886] Internal error: Oops: 0000000096000044 [#1] PREEMPT SMP > <4>[ 188.518838] Modules linked in: crct10dif_ce sm3_ce sm3 sha3_ce > sha512_ce sha512_arm64 fuse drm backlight dm_mod ip_tables x_tables > <4>[ 188.521105] CPU: 0 PID: 57 Comm: kworker/0:2 Not tainted > 6.6.0-rc4-next-20231006 #1 > <4>[ 188.521750] Hardware name: linux,dummy-virt (DT) > <4>[ 188.522915] Workqueue: cgroup_destroy css_free_rwork_fn > <4>[ 188.523572] pstate: 03400009 (nzcv daif +PAN -UAO +TCO +DIT > -SSBS BTYPE=--) > <4>[ 188.524569] pc : percpu_ref_put_many.constprop.0 > (arch/arm64/include/asm/atomic_lse.h:169 (discriminator 1) > arch/arm64/include/asm/atomic_lse.h:184 (discriminator 1) > arch/arm64/include/asm/atomic_lse.h:204 (discriminator 1) > arch/arm64/include/asm/atomic.h:92 (discriminator 1) > include/linux/atomic/atomic-arch-fallback.h:2886 (discriminator 1) > include/linux/atomic/atomic-arch-fallback.h:4351 (discriminator 1) > include/linux/atomic/atomic-long.h:1543 (discriminator 1) > include/linux/atomic/atomic-instrumented.h:4486 (discriminator 1) > include/linux/percpu-refcount.h:334 (discriminator 1)) > <4>[ 188.525451] lr : percpu_ref_put_many.constprop.0 > (include/linux/percpu-refcount.h:174 (discriminator 2) > include/linux/percpu-refcount.h:332 (discriminator 2)) > <4>[ 188.526230] sp : ffff8000803d3ca0 > <4>[ 188.526902] x29: ffff8000803d3ca0 x28: 0000000000000000 x27: > 0000000000000000 > <4>[ 188.527976] x26: 0000000000000000 x25: ffff0000c1112e80 x24: > ffff0000c002da05 > <4>[ 188.528965] x23: 0000000000000030 x22: ffffafd0c976a000 x21: > ffffafd0c976cb70 > <4>[ 188.529927] x20: ffff0000d67fa000 x19: ffff0000d32f44c0 x18: > 0000000000000000 > <4>[ 188.531096] x17: 0000000000000000 x16: 0000000000000000 x15: > 0000000000000000 > <4>[ 188.532244] x14: 0000000000000004 x13: ffffafd0c9787258 x12: > 0000000000000000 > <4>[ 188.533548] x11: ffff0000c0402e58 x10: ffff0000c0402db0 x9 : > ffffafd0c6f86a90 > <4>[ 188.535274] x8 : ffff8000803d3b68 x7 : 0000000000000000 x6 : > 0000000000000001 > <4>[ 188.536318] x5 : ffffafd0c976a000 x4 : ffffafd0c976a288 x3 : > 0000000000000000 > <4>[ 188.537144] x2 : ffff0000c0acbe00 x1 : 0000000000000000 x0 : > ffffffffffffffff > <4>[ 188.538091] Call trace: > <4>[ 188.538883] percpu_ref_put_many.constprop.0 > (arch/arm64/include/asm/atomic_lse.h:169 (discriminator 1) > arch/arm64/include/asm/atomic_lse.h:184 (discriminator 1) > arch/arm64/include/asm/atomic_lse.h:204 (discriminator 1) > arch/arm64/include/asm/atomic.h:92 (discriminator 1) > include/linux/atomic/atomic-arch-fallback.h:2886 (discriminator 1) > include/linux/atomic/atomic-arch-fallback.h:4351 (discriminator 1) > include/linux/atomic/atomic-long.h:1543 (discriminator 1) > include/linux/atomic/atomic-instrumented.h:4486 (discriminator 1) > include/linux/percpu-refcount.h:334 (discriminator 1)) > <4>[ 188.539208] __mem_cgroup_free (include/linux/find.h:203 > include/linux/nodemask.h:266 mm/memcontrol.c:5446) > <4>[ 188.539559] mem_cgroup_css_free (mm/memcontrol.c:5682) > <4>[ 188.543093] css_free_rwork_fn (include/linux/spinlock.h:356 > kernel/cgroup/cgroup.c:350 kernel/cgroup/cgroup.c:5377) > <4>[ 188.544324] process_one_work (kernel/workqueue.c:2635) > <4>[ 188.545045] worker_thread (kernel/workqueue.c:2697 (discriminator > 2) kernel/workqueue.c:2784 (discriminator 2)) > <4>[ 188.545710] kthread (kernel/kthread.c:388) > <4>[ 188.546690] ret_from_fork (arch/arm64/kernel/entry.S:858) > <0>[ 188.547769] Code: d65f03c0 f9400661 d503201f 92800000 (f8e00020) > All code > ======== > 0: d65f03c0 ret > 4: f9400661 ldr x1, [x19, #8] > 8: d503201f nop > c: 92800000 mov x0, #0xffffffffffffffff // #-1 > 10:* f8e00020 ldaddal x0, x0, [x1] <-- trapping instruction > > Code starting with the faulting instruction > =========================================== > 0: f8e00020 ldaddal x0, x0, [x1] > <4>[ 188.549059] ---[ end trace 0000000000000000 ]--- > ok 3 selftests: cgroup: test_core > > Reported-by: Linux Kernel Functional Testing > Reported-by: Naresh Kamboju > > Links: > logs: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/naresh/tests/2WWLlrB6hfNAc0btHy8LPxuQefp > > Build: https://storage.tuxsuite.com/public/linaro/naresh/builds/2WWGnN6pGnGdueSyID8ZTdS5EVv/ > Config: https://storage.tuxsuite.com/public/linaro/naresh/builds/2WWGnN6pGnGdueSyID8ZTdS5EVv/config > > Steps to reproduce: > - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/naresh/tests/2WWLlrB6hfNAc0btHy8LPxuQefp/reproducer > > - Naresh