Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp322490pxv; Thu, 22 Jul 2021 00:32:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy070rAwTY6WmE4r7MyT5rwPqg3wUtae4hCRxMLY2V2wkwW7sctMqrVDnETm3MjA3zX3sux X-Received: by 2002:a6b:794b:: with SMTP id j11mr29327737iop.129.1626939120294; Thu, 22 Jul 2021 00:32:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626939120; cv=none; d=google.com; s=arc-20160816; b=ZHs0e6rDjPjV8NKT5uSL91gVuntP0dHBGTlkYOaH18GA3glQur4D+AQSq4Oj4eH4qY 1lIm5H2odj8u0h307RFdA132X2omoqGGp4YQ2DGq6OOvHmJCZj3yq1maP9+yY/BHSRkP BZAPMmBnxWqm0HT7T4/p5hrXOl9dkz8T2EJPooAnqPJX02sjUR/R5WjSaTqoYTCEjscj fN6ke+Uj+aA5EtyXD0BmK7kg6v5ekBQVKtZnyXi8SDIQ6A9q8r4uEpW5+ZRVlbkrq+sT TcGqy1e3iFDAtlVmwWOCekz9ACPVyr4kcIE29X6CrG54X8vhb2vGV1Rpw4xj1y4KT3C9 OrNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=BIfW5/GCfDGeS0DRFE7Ew0EvQgy1ViXzUO2KGZmXsU0=; b=NlVCnwmIJsAPs58BL+Zacq+1B/6JCFkvpnwbPRjwzTRpAb95ZcrURU8J4gUChSEd/2 zCKOPP0hPvODOjC4CNQ7h0Uq39JJbkpGGp7qAgjfJlrtUh8BfKXb435FYPjDpyNKTlVa mTeSBAC1pB2H2Dl2jZrODK1qvUTV0oA4B8fGU/I/+Aa0lddov3tIDtlZxSHW/IyqGCX3 lZghgBkkVeAqD8Wcdo/Bpi/k2FfAOLC7TSQ/drT7r/WU7HL8D+b/fyJfjkKnOKD9pDGa 8aQbze2rOiD4iFRLeNaNNJersCLBcvF8r4f0Un1//OCWgSKy/R/XSqIIFW5mOAxrSN3h ZJqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@googlemail.com header.s=20161025 header.b="Lz/Win4J"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=googlemail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q5si21545611ilt.55.2021.07.22.00.31.48; Thu, 22 Jul 2021 00:32:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@googlemail.com header.s=20161025 header.b="Lz/Win4J"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=googlemail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230392AbhGVGtp (ORCPT + 99 others); Thu, 22 Jul 2021 02:49:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230308AbhGVGtp (ORCPT ); Thu, 22 Jul 2021 02:49:45 -0400 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3BF9FC061575; Thu, 22 Jul 2021 00:30:19 -0700 (PDT) Received: by mail-wr1-x430.google.com with SMTP id n1so4791101wri.10; Thu, 22 Jul 2021 00:30:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=BIfW5/GCfDGeS0DRFE7Ew0EvQgy1ViXzUO2KGZmXsU0=; b=Lz/Win4JEcXzLoAhhDSLNpAZxJji6acyRKGBTuBnClATF4RTQLZrNyiYnejruIP0Rg +SXj+BqrjcrLkN3J5D0SIoBVvR70i6rj/+30smbTF9Hnt9GsBUpkmPKOF/2eIP6qTLVd k9+KjI4UhJ+aYrlqMgmrVN/D95Ft/pUvPr2bd9/mWDpn2PLfZtkG3JzdcgyYxasRMK+V wLs2zIfRXpliJcZ7YJMWaR5KY0o/QegJ702gIqY9aey9EQ3mihE6+3F0GGxXv1cIwjrI qa1Odajp36/shLtA/g19PaA3cggR4yoK08/ISJ7qpH1zzelvmME7gb+w80GsdDX72u/w i5Og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=BIfW5/GCfDGeS0DRFE7Ew0EvQgy1ViXzUO2KGZmXsU0=; b=oLn8oZ6Zlom4/2ANxd68UYoP/YHqe7o+vhckNKrSxyPFpjUjwZsoilaDoM9rqS44RO fwKpX3XepbZZKps1BM7vEQe7W5ITBKyxfIi8HXg0zjA2a22R/o59WHr7jVSlSh2d06XV 9j/uYmvSdoC1PEcjwFbx7ixNx9nKg5VtIdpqLghtDQqYbb+3ZEq2cniaMm/1bktzcg+d 99ip3z9Sq5bLdS8BNnZSflGfFllIQOyESQGXurCnfqAcJIOwfekjlknFUtXRtVXj+w/E WwCLGZrsBWkS04oLujQ4FdobGZgvDdE/gIbKE+SqXk73pfCnLbodaRH+DaA16ZpYLvun J12w== X-Gm-Message-State: AOAM533HISVXn0gOzoH+EuT7W45Z7IZ2uv9nBNT83ueX7Imb65sMHJUl G7asRa5Uj5oLmurR19jMU1A= X-Received: by 2002:adf:ffd1:: with SMTP id x17mr26015637wrs.411.1626939017684; Thu, 22 Jul 2021 00:30:17 -0700 (PDT) Received: from [192.168.1.10] ([94.10.31.26]) by smtp.googlemail.com with ESMTPSA id c15sm23904092wmr.28.2021.07.22.00.30.16 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 22 Jul 2021 00:30:16 -0700 (PDT) Subject: Re: linux-5.13.2: warning from kernel/rcu/tree_plugin.h:359 To: Zhouyi Zhou , Oleksandr Natalenko Cc: Matthew Wilcox , Miaohe Lin , Boqun Feng , paulmck@kernel.org, linux-kernel , stable@vger.kernel.org, Chris Rankin , Josh Triplett , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Joel Fernandes , rcu , Andrew Morton , Linux-MM , "Huang, Ying" , Greg KH References: <5812280.fcLxn8YiTP@natalenko.name> <2237123.PRLUojbHBq@natalenko.name> From: Chris Clayton Message-ID: Date: Thu, 22 Jul 2021 08:30:10 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 19/07/2021 17:56, Zhouyi Zhou wrote: > Attached is my kernel configuration file (config-5.13.2), dmesg (dmesg.txt) > My qemu box's memory layout is: > total used free shared > buff/cache available > Mem: 3915 92 3764 0 58 3676 > Swap: 2044 39 2005 > The memory exhausting program: > #include > int main() > { > while(1) > { > void *ptr = > malloc(1024*1024); > memset(ptr, 0, 1024*1024); > } > return 0; > } > I'm not sure what I'm supposed to make of this. Is it that my laptop simply ran out of memory? That would be odd because, as I said earlier, I was simply doing a weekly archive of my system. That involved using the dar backup utility to create the incremental backup (which in this instance resulted in a single archive file), compressing that archive with xz and copying (again one at a time) it and a directory containing about 7000 source and binary rpm files from an internal HDD to and external HDD connected via USB My laptop, which I've had for a little over 3 months, has 32GB of memory. I have been using dar for a few years now (and the version currently installed since May) and have had no trouble with it. Similarly, I've used xz to compress the backup archives for year and the version currently installed since December 2020. The (incremental) archive file is about 7GB and 1.2GB when compressed. If copying files, the largest of which is 1.2GB, serially can cause 32GB fn RAM to be exhausted , then we are all up the creek. So I don't see where in my normal archiving process, the memory exhaustion would arise from (but I'm happy to be educated on this). Or is the memory exhaustion likely to be a side effect of the fact that two patches applied in 5.13.2 are missing a pre-requisite? If that's the case there, seems to no disagreement on whether the two patches (which had not been tagged for stable) should be reverted or the missing prerequisite should be applied (along with another, related patch that was in the patch set). Perhaps the stable and mm teams are resolving this issue behind the scenes, but in the meantime I have backported the missing patches to 5.13.4 and 5.10.52 and am currently running the former as my default kernel. I also have a patch that reverts the two patches that were applied to stable, so could run with that applied if it would be more helpful. It would, of course, leave open the races that the patches are designed to close., but if I've manually run swapoff more than once or twice in the twenty years I've been using Linux-based systems, I'd be very surprised. > On Tue, Jul 20, 2021 at 12:47 AM Zhouyi Zhou wrote: >> >> I downloaded linux-5.13.2, configure and compile the kernel with >> CONFIG_LOCKDEP=y >> CONFIG_LOCK_DEBUGGING_SUPPORT=y >> CONFIG_PROVE_LOCKING=y >> CONFIG_DEBUG_SPINLOCK=y >> CONFIG_DEBUG_LOCK_ALLOC=y >> install the kernel on a qemu box, then run following C Program >> #include >> int main() >> { >> while(1) >> { >> malloc(1024*1024); >> } >> return 0; >> } >> And following is the dmesg: >> [ 96.155017] ------------[ cut here ]------------ >> [ 96.155030] WARNING: CPU: 10 PID: 770 at >> kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x91/0x610 >> [ 96.155074] Modules linked in: ppdev intel_rapl_msr >> intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel >> bochs_drm aesni_intel drm_vram_helper evdev crypto_simd drm_ttm_helper >> snd_pcm cryptd ttm snd_timer parport_pc serio_raw drm_kms_helper sg >> snd parport soundcore drm pcspkr button ip_tables autofs4 psmouse >> sr_mod i2c_piix4 sd_mod crc32c_intel t10_pi cdrom i2c_core e1000 >> ata_generic floppy >> [ 96.155180] CPU: 10 PID: 770 Comm: containerd Not tainted 5.13.2 #1 >> [ 96.155185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 >> [ 96.155189] RIP: 0010:rcu_note_context_switch+0x91/0x610 >> [ 96.155195] Code: ff 74 0f 65 8b 05 a7 10 e5 7e 85 c0 0f 84 22 01 >> 00 00 45 84 ed 75 15 65 48 8b 04 25 80 7f 01 00 8b b0 44 03 00 00 85 >> f6 7e 02 <0f> 0b 65 48 8b 04 25 80 7f 01 00 8b 88 44 03 00 00 85 c9 7e >> 0f 41 >> [ 96.155200] RSP: 0000:ffffc90000a8bc50 EFLAGS: 00010002 >> [ 96.155204] RAX: ffff88810830c300 RBX: ffff88813bcae680 RCX: 0000000000000000 >> [ 96.155208] RDX: 0000000000000002 RSI: 0000000000000001 RDI: 0000000000000001 >> [ 96.155210] RBP: ffffc90000a8bcd0 R08: 0000000000000001 R09: 0000000000000001 >> [ 96.155213] R10: 0000000000000000 R11: ffffffff81319b93 R12: ffff88810830c300 >> [ 96.155216] R13: 0000000000000000 R14: ffff88813bcad958 R15: 0000000000004970 >> [ 96.155220] FS: 00007fc09cff9700(0000) GS:ffff88813bc80000(0000) >> knlGS:0000000000000000 >> [ 96.155223] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 96.155226] CR2: 00005594d2ff0530 CR3: 0000000103be4001 CR4: 0000000000060ee0 >> [ 96.155233] Call Trace: >> [ 96.155243] __schedule+0xda/0xa30 >> [ 96.155283] schedule+0x46/0xf0 >> [ 96.155289] io_schedule+0x12/0x40 >> [ 96.155295] __lock_page_or_retry+0x1f9/0x510 >> [ 96.155319] ? __page_cache_alloc+0x140/0x140 >> [ 96.155345] do_swap_page+0x33f/0x930 >> [ 96.155364] __handle_mm_fault+0xa54/0x1550 >> [ 96.155390] handle_mm_fault+0x17f/0x420 >> [ 96.155400] do_user_addr_fault+0x1be/0x770 >> [ 96.155422] exc_page_fault+0x69/0x280 >> [ 96.155435] ? asm_exc_page_fault+0x8/0x30 >> [ 96.155443] asm_exc_page_fault+0x1e/0x30 >> [ 96.155448] RIP: 0033:0x5594d15ec98f >> [ 96.155454] Code: 44 24 58 48 85 c0 48 b9 00 e4 0b 54 02 00 00 00 >> 48 0f 44 c1 48 89 44 24 58 eb 05 48 8b 44 24 58 48 89 04 24 e8 e2 c9 >> 9c ff 90 <48> 8b 05 9a 3b a0 01 48 8b 4c 24 50 48 89 0c 24 48 8d 15 9a >> c4 ec >> [ 96.155457] RSP: 002b:000000c0001f7f80 EFLAGS: 00010206 >> [ 96.155462] RAX: 0000000000000000 RBX: 00005594d0f9cf55 RCX: fffffffffffffff8 >> [ 96.155465] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005594d0fa2d01 >> [ 96.155467] RBP: 000000c0001f7fc0 R08: 0000000000000000 R09: 0000000000000000 >> [ 96.155470] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000004 >> [ 96.155472] R13: 0000000000000013 R14: 00005594d1ee96e6 R15: 0000000000000039 >> [ 96.155495] irq event stamp: 10464 >> [ 96.155497] hardirqs last enabled at (10463): [] >> _raw_spin_unlock_irq+0x24/0x50 >> [ 96.155508] hardirqs last disabled at (10464): [] >> __schedule+0x412/0xa30 >> [ 96.155512] softirqs last enabled at (8668): [] >> __do_softirq+0x401/0x51b >> [ 96.155517] softirqs last disabled at (8657): [] >> irq_exit_rcu+0x142/0x150 >> [ 96.155531] ---[ end trace 165ff31fd86ffc12 ]--- >> >> [ 96.177669] ============================= >> [ 96.177693] [ BUG: Invalid wait context ] >> [ 96.177717] 5.13.2 #1 Tainted: G W >> [ 96.177743] ----------------------------- >> [ 96.177765] containerd/770 is trying to lock: >> [ 96.177790] ffff88813ba69b30 (&cache->alloc_lock){+.+.}-{3:3}, at: >> get_swap_page+0x126/0x200 >> [ 96.177867] other info that might help us debug this: >> [ 96.177894] context-{4:4} >> [ 96.177910] 3 locks held by containerd/770: >> [ 96.177934] #0: ffff88810815ea28 (&mm->mmap_lock#2){++++}-{3:3}, >> at: do_user_addr_fault+0x115/0x770 >> [ 96.177999] #1: ffffffff82915020 (rcu_read_lock){....}-{1:2}, at: >> get_swap_device+0x33/0x140 >> [ 96.178057] #2: ffffffff82955ba0 (fs_reclaim){+.+.}-{0:0}, at: >> __fs_reclaim_acquire+0x5/0x30 >> [ 96.178115] stack backtrace: >> [ 96.178133] CPU: 1 PID: 770 Comm: containerd Tainted: G W >> 5.13.2 #1 >> [ 96.178183] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 >> [ 96.178254] Call Trace: >> [ 96.178274] dump_stack+0x82/0xa4 >> [ 96.178314] __lock_acquire+0x9a5/0x20a0 >> [ 96.178347] ? find_held_lock+0x3a/0xb0 >> [ 96.178381] lock_acquire+0xe9/0x320 >> [ 96.178405] ? get_swap_page+0x126/0x200 >> [ 96.178433] ? _raw_spin_unlock+0x29/0x40 >> [ 96.178461] ? page_vma_mapped_walk+0x3a3/0x960 >> [ 96.178491] __mutex_lock+0x99/0x980 >> [ 96.178515] ? get_swap_page+0x126/0x200 >> [ 96.178541] ? find_held_lock+0x3a/0xb0 >> [ 96.178577] ? get_swap_page+0x126/0x200 >> [ 96.178603] ? page_referenced+0xd5/0x170 >> [ 96.178637] ? lock_release+0x1b4/0x300 >> [ 96.178663] ? get_swap_page+0x126/0x200 >> [ 96.178698] get_swap_page+0x126/0x200 >> [ 96.178725] add_to_swap+0x14/0x60 >> [ 96.178749] shrink_page_list+0xb13/0xe70 >> [ 96.178787] shrink_inactive_list+0x243/0x550 >> [ 96.178819] shrink_lruvec+0x4fd/0x780 >> [ 96.178849] ? shrink_node+0x257/0x7c0 >> [ 96.178873] shrink_node+0x257/0x7c0 >> [ 96.178900] do_try_to_free_pages+0xdd/0x410 >> [ 96.178929] try_to_free_pages+0x110/0x300 >> [ 96.178966] __alloc_pages_slowpath.constprop.126+0x2ae/0xfa0 >> [ 96.179002] ? lock_release+0x1b4/0x300 >> [ 96.179028] __alloc_pages+0x37d/0x400 >> [ 96.179054] alloc_pages_vma+0x73/0x1d0 >> [ 96.179878] __read_swap_cache_async+0xb8/0x280 >> [ 96.180684] swap_cluster_readahead+0x194/0x270 >> [ 96.181459] ? swapin_readahead+0x62/0x530 >> [ 96.182008] swapin_readahead+0x62/0x530 >> [ 96.182558] ? find_held_lock+0x3a/0xb0 >> [ 96.183109] ? lookup_swap_cache+0x5c/0x1c0 >> [ 96.183657] ? lock_release+0x1b4/0x300 >> [ 96.184207] ? do_swap_page+0x232/0x930 >> [ 96.184753] do_swap_page+0x232/0x930 >> [ 96.185326] __handle_mm_fault+0xa54/0x1550 >> [ 96.185850] handle_mm_fault+0x17f/0x420 >> [ 96.186361] do_user_addr_fault+0x1be/0x770 >> [ 96.186880] exc_page_fault+0x69/0x280 >> [ 96.187382] ? asm_exc_page_fault+0x8/0x30 >> [ 96.187879] asm_exc_page_fault+0x1e/0x30 >> [ 96.188363] RIP: 0033:0x5594d0f78da4 >> [ 96.188829] Code: cc cc cc cc cc cc 48 8b 0d 39 d2 01 02 64 48 8b >> 09 48 3b 61 10 76 3d 48 83 ec 28 48 89 6c 24 20 48 8d 6c 24 20 48 8b >> 44 24 30 <48> 8b 08 48 89 0c 24 48 89 44 24 08 c6 44 24 10 01 e8 76 f4 >> ff ff >> [ 96.189894] RSP: 002b:000000c0001f7de8 EFLAGS: 00010216 >> [ 96.190406] RAX: 00005594d2308160 RBX: 0000000000000000 RCX: 000000c0004ea480 >> [ 96.190902] RDX: 000000c0002a4270 RSI: 0000000000000010 RDI: 0000000000000011 >> [ 96.191394] RBP: 000000c0001f7e08 R08: 0000000000000002 R09: 0000000000000011 >> [ 96.191891] R10: 00005594d22f6ce0 R11: 00005594d1ee96e4 R12: ffffffffffffffff >> [ 96.192396] R13: 0000000000000028 R14: 0000000000000027 R15: 0000000000000200 >> [ 115.344546] exaust invoked oom-killer: >> gfp_mask=0x100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, >> oom_score_adj=0 >> [ 115.346019] CPU: 1 PID: 969 Comm: exaust Tainted: G W >> 5.13.2 #1 >> [ 115.346569] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 >> [ 115.347712] Call Trace: >> [ 115.348287] dump_stack+0x82/0xa4 >> [ 115.348867] dump_header+0x55/0x3f0 >> [ 115.349491] oom_kill_process+0x160/0x210 >> [ 115.350068] out_of_memory+0x10b/0x630 >> [ 115.350646] __alloc_pages_slowpath.constprop.126+0xec7/0xfa0 >> [ 115.351255] __alloc_pages+0x37d/0x400 >> [ 115.351861] alloc_pages_vma+0x73/0x1d0 >> [ 115.352449] __handle_mm_fault+0xe1b/0x1550 >> [ 115.353042] handle_mm_fault+0x17f/0x420 >> [ 115.353646] do_user_addr_fault+0x1be/0x770 >> [ 115.354239] exc_page_fault+0x69/0x280 >> [ 115.354833] ? asm_exc_page_fault+0x8/0x30 >> [ 115.355428] asm_exc_page_fault+0x1e/0x30 >> [ 115.356021] RIP: 0033:0x7fe8ee633543 >> [ 115.356617] Code: Unable to access opcode bytes at RIP 0x7fe8ee633519. >> [ 115.357245] RSP: 002b:00007fff742c68c8 EFLAGS: 00010206 >> [ 115.357857] RAX: 00007fe798041010 RBX: 0000000000000000 RCX: 00007fe7980fd000 >> [ 115.358469] RDX: 00007fe798141000 RSI: 0000000000000000 RDI: 00007fe798041010 >> [ 115.359077] RBP: 00007fff742c68e0 R08: 00000000ffffffff R09: 0000000000000000 >> [ 115.359686] R10: 0000000000000022 R11: 0000000000000246 R12: 0000561abe3fa060 >> [ 115.360300] R13: 00007fff742c69c0 R14: 0000000000000000 R15: 0000000000000000 >> [ 115.361313] Mem-Info: >> [ 115.362285] active_anon:188386 inactive_anon:764572 isolated_anon:64 >> active_file:61 inactive_file:0 isolated_file:0 >> unevictable:0 dirty:0 writeback:2 >> slab_reclaimable:6676 slab_unreclaimable:6200 >> mapped:155 shmem:180 pagetables:3124 bounce:0 >> free:25647 free_pcp:0 free_cma:0 >> [ 115.366889] Node 0 active_anon:782092kB inactive_anon:3029744kB >> active_file:244kB inactive_file:0kB unevictable:0kB >> isolated(anon):256kB isolated(file):0kB mapped:620kB dirty:0kB >> writeback:8kB shmem:720kB shmem_thp: 0kB shmem_pmdmapped: 0kB >> anon_thp: 0kB writeback_tmp:0kB kernel_stack:3904kB pagetables:12496kB >> all_unreclaimable? no >> [ 115.369378] Node 0 DMA free:15296kB min:260kB low:324kB high:388kB >> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB >> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB >> present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB >> local_pcp:0kB free_cma:0kB >> [ 115.372121] lowmem_reserve[]: 0 2925 3874 3874 >> [ 115.373195] Node 0 DMA32 free:54436kB min:50824kB low:63528kB >> high:76232kB reserved_highatomic:0KB active_anon:819596kB >> inactive_anon:2135980kB active_file:48kB inactive_file:24kB >> unevictable:0kB writepending:8kB present:3129212kB managed:3021488kB >> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB >> [ 115.375958] lowmem_reserve[]: 0 0 949 949 >> [ 115.377064] Node 0 Normal free:32856kB min:32880kB low:37004kB >> high:41128kB reserved_highatomic:0KB active_anon:2232kB >> inactive_anon:853384kB active_file:80kB inactive_file:76kB >> unevictable:0kB writepending:0kB present:1048576kB managed:972400kB >> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB >> [ 115.380050] lowmem_reserve[]: 0 0 0 0 >> [ 115.380908] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB (U) >> 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M) >> = 15296kB >> [ 115.382728] Node 0 DMA32: 86*4kB (UM) 52*8kB (UME) 33*16kB (UM) >> 28*32kB (UME) 11*64kB (U) 2*128kB (UE) 0*256kB 1*512kB (U) 2*1024kB >> (UM) 2*2048kB (ME) 11*4096kB (M) = 54856kB >> [ 115.384560] Node 0 Normal: 597*4kB (UME) 904*8kB (UME) 414*16kB >> (UME) 187*32kB (UME) 64*64kB (UME) 19*128kB (UME) 3*256kB (U) 3*512kB >> (UE) 2*1024kB (M) 0*2048kB 0*4096kB = 33108kB >> [ 115.386558] Node 0 hugepages_total=0 hugepages_free=0 >> hugepages_surp=0 hugepages_size=1048576kB >> [ 115.387543] Node 0 hugepages_total=0 hugepages_free=0 >> hugepages_surp=0 hugepages_size=2048kB >> [ 115.388503] 61467 total pagecache pages >> [ 115.389532] 61242 pages in swap cache >> [ 115.390450] Swap cache stats: add 523756, delete 462538, find 69/171 >> [ 115.391383] Free swap = 0kB >> [ 115.392312] Total swap = 2094076kB >> [ 115.393506] 1048445 pages RAM >> [ 115.394538] 0 pages HighMem/MovableOnly >> [ 115.395514] 46133 pages reserved >> [ 115.396525] 0 pages hwpoisoned >> [ 115.397824] Tasks state (memory values in pages): >> [ 115.398778] [ pid ] uid tgid total_vm rss pgtables_bytes >> swapents oom_score_adj name >> [ 115.399792] [ 312] 0 312 6603 217 98304 >> 181 0 systemd-journal >> [ 115.400749] [ 339] 0 339 5562 1 69632 >> 395 -1000 systemd-udevd >> [ 115.401869] [ 379] 101 379 23271 17 81920 >> 197 0 systemd-timesyn >> [ 115.402873] [ 388] 107 388 1707 7 53248 >> 101 0 rpcbind >> [ 115.403869] [ 479] 0 479 2373 39 53248 >> 284 0 dhclient >> [ 115.404849] [ 485] 0 485 2120 8 53248 >> 43 0 cron >> [ 115.406291] [ 487] 0 487 4869 44 77824 >> 197 0 systemd-logind >> [ 115.407536] [ 494] 104 494 2247 58 57344 >> 80 -900 dbus-daemon >> [ 115.408772] [ 501] 0 501 56457 0 86016 >> 226 0 rsyslogd >> [ 115.410041] [ 503] 0 503 315093 769 311296 >> 3943 0 containerd >> [ 115.411289] [ 507] 0 507 1404 2 45056 >> 26 0 agetty >> [ 115.412563] [ 521] 0 521 27588 235 110592 >> 1678 0 unattended-upgr >> [ 115.413834] [ 860] 0 860 3964 28 73728 >> 187 -1000 sshd >> [ 115.415062] [ 866] 106 866 5015 19 81920 >> 187 0 exim4 >> [ 115.416286] [ 901] 0 901 4233 1 73728 >> 281 0 sshd >> [ 115.417534] [ 928] 0 928 5287 57 73728 >> 280 0 systemd >> [ 115.418755] [ 929] 0 929 5710 77 90112 >> 488 0 (sd-pam) >> [ 115.419921] [ 943] 0 943 2013 1 53248 >> 407 0 bash >> [ 115.421061] [ 969] 0 969 1402762 888995 11276288 >> 513120 0 exaust >> [ 115.421973] [ 970] 0 970 4233 272 77824 >> 5 0 sshd >> [ 115.422840] [ 976] 0 976 1980 297 53248 >> 64 0 bash >> [ 115.423664] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/session-1.scope,task=exaust,pid=969,uid=0 >> [ 115.425344] Out of memory: Killed process 969 (exaust) >> total-vm:5611048kB, anon-rss:3555976kB, file-rss:4kB, shmem-rss:0kB, >> UID:0 pgtables:11012kB oom_score_adj:0 >> [ 115.912696] oom_reaper: reaped process 969 (exaust), now >> anon-rss:0kB, file-rss:0kB, shmem-rss:0kB >> >> >> I can tell from above that: [ 96.177999] #1: ffffffff82915020 >> (rcu_read_lock){....}-{1:2}, at: get_swap_device+0x33/0x140 >> get_swap_device did help the rcu_read_lock >> >> Thanks >> Zhouyi >> >> On Mon, Jul 19, 2021 at 8:23 PM Oleksandr Natalenko >> wrote: >>> >>> On pondělí 19. července 2021 14:16:04 CEST Matthew Wilcox wrote: >>>> On Mon, Jul 19, 2021 at 02:12:15PM +0200, Oleksandr Natalenko wrote: >>>>> On pondělí 19. července 2021 14:08:37 CEST Miaohe Lin wrote: >>>>>> On 2021/7/19 19:59, Oleksandr Natalenko wrote: >>>>>>> On pondělí 19. července 2021 13:50:07 CEST Miaohe Lin wrote: >>>>>>>> On 2021/7/19 19:22, Matthew Wilcox wrote: >>>>>>>>> On Mon, Jul 19, 2021 at 07:12:58PM +0800, Miaohe Lin wrote: >>>>>>>>>> When in the commit 2799e77529c2a, we're using the percpu_ref to >>>>>>>>>> serialize >>>>>>>>>> against concurrent swapoff, i.e. there's percpu_ref inside >>>>>>>>>> get_swap_device() instead of rcu_read_lock(). Please see commit >>>>>>>>>> 63d8620ecf93 ("mm/swapfile: use percpu_ref to serialize against >>>>>>>>>> concurrent swapoff") for detail. >>>>>>>>> >>>>>>>>> Oh, so this is a backport problem. 2799e77529c2 was backported >>>>>>>>> without >>>>>>>>> its prerequisite 63d8620ecf93. Greg, probably best to just drop >>>>>>>> >>>>>>>> Yes, they're posted as a patch set: >>>>>>>> >>>>>>>> https://lkml.kernel.org/r/20210426123316.806267-1-linmiaohe@huawei.co >>>>>>>> m >>>>>>>> >>>>>>>>> 2799e77529c2 from all stable trees; the race described is not very >>>>>>>>> important (swapoff vs reading a page back from that swap device). >>>>>>>>> . >>>>>>>> >>>>>>>> The swapoff races with reading a page back from that swap device >>>>>>>> should >>>>>>>> be >>>>>>>> really uncommon as most users only do swapoff when the system is >>>>>>>> going to >>>>>>>> shutdown. >>>>>>>> >>>>>>>> Sorry for the trouble! >>>>>>> >>>>>>> git log --oneline v5.13..v5.13.3 --author="Miaohe Lin" >>>>>>> 11ebc09e50dc mm/zswap.c: fix two bugs in zswap_writeback_entry() >>>>>>> 95d192da198d mm/z3fold: use release_z3fold_page_locked() to release >>>>>>> locked >>>>>>> z3fold page >>>>>>> ccb7848e2344 mm/z3fold: fix potential memory leak in >>>>>>> z3fold_destroy_pool() >>>>>>> 9f7229c901c1 mm/huge_memory.c: don't discard hugepage if other >>>>>>> processes >>>>>>> are mapping it >>>>>>> f13259175e4f mm/huge_memory.c: add missing read-only THP checking in >>>>>>> transparent_hugepage_enabled() >>>>>>> afafd371e7de mm/huge_memory.c: remove dedicated macro >>>>>>> HPAGE_CACHE_INDEX_MASK a533a21b692f mm/shmem: fix shmem_swapin() race >>>>>>> with swapoff >>>>>>> c3b39134bbd0 swap: fix do_swap_page() race with swapoff >>>>>>> >>>>>>> Do you suggest reverting "mm/shmem: fix shmem_swapin() race with >>>>>>> swapoff" >>>>>>> as well? >>>>>> >>>>>> This patch also rely on its prerequisite 63d8620ecf93. I think we should >>>>>> either revert any commit in this series or just backport the entire >>>>>> series. >>>>> >>>>> Then why not just pick up 2 more patches instead of dropping 2 patches. >>>>> Greg, could you please make sure the whole series from [1] gets pulled? >>>> >>>> Because none of these patches should have been backported in the first >>>> place. It's just not worth the destabilisation. >>> >>> What about the rest then? >>> >>> git log --oneline v5.13..v5.13.3 -- mm/ | wc -l >>> 18 >>> >>> Those look to be fixes, these ones too. >>> >>> -- >>> Oleksandr Natalenko (post-factum) >>> >>>