Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp210574rdh; Sat, 23 Sep 2023 07:24:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGq/3Dp9zwUF+le2rYbBMqWqrNnl0092mS1l81eky2njTGbYD2Mga7IUxQ21E523u9xQ8XY X-Received: by 2002:a05:6a20:d422:b0:14c:d105:2d16 with SMTP id il34-20020a056a20d42200b0014cd1052d16mr2313838pzb.32.1695479084990; Sat, 23 Sep 2023 07:24:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695479084; cv=none; d=google.com; s=arc-20160816; b=WgCNW9Cn1Tc3Z/m5jC/H4kbxQHpu6wfH5MLymfFGS1x6/bmn8y7g7oaek0IHwySKIe gp1Q4Em1PmRd5PQmR1EFHVyzklWgwQDNwqjbK9oH25MGjNrEvfYv0j10Bx44cqt2BURQ 3mS2XXiINkO/LxRpDVYQecP9Lwi66Y9+K5JiwSj8vKdivWQu54WeF0a/9jl6X3bKmcwV MGWBW+09DRz7qHupwfiRIDJM9kHHrWoHH6bSdi43uecB9934rKvzn9c/xkatfANFrLVf 16cZiEB4QOSaHNISro0xI8DHvLemiOTIxcWP0m54SjHbtAQKuVaC8gPq2+bX4Tc9X1yO g6JA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:feedback-id:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from; bh=GJ0TAksehJ3CrYBh2qxhZ4OY6ly9+Tm1d5KWjh5y1Dw=; fh=XbolI0rF6RblP6rD4wTbw7tJj6To2jybYOawQD9PLTg=; b=fBUqGvbgf6N+ith8kQyQ58VxGnZHOzvywfnRBNxvlE6h5gT4IlW/qJxlR7tBZvwsQw Un+J7ITMSj3DEeChXYwyjjOmFv1of5a0aN4S5Lb88+fA1orFZN9k4PsoZgq4oLxLXJGr O5Nol6mkFBgEc/pfA504X5t2y6Wg5X255m53zuK0uidGj8WjezgNKg1uuinOdBMOpUzf zOX1wtdr/akFtE7DjzHGNNeH0seVRyWmWjVvs7pOQO9gGk5OQ1wZMU4bbA6ZjpqypmXt Gudqo3TRh47AowKmv1SXtSzOanuI0FADIBVScxAUbGkmOkng/5dfBaGa1Ex6wvu2XIhd McxQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id bd8-20020a056a00278800b0068a140568b4si6098499pfb.348.2023.09.23.07.24.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 Sep 2023 07:24:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 191DA81FD9DD; Sat, 23 Sep 2023 07:24:42 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231566AbjIWOYl (ORCPT + 99 others); Sat, 23 Sep 2023 10:24:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231589AbjIWOYk (ORCPT ); Sat, 23 Sep 2023 10:24:40 -0400 Received: from bg4.exmail.qq.com (bg4.exmail.qq.com [43.154.54.12]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4ADF1AD for ; Sat, 23 Sep 2023 07:24:31 -0700 (PDT) X-QQ-mid: bizesmtp78t1695479063tck4ueym Received: from wangjiexun-virtual-machine.loca ( [223.104.251.16]) by bizesmtp.qq.com (ESMTP) with id ; Sat, 23 Sep 2023 22:24:21 +0800 (CST) X-QQ-SSF: 01200000000000201000000A0000000 X-QQ-FEAT: qOAV9bwDT/nVkw7SLrIQ5tcEIZ2s+un9FHMautATaeZAujzZa51biRWX8PpAl Zr9MUQkh2XVoGmvg6ivkt8StLHEUEljiOV4wjFBMeF0bOJWmGDHHHMeilT7uI6uu2VahxLR fqGpdBate+FEFSRo46G123DpvMPE2MNqTzXnABCvs6V+F8/3K9Q8xfeMRgXsRPaJuYDpmwJ MWsEK//ZHWe8YLD2nurxLv1M7o4Qh8rqvhymIPsbGxnAid3iuAkOasXYtkB9akMURIFsIr9 LDgiClioTiRkc1uz/S++ebogBfErAVP00oNTV5X0oew6fY2iUWnfWWlQ5BaJCTb7Rtt0cXu E0tzee31zhDw6Ceu9e2IQoGD1AZjBEkWGpKq8UxJp217QZzspc= X-QQ-GoodBg: 0 X-BIZMAIL-ID: 15516014622260427791 From: Jiexun Wang To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, falcon@tinylab.org Subject: BUG: swap: significant latency caused by folio_batch_move_lru() Date: Sat, 23 Sep 2023 22:24:20 +0800 Message-Id: <20230923142420.120397-1-wangjiexun@tinylab.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:tinylab.org:qybglogicsvrgz:qybglogicsvrgz5a-0 X-Spam-Status: No, score=2.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: ** X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Sat, 23 Sep 2023 07:24:42 -0700 (PDT) Hi, everyone: I found that folio_batch_move_lru() causes significant latency under memory pressure. I try to add cond_resched() but get BUG message: BUG: sleeping function called from invalid context at mm/swap.c I tested on the LicheePi 4A board using Cylictest for latency testing and Ftrace for latency tracing. The board uses TH1520 processor and has a memory size of 8GB. The kernel version is 6.5.0 with the PREEMPT_RT patch applied. The tracing results of wakeup_rt tracer are: # tracer: wakeup_rt # # wakeup_rt latency trace v1.1.5 on 6.5.0-rt6-r1208-00004-gca3876fc69a6-dirty # -------------------------------------------------------------------- # latency: 1689 us, #6/6, CPU#0 | (M:preempt_rt VP:0, KP:0, SP:0 HP:0 #P:4) # ----------------- # | task: cyclictest-217 (uid:0 nice:0 policy:1 rt_prio:99) # ----------------- # # _--------=> CPU# # / _-------=> irqs-off/BH-disabled # | / _------=> need-resched # || / _-----=> need-resched-lazy # ||| / _----=> hardirq/softirq # |||| / _---=> preempt-depth # ||||| / _--=> preempt-lazy-depth # |||||| / _-=> migrate-disable # ||||||| / delay # cmd pid |||||||| time | caller # \ / |||||||| \ | / stress-n-232 0dn.h413 1us+: 232:120:R + [000] 217: 0:R cyclictest stress-n-232 0dn.h413 12us : => __ftrace_trace_stack => __trace_stack => probe_wakeup => ttwu_do_activate => try_to_wake_up => wake_up_process => hrtimer_wakeup => __hrtimer_run_queues => hrtimer_interrupt => riscv_timer_interrupt => handle_percpu_devid_irq => generic_handle_domain_irq => riscv_intc_irq => handle_riscv_irq => do_irq stress-n-232 0dn.h413 19us#: 0 stress-n-232 0d...3.. 1671us : __schedule stress-n-232 0d...3.. 1676us+: 232:120:R ==> [000] 217: 0:R cyclictest stress-n-232 0d...3.. 1687us : => __ftrace_trace_stack => __trace_stack => probe_wakeup_sched_switch => __schedule => preempt_schedule => migrate_enable => free_unref_page_list => release_pages => free_pages_and_swap_cache => tlb_batch_pages_flush => tlb_flush_mmu => unmap_page_range => unmap_vmas => unmap_region => do_vmi_align_munmap.constprop.0 => do_vmi_munmap => __vm_munmap => sys_munmap => do_trap_ecall_u => ret_from_exception I obtained more detailed information with display-graph option: ... 1168 us | 0) stress--204 | .n..... | | unmap_region() { 1171 us | 0) stress--204 | .n..... | | lru_add_drain() { 1175 us | 0) stress--204 | .n..1.. | | lru_add_drain_cpu() { 1184 us | 0) stress--204 | .n..1.. | | folio_batch_move_lru() { 1187 us | 0) stress--204 | .n..1.. | | folio_lruvec_lock_irqsave() { 1190 us | 0) stress--204 | .n..1.. | 3.000 us | _raw_spin_lock_irqsave(); 1196 us | 0) stress--204 | dn..2.. | 8.667 us | } 1200 us | 0) stress--204 | dn..2.. | | lru_add_fn() { 1204 us | 0) stress--204 | dn..3.. | 2.667 us | folio_mapping(); 1213 us | 0) stress--204 | dn..2.. | | __mod_lruvec_state() { 1215 us | 0) stress--204 | dn..2.. | 2.667 us | __mod_node_page_state(); 1221 us | 0) stress--204 | dn..2.. | | __mod_memcg_lruvec_state() { 1224 us | 0) stress--204 | dn..2.. | 2.334 us | cgroup_rstat_updated(); 1229 us | 0) stress--204 | dn..2.. | 9.000 us | } 1232 us | 0) stress--204 | dn..2.. | + 19.000 us | } 1234 us | 0) stress--204 | dn..2.. | 2.667 us | __mod_zone_page_state(); 1239 us | 0) stress--204 | dn..2.. | 4.334 us | mem_cgroup_update_lru_size(); 1246 us | 0) stress--204 | dn..2.. | + 46.333 us | } 1249 us | 0) stress--204 | dn..2.. | | lru_add_fn() { 1252 us | 0) stress--204 | dn..3.. | 2.333 us | folio_mapping(); 1258 us | 0) stress--204 | dn..2.. | | __mod_lruvec_state() { 1261 us | 0) stress--204 | dn..2.. | 2.333 us | __mod_node_page_state(); 1266 us | 0) stress--204 | dn..2.. | | __mod_memcg_lruvec_state() { 1270 us | 0) stress--204 | dn..2.. | 2.333 us | cgroup_rstat_updated(); 1276 us | 0) stress--204 | dn..2.. | + 10.333 us | } 1278 us | 0) stress--204 | dn..2.. | + 20.000 us | } 1281 us | 0) stress--204 | dn..2.. | 4.000 us | __mod_zone_page_state(); 1287 us | 0) stress--204 | dn..2.. | 4.000 us | mem_cgroup_update_lru_size(); ... 1696 us | 0) stress--204 | dn..2.. | | lru_add_fn() { 1700 us | 0) stress--204 | dn..3.. | 2.666 us | folio_mapping(); 1706 us | 0) stress--204 | dn..2.. | | __mod_lruvec_state() { 1710 us | 0) stress--204 | dn..2.. | 2.667 us | __mod_node_page_state(); 1716 us | 0) stress--204 | dn..2.. | | __mod_memcg_lruvec_state() { 1720 us | 0) stress--204 | dn..2.. | 2.334 us | cgroup_rstat_updated(); 1726 us | 0) stress--204 | dn..2.. | 9.667 us | } 1728 us | 0) stress--204 | dn..2.. | + 22.000 us | } 1731 us | 0) stress--204 | dn..2.. | 2.666 us | __mod_zone_page_state(); 1736 us | 0) stress--204 | dn..2.. | 2.333 us | mem_cgroup_update_lru_size(); 1740 us | 0) stress--204 | dn..2.. | + 44.666 us | } 1744 us | 0) stress--204 | dn..2.. | 3.333 us | _raw_spin_unlock_irqrestore(); 1750 us | 0) stress--204 | .n..1.. | | release_pages() { 1756 us | 0) stress--204 | .n..1.. | 3.667 us | __mem_cgroup_uncharge_list(); 1762 us | 0) stress--204 | .n..1.. | 2.667 us | free_unref_page_list(); 1769 us | 0) stress--204 | .n..1.. | + 18.667 us | } 1772 us | 0) stress--204 | .n..1.. | ! 588.000 us | } 1777 us | 0) stress--204 | .n..1.. | ! 600.000 us | } 1780 us | 0) stress--204 | .n..... | 2.667 us | mlock_drain_local(); 1785 us | 0) stress--204 | .n..... | ! 614.000 us | } ... I attempted to unlock and call cond_resched() within the loop: @@ -217,6 +217,12 @@ static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn) move_fn(lruvec, folio); folio_set_lru(folio); + + if (i % 4 == 0) { + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); + cond_resched(); + lruvec = folio_lruvec_relock_irqsave(folio, lruvec, &flags); } if (lruvec) Then I received BUG messages continuously: BUG: sleeping function called from invalid context at ... I noticed that there are also locks at the outer layer of this function. I also attempted to release the locks, but still received the same BUG info. I wonder if the internal of this function is non-preemptible? Or do I have any means to reduce the latency caused by this function? Best regards, Jiexun Wang