Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp3634632rwb; Sun, 20 Nov 2022 18:45:11 -0800 (PST) X-Google-Smtp-Source: AA0mqf647g6uaAdn/pgKZ1d8KGwy9ndYN1o8vfHaZ6ekHufw9w99hPAU4oVdAwe2JKe4o4z6qaal X-Received: by 2002:a63:e54b:0:b0:46f:8e45:8da1 with SMTP id z11-20020a63e54b000000b0046f8e458da1mr16097778pgj.71.1668998711577; Sun, 20 Nov 2022 18:45:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668998711; cv=none; d=google.com; s=arc-20160816; b=IWWFxnemNLQpft2t5XjtQqCDyKomCqDSAudwg5zcPD84oMVwTMDAlD24ZQJ8uD4G/z 3DSPLXGkfXFSuns69nRVwzgjG8ZF+pONfgJTHFR9zuFyTJmovap3Hui0u37zj/XcNnHu Z1BSzQupMrUWNVdrzy5nprbc64NS5GfySLjdr5Q92XFhcAo6nRQ3yTWAx2X4ESQZs/0/ TS1e17guxvMovJfAAbiiIQXvflJIDuXpVJVFYne21BFdFz54shCywxLvwLMVoX14WBZX VKoNQcll3nAMCOwqL9/7fDJZIbos7tYEnju0KkFLi6rziEW2YMPItK/LHT15HJVnJn7E HuDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:message-id:in-reply-to :date:references:subject:cc:to:from:dkim-signature; bh=jwe6IgGkDavIkyqB7tc+rCnmd54yDXz5AB59teMWnAg=; b=u/ugY8w5JkJSEvN6fdannYUZr6D003kWMTzMiy6FNbnZPlPSb+rlNS6Cqb6lyyyn4h p95fieQXdaThHon+M1lcHk4tllBKSlZIqdMOjEHCN4WEAHYhOCnsX6ACayE2oCz3Tutd cS7uIgWL/6IZty25kQLaQ9BN2lIsFHoQRAz4JeCDGXqG63DI2a8QT6KcAQ05A2tLDX5K U23T0nCAMiRSiDWRcIWE84eA6AhA6ikwFB0/5LSCvdY1Wh7zHXH+c2U4BBMaybizJqO6 kTRTjmYHt6xwRyGfgCfsSWNTjF5usv4CeLi1HoYRJ8UvA49a6BPJL4baVnwsks8qmHdo quYw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FNk6xaxP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f20-20020a056a001ad400b00561f40cc9f3si10520510pfv.262.2022.11.20.18.44.51; Sun, 20 Nov 2022 18:45:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FNk6xaxP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229674AbiKUByr (ORCPT + 90 others); Sun, 20 Nov 2022 20:54:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229617AbiKUByp (ORCPT ); Sun, 20 Nov 2022 20:54:45 -0500 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C146825CA for ; Sun, 20 Nov 2022 17:54:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1668995682; x=1700531682; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=QpFKF7fDLr8Ams9bU7P3h3OfOCC05+T00uayr9oKtx8=; b=FNk6xaxPoo/30Tky+RsHbz09HEq8kv4DIzWlG6qqv6QJb/zcubz7hjoH 5a+ScE1x3HvX/cz1wNWCn+ffuNcMDSkgt16qjjf6kDP2+R7aycz6CL05B jjcRtPojXwPO3Ru+UeoP/qgrkMaYXtu1utAIoo0IElRxXZcY4jJiUR9+h rNNA9YkU4MSNXI77mQQB5sr/hwisPVhCSfNonxM8H5t3M8Mn1GUF7Dspo KrWbJEfyLBapBkH+8e9HT4XIkt/IirfHZDsCMWO8znleBAkTmn0/xGC28 unnQGCABZMt6QLCqafgmKnNHAwe6TbNrp95DkUOOvDIyd60OlUgZvdsp9 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10537"; a="300997848" X-IronPort-AV: E=Sophos;i="5.96,180,1665471600"; d="scan'208";a="300997848" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2022 17:54:38 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10537"; a="746716085" X-IronPort-AV: E=Sophos;i="5.96,180,1665471600"; d="scan'208";a="746716085" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2022 17:54:35 -0800 From: "Huang, Ying" To: Chen Wandun , Andrew Morton Cc: , , , , , Subject: Re: [RFC PATCH] swapfile: fix soft lockup in scan_swap_map_slots References: <20221118133850.3360369-1-chenwandun@huawei.com> <20221118132741.aaf6f9081b5a1018cc9a5402@linux-foundation.org> Date: Mon, 21 Nov 2022 09:53:35 +0800 In-Reply-To: <20221118132741.aaf6f9081b5a1018cc9a5402@linux-foundation.org> (Andrew Morton's message of "Fri, 18 Nov 2022 13:27:41 -0800") Message-ID: <87ilj9ul7k.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Andrew Morton writes: > On Fri, 18 Nov 2022 21:38:50 +0800 Chen Wandun wrote: > >> A soft lockup occur in scan free swap slot by constructing >> huge memory pressure. >> The test scenario is: 64 CPU cores, 64GB memory, and 28 >> zram devices, the disksize of each zram device is 50MB. >> >> LATENCY_LIMIT is used to prevent soft lockup in function >> scan_swap_map_slots, but the real loop number would more >> than LATENCY_LIMIT because of "goto checks and goto scan" >> repeatly without decrease of latency limit. >> >> In order to fix it, move decrease latency_ration code in advance. >> >> There is also a suspicious place that will cause soft lockup in >> function get_swap_pages, in this function, the "goto start_over" >> may result in continuous scanning of swap partition, if there is >> no cond_sched in scan_swap_map_slots, it would cause soft lockup >> (I am not sure about this). >> >> ... >> > > Looks sensible. Yes. LGTM. Reviewed-by: "Huang, Ying" >> --- a/mm/swapfile.c >> +++ b/mm/swapfile.c >> @@ -972,23 +972,23 @@ static int scan_swap_map_slots(struct swap_info_struct *si, >> scan: >> spin_unlock(&si->lock); >> while (++offset <= READ_ONCE(si->highest_bit)) { >> - if (swap_offset_available_and_locked(si, offset)) >> - goto checks; >> if (unlikely(--latency_ration < 0)) { >> cond_resched(); >> latency_ration = LATENCY_LIMIT; >> scanned_many = true; >> } >> + if (swap_offset_available_and_locked(si, offset)) >> + goto checks; >> } >> offset = si->lowest_bit; >> while (offset < scan_base) { >> - if (swap_offset_available_and_locked(si, offset)) >> - goto checks; >> if (unlikely(--latency_ration < 0)) { >> cond_resched(); >> latency_ration = LATENCY_LIMIT; >> scanned_many = true; >> } >> + if (swap_offset_available_and_locked(si, offset)) >> + goto checks; >> offset++; >> } >> spin_lock(&si->lock); > > But this does somewhat alter the `scanned_many' logic. We'll now set > 'scanned_many` earlier. What are the effects of this? > > The ed43af10975eef7e changelog outlines tests which could be performed > to ensure we aren't regressing from this. Per my understanding, this will not influence `scanned_many` logic much. Because `scanned_many` flag will be set just a little earlier (one less slot). Best Regards, Huang, Ying