Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp2311153pxp; Mon, 21 Mar 2022 16:36:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJybWfnsmkWu+Sim9gvUnWJ8RKqRDW7V4Y6e5pnaBAoS2Q9aP1j2f0ZLjFQEGob5NjM7wzHc X-Received: by 2002:a17:902:c213:b0:154:882c:fac0 with SMTP id 19-20020a170902c21300b00154882cfac0mr59150pll.151.1647905771838; Mon, 21 Mar 2022 16:36:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647905771; cv=none; d=google.com; s=arc-20160816; b=hG34bEg6dAbX6w5PiI4m2mKPYOLaHn39qHKLclc2qjoBcqbTlViBCL2PFrFN1Iw64y Fk0nu3j/G703OANGOEkEiLKVUjdEYR7nOk1KsVAG8xzDTQVmGyHdELbmym4SAInQpWGf 68rxmLrdqiAWO3+ZBQBB/W7c1XEzea/TpVe9F8CzM4go65p1AW9Fvq99fS8X5EYaGjoH kU0ODROLXR81WLtN8g74t43DuQm2ma2h2vCTs4po43Flslfo8daZl6zqdhoXc6TshWx/ 3vvqe2Dznu2KEPIdN4UwVGIXUOLmPDh/Z/Q+TT/ttqw7XLa/HsKetnmx2woGpd5VU/A2 nu3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=wlZw7bP5JwZi/j/Jj/8L1oSdd3VOTjMfBcpK4Gn8N7I=; b=YZ31dHNZe7u2+/U83lc0YBQyF1UwouxmZ67DRuJy6El0hWyRiycxyGxen42KpB7kPs MELYExM1jFqhK1e3vn2wRfkrD6JPVcbVPiRt+f/P2ETaes+k2GuDPo16GdJVfjjEXLLl 54oOfftPZzRkvgZWj/kD5G9ND/DhiHy1W7hbjK3hHHqTxJkHdWHnLd4gavlCfZWmSNq7 0YKc//Y7SXPMoAUTftnHPYHxg6s1FF+V+Cgo0Q67IcJQYpGEDRgA8F4INp5FSi67v7gL lGp3NNqATVDwEaMSS2qLwABSMwzuluHL7SgGkWFtmn4ypu2ukUDKEZbszN6BeTDzKVq1 cn2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b="Ro0/ORY1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id h17-20020a63c011000000b003825f9c93a9si7089914pgg.524.2022.03.21.16.36.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Mar 2022 16:36:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b="Ro0/ORY1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0473746C56D; Mon, 21 Mar 2022 15:37:04 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351160AbiCUOKo (ORCPT + 99 others); Mon, 21 Mar 2022 10:10:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349403AbiCUODm (ORCPT ); Mon, 21 Mar 2022 10:03:42 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C7A518005A; Mon, 21 Mar 2022 07:01:02 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E84FBB816D7; Mon, 21 Mar 2022 14:01:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2179CC340E8; Mon, 21 Mar 2022 14:00:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1647871259; bh=u3F5wmbk+6PkM8pt2vrxqech9PMOYixTjf4rRI8WWcM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ro0/ORY1qmo1P4eGyCR1bGBlU2Jm1h7meFiaGb3eFnOkb2bI97KgbbV+9O0nDk/cw S1P2wujTNhgDBKCW5GLA4OxRhKCZLA+Zk1NzETY+DTsJVwxigFrf/5C/Od07hZoE2T UfyxOFWhF2rPmtCOcjvF2M99NsoM9jPeR2Klhdq8= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Guo Ziliang , Zeal Robot , Ran Xiaokai , Jiang Xuexin , Yang Yang , Hugh Dickins , Naoya Horiguchi , Michal Hocko , Minchan Kim , Johannes Weiner , Roger Quadros , Andrew Morton , Linus Torvalds Subject: [PATCH 5.15 03/32] mm: swap: get rid of livelock in swapin readahead Date: Mon, 21 Mar 2022 14:52:39 +0100 Message-Id: <20220321133220.662265760@linuxfoundation.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220321133220.559554263@linuxfoundation.org> References: <20220321133220.559554263@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Guo Ziliang commit 029c4628b2eb2ca969e9bf979b05dc18d8d5575e upstream. In our testing, a livelock task was found. Through sysrq printing, same stack was found every time, as follows: __swap_duplicate+0x58/0x1a0 swapcache_prepare+0x24/0x30 __read_swap_cache_async+0xac/0x220 read_swap_cache_async+0x58/0xa0 swapin_readahead+0x24c/0x628 do_swap_page+0x374/0x8a0 __handle_mm_fault+0x598/0xd60 handle_mm_fault+0x114/0x200 do_page_fault+0x148/0x4d0 do_translation_fault+0xb0/0xd4 do_mem_abort+0x50/0xb0 The reason for the livelock is that swapcache_prepare() always returns EEXIST, indicating that SWAP_HAS_CACHE has not been cleared, so that it cannot jump out of the loop. We suspect that the task that clears the SWAP_HAS_CACHE flag never gets a chance to run. We try to lower the priority of the task stuck in a livelock so that the task that clears the SWAP_HAS_CACHE flag will run. The results show that the system returns to normal after the priority is lowered. In our testing, multiple real-time tasks are bound to the same core, and the task in the livelock is the highest priority task of the core, so the livelocked task cannot be preempted. Although cond_resched() is used by __read_swap_cache_async, it is an empty function in the preemptive system and cannot achieve the purpose of releasing the CPU. A high-priority task cannot release the CPU unless preempted by a higher-priority task. But when this task is already the highest priority task on this core, other tasks will not be able to be scheduled. So we think we should replace cond_resched() with schedule_timeout_uninterruptible(1), schedule_timeout_interruptible will call set_current_state first to set the task state, so the task will be removed from the running queue, so as to achieve the purpose of giving up the CPU and prevent it from running in kernel mode for too long. (akpm: ugly hack becomes uglier. But it fixes the issue in a backportable-to-stable fashion while we hopefully work on something better) Link: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.com Signed-off-by: Guo Ziliang Reported-by: Zeal Robot Reviewed-by: Ran Xiaokai Reviewed-by: Jiang Xuexin Reviewed-by: Yang Yang Acked-by: Hugh Dickins Cc: Naoya Horiguchi Cc: Michal Hocko Cc: Minchan Kim Cc: Johannes Weiner Cc: Roger Quadros Cc: Ziliang Guo Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/swap_state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -478,7 +478,7 @@ struct page *__read_swap_cache_async(swp * __read_swap_cache_async(), which has set SWAP_HAS_CACHE * in swap_map, but not yet added its page to swap cache. */ - cond_resched(); + schedule_timeout_uninterruptible(1); } /*