Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp2143238rdb; Mon, 20 Nov 2023 03:21:03 -0800 (PST) X-Google-Smtp-Source: AGHT+IHdJKuLnL1IJ+f9+bR6LOvpUzdKoPLHCXe+WHEa7EVz8FZfPVZ9kQXBR/hgA4LiYbleeE0x X-Received: by 2002:a17:90a:9281:b0:280:8cef:c87d with SMTP id n1-20020a17090a928100b002808cefc87dmr7365606pjo.19.1700479262781; Mon, 20 Nov 2023 03:21:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700479262; cv=none; d=google.com; s=arc-20160816; b=PenpeJDbuvVcfGGL8ULpy9T5c/HfBVYcO19r7C1IbpTVHHVA+tzOT2S9N1TQPlwVZ4 iqn7WNccTvdpzQf/7eGomnx6f6W5HWPelL8xeJiyTBtDd0ru/0b/8QYN4zArdSQUAyCZ hsk6+Lh9Jpj1GDoF47fP9VpPbcySOGJVKhvJEo7O9e57mOiGgU5PawzKceKocTQtaLVo pgykGo9Z5NfMcR5NbhNa4GXfv4hM41nW0hMI1FHBp5BTc1qI2eLojUEQYHYeBsbL5fY5 LGSljwWr+cFTQQs86KoxE5Ee+5HEHEhP2eYRI+Wcamj4g19ZGTM0W9nS0J8W38emilUH 27Ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=jDZoUdX4vDe/9rtS1iTlpRn5rfopwSjoTNNEwdu9Oa4=; fh=hUeRObL7/kxou8TxLBq7dWVT9PZlyGCPCIAIzIEQEEE=; b=jiKxUcc5uBuFn7pniSFZrt/brtulWV2JePhz9qbrbc2bw6t17VkNrFtPoDWfCLHVo1 i89F9vxEranzG6PTplNJEMabPJEEL/8BP/nQX3Jt9200YUqPeHkujm80fjvai7cm2WeU GfQsrZwEqufTl65z++n4nYmZ1bGdhydEZTmEsFDCibvlp7TS60SUT/u+VDWbjU7J4vTv mTbOkxcs6yFGqaHVC6p/IL93S2SA2oYPIOZVq+pzeCHu0+WpUHjbiX1FYyFpRDNG8y4o f02T40lX+kRmyG7mkahYj7MWC2cW9NYNygFsYbmSdRksHXTIbe7NW2JCoJA9NDe3743W 7DSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=HGRntJYe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id mm1-20020a17090b358100b0028001292940si10018780pjb.68.2023.11.20.03.21.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Nov 2023 03:21:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=HGRntJYe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id E8D7A80A0E2E; Mon, 20 Nov 2023 03:19:51 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233319AbjKTLTR (ORCPT + 99 others); Mon, 20 Nov 2023 06:19:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233216AbjKTLSm (ORCPT ); Mon, 20 Nov 2023 06:18:42 -0500 Received: from mail-lj1-x236.google.com (mail-lj1-x236.google.com [IPv6:2a00:1450:4864:20::236]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 021DF19A9 for ; Mon, 20 Nov 2023 03:17:32 -0800 (PST) Received: by mail-lj1-x236.google.com with SMTP id 38308e7fff4ca-2c50cf61f6dso55427541fa.2 for ; Mon, 20 Nov 2023 03:17:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700479051; x=1701083851; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jDZoUdX4vDe/9rtS1iTlpRn5rfopwSjoTNNEwdu9Oa4=; b=HGRntJYexibDQfZOY152dHp4oZvuKHrc+tKc5B4eveRqitiRlA31TP20nco15u23QV Xtoc2Df14VD0UOPyM1liDrj3L1IORtqZTIN+iqiVLoVcjOvIU0jJqaN/nid9N2aa8uY1 Otmd6dpejGQJrvaR9D3Q5VY+b+DdO9pV+14EHW4pUdCi/FgxRsfr4Iw+R4BZZZi/VfgB h6z/AFWo6dvFu7eDzuHfuas/11AUrW6JtrQS9V+v/Sao3cjcvq9hWdBWdMgF4I7sx0y1 ZqoezJNkve2VLfuo8HsVEcVdHY+PmNgV4bjED1skkZx68x2KVhpbFvnJUprirgHSBBmq kegA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700479051; x=1701083851; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jDZoUdX4vDe/9rtS1iTlpRn5rfopwSjoTNNEwdu9Oa4=; b=Ih2evFtATpIn4BiyaHZivgl6EPsijDLQ6dcHAkAoU7amCcRXd9QSZSlox2DXdlqPNy iypOOiHvmVee+CkhHg3+iqR73/5zRMMe0rhgufoc2S+YrB/cEOHsQFO3T69yLMLQNQuv k+DByWTIDYx/JNSYENb9IZBo2072pRO4FYVGGhoRyp984UPVELw3WrYs4x6dIiYRr3Mt EVGqDrbjzP02JCbTOXmq28ZVxla7755tK1YpEIkz5cMl2Nj1+3q3EymGO8Lbmyx1/wML bkTkRVRGADBebVKaF8zjteKGLt4NDg/NozfgMO8/F+WELmBp4wvV2FYMX2XcAid1mpfA mthQ== X-Gm-Message-State: AOJu0YzKtPqdBXxoAIbYF0CXI1DpPZZ4gNd4twBGnpMGADLiepDKwLS4 9rtTmHJhBJtMylsJw9ZWeoDIveg4SybS499MwYo= X-Received: by 2002:a05:651c:200b:b0:2c8:742a:5a66 with SMTP id s11-20020a05651c200b00b002c8742a5a66mr4567531ljo.48.1700479050776; Mon, 20 Nov 2023 03:17:30 -0800 (PST) MIME-Version: 1.0 References: <20231119194740.94101-1-ryncsn@gmail.com> <20231119194740.94101-9-ryncsn@gmail.com> <87r0klarjp.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87r0klarjp.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Kairui Song Date: Mon, 20 Nov 2023 19:17:12 +0800 Message-ID: Subject: Re: [PATCH 08/24] mm/swap: check readahead policy per entry To: "Huang, Ying" Cc: linux-mm@kvack.org, Andrew Morton , David Hildenbrand , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 20 Nov 2023 03:19:52 -0800 (PST) Huang, Ying =E4=BA=8E2023=E5=B9=B411=E6=9C=8820=E6= =97=A5=E5=91=A8=E4=B8=80 14:07=E5=86=99=E9=81=93=EF=BC=9A > > Kairui Song writes: > > > From: Kairui Song > > > > Currently VMA readahead is globally disabled when any rotate disk is > > used as swap backend. So multiple swap devices are enabled, if a slower > > hard disk is set as a low priority fallback, and a high performance SSD > > is used and high priority swap device, vma readahead is disabled global= ly. > > The SSD swap device performance will drop by a lot. > > > > Check readahead policy per entry to avoid such problem. > > > > Signed-off-by: Kairui Song > > --- > > mm/swap_state.c | 12 +++++++----- > > 1 file changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/mm/swap_state.c b/mm/swap_state.c > > index ff6756f2e8e4..fb78f7f18ed7 100644 > > --- a/mm/swap_state.c > > +++ b/mm/swap_state.c > > @@ -321,9 +321,9 @@ static inline bool swap_use_no_readahead(struct swa= p_info_struct *si, swp_entry_ > > return data_race(si->flags & SWP_SYNCHRONOUS_IO) && __swap_count(= entry) =3D=3D 1; > > } > > > > -static inline bool swap_use_vma_readahead(void) > > +static inline bool swap_use_vma_readahead(struct swap_info_struct *si) > > { > > - return READ_ONCE(enable_vma_readahead) && !atomic_read(&nr_rotate= _swap); > > + return data_race(si->flags & SWP_SOLIDSTATE) && READ_ONCE(enable_= vma_readahead); > > } > > > > /* > > @@ -341,7 +341,7 @@ struct folio *swap_cache_get_folio(swp_entry_t entr= y, > > > > folio =3D filemap_get_folio(swap_address_space(entry), swp_offset= (entry)); > > if (!IS_ERR(folio)) { > > - bool vma_ra =3D swap_use_vma_readahead(); > > + bool vma_ra =3D swap_use_vma_readahead(swp_swap_info(entr= y)); > > bool readahead; > > > > /* > > @@ -920,16 +920,18 @@ static struct page *swapin_no_readahead(swp_entry= _t entry, gfp_t gfp_mask, > > struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, > > struct vm_fault *vmf, bool *swapcached) > > { > > + struct swap_info_struct *si; > > struct mempolicy *mpol; > > struct page *page; > > pgoff_t ilx; > > bool cached; > > > > + si =3D swp_swap_info(entry); > > mpol =3D get_vma_policy(vmf->vma, vmf->address, 0, &ilx); > > - if (swap_use_no_readahead(swp_swap_info(entry), entry)) { > > + if (swap_use_no_readahead(si, entry)) { > > page =3D swapin_no_readahead(entry, gfp_mask, mpol, ilx, = vmf->vma->vm_mm); > > cached =3D false; > > - } else if (swap_use_vma_readahead()) { > > + } else if (swap_use_vma_readahead(si)) { > > It's possible that some pages are swapped out to SSD while others are > swapped out to HDD in a readahead window. > > I suspect that there are practical requirements to use swap on SSD and > HDD at the same time. Hi Ying, Thanks for the review! For the first issue "fragmented readahead window", I was planning to do an extra check in readahead path to skip readahead entries that are on different swap devices, which is not hard to do, but this series is growing too long so I thought it will be better done later. For the second issue, "is there any practical use for multiple swap", I think actually there are. For example we are trying to use multi layer swap for offloading memory of different hotness on servers. And we also tried to implement a mechanism to migrate long sleep swap entries from high performance SSD/RAMDISK swap to cheap HDD swap device, with more than two layers of swap, which worked except the upstream issue, that readahead policy will no longer work as expected. > > > page =3D swap_vma_readahead(entry, gfp_mask, mpol, ilx, v= mf); > > cached =3D true; > > } else { > > -- > Best Regards, > Huang, Ying