Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp106673imu; Thu, 10 Jan 2019 19:16:52 -0800 (PST) X-Google-Smtp-Source: ALg8bN7gIPJACIkXUMA4MaRsDfYqQFXOCH8fPe3Euf2Huv4i/CQwPwZczjNM3JywricHfzsZBIZB X-Received: by 2002:a62:fc52:: with SMTP id e79mr13001444pfh.8.1547176612620; Thu, 10 Jan 2019 19:16:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547176612; cv=none; d=google.com; s=arc-20160816; b=TT0OqmXKqGNpPcdGTBDH1WsSCwtFIE+R3cP+9E3ajDWpzIWBR1c7l3+o/b2/0VYXa9 3xUL3kVeJS75gw9tNaF4LySk65FpAIPJH3znANC6WtfTq7TNWjZa5tGLEvZHtokWy7Ps mVNUledK+0kAxxtaF0LYqQUl9EU3IpD83hbLIDU5VYlCZ4Xp2SjnoPpM1a6O9wtDVdAR /CuTGYlNzL1QeBYv0PUZ8/A2qR5NUh+pzE/f/QE7EiGyY8DYlaxjFg6NjkETmgD2t6KJ 9M0jROx+XyiMfYYbio3GBdROg0zaF8Nm6Am3kcdNRBdglYD9Y60OLGAFzAJ8D4F8HMxu EEjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=o+x6bnC2Ip35k52Nu8Gbu0Nj1pbcr3DY+PnaWFG6E54=; b=EqtiEL3CVk/Kxd1LWHQH9jxJKVKGlZnbOnlFtgI9LeesP93EvN52urGSws+4KhCra8 VEHzTzYcFk3q4/L8IJcB53+kMimLhw0KUmhxxpYKyMLs3b2wLrBhBf07F7flnVoKDQN4 26J9frlE57E1dYAZz2c5FcnAGJUQMGylEYTCLfmHcTozLhyM3fd0FrLS4X0yCY3xVfQu RZwg57YCKmNDwXepgYEl3PYGwJpff76NmhtJf/OaunypcwxifmdoglZS2EyF1A+DFY6M 2Ng/HnHZw8c1glYhLZuAip7ifV/QllFdUke8v3mzKpi2GofTab5tSOe5g/pAQpOyTI2t C+bg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 66si8120479plc.125.2019.01.10.19.16.37; Thu, 10 Jan 2019 19:16:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728270AbfAKA6p (ORCPT + 99 others); Thu, 10 Jan 2019 19:58:45 -0500 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:56924 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726149AbfAKA6o (ORCPT ); Thu, 10 Jan 2019 19:58:44 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R981e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01451;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0THzbz1J_1547168190; Received: from US-143344MP.local(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0THzbz1J_1547168190) by smtp.aliyun-inc.com(127.0.0.1); Fri, 11 Jan 2019 08:56:33 +0800 Subject: Re: [v5 PATCH 1/2] mm: swap: check if swap backing device is congested or not To: Andrew Morton Cc: ying.huang@intel.com, tim.c.chen@intel.com, minchan@kernel.org, daniel.m.jordan@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1546543673-108536-1-git-send-email-yang.shi@linux.alibaba.com> <20190110153147.1baf4c88bf0dd3b8a78aad08@linux-foundation.org> From: Yang Shi Message-ID: Date: Thu, 10 Jan 2019 16:56:29 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20190110153147.1baf4c88bf0dd3b8a78aad08@linux-foundation.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/10/19 3:31 PM, Andrew Morton wrote: > On Fri, 4 Jan 2019 03:27:52 +0800 Yang Shi wrote: > >> Swap readahead would read in a few pages regardless if the underlying >> device is busy or not. It may incur long waiting time if the device is >> congested, and it may also exacerbate the congestion. >> >> Use inode_read_congested() to check if the underlying device is busy or >> not like what file page readahead does. Get inode from swap_info_struct. >> Although we can add inode information in swap_address_space >> (address_space->host), it may lead some unexpected side effect, i.e. >> it may break mapping_cap_account_dirty(). Using inode from >> swap_info_struct seems simple and good enough. >> >> ... >> >> --- a/mm/swap_state.c >> +++ b/mm/swap_state.c >> @@ -538,11 +538,18 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, >> bool do_poll = true, page_allocated; >> struct vm_area_struct *vma = vmf->vma; >> unsigned long addr = vmf->address; >> + struct inode *inode = NULL; >> >> mask = swapin_nr_pages(offset) - 1; >> if (!mask) >> goto skip; >> >> + if (si->flags & (SWP_BLKDEV | SWP_FS)) { > I re-read your discussion with Tim and I must say the reasoning behind > this test remain foggy. > > What goes wrong if we just remove it? I saw Tim already answered this. > > What is the status of shmem swap readahead? shmem swap readahead will be skipped too if the underlying device is congested. > > Can we at least get a comment in here which explains the reasoning? How about like this: diff --git a/mm/swap_state.c b/mm/swap_state.c index 3f63bb7..85245fd 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -543,7 +543,8 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,         if (!mask)                 goto skip; -       if (si->flags & (SWP_BLKDEV | SWP_FS)) { +       /* Test swap type to make sure the dereference is safe */ +       if (likely(si->flags & (SWP_BLKDEV | SWP_FS))) {                 struct inode *inode = si->swap_file->f_mapping->host;                 if (inode_read_congested(inode))                         goto skip; Tim is worried about the deference might be not safe for some corner case, the corner cases sound unlikely by code inspection. So, added "likely" in the if statement. Thanks, Yang > > Thanks. > >> + inode = si->swap_file->f_mapping->host; >> + if (inode_read_congested(inode)) >> + goto skip; >> + } >> + >> do_poll = false; >> /* Read a page_cluster sized and aligned cluster around offset. */