Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp707877ybh; Wed, 22 Jul 2020 11:05:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzLVWloX0lYUOIgN+PYw4xUysyttFZAIXyqa8m5MNs28kC6bXcGu1QRmPDYqm3dJGHPA/Z3 X-Received: by 2002:a17:906:269a:: with SMTP id t26mr772799ejc.286.1595441135361; Wed, 22 Jul 2020 11:05:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595441135; cv=none; d=google.com; s=arc-20160816; b=PSzntBgzDvIulbuARbc4uCAfBL9GQ/0R7eOk+lDHQhLFjQpyUhAv68qr1VnGYUXAu3 jMofFKdsMiQNZNrDGA1c3ItI/xihGt+NTI4GtYU4xji4pvksL2bPIy9e3O1J5/7WNEHw 1//x/3m+nZVMk4fZKHsgZJTOyatThIRAMWqHp5bvTgVw1uQFYmzB7ApAOx71xnBBAxFf 6wxdR/DMsVV/z8fQ5MbWuqjTNH3c+SU2Taf8YpHE0WPoqXLaodgwl6734lcryJibpYHy zgBhOyYFl0ELRw9Wa57T5U91hzcj0AhS0shcKTaY/WujyD2AWDZJhZslofcIry1NiBTx OopQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=CMMHY0mGrSS0LIPSa86C/P5gTx9E8xNlElv3H254r/Y=; b=IXNYNJaM/XUSLI2/wGdyHjut9YfyCeUGirhyJSAmC/W784hjQ7e93Z3Tx4vv5v8bzF H3wUWzj6YYQMfF/sYAqdRWPxi2OUTED336J/GqgOXtCjDnXQZBa9EQI0E+6xOgygD6nc oBk0lvGV+asSrblIqgTMLq4ZPOdv0n5og2ccWW2kC0zp2xJghJ5oOboeRbW6/rWgqgqa IEE66d6Hz58EkIkpXw34swGOzL6yUvU261t3ZNEm4soEdsSMQk+5I9OtrTqwrzRR3ZN+ fepLwE0QiVNFnvfuy/pIKrpxgufA0RAKy4O6axyzrwJ6h3j1Z5+DyMupt2NBpohNhVPp JOTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=hkW41v1d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n1si439289edy.409.2020.07.22.11.05.12; Wed, 22 Jul 2020 11:05:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=hkW41v1d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731609AbgGVSE3 (ORCPT + 99 others); Wed, 22 Jul 2020 14:04:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726907AbgGVSE3 (ORCPT ); Wed, 22 Jul 2020 14:04:29 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 236ADC0619DC for ; Wed, 22 Jul 2020 11:04:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=CMMHY0mGrSS0LIPSa86C/P5gTx9E8xNlElv3H254r/Y=; b=hkW41v1dZGPXp3JpYDiAnmZM7x ZjLtYMCIG9CeT6hkLnYFS7YBZt/vgnic8P9UVkqooMc8IuSeSFTEowFUIxWWgNuF4wgUSq+dzh+j4 SngfppWE6/AolYAeyZz1F9L3dUfG8RXgY3+8eA143q+IY32ex5yKNMJ+rpxorDqWHEbs1OtL+m+69 IGxAeOfxxvpg/iAoEFbF0eUkG0l0W1siBSDRporQIY8hNbR7Mhn8VPaH2ySK+hTHkPyYONgOVcLjn it4zZLfXECP6ZzYftwlMS9oJsKEyXx3VRenvKKxiG4ddyM5K6x1D5qj1+w1J7ieSyN3cb+hbwGsO6 s8TXNMXQ==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1jyJ6P-0008RR-S5; Wed, 22 Jul 2020 18:04:25 +0000 Date: Wed, 22 Jul 2020 19:04:25 +0100 From: Matthew Wilcox To: Andrea Righi Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: swap: do not wait for lock_page() in unuse_pte_range() Message-ID: <20200722180425.GP15516@casper.infradead.org> References: <20200722174436.GB841369@xps-13> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200722174436.GB841369@xps-13> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 22, 2020 at 07:44:36PM +0200, Andrea Righi wrote: > Waiting for lock_page() with mm->mmap_sem held in unuse_pte_range() can > lead to stalls while running swapoff (i.e., not being able to ssh into > the system, inability to execute simple commands like 'ps', etc.). > > Replace lock_page() with trylock_page() and release mm->mmap_sem if we > fail to lock it, giving other tasks a chance to continue and prevent > the stall. I think you've removed the warning at the expense of turning a stall into a potential livelock. > @@ -1977,7 +1977,11 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd, > return -ENOMEM; > } > > - lock_page(page); > + if (!trylock_page(page)) { > + ret = -EAGAIN; > + put_page(page); > + goto out; > + } If you look at the patterns we have elsewhere in the MM for doing this kind of thing (eg truncate_inode_pages_range()), we iterate over the entire range, take care of the easy cases, then go back and deal with the hard cases later. So that would argue for skipping any page that we can't trylock, but continue over at least the VMA, and quite possibly the entire MM until we're convinced that we have unused all of the required pages. Another thing we could do is drop the MM semaphore _here_, sleep on this page until it's unlocked, then go around again. if (!trylock_page(page)) { mmap_read_unlock(mm); lock_page(page); unlock_page(page); put_page(page); ret = -EAGAIN; goto out; } (I haven't checked the call paths; maybe you can't do this because sometimes it's called with the mmap sem held for write) Also, if we're trying to scale this better, there are some fun workloads where readers block writers who block subsequent readers and we shouldn't wait for I/O in swapin_readahead(). See patches like 6b4c9f4469819a0c1a38a0a4541337e0f9bf6c11 for more on this kind of thing.