Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp737455ybh; Wed, 22 Jul 2020 11:51:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzFz3uS0M3wZFg1EXEkPiezcX+f3nInKBYa3L0iKiff83jfrEi3Wd8mxFxTArp3Je1SSqub X-Received: by 2002:a17:906:d92a:: with SMTP id rn10mr895850ejb.169.1595443881820; Wed, 22 Jul 2020 11:51:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595443881; cv=none; d=google.com; s=arc-20160816; b=VPpVlHp6TktFI+f7Q0oR3Ja6ad8rav7vg/dtx/hfAnhSBKP3pa880UpOBCVSqsaDg9 cBKBIJ0zl+ojDAU2EQD/yQhSBTTZDtDi+pzvHssP5urTxP0aszJvWh9rhYyl9BZDTsmF MZpskSYTR3bMqH8ylPH4kNSo7S/M/v7PVhMvQkJhX63XM9jShtwHTRtclw8M9VrSWFFv jvS0s/CdgrkCpC9r14VOiCbqQ80UT1D2ltuFbsQdOgh1Mn3OZKBBg1GRo0QPYjpRlh4O Tng0u1QnYPcdj7cmx7WxCAlyfLOKulqz40LoQpx1otN3ztkvZkVSgX9cmIanRvW7ANmv opUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=Fj5tDWbuTCkQYGvkoKjLGsFFqP8S1vhifunocsfi/nc=; b=Ce2o3H9GIB6Fz/q0ZcOfxUVgW4M3nYciN/jFFgJskegoFIrfMsKy713IPWDyZ6S7Zj 6TGYxwoYzHVUH1ZhhNmRimRTBn/RNNJ6U23lkOzkql/nj+LoOu4POxIWQ6RBA9CC2b2L 8EkFCHu+r5kgVO34aHg4rC0eOwDV6GY/G5Yo7Ge6y2LaYFMDgEo0u3JeitrgOmO0AS4A HiHMBX35SqVmCPIb3DlrupasXWo1sxHbj8I7HSm9iO7VJNyaeTwzbeZxgSCOaRmTZrmG CIO/BmV1sLJe7dmJ0lj8WwwAdh82VJnT1FgyurSSP6MD6xxmql6VMw9H250KYfD743vG oEMw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bz20si538579ejc.90.2020.07.22.11.50.59; Wed, 22 Jul 2020 11:51:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726888AbgGVSsr (ORCPT + 99 others); Wed, 22 Jul 2020 14:48:47 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:47464 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726390AbgGVSsr (ORCPT ); Wed, 22 Jul 2020 14:48:47 -0400 Received: from mail-wm1-f69.google.com ([209.85.128.69]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1jyJnI-0003lc-N5 for linux-kernel@vger.kernel.org; Wed, 22 Jul 2020 18:48:44 +0000 Received: by mail-wm1-f69.google.com with SMTP id x8so840925wmk.2 for ; Wed, 22 Jul 2020 11:48:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Fj5tDWbuTCkQYGvkoKjLGsFFqP8S1vhifunocsfi/nc=; b=Ox/qmvrNnQukwiYQRjKfFc58Sn1JzdCs+NGuQd8P9q9+Pz1K0lxMROUuAkUS0M1qUK E3ENbuBfrWBL7DO0tLs5UCwM2+X552sDukTQsWvRdB2wzFpNQ8YiN0H0d68LXrcvl9EN BS1LqA79Y8i0a1NYqFmah0vXrV2vDij3dyWvuJvZdhbqc8Zwuwxiq8g29vdkvfUEGX/s s1wLoQWr99ly2S6R6rwNKxsYjbEnWY+DKiAUzR2uNDLWrbl65+olzz9JUDgMU5SbE0o7 xg+Hyjk9Yb/FagWpxikw4DuqUADVmgYjJAvbK+SKjZQAv2qf6T11jzIxMSPQ9Z5uUdqr 65uQ== X-Gm-Message-State: AOAM533IaDz8m56T5IFkR65m4MiO7oLFiQf1lkYwlCJCMxMKRrgyOV2s qpZ+Ygb4rSrcFRYjiLBeCef5lKH9QSw0neWBUk+M+2Zl7Mal352r12g2qBK4X+VHWw+JS9ny29e EN737YKLGAihrpc/47bd9mVka3xIIN+d/N7zcIliLpg== X-Received: by 2002:a5d:618e:: with SMTP id j14mr833271wru.374.1595443723932; Wed, 22 Jul 2020 11:48:43 -0700 (PDT) X-Received: by 2002:a5d:618e:: with SMTP id j14mr833244wru.374.1595443723488; Wed, 22 Jul 2020 11:48:43 -0700 (PDT) Received: from localhost (host-87-11-131-192.retail.telecomitalia.it. [87.11.131.192]) by smtp.gmail.com with ESMTPSA id z63sm725561wmb.2.2020.07.22.11.48.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Jul 2020 11:48:42 -0700 (PDT) Date: Wed, 22 Jul 2020 20:48:41 +0200 From: Andrea Righi To: Matthew Wilcox Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: swap: do not wait for lock_page() in unuse_pte_range() Message-ID: <20200722184841.GC841369@xps-13> References: <20200722174436.GB841369@xps-13> <20200722180425.GP15516@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200722180425.GP15516@casper.infradead.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 22, 2020 at 07:04:25PM +0100, Matthew Wilcox wrote: > On Wed, Jul 22, 2020 at 07:44:36PM +0200, Andrea Righi wrote: > > Waiting for lock_page() with mm->mmap_sem held in unuse_pte_range() can > > lead to stalls while running swapoff (i.e., not being able to ssh into > > the system, inability to execute simple commands like 'ps', etc.). > > > > Replace lock_page() with trylock_page() and release mm->mmap_sem if we > > fail to lock it, giving other tasks a chance to continue and prevent > > the stall. > > I think you've removed the warning at the expense of turning a stall > into a potential livelock. > > > @@ -1977,7 +1977,11 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd, > > return -ENOMEM; > > } > > > > - lock_page(page); > > + if (!trylock_page(page)) { > > + ret = -EAGAIN; > > + put_page(page); > > + goto out; > > + } > > If you look at the patterns we have elsewhere in the MM for doing > this kind of thing (eg truncate_inode_pages_range()), we iterate over the > entire range, take care of the easy cases, then go back and deal with the > hard cases later. > > So that would argue for skipping any page that we can't trylock, but > continue over at least the VMA, and quite possibly the entire MM until > we're convinced that we have unused all of the required pages. > > Another thing we could do is drop the MM semaphore _here_, sleep on this > page until it's unlocked, then go around again. > > if (!trylock_page(page)) { > mmap_read_unlock(mm); > lock_page(page); > unlock_page(page); > put_page(page); > ret = -EAGAIN; > goto out; > } > > (I haven't checked the call paths; maybe you can't do this because > sometimes it's called with the mmap sem held for write) > > Also, if we're trying to scale this better, there are some fun > workloads where readers block writers who block subsequent readers > and we shouldn't wait for I/O in swapin_readahead(). See patches like > 6b4c9f4469819a0c1a38a0a4541337e0f9bf6c11 for more on this kind of thing. Thanks for the review, Matthew. I'll see if I can find a better solution following your useful hints! -Andrea