Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp624705ybt; Wed, 24 Jun 2020 07:24:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxWVpAEcjRw8Gz3c+Uoyr85/JZXDzk/mh/LNe7ez+hcG7bMi/MJUIOwk7mDPiZJVZod0OA/ X-Received: by 2002:aa7:c9c9:: with SMTP id i9mr26556280edt.166.1593008649804; Wed, 24 Jun 2020 07:24:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593008649; cv=none; d=google.com; s=arc-20160816; b=WB4tlclK9lTPuM2Wa5DLfKrUY/glqGQRBH4FW6SDGeVXPEwchwxbV3wiwWMaP/PluS BYHxyzk9rosPcthFwKSTm8LzYEmxTz4wrQ0Iu63vdHG0CxhLb4JlHdVBkL6zjmp19dpE J4chVlq32z6nS9eKBVIsbHLwwk+dfTB6PIxZLcEGKJg+co2Vp/1c6NGH5C/QIsVyQwg8 s7cV9rZ/7vlN5ZPH6oIWwMF6qEB8Mw064RtjQcKOm331Yqa+zbwgIE3sjJ58JCsw+RUT 1um600JFtrY2CHzW4/360IFnVH4zWj14/SEcrXYpIREJ5sERc3/j0IasVWYem0ZjpzB5 qedg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:date:user-agent:message-id:cc:to:subject :from:references:in-reply-to:content-transfer-encoding:mime-version; bh=dL2neElv7CUEp7nE7rW85sK04MaQUnpZ8aRDcozfemM=; b=R+92CsJRpk2TY4mYII0Pi8QUxKZl+/IbJZMai4pdeRsFFj/q2L+mL2r1N2gIc1D+BY Ju0a0pTiXzHwXL629L0A36pxUY8FeO2Ljkvd0Rw5A5V5LDDh6WLQZ40peIvTOlGn6rQK UVjCxZz9/1VQTiSv0SVNztqTDaXU5Ipl2vxzOxsSMERwpIJzLEyJRZk56I8HIjBIAraV S+X2OhzLbZTgQGW7R+YYKFHqAQN1wIaePwWk1+FsHZuP8trswU2qQVOilX4jsZ7tNIRb rtYg8FC0GhVBHnNkh9bdo5SI6jDKJbSHuhIvPgYuNMouLGGP0tH6Fm2yJnEDL31UFl+z pHRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u5si14065063edq.596.2020.06.24.07.23.46; Wed, 24 Jun 2020 07:24:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403991AbgFXOV5 convert rfc822-to-8bit (ORCPT + 99 others); Wed, 24 Jun 2020 10:21:57 -0400 Received: from mail.fireflyinternet.com ([109.228.58.192]:59147 "EHLO fireflyinternet.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2388115AbgFXOV5 (ORCPT ); Wed, 24 Jun 2020 10:21:57 -0400 X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from localhost (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP (TLS) id 21603718-1500050 for multiple; Wed, 24 Jun 2020 15:21:54 +0100 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT In-Reply-To: <20200624141604.GH6578@ziepe.ca> References: <20200624080248.3701-1-chris@chris-wilson.co.uk> <20200624121053.GD6578@ziepe.ca> <159300126338.4527.3968787379471939056@build.alporthouse.com> <20200624123910.GA3178169@ziepe.ca> <159300796224.4527.2014771396582759689@build.alporthouse.com> <20200624141604.GH6578@ziepe.ca> From: Chris Wilson Subject: Re: [PATCH 1/2] mm/mmu_notifier: Mark up direct reclaim paths with MAYFAIL To: Jason Gunthorpe Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, intel-gfx@lists.freedesktop.org, Andrew Morton Message-ID: <159300850942.4527.8335506003268197914@build.alporthouse.com> User-Agent: alot/0.8.1 Date: Wed, 24 Jun 2020 15:21:49 +0100 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Quoting Jason Gunthorpe (2020-06-24 15:16:04) > On Wed, Jun 24, 2020 at 03:12:42PM +0100, Chris Wilson wrote: > > Quoting Jason Gunthorpe (2020-06-24 13:39:10) > > > On Wed, Jun 24, 2020 at 01:21:03PM +0100, Chris Wilson wrote: > > > > Quoting Jason Gunthorpe (2020-06-24 13:10:53) > > > > > On Wed, Jun 24, 2020 at 09:02:47AM +0100, Chris Wilson wrote: > > > > > > When direct reclaim enters the shrinker and tries to reclaim pages, it > > > > > > has to opportunitically unmap them [try_to_unmap_one]. For direct > > > > > > reclaim, the calling context is unknown and may include attempts to > > > > > > unmap one page of a dma object while attempting to allocate more pages > > > > > > for that object. Pass the information along that we are inside an > > > > > > opportunistic unmap that can allow that page to remain referenced and > > > > > > mapped, and let the callback opt in to avoiding a recursive wait. > > > > > > > > > > i915 should already not be holding locks shared with the notifiers > > > > > across allocations that can trigger reclaim. This is already required > > > > > to use notifiers correctly anyhow - why do we need something in the > > > > > notifiers? > > > > > > > > for (n = 0; n < num_pages; n++) > > > > pin_user_page() > > > > > > > > may call try_to_unmap_page from the lru shrinker for [0, n-1]. > > > > > > Yes, of course you can't hold any locks that intersect with notifiers > > > across pin_user_page()/get_user_page() > > > > What lock though? It's just the page refcount, shrinker asks us to drop > > it [via mmu], we reply we would like to keep using that page as freeing > > it for the current allocation is "robbing Peter to pay Paul". > > Maybe I'm unclear what this series is actually trying to fix? > > You said "avoiding a recursive wait" which sounds like some locking > deadlock to me. It's the shrinker being called while we are allocating for/on behalf of the object. As we are actively using the object, we don't want to free it -- the partial object allocation being the clearest, if the object consists of 2 pages, trying to free page 0 in order to allocate page 1 has to fail (and the shrinker should find another candidate to reclaim, or fail the allocation). -Chris