Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp956613yba; Thu, 4 Apr 2019 01:03:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqwoe19ejEL3zR1b5Yz+9TlpGHZuJL+TxgJKMQOnTZ4w/zPMAso273lQu6sinlY6co2uiFIn X-Received: by 2002:aa7:8289:: with SMTP id s9mr4368655pfm.208.1554365034014; Thu, 04 Apr 2019 01:03:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554365034; cv=none; d=google.com; s=arc-20160816; b=XkyCgQdxKQRpx63ZkKyfCYA5aiBmEEq377BKvCWlaKExDvJCy+8PxLEGrptljz7Uq8 iF8nOSBz0WzOO/Yr5oWqOQOtlP/AdEifBfuypCBur5FjV4oSbY1CuKjkHcUw4TVHNmS+ IeemX5KMDbvktajTR2Jzaaw2KHuTS7qRnGlvin5TXV+iv12JDn1hJbEeqASo1Y458+ye tNLwBgxBc65bMRPdChYuA8r1aybiusuhJ4pLbNm5PJlXfU6FmQ3lr/+eLe1AQ4XsV+3s qGV1GFs4OKFaPMSJ92nEb4vUqq1Zj/4UhrzVl4zk22AU7rs1cv6l3WXNUDS7lQMZnkTW 41vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=tdJDlyHvzmVCG4+t5yoJd2yuhryUSl/550UFcR7PiYk=; b=SSw+PfFppmq/hNd0taasUV5BMFRGeIN+VdH+3vJsVPvohwVqtcymKCwB0dd1u8KclZ fwooira3O0mbGX7rYAXz6V1C0KTV6UMYay6UUAsaGHJhF0vJmJUM0Ga3SnnzAEzUribO 7/u9cs9tORS1Xck/7HMkK8HHaPYzF/Wi/XsSm9npwLJRWFdsLqXppKX6js/c3oLSvSrX EghSTzsJDaoJJ/Ii46CqhPuY91rxIftAscxSiRYkJO0vQewM/wWvMsvwrbIPJ4eKBsE3 xlN1ebzLd0EgsZEWv7Etroqrl8GziMVIhswOrb96yXXrZvRUirFOhoWE45LPLKTaYQSZ MvJg== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b="WX/hqQzf"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o77si5221116pfi.247.2019.04.04.01.03.37; Thu, 04 Apr 2019 01:03:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b="WX/hqQzf"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727263AbfDDIBf (ORCPT + 99 others); Thu, 4 Apr 2019 04:01:35 -0400 Received: from mail-it1-f193.google.com ([209.85.166.193]:37892 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726204AbfDDIBf (ORCPT ); Thu, 4 Apr 2019 04:01:35 -0400 Received: by mail-it1-f193.google.com with SMTP id f22so2248209ita.3 for ; Thu, 04 Apr 2019 01:01:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tdJDlyHvzmVCG4+t5yoJd2yuhryUSl/550UFcR7PiYk=; b=WX/hqQzfO+mK+nn/esSN14pxXKWwfDchWjNF0ahntG/CwHvSNFCMxSgl0/SCIg3AW0 BIxbSfjZy/TrWmNZEIB58kTiYlZpKLJWSb7zoHnes2lFMobyu+qFBEkVIFhZYeNlX+vj 0S35iTUl4uNXIEV4mBe3kXpGBOLo0Q+lbkDvw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tdJDlyHvzmVCG4+t5yoJd2yuhryUSl/550UFcR7PiYk=; b=lnh9SLiBpLjwR/XYzNs1QuItgMpDT6NW5Df392/8zewjFegIJ46ISXq/r++qf43Ac1 aCFDQShNyda73JqxJsfqqeL8NsIL5I2di+YMuij7+Rfe0oHVgrBXWrwjlrnIFmyu0dZh KPFb38lhuvkM6ygmvVB9RazGh3BbNT+gPDKVDgCgirVzHnpVzKds5MQDQ4RkGQ8OlZI6 Wm1lIDfB3SvXaSaIwsZYGaTg4sRGbovQdXf5Biu6wcQpuTPvRwmez1KW8fmNoHSRSEns OHt0c4eouX04BU1KcB8nFlFHKhRGF6sYaUvmW2nm+S5zxzfqED0I2uT5KmpkZJGd2Hwv ZiBg== X-Gm-Message-State: APjAAAX6acwfMjocEoMJuMcTy9oJBunrRYk9fH7ykZcIT9auZ2pVH26B Ihe8shUrgfAUwOzGC/PJ02IbR+YIFNoxkEuPJM5Ibw== X-Received: by 2002:a24:c2c1:: with SMTP id i184mr3600474itg.82.1554364894463; Thu, 04 Apr 2019 01:01:34 -0700 (PDT) MIME-Version: 1.0 References: <20190403042127.18755-1-tobin@kernel.org> <20190403042127.18755-15-tobin@kernel.org> <20190403170811.GR2217@ZenIV.linux.org.uk> <01000169e458534a-3c6a5d6f-3054-4c64-b5f9-7f46c811eeac-000000@email.amazonses.com> <20190403182454.GU2217@ZenIV.linux.org.uk> <20190403190520.GW2217@ZenIV.linux.org.uk> In-Reply-To: <20190403190520.GW2217@ZenIV.linux.org.uk> From: Miklos Szeredi Date: Thu, 4 Apr 2019 10:01:23 +0200 Message-ID: Subject: Re: [RFC PATCH v2 14/14] dcache: Implement object migration To: Al Viro Cc: Christopher Lameter , "Tobin C. Harding" , Andrew Morton , Roman Gushchin , Alexander Viro , Christoph Hellwig , Pekka Enberg , David Rientjes , Joonsoo Kim , Matthew Wilcox , Miklos Szeredi , Andreas Dilger , Waiman Long , Tycho Andersen , "Theodore Ts'o" , Andi Kleen , David Chinner , Nick Piggin , Rik van Riel , Hugh Dickins , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Linus Torvalds Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 3, 2019 at 9:05 PM Al Viro wrote: > > On Wed, Apr 03, 2019 at 07:24:54PM +0100, Al Viro wrote: > > > If by "how to do it right" you mean "expedit kicking out something with > > non-zero refcount" - there's no way to do that. Nothing even remotely > > sane. > > > > If you mean "kick out everything in this page with zero refcount" - that > > can be done (see further in the thread). > > > > Look, dentries and inodes are really, really not relocatable. If they > > can be evicted by memory pressure - sure, we can do that for a given > > set (e.g. "everything in that page"). But that's it - if memory > > pressure would _not_ get rid of that one, there's nothing to be done. > > Again, all VM can do is to simulate shrinker hitting hard on given > > bunch (rather than buggering the entire cache). If filesystem (or > > something in VFS) says "it's busy", it bloody well _is_ busy and > > won't be going away until it ceases to be such. > > FWIW, some theory: the only kind of long-term reference that can > be killed off by memory pressure is that from child to parent. > Anything else (e.g. an opened file, current directory, mountpoint, > etc.) is out of limits - it either won't be going away until > the thing is not pinned anymore (close, chdir, etc.) *or* > it really shouldn't be ("VM wants this mountpoint dentry freed, > so just dissolve the mount" is a bloody bad idea for obvious > reasons). Well, theoretically we could do two levels of references, where the long term reference is stable and contains an rcu protected unstable reference to the real object. In the likely case when only read-only access to the object is needed (d_lookup) then the cost is an extra dereference and the associated additional cache usage. If read-write access is needed to object, then extra locking is needed to protect against concurrent migration. So there's non-trivial cost in addition to the added complexity, and I don't see it actually making sense in practice. But maybe someone can expand this idea to something practicable... Thanks, Miklos