Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp522502img; Thu, 21 Mar 2019 03:23:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqwPdv6dEgiUNECQBik3yio9aHaM2GPZDkmG4HYhTj2MMZW8rEnD1llk9o0tTiW+BFf1O+MX X-Received: by 2002:a17:902:8d93:: with SMTP id v19mr2733482plo.271.1553163802462; Thu, 21 Mar 2019 03:23:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553163802; cv=none; d=google.com; s=arc-20160816; b=p1i/PPKkcOLBvMRXfdzWBprvgzC+E/JMhEkGuGLFzsDHMqn9fKxxX2t2RALotjsk2u 0ZaKIus3/fBicSLT6jwtM3cj/wdsBaDNefQdyM+MwYxJvlOHatnwzMe8/0+Oh2E/c1Pc jliK0qo1FWVNZJUiM4Va2IUByj9cOIAAomvnqqmBYj4z4TKAd0vo0VoVnQZy0TxtBAXd qyMRv41BusoV5x6JbZJco80vIQ2IO8tXOGm6Y8sfGqZmwtlLndW+BJ/Ecfw511TwM3+v l2z9k/ZEDzvG7XPJH+LA36u34B2Pf2YajJCmDTUd5l9pBb57Hkk03asaFvpyZI6BD47f 8GEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=In6T9VoQ6gEBRwInMS7o7Pcl20N0HppgF/nediOg190=; b=1AcI3jJ3+6DpoS2c+T4S0ogEDyNTxue7PD+zQxqUUTX8m7mspgpL49PvLprNm+RXgn CdOI7olYK/0VK0IExWewSW+tMKssDBYpkI54onfLN/ULlgJjhVYHfgRQKzC3ywh1X5tn TZrnUQEKEhDQbBZUJ2Q066wNCBL491qjtt+lqpM42LRciOXHkRNgt/pkYpzrpHeUv/W0 ACHl4amz2FX0VsZD0Z7UV7/iQSnfr7vL98jKye5PBqtVJ2xm50Y7fEO+XoDnZLjny/cE 9U90JnFforswYyNrpdEgZcD9JuQln3x2kmoa1Jl6XpgveA38KrrQOApQT1kPTrYude+/ 1T9g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a5si4204577plh.299.2019.03.21.03.23.03; Thu, 21 Mar 2019 03:23:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727976AbfCUKUO (ORCPT + 99 others); Thu, 21 Mar 2019 06:20:14 -0400 Received: from mx2.suse.de ([195.135.220.15]:57246 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727874AbfCUKUN (ORCPT ); Thu, 21 Mar 2019 06:20:13 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 593ADAE88; Thu, 21 Mar 2019 10:20:12 +0000 (UTC) From: Luis Henriques To: "Yan, Zheng" , Sage Weil , Ilya Dryomov Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques Subject: [PATCH v3 0/2] fix quota subdir mounts Date: Thu, 21 Mar 2019 10:20:08 +0000 Message-Id: <20190321102010.26958-1-lhenriques@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, As recently reported in the ceph-users mailing-list[1], the kernel client behaves differently from the fuse client regarding mounting subdirs where quotas are in effect. I've also created a bug to track this issue[2]. The following patches are a possible way of fixing this issue. The performance impact should be close to zero if the mount is done in the CephFS root inode. When we're mounting subdirs, we may have extra queries to the MDSs, depending on how many extra realms we'll need to loop through. Changes since v2: - Replaced inodes list in mdsc by an rbtree. This is because we need to able to keep track errors in lookupino so that we don't keep sending the same useless request for inodes that have failed. This also resulted in reworking the locking in lookup_quotarealm_inode, so that 2 threads can lookup the same inode at the same time - No need to set realm->inode in lookup_quotarealm_inode() as the lookupino has set it already. Changes since v1: - Loop to free mdsc->quotarealms_inodes_list list was moved further down where it's not possible to race with insertions. This way there's no need to hold the spinlock anymore. - Clarified comments regarding the get_quota_realm function 'retry' parameter, both in the function itself and in function ceph_quota_is_same_realm, where that param is set to 'false' - Distinguish between 'realm->inode is NULL' and igrab failures, both in get_quota_realm and check_quota_exceeded Changes since RFC: The 1st patch hasn't been changed since the initial RFC. The 2nd patch has been refactored to include the following changes: - Zheng Yan's suggestions, i.e, move inode references from the realms to ceph_mds_client instance - It now also handles other cases where an MDS lookup may need to be performed: * statfs when there are quotas * renames, to forbid cross-quota renames [1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-February/033357.html [2] https://tracker.ceph.com/issues/38482 Cheers, -- Luís Luis Henriques (2): ceph: factor out ceph_lookup_inode() ceph: quota: fix quota subdir mounts fs/ceph/export.c | 14 +++- fs/ceph/mds_client.c | 19 ++++++ fs/ceph/mds_client.h | 18 +++++ fs/ceph/quota.c | 158 ++++++++++++++++++++++++++++++++++++++++--- fs/ceph/super.h | 1 + 5 files changed, 199 insertions(+), 11 deletions(-)