Received: by 10.223.185.116 with SMTP id b49csp4288613wrg; Mon, 19 Feb 2018 15:03:50 -0800 (PST) X-Google-Smtp-Source: AH8x2258kXECCEvRtEr66Uiqz6uNbGVlppI1hbYSIKqmDmtmmGXTKEAHAlnquuGisAlqp3qA3IW/ X-Received: by 10.99.122.12 with SMTP id v12mr13606664pgc.128.1519081430701; Mon, 19 Feb 2018 15:03:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519081430; cv=none; d=google.com; s=arc-20160816; b=0qS5fJjmXzcUt7daC9m6s+Osfn4E1vjmyhyHXj/C5HmiorcEbuMHXYgbA0wKxNrDbB rYciJ6vuIg7iUXkeQDpXoX9LZjz1S2ygp/8rCFsj0ObwZ3qVDBcJ48MFaBxARAXRngMf Na59k8bFqwvxDp9Hjk8CaLpxRPy0ABXeh8Sg+52DISghLFYx4Rn1aOSt4U5QhZv77wxa I0+rHeqtu5WCVYKnDLo8pz1WsuiP3AfARFC8a5M8+0mlJghqAeuovhvX321ONtDF52wn QEnysGuxurvB8F/Lzbisq9uiyLx/y8NVxwH+SgM8U/5TGt+bpvu9bd8FA3VQ41b3WVGd yg7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=HSoxvbVJ1FaDkbmVWAFjxmVzsq/7VkRLg2j897vc+hg=; b=aFCiKyFugaCLc6H7f6ZQWe/ih8Dbhe2WI3fe5JDgd5uTr3Gf8RcXoyjRa9LGg49okr uHbrPiirTuetGYgANNUdxpaeCv88hADGfU3leBRxuNaJhWF+b2Cmugj54Yob4e5uX1Nl x9l3N8P7VuqRuCX4DqNZCxr1ufXd8x3owMs6yXrcXcNjIUo0Y7Gm1K05TvA9HtdHfO7S SjgK27y5AbYebzzJ/lg/mIjdXo3BAZrrjAxSj+ash3b7XRyRb0wt9hraYmiiQ1ixMQcU PLvDq4vMneJesLau+f3z6JVv5KD6lWcghCAP8sZo+Hv0ZFXDcwVg/qI1Chw9jUfMR7BT 2GdQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m1-v6si416455plt.295.2018.02.19.15.03.36; Mon, 19 Feb 2018 15:03:50 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932267AbeBSXCz (ORCPT + 99 others); Mon, 19 Feb 2018 18:02:55 -0500 Received: from mga11.intel.com ([192.55.52.93]:34464 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932190AbeBSXCy (ORCPT ); Mon, 19 Feb 2018 18:02:54 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Feb 2018 15:02:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,537,1511856000"; d="scan'208";a="19315653" Received: from rchatre-mobl.amr.corp.intel.com (HELO [10.255.73.9]) ([10.255.73.9]) by orsmga008.jf.intel.com with ESMTP; 19 Feb 2018 15:02:52 -0800 Subject: Re: [RFC PATCH V2 06/22] x86/intel_rdt: Create pseudo-locked regions To: Thomas Gleixner Cc: fenghua.yu@intel.com, tony.luck@intel.com, gavin.hindman@intel.com, vikas.shivappa@linux.intel.com, dave.hansen@intel.com, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org References: <17cceeb077f3ac5f50b110285b36905091a345b0.1518443616.git.reinette.chatre@intel.com> From: Reinette Chatre Message-ID: <97148038-6af3-aa49-d5ac-35741867dd29@intel.com> Date: Mon, 19 Feb 2018 15:02:51 -0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Thomas, On 2/19/2018 12:57 PM, Thomas Gleixner wrote: > On Tue, 13 Feb 2018, Reinette Chatre wrote: > >> System administrator creates/removes pseudo-locked regions by >> creating/removing directories in the pseudo-lock subdirectory of the >> resctrl filesystem. Here we add directory creation and removal support. >> >> A "pseudo-lock region" is introduced, which represents an >> instance of a pseudo-locked cache region. During mkdir a new region is >> created but since we do not know which cache it belongs to at that time >> we maintain a global pointer to it from where it will be moved to the cache >> (rdt_domain) it belongs to after initialization. This implies that >> we only support one uninitialized pseudo-locked region at a time. > > Whats the reason for this restriction? If there are uninitialized > directories, so what? I was thinking about a problematic scenario where an application attempts to create infinite directories. All of these uninitialized directories need to be kept track of before they are initialized as pseudo-locked regions. It seemed simpler to require that one pseudo-locked region is set up at a time. >> Signed-off-by: Reinette Chatre >> --- >> arch/x86/kernel/cpu/intel_rdt.h | 3 + >> arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 220 +++++++++++++++++++++++++++- >> 2 files changed, 222 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h >> index 8f5ded384e19..55f085985072 100644 >> --- a/arch/x86/kernel/cpu/intel_rdt.h >> +++ b/arch/x86/kernel/cpu/intel_rdt.h >> @@ -352,6 +352,7 @@ extern struct mutex rdtgroup_mutex; >> extern struct rdt_resource rdt_resources_all[]; >> extern struct rdtgroup rdtgroup_default; >> DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); >> +extern struct kernfs_node *pseudo_lock_kn; >> >> int __init rdtgroup_init(void); >> >> @@ -457,5 +458,7 @@ bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d); >> void __check_limbo(struct rdt_domain *d, bool force_free); >> int rdt_pseudo_lock_fs_init(struct kernfs_node *root); >> void rdt_pseudo_lock_fs_remove(void); >> +int rdt_pseudo_lock_mkdir(const char *name, umode_t mode); >> +int rdt_pseudo_lock_rmdir(struct kernfs_node *kn); >> >> #endif /* _ASM_X86_INTEL_RDT_H */ >> diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c >> index a787a103c432..7a22e367b82f 100644 >> --- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c >> +++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c >> @@ -20,11 +20,142 @@ >> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt >> >> #include >> +#include >> #include >> #include >> +#include >> #include "intel_rdt.h" >> >> -static struct kernfs_node *pseudo_lock_kn; >> +struct kernfs_node *pseudo_lock_kn; >> + >> +/* >> + * Protect the pseudo_lock_region access. Since we will link to >> + * pseudo_lock_region from rdt domains rdtgroup_mutex should be obtained >> + * first if needed. >> + */ >> +static DEFINE_MUTEX(rdt_pseudo_lock_mutex); >> + >> +/** >> + * struct pseudo_lock_region - pseudo-lock region information >> + * @kn: kernfs node representing this region in the resctrl >> + * filesystem >> + * @cbm: bitmask of the pseudo-locked region >> + * @cpu: core associated with the cache on which the setup code >> + * will be run >> + * @minor: minor number of character device associated with this >> + * region >> + * @locked: state indicating if this region has been locked or not >> + * @refcount: how many are waiting to access this pseudo-lock >> + * region via kernfs >> + * @deleted: user requested removal of region via rmdir on kernfs >> + */ >> +struct pseudo_lock_region { >> + struct kernfs_node *kn; >> + u32 cbm; >> + int cpu; >> + unsigned int minor; >> + bool locked; >> + struct kref refcount; >> + bool deleted; >> +}; >> + >> +/* >> + * Only one uninitialized pseudo-locked region can exist at a time. An >> + * uninitialized pseudo-locked region is created when the user creates a >> + * new directory within the pseudo_lock subdirectory of the resctrl >> + * filsystem. The user will initialize the pseudo-locked region by writing >> + * to its schemata file at which point this structure will be moved to the >> + * cache domain it belongs to. >> + */ >> +static struct pseudo_lock_region *new_plr; > > Why isn't the struct pointer not stored in the corresponding kernfs's node->priv? It is. The life cycle of the uninitialized pseudo-locked region is managed with this pointer though. The initialized pseudo-locked regions are managed through their pointers within struct rdt_domain to which they belong. The uninitialized pseudo-locked region is tracked here until moved to the cache domain to which it belongs. >> +static void __pseudo_lock_region_release(struct pseudo_lock_region *plr) >> +{ >> + bool is_new_plr = (plr == new_plr); >> + >> + WARN_ON(!plr->deleted); >> + if (!plr->deleted) >> + return; > > if (WARN_ON(...)) > return; > Will fix. >> + >> + kfree(plr); >> + if (is_new_plr) >> + new_plr = NULL; >> +} >> + >> +static void pseudo_lock_region_release(struct kref *ref) >> +{ >> + struct pseudo_lock_region *plr = container_of(ref, >> + struct pseudo_lock_region, >> + refcount); > > You simply can avoid those line breaks by: > > struct pseudo_lock_region *plr; > > plr = container_of(ref, struct pseudo_lock_region, refcount); > > Hmm? > Absolutely. Will fix. >> + mutex_lock(&rdt_pseudo_lock_mutex); >> + __pseudo_lock_region_release(plr); >> + mutex_unlock(&rdt_pseudo_lock_mutex); >> +} >> + >> +/** >> + * pseudo_lock_region_kn_lock - Obtain lock to pseudo-lock region kernfs node >> + * >> + * This is called from the kernfs related functions which are called with >> + * an active reference to the kernfs_node that contains a valid pointer to >> + * the pseudo-lock region it represents. We can thus safely take an active >> + * reference to the pseudo-lock region before dropping the reference to the >> + * kernfs_node. >> + * >> + * We need to handle the scenarios where the kernfs directory representing >> + * this pseudo-lock region can be removed while an application still has an >> + * open handle to one of the directory's files and operations on this >> + * handle are attempted. >> + * To support this we allow a file operation to drop its reference to the >> + * kernfs_node so that the removal can proceed, while using the mutex to >> + * ensure these operations on the pseudo-lock region are serialized. At the >> + * time an operation does obtain access to the region it may thus have been >> + * deleted. >> + */ >> +static struct pseudo_lock_region *pseudo_lock_region_kn_lock( >> + struct kernfs_node *kn) >> +{ >> + struct pseudo_lock_region *plr = (kernfs_type(kn) == KERNFS_DIR) ? >> + kn->priv : kn->parent->priv; > > See above. Will fix. > >> +int rdt_pseudo_lock_mkdir(const char *name, umode_t mode) >> +{ >> + struct pseudo_lock_region *plr; >> + struct kernfs_node *kn; >> + int ret = 0; >> + >> + mutex_lock(&rdtgroup_mutex); >> + mutex_lock(&rdt_pseudo_lock_mutex); >> + >> + if (new_plr) { >> + ret = -ENOSPC; >> + goto out; >> + } >> + >> + plr = kzalloc(sizeof(*plr), GFP_KERNEL); >> + if (!plr) { >> + ret = -ENOSPC; > > ENOMEM is the proper error code here. Will fix. > >> + goto out; >> + } >> + >> + kn = kernfs_create_dir(pseudo_lock_kn, name, mode, plr); >> + if (IS_ERR(kn)) { >> + ret = PTR_ERR(kn); >> + goto out_free; >> + } >> + >> + plr->kn = kn; >> + ret = rdtgroup_kn_set_ugid(kn); >> + if (ret) >> + goto out_remove; >> + >> + kref_init(&plr->refcount); >> + kernfs_activate(kn); >> + new_plr = plr; >> + ret = 0; >> + goto out; >> + >> +out_remove: >> + kernfs_remove(kn); >> +out_free: >> + kfree(plr); >> +out: >> + mutex_unlock(&rdt_pseudo_lock_mutex); >> + mutex_unlock(&rdtgroup_mutex); >> + return ret; >> +} >> + >> +/* >> + * rdt_pseudo_lock_rmdir - Remove pseudo-lock region >> + * >> + * LOCKING: >> + * Since the pseudo-locked region can be associated with a RDT domain at >> + * removal we take both rdtgroup_mutex and rdt_pseudo_lock_mutex to protect >> + * the rdt_domain access as well as the pseudo_lock_region access. > > Is there a real reason / benefit for having this second mutex? Some interactions with the pseudo-locked region are currently done without the need for the rdtgroup_mutex. For example, interaction with the character device associated with the pseudo-locked region (the mmap() call) as well as the debugfs operations. Reinette