Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp896764yba; Fri, 26 Apr 2019 10:32:23 -0700 (PDT) X-Google-Smtp-Source: APXvYqzTQohnatYCNcznOktWBTremLO+iuwXljeWzUY+Dux9ghM1l14QC2mp/ew2X4SDB9WixiV3 X-Received: by 2002:a17:902:ba89:: with SMTP id k9mr6833107pls.96.1556299943097; Fri, 26 Apr 2019 10:32:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556299943; cv=none; d=google.com; s=arc-20160816; b=JlS2nnnDgyyIZHExFiBdf0Asg87X88EFeEmKLVA8xJiuqcJZ1hAsO+eUgsCoXlpNf5 C/irE9lQM/lGVIJj33NYLOpJa7YfgO4z4busEYjfFa8zWd4lylkkLi2nY1dXPuXIxaMZ Fy8mg2dXLIlpkCW7HcYmTgPO3s4Ey0mhmuGb1h3LWYOBRlcbU+FOyGAfT6Q2noX/Z4tG gAhzOb49oL6W0BemgV7fmV1jdKJYojbFmK5aUf9C+AetntuOOXJAnSCy2EMLKr2h+4F2 gIHGzbNR5fMFQnB9agpQ64yaWtEtgs6AT6j8EDjtsehYFY0jXhvbPYO5eT3OnXrGLYyf oP8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=ThST1Uj5jpX0grPzO2iYOAqy8CtdE+4lv/lxjUBjGbg=; b=PgfFjTayWrLckSfUv5Fr+/H5CjQVOfqp/3W2vBIX9363vywIcVDjvD0YJctJPPqXV2 xVZL74ZA2mQYsESm/jHB+uhCi66YjAC5VkUQI5vX2dSi9aSpJSo+yK2bPlIX3dMmxaEV buL2831p9O0y+WUOlu9uCPRfDNMoHYp91yx4jBwvKkzWIYCWz+F7efP0uLFBU8wrBAMZ fnFi6pFrQbgfI9p2QkGa6Sz+58WY+NIsMrhj6xj0eqCCIqDTSzHR0Dl7mxQTCVg1UToK YQ9n3BW/e+9jWq3rjecChiO3AgWAWyNCFXYsb5WR9cq8r3hwWDMMK4YElq0Daw1Q3mS5 J3hA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=hpIUpTZ8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k3si26766060pfc.91.2019.04.26.10.32.07; Fri, 26 Apr 2019 10:32:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=hpIUpTZ8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726354AbfDZRa5 (ORCPT + 99 others); Fri, 26 Apr 2019 13:30:57 -0400 Received: from mail.kernel.org ([198.145.29.99]:51410 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726224AbfDZRa4 (ORCPT ); Fri, 26 Apr 2019 13:30:56 -0400 Received: from tleilax.poochiereds.net (cpe-71-70-156-158.nc.res.rr.com [71.70.156.158]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D49E62146E; Fri, 26 Apr 2019 17:30:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1556299855; bh=Bj1jsg50TdLt33qVgLo9ICACP/IhpTZnI3gkRKk49U4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=hpIUpTZ8ur5Mwe6Wes7zT5o+hJprutZ7x3ZNTHcFCCu1oekkgjiZby2aPjFg4S0rS F++JdTIcYCJUlRuPwAdAMgLEaq/2CKh69yMDYS1OLCADsHm3/cPNVfKvvdiR/Lq+2O q/+jWfil3nAhaiy/Cv7ucGFVWrMjhf7xHlufvVG0= Message-ID: Subject: Re: [GIT PULL] Ceph fixes for 5.1-rc7 From: Jeff Layton To: Al Viro Cc: Linus Torvalds , Ilya Dryomov , ceph-devel@vger.kernel.org, Linux List Kernel Mailing Date: Fri, 26 Apr 2019 13:30:53 -0400 In-Reply-To: <20190426165055.GY2217@ZenIV.linux.org.uk> References: <20190425174739.27604-1-idryomov@gmail.com> <342ef35feb1110197108068d10e518742823a210.camel@kernel.org> <20190425200941.GW2217@ZenIV.linux.org.uk> <86674e79e9f24e81feda75bc3c0dd4215604ffa5.camel@kernel.org> <20190426165055.GY2217@ZenIV.linux.org.uk> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.30.5 (3.30.5-1.fc29) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2019-04-26 at 17:50 +0100, Al Viro wrote: > On Fri, Apr 26, 2019 at 12:25:03PM -0400, Jeff Layton wrote: > > > It turns out though that using name_snapshot from ceph is a bit more > > tricky. In some cases, we have to call ceph_mdsc_build_path to build up > > a full path string. We can't easily populate a name_snapshot from there > > because struct external_name is only defined in fs/dcache.c. > > Explain, please. For ceph_mdsc_build_path() you don't need name > snapshots at all and existing code is, AFAICS, just fine, except > for pointless pr_err() there. > Eventually we have to pass back the result of all the build_dentry_path() shenanigans to create_request_message(), and then free whatever that result is after using it. Today we can get back a string+length from ceph_mdsc_build_path or clone_dentry_name, or we might get direct pointers into the dentry if the situation allows for it. Now we want to rip out clone_dentry_name() and start using take_dentry_name_snapshot(). That returns a name_snapshot that we'll need to pass back to create_request_message. It will need to deal with the fact that it could get one of those instead of just a string+length. My original thought was to always pass back a name_snapshot, but that turns out to be difficult because its innards are not public. The other potential solutions that I've tried make this code look even worse than it already is. > I _probably_ would take allocation out of the loop (e.g. make it > __getname(), called unconditionally) and turned it into the > d_path.c-style read_seqbegin_or_lock()/need_seqretry()/done_seqretry() > loop, so that the first pass would go under rcu_read_lock(), while > the second (if needed) would just hold rename_lock exclusive (without > bumping the refcount). But that's a matter of (theoretical) livelock > avoidance, not the locking correctness for ->d_name accesses. > Yeah, that does sound better. I want to think about this code a bit > Oh, and > *base = ceph_ino(d_inode(temp)); > *plen = len; > probably belongs in critical section - _that_ might be a correctness > issue, since temp is not held by anything once you are out of there. > Good catch. I'll fix that up. > > I could add some routines to do this, but it feels a lot like I'm > > abusing internal dcache interfaces. I'll keep thinking about it though. > > > > While we're on the subject though: > > > > struct external_name { > > union { > > atomic_t count; > > struct rcu_head head; > > } u; > > unsigned char name[]; > > }; > > > > Is it really ok to union the count and rcu_head there? > > > > I haven't trawled through all of the code yet, but what prevents someone > > from trying to access the count inside an RCU critical section, after > > call_rcu has been called on it? > > The fact that no lockless accesses to ->count are ever done? Thanks, -- Jeff Layton