Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758316Ab1FPXFO (ORCPT ); Thu, 16 Jun 2011 19:05:14 -0400 Received: from mail-yi0-f46.google.com ([209.85.218.46]:40500 "EHLO mail-yi0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755628Ab1FPXFL convert rfc822-to-8bit (ORCPT ); Thu, 16 Jun 2011 19:05:11 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=vrfy.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=aVQkZibxLCFDwEoD+SSvyyyFIiisGy8VoWLVBwQiFJC8+YPuNvafbzBBYkqW8i3nyH MCZCeVqAHpKZNp9HSYajEo2KLK2jlHlX6/b9Feid1jpoRPCDxwEtrj/sZXP1hZAQrbT9 hOwIZhuGx521coIa/M4jZqQmH4Jq1Y9y8VPWM= MIME-Version: 1.0 In-Reply-To: <1308264321.2436.161.camel@mulgrave> References: <20110615081610.2237.44767.stgit@ltc233.sdl.hitachi.co.jp> <20110615081627.2237.9620.stgit@ltc233.sdl.hitachi.co.jp> <20110615153337.GA10160@kroah.com> <4DF9F11F.705@hitachi.com> <20110616154129.GA31498@kroah.com> <1308239454.2436.34.camel@mulgrave> <20110616161442.GA32113@kroah.com> <1308241506.2436.44.camel@mulgrave> <20110616181943.GB1439@kroah.com> <1308256290.2436.143.camel@mulgrave> <1308264321.2436.161.camel@mulgrave> From: Kay Sievers Date: Fri, 17 Jun 2011 01:04:55 +0200 Message-ID: Subject: Re: [PATCH 1/3] [RFC] genhd: add a new attribute in device structure To: James Bottomley Cc: Greg KH , Nao Nishijima , linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, jcm@redhat.com, hare@suse.de, stefanr@s5r6.in-berlin.de, yrl.pp-manager.tt@hitachi.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8844 Lines: 175 On Fri, Jun 17, 2011 at 00:45, James Bottomley wrote: > On Fri, 2011-06-17 at 00:05 +0200, Kay Sievers wrote: >> On Thu, Jun 16, 2011 at 22:31, James Bottomley >> wrote: >> > On Thu, 2011-06-16 at 11:19 -0700, Greg KH wrote: >> >> On Thu, Jun 16, 2011 at 12:25:06PM -0400, James Bottomley wrote: >> >> > On Thu, 2011-06-16 at 09:14 -0700, Greg KH wrote: >> >> >> > > No, udev can not create such a link after the preferred name is set, as >> >> > > it has no way of knowing that the name was set. >> >> > >> >> > It can if we trigger a uevent.  Note: I'm not advocating this ... I'd be >> >> > equally happy having whatever sets the kernel name create the link (or >> >> > tickle udev to create it).  We definitely require device links, though, >> >> > to get this to work. >> >> >> >> And no, I don't want to trigger a uevent, Kay pointed out where this >> >> will go very wrong very quickly if this is done. >> > >> > As I said: we just need a by-preferred type of link. >> >> And if the user changes the name, the link and all earlier uses will >> be dangling, even /proc/mounts might show non-existing device names. > > I don't understand this.  If a user decides to call sda "fred" by doing > an echo to the preferred named file, then there should be a link > > /dev/disk/by-preferred/fred -> ../../sda > > Even if the name later changes to "angela", /dev/disk/by-preferred/fred > will be valid if we don't clean it up. Not currently, udev needs to keep track of all symlinks, and removes the ones not specified in rules. There is not way for udev to track no longer valid names. We can not just add stuff to /dev without a udev database entry, it would never get removed on device unplug and leave a real mess behind. >> I honestly don't think there will ever be _the_ name for a device. >> We've been there, stuff seems not to work that way in the real world. > > So that's not really the point ... all we do with it in-kernel is use it > as the device name for log prints and some basic /proc files (again, > mainly as prints); nothing more. > >> I really like to hear how stuff is supposed to compose _the_ name in a >> real world use-case like a multipath setup, in initramfs, in a heavy >> hotplug setup, and so on ... And with more details than "udev will set >> _the_ name", I really fail to see how that is supposed to happen to be >> useful. > > All that really has to happen is that we get a database of 1:1 > correspondence between preferred name and actual name (with device > links). > >> I think any solution that assumes the name can change later after it >> is possibly already used, is just wishful thinking. > > The ability to change on the fly isn't part of the original hitachi > proposal, but I don't really see why it can't ... it just alters the way > the kernel prints out the name, nothing more. And creates a huge problem for everything that uses that name. >> We need many names, and we need all of them from the very beginning, >> and they should not change during device lifetime unless the device >> state changes. > > So that's actually an argument for leaving the links, surely?  We can > have many inbound links, but the kernel can only print one name in > messages, which would be the preferred name that was currently set. I really question any concept of _the_ name. My take on it: It will never work in reality. >> >> > > So as userspace tools will still need to be fixed, I don't see how >> >> > > adding a kernel file for this is going to help any.  Well, a bit in that >> >> > > the kernel log files will look "different", but again, that really isn't >> >> > > a problem that userspace couldn't also solve with no kernel changes >> >> > > needed. >> >> > >> >> > This is true, but I think for the small effort it takes to implement the >> >> > feature in-kernel compared with what we'd have to do to the >> >> > distributions to get it implemented in userspace (we'd need klogd to do >> >> > the conversion for dmesg ... I'm entirely unclear what we need to modify >> >> > for /proc/partitions, etc.) the benefit outweighs the cost. >> >> > >> >> > Additionally, since renaming is something users seem to want (just look >> >> > at net interfaces), if we can make this work, we now have a definitive >> >> > answer to point people at. >> >> >> >> Renaming is something that we do NOT want to do, as we have learned our >> >> lesson of the network device renaming mess.  And as Kay pointed out, we >> >> already have an "alias" name there, which no one uses. >> > >> > Look at this as an opportunity to get it right.  The original proposal >> > was for renaming.  By iterating over the actual requirements, we have it >> > reduced to simply having the kernel print a preferred name.  I think >> > that's a nice achievement which we can point other proponents of >> > renaming to as they arise. >> >> Sure, we absolutely don't want renaming, and we can provide countless >> solid technical reasons why we should not allow it to happen. But I'm >> also pretty sure, we also don't want just-another-single-name to put >> somewhere in the kernel. > > I understand why we don't want renaming.  However, the technical reason > why we want a preferred name is that it's often associated with a name > printed somewhere on the box (say a label on the disk enclosure, or > ethernet port).  Not being able to use this name to address the device > is a usability issue which annoys the enterprise enormously. > > So if we stop there, regardless of solution (in-kernel or fix all > userspace), does everyone see what the actual problem is? I don't think that solves the problem, no. We need _smart_ userspace with a debug/error message channel from the kernel to userspace that pops out _structured_ data. Userspace needs to index the data, and merge a lot of userspace information into it. Adding just another single-name to the kernel just makes the much-too-dumb free-text printk() a bit more readable, but still sounds not like a solution. Pimping up syslog is not the solution to this problem, and it can't be solved in the kernel alone. >> >> So again, I really don't like this, just fix the userspace tools to map >> >> the proper device name that the kernel is using to the userspace name >> >> the tool used, and all is fine.  This has been done already today, >> >> succesfully, by many of the big "enterprise" monitoring systems that >> >> work quite well on Linux, proving that this is not something that the >> >> kernel needs to provide to implement properly. >> > >> > Well, it's expediency.  Sure we could try to patch the world, but I >> > think the simple patch of getting the kernel to print a preferred name >> > solves 90% of the problem.  Sure there is a long tail of userspace >> > components that needs fixing, but that can be done gradually if we take >> > the kernel route.  If we go the userspace route, it will be a long while >> > before we even get to 50% coverage. >> >> I need to ask again ask for an explanation why logging all symlinks at >> device discovery from udev, does not solve exactly this problem. With >> that tag in the syslog message stream, all later kernel names can be >> safely associated with _all_ the current device names in question, >> until the next tag from udev is found. > > So if the user has one preferred name, us logging all the names (and we > have quite a few for disks) doesn't really help because the user might > want to choose a different name.  However, even if we assume they choose > one of the current names, they still have to do the mapping manually; > even if they have all the information, they can't just cut and paste > from dmesg say, they have to cut, edit the buffer to put in the > preferred name and then paste ... that's just one annoying step too far > for most users.  I agree that all the output tools within reason can be > fixed to do this automatically, but fixing cat say, just so > cat /proc/partitions works would never be acceptable upstream. > > The reason for storing this in the kernel is just that it's easier than > trying to update all the tools, and it solves 90% of the problem, which > makes the solution usable, even if we have to update tools to get to > 100%. I don't think we can even solve 10% of the problems that way. It's just a hack that makes stuff a bit more pretty, but doesn't provide any reasonable solution to the problem. I doubt we can even make a simple use case out of it, what name to put into that field for a multipath setup. Kay -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/