Return-Path: Received: from acsinet11.oracle.com ([141.146.126.233]:64865 "EHLO acsinet11.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752320AbZIJV0q (ORCPT ); Thu, 10 Sep 2009 17:26:46 -0400 Cc: Trond Myklebust , Jeff Layton , steved@redhat.com, linux-nfs@vger.kernel.org Message-Id: From: Chuck Lever To: "J. Bruce Fields" In-Reply-To: <20090910204959.GN14406@fieldses.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Subject: Re: [PATCH 1/4] nfs-utils: introduce new statd implementation (1st part) Date: Thu, 10 Sep 2009 17:26:11 -0400 References: <1249515004.5428.34.camel@heimdal.trondhjem.org> <20090909142945.755da393@tlielax.poochiereds.net> <1252521599.8722.53.camel@heimdal.trondhjem.org> <20B7C2F0-E566-4292-91E9-41A3FA6C9D4C@oracle.com> <1252525327.8722.81.camel@heimdal.trondhjem.org> <20090910150319.GA10704@fieldses.org> <20090910162327.GE11858@fieldses.org> <20090910204959.GN14406@fieldses.org> Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On Sep 10, 2009, at 4:49 PM, J. Bruce Fields wrote: > On Thu, Sep 10, 2009 at 04:39:51PM -0400, Chuck Lever wrote: >> On Sep 10, 2009, at 12:23 PM, J. Bruce Fields wrote: >>> On Thu, Sep 10, 2009 at 12:14:27PM -0400, Chuck Lever wrote: >>>> On Sep 10, 2009, at 11:03 AM, J. Bruce Fields wrote: >>>>> On Wed, Sep 09, 2009 at 06:18:11PM -0400, Chuck Lever wrote: >>>>>> IDNs are UTF16. /var therefore has to support UTF16 filenames; >>>>>> either >>>>>> byte in a double-byte character can be '/' or '\0'. That means >>>>>> the >>>>>> underlying fs implementation has to support UTF16 (FAT32 >>>>>> anyone?), >>>>>> and >>>>>> the system's locale has to be configured correctly. If we decide >>>>>> not to >>>>>> depend on the file system to support UTF16 filenames, then statd >>>>>> has >>>>>> to >>>>>> be intelligent enough to figure out how to deal with converting >>>>>> UTF16 >>>>>> hostnames before storing them as filenames. Then, we have to >>>>>> teach >>>>>> matchhostname() and friends how to deal with double-byte >>>>>> character >>>>>> strings... >>>>> >>>>> Googling around.... Is this accurate?: >>>>> >>>>> http://en.wikipedia.org/wiki/Internationalized_domain_name >>>>> >>>>> That makes it sound like domain names are staying ascii, and >>>>> they're >>>>> just adding something on top to allow encoding unicode using >>>>> ascii, >>>>> which may optionally be used by applications. >>>> >>>> There is a mechanism that provides an ASCII-ized version of domain >>>> names >>>> that may contain non-ASCII characters, expressly for applications >>>> that >>>> need to perform DNS queries but can't be easily converted to handle >>>> double-byte character strings. This can be adapted for statd, >>>> though I'm >>>> not sure if the converted ASCII version of such names specifically >>>> exclude '/'. >>>> >>>> Internationalized domain names themselves are still expressed in >>>> UTF16, >>>> as far as I understand it. >>> >>> From a quick skim of http://www.ietf.org/rfc/rfc3490.txt, it appears >>> to >>> me that protocols (at the very least, any preexisting protocols) are >>> all >>> expected to use the ascii representation on the wire, and that the >>> translation to unicode is meant by use for applications. >>> >>> So in our case we'd continue to expect ascii domain names on the >>> wire, >>> and I believe that's also what we should store in any database. But >>> if >>> someone were to write a gui administrative interface to that data, >>> for >>> example, they might choose to use idna for display. >> >> That's a reasonable and specific objection to my claim that our >> current >> host record storage format is inadequate to support IDNA. I've also >> confirmed that ToAscii with the UseSTD3ASCIIRules flag set is not >> supposed to generate a domain label string with a '/' in it. My >> remaining concern here is that we could possibly see hostnames that >> are >> too long to be stored in directory entries of some file systems, >> especially considering that the ASCII-fied Unicode names will be >> longer >> than typical ASCII names we normally encounter today. > > Googling around some more.... the normal limits for dns appear to be > 63 > bytes per component, and 255 for the whole string, and those limits > are > still in force on the output of that mapping. I suspect this isn't a > huge deal. I bring this up because NI_MAXHOST, declared in /usr/include/netdb.h, is 1025, not 255. You are probably correct, practically speaking. >> What about multi-homed host support? The same mon_name can be used >> with >> more than one my_name, for multi-homed hosts. Using the current on- >> disk >> scheme, statd turns that SM_MON request into a no-op. > > I don't know what we're supposed to do in that case. You want to > store > them all so you can send notifies to them all on reboot? Something like that. Basically I think we want to send SM_NOTIFY to the monitored host _from_ every registered my_name we have for that mon_name. The kernel will probably use a separate nsm_host for each one of these, so statd should probably keep track of each of the cookies as well. Remembering which my_names have been used is important because the remote often uses the sender's name or IP address to identify which monitored host has rebooted. We could throw up our hands and just keep track of all the my_names that were used during the last reboot, and notify each mon_name from all of those. That doesn't help with remembering the cookies, though, and makes sending all notifications at reboot take longer. >> So additional >> records for the same hostname can't be stored, or we have to resort >> to >> adding multiple lines in the same file. This is possible to do >> with just >> POSIX file system calls, but it does add complexity to manage several >> lines in each hostname file without increasing the risk of >> corruption if >> a file update (especially the deletion of one record in the middle) >> is >> interrupted. >> >> -- >> Chuck Lever >> chuck[dot]lever[at]oracle[dot]com >> >> >> -- Chuck Lever chuck[dot]lever[at]oracle[dot]com