Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756833AbXKFMuk (ORCPT ); Tue, 6 Nov 2007 07:50:40 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754475AbXKFMuJ (ORCPT ); Tue, 6 Nov 2007 07:50:09 -0500 Received: from smtpoutm.mac.com ([17.148.16.80]:57555 "EHLO smtpoutm.mac.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752464AbXKFMuD convert rfc822-to-8bit (ORCPT ); Tue, 6 Nov 2007 07:50:03 -0500 In-Reply-To: <1865922a0711060423ue81dbdct413b993e727f6f9@mail.gmail.com> References: <472B8DAF.9080706@schaufler-ca.com> <20071103164303.GA26707@ubuntu> <20071106063305.GA26163@stusta.de> <20071106085651.GC26163@stusta.de> <20071106113359.GA6041@ubuntu> <20071106114738.GH26163@stusta.de> <1865922a0711060423ue81dbdct413b993e727f6f9@mail.gmail.com> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed Message-Id: <46B300CD-7A35-4E81-A0AE-EC2BB72016DD@mac.com> Cc: Adrian Bunk , Casey Schaufler , akpm@osdl.org, torvalds@osdl.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8BIT From: Kyle Moffett Subject: Re: [PATCH] Smackv10: Smack rules grammar + their stateful parser Date: Tue, 6 Nov 2007 07:49:26 -0500 To: "Ahmed S. Darwish" X-Mailer: Apple Mail (2.752.2) Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1637 Lines: 35 On Nov 06, 2007, at 07:23:36, Ahmed S. Darwish wrote: > On 11/6/07, Adrian Bunk wrote: >> On Tue, Nov 06, 2007 at 01:34:05PM +0200, Ahmed S. Darwish wrote: >>> As far as I understand the problem now, isspace() accepts the >>> 0xa0 character which might collide with some of UTF-8 encoded >>> characters cause the high bit is set. > > I admit I'm not experienced in such encoding stuff, but shouldn't > the ASCII and the ASCII-compatible UTF-8 encodings be enough for > the labels? > >> It would not work if someone would e.g. give you UTF-16 encoded >> strings, but I don't see this happening in practice. > > Won't this complicate the code too much ? Well the VFS (for example) certainly doesn't support any encodings other than various extended-ASCII forms (which includes UTF-8). Something like UTF-16 has extra null characters in-between every normal character, and as such would fail completely if passed to the VFS. Personally I think that isspace() accepting character 0xA0 is a bug, as there are several variants of extended ASCII only one of which has that character as a space. Others have it as ? (accented A), etc. In addition the "canonical" internal text format of the kernel is UTF-8 as that encoding can represent any character in any other encoding and it is backwards-compatible with traditional ASCII. Cheers, Kyle Moffett- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/