Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp2800935ybl; Mon, 20 Jan 2020 09:33:59 -0800 (PST) X-Google-Smtp-Source: APXvYqxbm5zcbSik1RVziZAVzQIk/o4hVe6yK/snEfDZDTQ3mLA34uCfcLhdlm6xjIuwcizg0c0a X-Received: by 2002:a9d:5c8a:: with SMTP id a10mr400194oti.95.1579541638964; Mon, 20 Jan 2020 09:33:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579541638; cv=none; d=google.com; s=arc-20160816; b=MULbKfATXetx+f9EKRBulqiNfokna5W3C0jWjlQHdsDlGPua2o1b+Jp77uYM+fs6iM 4tvm7l5FdUykiR3ZzCSWoZiUmuzqYE+HZTum85AvYCEWm2A2dAtbLHsMAdDKTqqm4FrE qYz4CPQk459j96vUBVglKTm7n/2+t8ssMNNIS17cD5lyY8fsQ8R9rwKXgOsEehmFRaBp oT+SSXqZgWByIVgvcBdeNS/CnTs6lnt3o42tWolbfE7mOebMEE89rgMqhqWgNoeGm1DH K5f4nk9pl6/9yi5rxu52vpV74xRLK8b8dp3E5SHwJ75cwUnFghw46rNHySzIGoZ0Co2j Wamw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=b3ol9ghU5PqF4udLMh61V1kzPEAtJz3SDeaUIeDzWTI=; b=jQJ7WaXWiv6R5X94TncvMFkTC3pVViKCV3xItIJmq53FtTrK5ku1lezQ1xyk7YqHhE 6ieWJXQifE8P5VjtaQ66wntNnD8pY/8Csmv+GUsTu3BtaQt7goBIFjtxNKrgbosJ3r1M oMin5jN3Wi2NsKPURblnYD+8ZoLXmA+s2AepRpbKOjlSkGRiiXGU7m415fiKth9bF/ax Vkpx+k1eD1HE3IQRGMjOLHMAmggZbxquNKvFFrWDfeM8juFpyVjftN1CUhJ6zE1vd2p+ 6keK9D63WjNFHb6P7q0Tcy4xkKI73/N19ZSwhUpICG0x00dmiVSWKNuh88fJo0/offeF GhTQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a7si20605273otp.284.2020.01.20.09.33.46; Mon, 20 Jan 2020 09:33:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727580AbgATRcq (ORCPT + 99 others); Mon, 20 Jan 2020 12:32:46 -0500 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:38229 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726642AbgATRcq (ORCPT ); Mon, 20 Jan 2020 12:32:46 -0500 Received: from callcc.thunk.org ([38.98.37.142]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 00KHWHXu003876 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Jan 2020 12:32:31 -0500 Received: by callcc.thunk.org (Postfix, from userid 15806) id 51823420057; Mon, 20 Jan 2020 12:32:15 -0500 (EST) Date: Mon, 20 Jan 2020 12:32:15 -0500 From: "Theodore Y. Ts'o" To: OGAWA Hirofumi Cc: Pali =?iso-8859-1?Q?Roh=E1r?= , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Namjae Jeon , Gabriel Krisman Bertazi Subject: Re: vfat: Broken case-insensitive support for UTF-8 Message-ID: <20200120173215.GF15860@mit.edu> References: <20200119221455.bac7dc55g56q2l4r@pali> <87sgkan57p.fsf@mail.parknet.co.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87sgkan57p.fsf@mail.parknet.co.jp> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 20, 2020 at 01:04:42PM +0900, OGAWA Hirofumi wrote: > > To be perfect, the table would have to emulate what Windows use. It can > be unicode standard, or something other. And other fs can use different > what Windows use. The big question is *which* version of Windows. vfat has been in use for over two decades, and vfat predates Window starting to use Unicode in 2001. Before that, vfat would have been using whatever code page its local Windows installation was set to sue; and I'm not sure if there was space in the FAT headers to indicate the codepage in use. It would be entertaining for someone with ancient versions of Windows 9x to create some floppy images using codepage 437 and 450, and then see what a modern Windows system does with those VFAT images --- would it break horibbly when it tries to interpret them as UTF-16? Or would it figure it out? And if so, how? Inquiring minds want to know.... Bonus points if the lack of forwards compatibility causes older versions of Windows to Blue Screen. :-) - Ted P.S. And of course, then there's the question of how does older versions of Windows handle versions of Unicode which postdate the release date of that particular version of Windows? After all, Unicode adds new code points with potential revisions to the case folding table every 6-12 months. (The most recent version of Unicode was released in in April 2019 to accomodate the new Japanese kanji character "Rei" for the current era name with the elevation of the new current reigning emperor of Japan.)