Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp2805680ybl; Mon, 20 Jan 2020 09:39:29 -0800 (PST) X-Google-Smtp-Source: APXvYqzQnAvOiCRNkYGiuHI9adlZBk3xXXOw4GQ8FSdnzQtJQ3xE3++WArnOmW8Xw8/qXZdkksnf X-Received: by 2002:a05:6808:84:: with SMTP id s4mr27090oic.60.1579541967862; Mon, 20 Jan 2020 09:39:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579541967; cv=none; d=google.com; s=arc-20160816; b=vLtjoGePtcmkuRyt5TL6EWk3j+3Yj/w8gTPp0K29KFCuUTrkTC13LYpOaiF9Gdrr18 IKJW7iCZ93E77/rwE53ygIsErRh5olaf629fdep1oaUFZpwOOApHkCC5sv5FmsOAd4IQ OfffSnmfa0C9uVhKyXDSwwWPA744xyNlFtiCULE81DpjW2uGQnh32MR0nJhPa1LKD0Ly DZhnEO1fsO5pk655EFf9sqxKnVa3fuE3j3JOGsWC7hu52CuaTTBvMsqwu0bOCuweus5d hIBrbQiBvWGEEJ+YSOiEz7pzdVOti/sNksIcPAttfxRmFFPrcPKwudm5eg3qObBfYs0I 1m3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=P9xZOZ1XWcB7aOlIGl3SKc5u5ceY3mEFddM+mKKi1h8=; b=pr8GREVxbq4S0kvoLyIdq9sU7emEQatAF5xiA8MQ+AK0O1/AzR8abq54iDfldPlcd+ ciR1z5gbxzoTinCYPd0CuqIMIG72W4j89BTALepw8cxxlYHg7pLPdKe51C2o0ldLwf4o zA2HCVFE4paIy7vha/ioUViU1rNmA9MsE+WAE4a+jHUvBTb4mIwHsSB1C/L6Qx1NyJpM oW4XgxmAU7Cb4hr3KjISgTHHBiDx0iastChV9ShJSyi/J27Z+Ic6tAzfAh1XHAbzIv/O rvJM9qnxTaXKP6c5Gg6kF1eab23mRMILc2a+Dzyy3Tm2BQgxtofN6Ecs06Uac823Dkvk 23+w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 128si18363839oih.78.2020.01.20.09.39.14; Mon, 20 Jan 2020 09:39:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727092AbgATRiS (ORCPT + 99 others); Mon, 20 Jan 2020 12:38:18 -0500 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:38834 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726642AbgATRiR (ORCPT ); Mon, 20 Jan 2020 12:38:17 -0500 Received: from callcc.thunk.org ([38.98.37.142]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 00KHboLl005395 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Jan 2020 12:37:59 -0500 Received: by callcc.thunk.org (Postfix, from userid 15806) id 689AB420057; Mon, 20 Jan 2020 12:37:49 -0500 (EST) Date: Mon, 20 Jan 2020 12:37:49 -0500 From: "Theodore Y. Ts'o" To: David Laight Cc: "'Pali =?iso-8859-1?Q?Roh=E1r'?=" , OGAWA Hirofumi , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" , Namjae Jeon , Gabriel Krisman Bertazi Subject: Re: vfat: Broken case-insensitive support for UTF-8 Message-ID: <20200120173749.GG15860@mit.edu> References: <20200119221455.bac7dc55g56q2l4r@pali> <87sgkan57p.fsf@mail.parknet.co.jp> <20200120110438.ak7jpyy66clx5v6x@pali> <89eba9906011446f8441090f496278d2@AcuMS.aculab.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <89eba9906011446f8441090f496278d2@AcuMS.aculab.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 20, 2020 at 03:07:20PM +0000, David Laight wrote: > What happens if the filesystem has filenames that invalid UTF8 sequences > or multiple filenames that decode from UTF8 to the same 'wchar' value. > Never mind ones that are just case-differences for the same filename. > > UTF8 is just so broken it should never have been allowed to become > a standard. Internationalization is an overconstrained problem which is impacted and influenced by human politics, incuding from the Cold War and who attended which internal standards bodies meetings. So much so that an I18N expert (very knowledgable about the problems in this domain) has been known to have said (in a bar, late at night, and after much alcohol) that it would be simpler to teach the entire human race English. Unfortunately, that's not going to happen, and if we are going to deal with the market of "everyone which doesn't speak English", we're going to have to live with Unicode, warts at and all. Seriously speaking, UTF-8 is the worst encoding, except for all of the others. :-) - Ted