Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp2456882ybl; Mon, 20 Jan 2020 03:20:37 -0800 (PST) X-Google-Smtp-Source: APXvYqzcjMJdWmK4Y7lVU29LHQ08MEdNU/gkzUEeFodSpruY5wm93lZpioxVt6tMGQXlodIw596q X-Received: by 2002:a05:6808:a8a:: with SMTP id q10mr12008971oij.66.1579519237521; Mon, 20 Jan 2020 03:20:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579519237; cv=none; d=google.com; s=arc-20160816; b=fDIasNHVSgG6okOhs6BovSYkKRJJkUlSObfkOorPDT3gIdWZnhReWHXwikh/bCr0qI l3j0xY7TG2qCehF/29CjO4e3vKLHO6cnBu5DQhKeTPDY+W4hlL23m4+dl8LxCgGt85zX TGyC0/fQY05GUca4VWe9jlGsqZk518H+cFT4fLlbTaXADPht02aiX16mdaDAMHhM33kj 1O0ck3tL8UJ5QoXsHYwavA2DDekdCgQl5g5Rc66rf586NN14TjyxZfodhGjfrocZRdm+ 8AXJVuKJX6jtADyCS4+03Lb3dNLBCg2y5iOCdWZcnsCkcPdEyqb5ipKNutJsno5FEDLC oG7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=sAKBI8tLvGzr6UvLomua7UpEPoIAkM60jCOjbm0hIrQ=; b=h+c+i+ngkV8PiNA4Y1XZtJj3DPTFyVaGUwSQhXyKoXMcHDsZEhjLL1ctnG+rL4DWSt zeVRFsH7TpGgCwjLS0Ar4dFrU53DGDBzuAb2Q5C5BFUiOWV+YuNvYyBVxw208mBbXHsV 6nxVr0P12iE6gJqN2Po/5qy6xuAqa+yyoc/nZvkrsJU4NAGZGklupl1zlTIjtAEf9/hp /oVP9eO65pL9gFpcmveYI7QmJmfBylKJ8xKnGPE8IWvA19fteC5KQywMpwnTkLyXBGHP MS1YGcMnMUSfpLic1VkiS2EeBmU49ukW5iJtPSFlh7HDwTp0kNN0qs785jzCIKwmk63d G5AQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=DmyWJdqg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b133si18531553oif.244.2020.01.20.03.20.26; Mon, 20 Jan 2020 03:20:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=DmyWJdqg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726954AbgATLTU (ORCPT + 99 others); Mon, 20 Jan 2020 06:19:20 -0500 Received: from mail-wr1-f65.google.com ([209.85.221.65]:45935 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726421AbgATLTT (ORCPT ); Mon, 20 Jan 2020 06:19:19 -0500 Received: by mail-wr1-f65.google.com with SMTP id j42so29078468wrj.12; Mon, 20 Jan 2020 03:19:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=sAKBI8tLvGzr6UvLomua7UpEPoIAkM60jCOjbm0hIrQ=; b=DmyWJdqghx6XRc275lGOaD/9JdK4XbHgV8V33ihri7Or9AYpu0acSKzt+aioLHFALE vaDTdHM7uPeSlMFYFzmNrPQGflhokrHJwmd3AkUm6ru7oDv2j0TK+fLxwLd4UM34xgO8 RTa9TtzELAT9bm6JKpFUEhvsXBSPogxvZCrLo3LrSmKqhl+dOzABmGPirpqkqPKHo8c1 xr+jA+6KLH+Ycw6CRP/FsQNoogSw6Gd1lx2rD1eBsld/Ij9Is/3Jf6y7xYXVvt5ng5nB 2vflSU+B1/4+eXV7BhhbuOi3wb6rfQM/+FC5BUvVoNfT2Y/oceTXEoiId2V9erY0oIOl CGKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=sAKBI8tLvGzr6UvLomua7UpEPoIAkM60jCOjbm0hIrQ=; b=AoX7dS1E9W/FBml8jDB3WAvFRTxNOkka5tUVWgG4GwU3V4IVcZWE19hk2bcpGOPVZF 368ZeKBobwWkWBoC4kOfvh31CjaFgsx7rBwWSnJ4Jcg5oX9FL9heskDHf4h210r5roxv QpI+4sd8xh38YLHEI+Ktxgkw3cOfBaJyY+cqJcI/g1c76/KpGql4s6tMF7NYp2zaccvi 84UPSHlqD+R6J0XFOGMEBjT+UsqVdGRdDY6AYRQfNEtdqa7vv7HAyPu47rDZvsxKLwjz qm4rKkQeC+KsccnMMkskp2DwiEihrCSEjK4aAywmTML3XKUTQ/HhRXsqhsJaTalZGVRT eZtw== X-Gm-Message-State: APjAAAV+UNHjwjtEnEXErV3JdUeId1KwHMXG0vlP8n5XLJlLy9aq0+/q OxqINJNQC2LsxU/ulN9hIbDF5jOM X-Received: by 2002:adf:bc87:: with SMTP id g7mr17959302wrh.121.1579519158079; Mon, 20 Jan 2020 03:19:18 -0800 (PST) Received: from pali ([2a02:2b88:2:1::5cc6:2f]) by smtp.gmail.com with ESMTPSA id u1sm291600wmc.5.2020.01.20.03.19.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Jan 2020 03:19:17 -0800 (PST) Date: Mon, 20 Jan 2020 12:19:16 +0100 From: Pali =?utf-8?B?Um9ow6Fy?= To: Al Viro Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, "Theodore Y. Ts'o" , OGAWA Hirofumi , Namjae Jeon , Gabriel Krisman Bertazi Subject: Re: vfat: Broken case-insensitive support for UTF-8 Message-ID: <20200120111916.pc2ml2farnga3yen@pali> References: <20200119221455.bac7dc55g56q2l4r@pali> <20200119230809.GW8904@ZenIV.linux.org.uk> <20200119233348.es5m63kapdvyesal@pali> <20200120000931.GX8904@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20200120000931.GX8904@ZenIV.linux.org.uk> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Monday 20 January 2020 00:09:31 Al Viro wrote: > On Mon, Jan 20, 2020 at 12:33:48AM +0100, Pali Rohár wrote: > > > > Does the behaviour match how Windows handles that thing? > > > > Linux behavior does not match Windows behavior. > > > > On Windows is FAT32 (fastfat.sys) case insensitive and file names "č" > > and "Č" are treated as same file. Windows does not allow you to create > > both files. It says that file already exists. > > So how is the mapping specified in their implementation? That's > obviously the mapping we have to match. FAT specification (fatgen103.doc) is just parody for specifications. E.g. it requires you to use pencil and paper during implementation... About case insensitivity I found in specification these parts: "The UNICODE name passed to the file system is converted to upper case." "UNICODE solves the case mapping problem prevalent in some OEM code pages by always providing a translation for lower case characters to a single, unique upper case character." Which basically says nothing... I can deduce from it that for mapping table should be used Unicode standard. But we already know that in that specifications are mistakes. And relevant is Microsoft FAT implementation (fastfat.sys). It is now open source on github, so we can inspect how it implements upper case conversion. > > > That's the only reason to support that garbage at all... > > > > What do you mean by garbage? > > Case-insensitive anything... the only reason to have that crap at all > is that native implementations are basically forcing it as fs > image correctness issue. You are right. But we need to deal with it. > It's worthless on its own merits, but > we can't do something that amounts to corrupting fs image when > we access it for write. If we implement same upper case conversion as in reference implementation (fastfat.sys) then we prevent "corrupting fs". -- Pali Rohár pali.rohar@gmail.com