Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp7511273rwl; Tue, 10 Jan 2023 01:34:29 -0800 (PST) X-Google-Smtp-Source: AMrXdXs1pw2Hgn5sFUnLWjfGJyuOJMzQ0X4LwulpbEwh5vUN2jB+cm+b9WG4oPHIxBoKpNfGhfAH X-Received: by 2002:a17:906:e11a:b0:84d:13ac:2fd4 with SMTP id gj26-20020a170906e11a00b0084d13ac2fd4mr14529356ejb.17.1673343269002; Tue, 10 Jan 2023 01:34:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673343268; cv=none; d=google.com; s=arc-20160816; b=rjIJZW6W+Js426JKeBJQWBQjT9UsN5acXrDXqA7e7cd3DlMmZEAy5xEHx1Ni0bWR8g OhMWVRU6A5ULjjdlDjFXcDYBBYdyNK7pFKNutTjAxx5NLJEh75mZudxBnatIJg8JbXzn VIEzVWwdZ497V4BJbUG3FJNibuUmtkd9fMvL7XK0sZBZjTL6AFW2oXTp4qenC0f0U/B4 EXEAGTIk0zVQogFOOXoLI9xZYigbX69/cSlZqR832RQz/SOofkbAcnH1jHWj9jKJCUF8 DZMUsR+KArzXi09orXX84IuytbhFMStYEXbpylMjGrcBEEhctltPpSYQqKMixU/EDcDD PbWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:message-id:date:references:in-reply-to:subject:cc:to :from; bh=/N8BBePM0Hg7XKVrHO/rDQnvjJizHOglcUDCZ0wzVLQ=; b=Pj1QdaCF1p5FgXYoI0jLq4JEwnEENdqRr23tGrqZlcUha8DwytJ3XnIa5/LNB510CH ceKAcDDWySfHLcYIhbOxLc+ue7Ku5GyC5M1SeCwtdgfWD2o3em6Z8JcpwLz2mr2nuyEG 2RtlGnn6gSZZLc7mBYNtZXVmGngIlXHEEFumkOaj0OwBSzA7C3C3mtnkevOkNiLvpMRQ W4RCBSOx6E9mqtXuRnjUQOjm5mpn4+kVe4xVwwjqvTeLlnQmatqPPOGNUP5M8TS0KYPW qTnH+cenqOmi6BFDpzAq8d/kV/9zVPhhY1cUY5TR3ksAa5sgGUm0bq/OaeJAGhi7b2eV RU4g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id eg24-20020a056402289800b00475a6378758si10187577edb.554.2023.01.10.01.34.15; Tue, 10 Jan 2023 01:34:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238091AbjAJJ2L (ORCPT + 53 others); Tue, 10 Jan 2023 04:28:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238164AbjAJJ1X (ORCPT ); Tue, 10 Jan 2023 04:27:23 -0500 X-Greylist: delayed 590 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Tue, 10 Jan 2023 01:27:01 PST Received: from mail.parknet.co.jp (mail.parknet.co.jp [210.171.160.6]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E67B813D17; Tue, 10 Jan 2023 01:27:01 -0800 (PST) Received: from ibmpc.myhome.or.jp (server.parknet.ne.jp [210.171.168.39]) by mail.parknet.co.jp (Postfix) with ESMTPSA id DBC882055F9C; Tue, 10 Jan 2023 18:17:09 +0900 (JST) Received: from devron.myhome.or.jp (foobar@devron.myhome.or.jp [192.168.0.3]) by ibmpc.myhome.or.jp (8.17.1.9/8.17.1.9/Debian-1) with ESMTPS id 30A9H8X3104114 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 10 Jan 2023 18:17:09 +0900 Received: from devron.myhome.or.jp (foobar@localhost [127.0.0.1]) by devron.myhome.or.jp (8.17.1.9/8.17.1.9/Debian-1) with ESMTPS id 30A9H8gj370581 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 10 Jan 2023 18:17:08 +0900 Received: (from hirofumi@localhost) by devron.myhome.or.jp (8.17.1.9/8.17.1.9/Submit) id 30A9H6Hl370575; Tue, 10 Jan 2023 18:17:06 +0900 From: OGAWA Hirofumi To: Pali =?iso-8859-1?Q?Roh=E1r?= Cc: linux-fsdevel@vger.kernel.org, linux-ntfs-dev@lists.sourceforge.net, linux-cifs@vger.kernel.org, jfs-discussion@lists.sourceforge.net, linux-kernel@vger.kernel.org, Alexander Viro , Jan Kara , "Theodore Y . Ts'o" , Anton Altaparmakov , Luis de Bethencourt , Salah Triki , Steve French , Paulo Alcantara , Ronnie Sahlberg , Shyam Prasad N , Tom Talpey , Dave Kleikamp , Andrew Morton , Pavel Machek , Christoph Hellwig , Kari Argillander , Viacheslav Dubeyko Subject: Re: [RFC PATCH v2 01/18] fat: Fix iocharset=utf8 mount option In-Reply-To: <20221226142150.13324-2-pali@kernel.org> ("Pali =?iso-8859-1?Q?Roh=E1r=22's?= message of "Mon, 26 Dec 2022 15:21:33 +0100") References: <20221226142150.13324-1-pali@kernel.org> <20221226142150.13324-2-pali@kernel.org> Date: Tue, 10 Jan 2023 18:17:05 +0900 Message-ID: <874jsyvje6.fsf@mail.parknet.co.jp> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Pali Roh?r writes: > Currently iocharset=utf8 mount option is broken and error is printed to > dmesg when it is used. To use UTF-8 as iocharset, it is required to use > utf8=1 mount option. > > Fix iocharset=utf8 mount option to use be equivalent to the utf8=1 mount > option and remove printing error from dmesg. [...] > - > - There is also an option of doing UTF-8 translations > - with the utf8 option. > - > -.. note:: ``iocharset=utf8`` is not recommended. If unsure, you should consider > - the utf8 option instead. > + **utf8** is supported too and recommended to use. > > **utf8=** > - UTF-8 is the filesystem safe version of Unicode that > - is used by the console. It can be enabled or disabled > - for the filesystem with this option. > - If 'uni_xlate' gets set, UTF-8 gets disabled. > - By default, FAT_DEFAULT_UTF8 setting is used. > + Alias for ``iocharset=utf8`` mount option. > > **uni_xlate=** > Translate unhandled Unicode characters to special > diff --git a/fs/fat/Kconfig b/fs/fat/Kconfig > index 238cc55f84c4..e98aaa3bb55b 100644 > --- a/fs/fat/Kconfig > +++ b/fs/fat/Kconfig > @@ -93,29 +93,12 @@ config FAT_DEFAULT_IOCHARSET > like FAT to use. It should probably match the character set > that most of your FAT filesystems use, and can be overridden > with the "iocharset" mount option for FAT filesystems. > - Note that "utf8" is not recommended for FAT filesystems. > - If unsure, you shouldn't set "utf8" here - select the next option > - instead if you would like to use UTF-8 encoded file names by default. > + "utf8" is supported too and recommended to use. This patch fixes the issue of utf-8 partially only. I think we can't still recommend only partially working one. [...] > - opts->utf8 = IS_ENABLED(CONFIG_FAT_DEFAULT_UTF8) && is_vfat; > - > if (!options) > goto out; > > @@ -1318,10 +1316,14 @@ static int parse_options(struct super_block *sb, char *options, int is_vfat, > | VFAT_SFN_CREATE_WIN95; > break; > case Opt_utf8_no: /* 0 or no or false */ > - opts->utf8 = 0; > + fat_reset_iocharset(opts); This changes the behavior of "iocharset=iso8859-1,utf8=no" for example. Do we need this user visible change here? > break; > case Opt_utf8_yes: /* empty or 1 or yes or true */ > - opts->utf8 = 1; > + fat_reset_iocharset(opts); > + iocharset = kstrdup("utf8", GFP_KERNEL); > + if (!iocharset) > + return -ENOMEM; > + opts->iocharset = iocharset; > break; > case Opt_uni_xl_no: /* 0 or no or false */ > opts->unicode_xlate = 0; > @@ -1359,18 +1361,11 @@ static int parse_options(struct super_block *sb, char *options, int is_vfat, > } > > out: > - /* UTF-8 doesn't provide FAT semantics */ > - if (!strcmp(opts->iocharset, "utf8")) { > - fat_msg(sb, KERN_WARNING, "utf8 is not a recommended IO charset" > - " for FAT filesystems, filesystem will be " > - "case sensitive!"); > - } > + opts->utf8 = !strcmp(opts->iocharset, "utf8") && is_vfat; Still broken, so I think we still need the warning here (would be tweaked warning). > /* If user doesn't specify allow_utime, it's initialized from dmask. */ > if (opts->allow_utime == (unsigned short)-1) > opts->allow_utime = ~opts->fs_dmask & (S_IWGRP | S_IWOTH); > - if (opts->unicode_xlate) > - opts->utf8 = 0; unicode_xlate option is exclusive with utf8, need to adjust somewhere. (with this patch, unicode_xlate and utf8 will shows by show_options()) > + else if (utf8) > + return fat_utf8_strnicmp(name->name, str, alen); > + else > + return nls_strnicmp(t, name->name, str, alen); > } Not strong opinion though, maybe we better to consolidate this to a (inline) function? (FWIW, it may be better to refactor to provide some filename functions to hide the detail of handling nls/utf8) Thanks. -- OGAWA Hirofumi