Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp2040161pxt; Sun, 8 Aug 2021 09:27:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxFOWLr9wkL+wtUOkJM/Fjb4hD7KvP7JtigDktkzI/yrmFnS/N3fg6FFW5xTMVrIti6WUu0 X-Received: by 2002:a05:6402:d2:: with SMTP id i18mr24821712edu.17.1628440045483; Sun, 08 Aug 2021 09:27:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628440045; cv=none; d=google.com; s=arc-20160816; b=ptGJTStUxTR0+QKF/3H+djMvzfKSzUrEqOetGcwF6owEq6DZX+kXU6y7idtKiZPEmV AgBZJCOdrdxroTDn/2w0vcpWZOnEhWy8XUYj9eK1A1B5vwMHjgKuK7bj/NIm9N+iXJJ3 pDnFRdWsAnWs3xlr6nLmSWxzompxFrT6ImRhtOG59T01PqJg11j6gNFDZQlwYC+4C0zX FVybBH/fzms6ZeKaAPdLtNmU59A/Opyl1KZFHW9FSzGc0E+laJovRAQ2V+zhenT7QSpG 7igi77b2UlFCjNN2ajpRZFqDc+qjL4A+IinC0FpPbIMFXKRvY3cRzjnhDKQTfzxgnLC+ ZM0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=98ih5wHtuXMbV762S2e9rogPxkvTPu+Hri3ICCbi8Ic=; b=SrGLQDwmE5vSAFvojhenzxDIOAATXciO9ycyovyFPXwX+vRCe2ScnFy+aBll4TPNx2 3lo67hQdmrukgNtf8SGQvp6YgQ1HpgqgFvJHf9i5LvGF+VBD7bUqgXTfSk8uNC1Zho8x nhdQtME6dtx4+5P4OXl14d7ePUYFrrxgRoBx0MsZKqBhtDKJ/duJxE020e2uCJ0QnN5V IUVWHD9u2jxE9JwiXqbT8M+29YxPvv122JVqOxtKqFL6UpeCLkLWo1Qv9uZjzbEcYmZb Y8aDhlnzbWEvrp2jvVCY7CQwnC8fkvyN/TpfGnT7Pzo9hfS7DqAhW1aIhbHmop7LZ6o6 jqaw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VfOqSRuF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h11si927131edq.176.2021.08.08.09.27.02; Sun, 08 Aug 2021 09:27:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VfOqSRuF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232295AbhHHQZw (ORCPT + 99 others); Sun, 8 Aug 2021 12:25:52 -0400 Received: from mail.kernel.org ([198.145.29.99]:47462 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230049AbhHHQZk (ORCPT ); Sun, 8 Aug 2021 12:25:40 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id F41CA61158; Sun, 8 Aug 2021 16:25:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628439921; bh=egOVkUn690xt7FZbOqT8MsamXeKICn7MJFEojXwgYng=; h=From:To:Subject:Date:In-Reply-To:References:From; b=VfOqSRuFvLe4mUUL47GjIPLuSciYWeBfWxTj8QxGd/gywNzUF/V/VSDO1OyBQr2KI pb+/lcc+l8h1rOWyRpZkTobnDfwXGl/H7pUks8skBwnjgJ8dMZQLIP9YOvb3BKH3Wy V1R+kOj4CCwoCPPwLaaCpJxU4fOXP4FYxCVXWqkaOHv0JNJsuzxurHCQKqPhY0n1z1 4vZJUKQ9G5I1Ky5nA1PqgI7ZFqOMpZdsfyxxz0S30hvQAWk9476TrXbiEKOAo6kLM3 SIF8Py6NINVOJfprskSbi/cpeMQk6ukYVbWgPcDIPnVFRXEz7veuskQX6vEW+BSd/N 2QRocWokUB1bQ== Received: by pali.im (Postfix) id B20291430; Sun, 8 Aug 2021 18:25:20 +0200 (CEST) From: =?UTF-8?q?Pali=20Roh=C3=A1r?= To: linux-fsdevel@vger.kernel.org, linux-ntfs-dev@lists.sourceforge.net, linux-cifs@vger.kernel.org, jfs-discussion@lists.sourceforge.net, linux-kernel@vger.kernel.org, Alexander Viro , Jan Kara , OGAWA Hirofumi , "Theodore Y . Ts'o" , Luis de Bethencourt , Salah Triki , Andrew Morton , Dave Kleikamp , Anton Altaparmakov , Pavel Machek , =?UTF-8?q?Marek=20Beh=C3=BAn?= , Christoph Hellwig Subject: [RFC PATCH 16/20] jfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option Date: Sun, 8 Aug 2021 18:24:49 +0200 Message-Id: <20210808162453.1653-17-pali@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210808162453.1653-1-pali@kernel.org> References: <20210808162453.1653-1-pali@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org NLS table for utf8 is broken and cannot be fixed. So instead of broken utf8 nls functions char2uni() and uni2char() use functions utf8s_to_utf16s() and utf16s_to_utf8s() which implements correct conversion between UTF-16 and UTF-8. These functions implements also correct processing of UTF-16 surrogate pairs and therefore after this change jfs driver would be able to correctly handle also file names with 4-byte UTF-8 sequences. When iochatset=utf8 is used then set sbi->nls_tab to NULL and use it for distinguish between the fact if NLS table or native UTF-8 functions should be used. Signed-off-by: Pali Rohár --- fs/jfs/jfs_unicode.c | 17 +++++++++++++++-- fs/jfs/super.c | 24 +++++++++++++++--------- 2 files changed, 30 insertions(+), 11 deletions(-) diff --git a/fs/jfs/jfs_unicode.c b/fs/jfs/jfs_unicode.c index 2db923872bf1..4c39b6b65bca 100644 --- a/fs/jfs/jfs_unicode.c +++ b/fs/jfs/jfs_unicode.c @@ -46,6 +46,9 @@ int jfs_strfromUCS_le(char *to, int maxlen, const __le16 * from, } } } + } else { + outlen = utf16s_to_utf8s(from, len, + UTF16_LITTLE_ENDIAN, to, maxlen-1); } to[outlen] = 0; return outlen; @@ -61,6 +64,7 @@ static int jfs_strtoUCS(wchar_t * to, const unsigned char *from, int len, struct nls_table *codepage) { int charlen; + int outlen; int i; if (codepage) { @@ -75,10 +79,19 @@ static int jfs_strtoUCS(wchar_t * to, const unsigned char *from, int len, return charlen; } } + outlen = i; + } else { + outlen = utf8s_to_utf16s(from, len, UTF16_LITTLE_ENDIAN, + to, len); + if (outlen < 1) { + jfs_err("jfs_strtoUCS: utf8s_to_utf16s returned %d.", + outlen); + return outlen; + } } - to[i] = 0; - return i; + to[outlen] = 0; + return outlen; } /* diff --git a/fs/jfs/super.c b/fs/jfs/super.c index 8ba2ac032292..f449fdd56654 100644 --- a/fs/jfs/super.c +++ b/fs/jfs/super.c @@ -261,16 +261,20 @@ static int parse_options(char *options, struct super_block *sb, s64 *newLVSize, /* Don't do anything ;-) */ break; case Opt_iocharset: - if (nls_map && nls_map != (void *) -1) + if (nls_map && nls_map != (void *) -1) { unload_nls(nls_map); - /* compatibility alias none means ISO-8859-1 */ - if (strcmp(args[0].from, "none") == 0) - nls_map = load_nls("iso8859-1"); - else - nls_map = load_nls(args[0].from); - if (!nls_map) { - pr_err("JFS: charset not found\n"); - goto cleanup; + nls_map = NULL; + } + if (strcmp(args[0].from, "utf8") != 0) { + /* compatibility alias none means ISO-8859-1 */ + if (strcmp(args[0].from, "none") == 0) + nls_map = load_nls("iso8859-1"); + else + nls_map = load_nls(args[0].from); + if (!nls_map) { + pr_err("JFS: charset not found\n"); + goto cleanup; + } } break; case Opt_resize: @@ -718,6 +722,8 @@ static int jfs_show_options(struct seq_file *seq, struct dentry *root) seq_printf(seq, ",discard=%u", sbi->minblks_trim); if (sbi->nls_tab) seq_printf(seq, ",iocharset=%s", sbi->nls_tab->charset); + else + seq_puts(seq, ",iocharset=utf8"); if (sbi->flag & JFS_ERR_CONTINUE) seq_printf(seq, ",errors=continue"); if (sbi->flag & JFS_ERR_PANIC) -- 2.20.1