Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp2013435ybb; Sat, 11 Apr 2020 18:03:25 -0700 (PDT) X-Google-Smtp-Source: APiQypLHum4/fSjIPGq3YhXH0IEQWYdvilIetM7AprVH9f74I/cnuULcqTXiwO+3CHJpKM5y0+CV X-Received: by 2002:ac8:fee:: with SMTP id f43mr3336753qtk.376.1586653404932; Sat, 11 Apr 2020 18:03:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586653404; cv=none; d=google.com; s=arc-20160816; b=q7K5aLAP3lnkH4zSIibbS9jAdPOVXztgHs0ipPDIZ4Yvgx1oaC8kc8h76vLZ0QBvqG f443KpUTXyWJQe1IqYu2XXCvMGbTOrHquNeNuEPkxqXoaLULxPsOOL5hqGlAIMFtIaVL Mrdkd5Guxiu7znVDLnh+uaE+5Xh/tkOXwXb1P68Sfi4RPStiWFZZO2DCevaVF1L/S36w rJNoWTBlTkF1NNeVTCj7n3uQw5HvSjVb2vJ5I+5X7uKiB5rM4zwLh3oReyXTtiM5DhdJ 3rWmHFnsCBtSbKqJLMdT8yZ6RpgqaOOwN3GcRpfAsjxX4K9LfkcWTr4HR4CYTANNC2kY 3bCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=vm7Rggte7n+s+noDbSZL/Ck0oM/uGuQiZN0h1OAo6M0=; b=zq1xjSV8fT0iBJPHzMKbJBuZd0rWVw3kvm+2s0jUSzMozYwdMn3yNXWtC6UYJ1DfMa j64BvxRoK3WuprdlaF09lZitgv5UOY78incW3+iKJouJJWG7Cj7okT6Z+JVELIfa0BHK sKjS2CcrMCODe3JxU6fnkWMF4jujo47uchdUuqvqXDSbHoQ3RT9xmdieYfI0EwJsNYwu cWIJ26C0hiS+mL5wrFGOIGC34fX6qyI4NqNFfbv/l2Ef6agSHK9fQd9w1aOYFKjkHXqg hD21GEMhKDnb2v0Ticv4bhUpVlmk+jKw1ZDsmpGwE7KtRPXZ9/oMCupQtZDsiHUSdzpN BQgw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x21si3520100qtp.256.2020.04.11.18.02.59; Sat, 11 Apr 2020 18:03:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726706AbgDKX6e (ORCPT + 99 others); Sat, 11 Apr 2020 19:58:34 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:33564 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726759AbgDKX6e (ORCPT ); Sat, 11 Apr 2020 19:58:34 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 44AE42A0399 From: Gabriel Krisman Bertazi To: linux-fsdevel@vger.kernel.org Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi , kernel@collabora.com, Theodore Ts'o , Jaegeuk Kim Subject: [PATCH] unicode: Expose available encodings in sysfs Date: Sat, 11 Apr 2020 19:58:23 -0400 Message-Id: <20200411235823.2967193-1-krisman@collabora.com> X-Mailer: git-send-email 2.26.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org A filesystem configuration utility has no way to detect which filename encodings are supported by the running kernel. This means, for instance, mkfs has no way to tell if the generated filesystem will be mountable in the current kernel or not. Also, users have no easy way to know if they can update the encoding in their filesystems and still have something functional in the end. This exposes details of the encodings available in the unicode subsystem, to fill that gap. Cc: Theodore Ts'o Cc: Jaegeuk Kim Signed-off-by: Gabriel Krisman Bertazi --- Documentation/ABI/testing/sysfs-fs-unicode | 13 +++++ fs/unicode/utf8-core.c | 64 ++++++++++++++++++++++ fs/unicode/utf8-norm.c | 18 ++++++ fs/unicode/utf8n.h | 5 ++ 4 files changed, 100 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-fs-unicode diff --git a/Documentation/ABI/testing/sysfs-fs-unicode b/Documentation/ABI/testing/sysfs-fs-unicode new file mode 100644 index 000000000000..15c63367bb8e --- /dev/null +++ b/Documentation/ABI/testing/sysfs-fs-unicode @@ -0,0 +1,13 @@ +What: /sys/fs/unicode/latest +Date: April 2020 +Contact: Gabriel Krisman Bertazi +Description: + The latest version of the Unicode Standard supported by + this kernel + +What: /sys/fs/unicode/encodings +Date: April 2020 +Contact: Gabriel Krisman Bertazi +Description: + List of encodings and corresponding versions supported + by this kernel diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c index 2a878b739115..7e0282707435 100644 --- a/fs/unicode/utf8-core.c +++ b/fs/unicode/utf8-core.c @@ -6,6 +6,7 @@ #include #include #include +#include #include "utf8n.h" @@ -212,4 +213,67 @@ void utf8_unload(struct unicode_map *um) } EXPORT_SYMBOL(utf8_unload); +static ssize_t latest_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + int l = utf8version_latest(); + + return snprintf(buf, PAGE_SIZE, "UTF-8 %d.%d.%d\n", UNICODE_AGE_MAJ(l), + UNICODE_AGE_MIN(l), UNICODE_AGE_REV(l)); + +} +static ssize_t encodings_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + int n; + + n = snprintf(buf, PAGE_SIZE, "UTF-8:"); + n += utf8version_list(buf + n, PAGE_SIZE - n); + n += snprintf(buf+n, PAGE_SIZE-n, "\n"); + + return n; +} + +#define UCD_ATTR(x) \ + static struct kobj_attribute x ## _attr = __ATTR_RO(x) + +UCD_ATTR(latest); +UCD_ATTR(encodings); + +static struct attribute *ucd_attrs[] = { + &latest_attr.attr, + &encodings_attr.attr, + NULL, +}; +static const struct attribute_group ucd_attr_group = { + .attrs = ucd_attrs, +}; +static struct kobject *ucd_root; + +int __init ucd_init(void) +{ + int ret; + + ucd_root = kobject_create_and_add("unicode", fs_kobj); + if (!ucd_root) + return -ENOMEM; + + ret = sysfs_create_group(ucd_root, &ucd_attr_group); + if (ret) { + kobject_put(ucd_root); + ucd_root = NULL; + return ret; + } + + return 0; +} + +void __exit ucd_exit(void) +{ + kobject_put(ucd_root); +} + +module_init(ucd_init); +module_exit(ucd_exit) + MODULE_LICENSE("GPL v2"); diff --git a/fs/unicode/utf8-norm.c b/fs/unicode/utf8-norm.c index 1d2d2e5b906a..f9ebba89a138 100644 --- a/fs/unicode/utf8-norm.c +++ b/fs/unicode/utf8-norm.c @@ -35,6 +35,24 @@ int utf8version_latest(void) } EXPORT_SYMBOL(utf8version_latest); +int utf8version_list(char *buf, int len) +{ + int i = ARRAY_SIZE(utf8agetab) - 1; + int ret = 0; + + /* + * Print most relevant (latest) first. No filesystem uses + * unicode <= 12.0.0, so don't expose them to userspace. + */ + for (; utf8agetab[i] >= UNICODE_AGE(12, 0, 0); i--) { + ret += snprintf(buf+ret, len-ret, " %d.%d.%d", + UNICODE_AGE_MAJ(utf8agetab[i]), + UNICODE_AGE_MIN(utf8agetab[i]), + UNICODE_AGE_REV(utf8agetab[i])); + } + return ret; +} + /* * UTF-8 valid ranges. * diff --git a/fs/unicode/utf8n.h b/fs/unicode/utf8n.h index 0acd530c2c79..5dea2c4af1f3 100644 --- a/fs/unicode/utf8n.h +++ b/fs/unicode/utf8n.h @@ -21,9 +21,14 @@ ((unsigned int)(MIN) << UNICODE_MIN_SHIFT) | \ ((unsigned int)(REV))) +#define UNICODE_AGE_MAJ(x) ((x) >> UNICODE_MAJ_SHIFT & 0xff) +#define UNICODE_AGE_MIN(x) ((x) >> UNICODE_MIN_SHIFT & 0xff) +#define UNICODE_AGE_REV(x) ((x) & 0xff) + /* Highest unicode version supported by the data tables. */ extern int utf8version_is_supported(u8 maj, u8 min, u8 rev); extern int utf8version_latest(void); +extern int utf8version_list(char *buf, int len); /* * Look for the correct const struct utf8data for a unicode version. -- 2.26.0