Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp4783686ybb; Tue, 14 Apr 2020 14:07:02 -0700 (PDT) X-Google-Smtp-Source: APiQypIgtZaz0+XFikWsFFX6yw0mCbySbIBwJjr9rIfpuFhj8+cNP/GxQb10LyXfnNVMVdLFCCQE X-Received: by 2002:a17:906:1387:: with SMTP id f7mr1911022ejc.333.1586898422254; Tue, 14 Apr 2020 14:07:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586898422; cv=none; d=google.com; s=arc-20160816; b=Ul6nqi/mHYKR0T8rjiAzYiqbEKM1Qs2AnoA4fhayDz7oamII6Dq0mFDdYfonfCEnUQ NZyhDVPujd55Gun7kKfNX+kxzuI+BAblaSnVGRRo/uqIcX/VKg91yxmn+wUvf4v0vsiL XJ5rTBkIJKjvPdyl7JzLl/7T68x/9GXTbO9bBvhOSRwaaF6WrHrcTPfNhKhc9bhzLhx9 6/he2S8IKDw8zV3YWABPq453dYT7/264ylbh0o4WaIQPec369eO/Z0g2LGWNw/uv3qaf LFZLWQJt9Xalv/Tr8rXxy3l79L7IdvFHTvVcsyRUpvWgCttkT8D3EmqSfk5quepBUu7m CGZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :in-reply-to:date:references:organization:subject:cc:to:from; bh=mFaHrVjmSMqBgLfEKRQ6xJ0HhnHIOt/stXnf0HKCXaM=; b=sKoNN/feJrqfMdTVIjRW3wgG71ZtIU/IAJOudGXMdSzom8TbSxU1L+b3jOefW28dnW wlsn6N3V3CtpkJ2lZ3JMmdDYN79LgHD2+paZU7hJK5cK+8LTfTOr1EUDskYdLVALYkxX x6r3NubhABqyQH36sx6LzliQehE92ov5e4WHG+/Whx71WDz00B0D7ifXMYNNY0UR+SlQ jPIvkOOWweZLKrV/waZYLHrYSW08f/WuzOjjV7SgcVwdJoqyZ5cJg5PcU79gjDLXjPft 6PUC3c6EmaUtE3lYe6brYo6QiXOcIgdp702AGXL0WCBlKWmdxBaVTiiuJLccCq8/c/uW NXdA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 13si6799370ejz.499.2020.04.14.14.06.26; Tue, 14 Apr 2020 14:07:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387576AbgDMSDC (ORCPT + 99 others); Mon, 13 Apr 2020 14:03:02 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:33152 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387498AbgDMSDB (ORCPT ); Mon, 13 Apr 2020 14:03:01 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 181402A010D From: Gabriel Krisman Bertazi To: Ezequiel Garcia Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, kernel@collabora.com, Theodore Ts'o , Jaegeuk Kim Subject: Re: [PATCH v2] unicode: Expose available encodings in sysfs Organization: Collabora References: <20200413165352.598877-1-krisman@collabora.com> <2672d7de2033fa14b9835b17b6fc59ba2cc4cd14.camel@collabora.com> Date: Mon, 13 Apr 2020 14:02:56 -0400 In-Reply-To: <2672d7de2033fa14b9835b17b6fc59ba2cc4cd14.camel@collabora.com> (Ezequiel Garcia's message of "Mon, 13 Apr 2020 14:50:13 -0300") Message-ID: <857dyjw9f3.fsf@collabora.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Ezequiel Garcia writes: > Hey Gabriel, > > On Mon, 2020-04-13 at 12:53 -0400, Gabriel Krisman Bertazi wrote: >> A filesystem configuration utility has no way to detect which filename >> encodings are supported by the running kernel. This means, for >> instance, mkfs has no way to tell if the generated filesystem will be >> mountable in the current kernel or not. Also, users have no easy way to >> know if they can update the encoding in their filesystems and still have >> something functional in the end. >> >> This exposes details of the encodings available in the unicode >> subsystem, to fill that gap. >> >> Cc: Theodore Ts'o >> Cc: Jaegeuk Kim >> Signed-off-by: Gabriel Krisman Bertazi >> >> --- >> Changes since v1: >> - Make init functions static. (lkp) >> >> Documentation/ABI/testing/sysfs-fs-unicode | 13 +++++ >> fs/unicode/utf8-core.c | 64 ++++++++++++++++++++++ >> fs/unicode/utf8-norm.c | 18 ++++++ >> fs/unicode/utf8n.h | 5 ++ >> 4 files changed, 100 insertions(+) >> create mode 100644 Documentation/ABI/testing/sysfs-fs-unicode >> >> diff --git a/Documentation/ABI/testing/sysfs-fs-unicode b/Documentation/ABI/testing/sysfs-fs-unicode >> new file mode 100644 >> index 000000000000..15c63367bb8e >> --- /dev/null >> +++ b/Documentation/ABI/testing/sysfs-fs-unicode >> @@ -0,0 +1,13 @@ >> +What: /sys/fs/unicode/latest >> +Date: April 2020 >> +Contact: Gabriel Krisman Bertazi >> +Description: >> + The latest version of the Unicode Standard supported by >> + this kernel >> + > > Missing stop at the end of the sentence? > >> +What: /sys/fs/unicode/encodings >> +Date: April 2020 >> +Contact: Gabriel Krisman Bertazi >> +Description: >> + List of encodings and corresponding versions supported >> + by this kernel > > Ditto. > >> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c >> index 2a878b739115..b48e13e823a5 100644 >> --- a/fs/unicode/utf8-core.c >> +++ b/fs/unicode/utf8-core.c >> @@ -6,6 +6,7 @@ >> #include >> #include >> #include >> +#include >> >> #include "utf8n.h" >> >> @@ -212,4 +213,67 @@ void utf8_unload(struct unicode_map *um) >> } >> EXPORT_SYMBOL(utf8_unload); >> >> +static ssize_t latest_show(struct kobject *kobj, >> + struct kobj_attribute *attr, char *buf) >> +{ >> + int l = utf8version_latest(); >> + >> + return snprintf(buf, PAGE_SIZE, "UTF-8 %d.%d.%d\n", UNICODE_AGE_MAJ(l), >> + UNICODE_AGE_MIN(l), UNICODE_AGE_REV(l)); >> + >> +} >> +static ssize_t encodings_show(struct kobject *kobj, >> + struct kobj_attribute *attr, char *buf) >> +{ >> + int n; >> + >> + n = snprintf(buf, PAGE_SIZE, "UTF-8:"); >> + n += utf8version_list(buf + n, PAGE_SIZE - n); >> + n += snprintf(buf+n, PAGE_SIZE-n, "\n"); >> + > > I was wondering how sysfs-compliant this was, > in terms of one value per attribute. > > Although, we do seem to break this on a few cases. I thought about making it more one-value-per-attribute, by considering each tuple (encoding,version) a different object. So the structure would look like: /sys/fs/unicode/utf-8// But it felt completely over-engineered, considering we don't support anything other than utf-8, and I don't see versions having specific attributes. But maybe this way is more future-proof. > >> + return n; >> +} >> + >> +#define UCD_ATTR(x) \ >> + static struct kobj_attribute x ## _attr = __ATTR_RO(x) >> + >> +UCD_ATTR(latest); >> +UCD_ATTR(encodings); >> + >> +static struct attribute *ucd_attrs[] = { >> + &latest_attr.attr, >> + &encodings_attr.attr, >> + NULL, >> +}; >> +static const struct attribute_group ucd_attr_group = { >> + .attrs = ucd_attrs, >> +}; >> +static struct kobject *ucd_root; >> + >> +static int __init ucd_init(void) >> +{ >> + int ret; >> + >> + ucd_root = kobject_create_and_add("unicode", fs_kobj); >> + if (!ucd_root) >> + return -ENOMEM; >> + >> + ret = sysfs_create_group(ucd_root, &ucd_attr_group); >> + if (ret) { >> + kobject_put(ucd_root); >> + ucd_root = NULL; >> + return ret; >> + } >> + >> + return 0; >> +} >> + >> +static void __exit ucd_exit(void) >> +{ >> + kobject_put(ucd_root); >> +} >> + >> +module_init(ucd_init); > > This code is not a module, so how about fs_initcall? Right. for the record, I'm planing to make part of it build as a module in a future patch to reduce the large footprint of the decoding table on systems that don't care about it. Will fix on v3. > >> +module_exit(ucd_exit) >> + > > I can be wrong, but I see no way to remove it :-) > > Thanks, > Ezequiel -- Gabriel Krisman Bertazi