Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp382733ybt; Wed, 8 Jul 2020 02:13:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzy1zTfv1fL92m44YS190JAPc8ZoK4FN0zvX/x0TJiNkF0T6m8f8MNvKPVlCu3A1pMVmW30 X-Received: by 2002:a17:906:5008:: with SMTP id s8mr39760491ejj.147.1594199594822; Wed, 08 Jul 2020 02:13:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594199594; cv=none; d=google.com; s=arc-20160816; b=MH9nDEB59nFss3gy/yH/EBf8LYAyTldv3HFXRhVtdnqxt9JcoHq2upeGY9KPv3KFJv yNr5LQPW/gBn2o1+W2buXtZ15F8Xw953lU2UYLHSUGqrnz1D3QI4Xyfenf/ytyAlXVyB XUrtw5yDv6fNsEZJEqCXn0Pn0DVWYFpOmrVuKcR69i59Izm1k+KbpRU/jhQPJIbNKvOB qAz5Y/ILQ7BHCZS22H0u3pY+X+2zOmW5a4Z8rO7pdenQVnxa51IqJhhgL5NP0kKI6nUz 9gPSSye59kNae3aAwe+Mo4PyurLqcXkh/kZo5Ic55TI2KhO30x77albBYq/DBdYqruOz Ki9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:from:subject:references :mime-version:message-id:in-reply-to:date:dkim-signature; bh=wV4Dlp1PtfRWNCEDy3OUVmwFuHR/znQviVm4udIzuj8=; b=snGaNHf5eq7l0rimWJL63FTE996UboC6zL/1HynRB+dUSO00BCKM+kiN78sed9JxSw yvgk4sJZybjjVkGiUT4SLSaJKleR4gLGExq1g7HD6Q9Hm9g3iqrU1JVz7TziVebiZF85 mbBR4luKKHlrYhIT1Ta2kid/6vbSZmLMGYXxJ3eLD8tqWgkrUzkHa1EQB4ZYViD8fsfo JYMtcFO7ppsOM+1qzpxbFjoXXINjBf69G6mrCoYR0zI0vN38vc8bBwh9JwrIMD3G+twX tn/6h/7ZbaSjEClDPlfFLHb+xkcqsmtJ8qQ3UUfGXN0iwpjTFnqkCHugCVjhtR6+Jjx0 IGpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=TmfFQqld; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rv20si18067528ejb.420.2020.07.08.02.12.50; Wed, 08 Jul 2020 02:13:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=TmfFQqld; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727796AbgGHJMp (ORCPT + 99 others); Wed, 8 Jul 2020 05:12:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727787AbgGHJMn (ORCPT ); Wed, 8 Jul 2020 05:12:43 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FC38C08E763 for ; Wed, 8 Jul 2020 02:12:43 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id p22so50765403ybg.21 for ; Wed, 08 Jul 2020 02:12:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=wV4Dlp1PtfRWNCEDy3OUVmwFuHR/znQviVm4udIzuj8=; b=TmfFQqldI6NR71g11ccYeLYb4YMRJ9TbhNQpLgbner44eVMPHA3+K4/r3OVoQRVqiU oNB0e489EVTL8XxXR17WKGxWV7xu+at9ec0oMAxeMgcQ5T9k9n+ssKWhtmWGqmjnLnUM 6BeB3sGPZLxW+Crt0D7Jg9uTB3Ou/lbVcbRxBFVNT1U7DJ5xXK91RBF+UxS6r65lKZvH mb0YANeHYaPI2HDI/q5KpLu4zwrnZ/Hd496MiPfyQOJR1nAqzsvRk07T0x7CzCU/v40I 9VR02aGQQCuV7etZAmEfaL4xiNC9HANSW/5B2aq3+iMK4Qs+wEkrnXqi/ckFCoDMjrjA I0tQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=wV4Dlp1PtfRWNCEDy3OUVmwFuHR/znQviVm4udIzuj8=; b=aSyALk7vZtGxEAD6sgp+gwddeS7EOjSdDb0/7BAD18xAZDXer9gWrAU+mLiSL4QhIH ABqEoS6bTJ9fBwmP2rwXCt3s8vDk8xGmzi/PgaflNd5fbvCq4AI1Xp/nCBEs3DqBih3q h8XJfz/J948SXUCnPr7RhG28hzqJ+gdGMSGVYC3tw/6QnktBkxKDG+iiLXvHbmKPXoLO GUV5KwiUbNiOLU0s4YGj4tGEXDmJiv7ZZo1+khff2+FljqbM6DyuIjQqmfXY07XoQXaH LXolYuwVpN0vKeqg6QElBuTzx9SgiNpfLW6opFjV4jJY0xIx6iuccnumPJCVG1yeEYR1 OGvg== X-Gm-Message-State: AOAM530h8qo7VDtRkQIFmkDdd0XiKwFMaEQTTzzzjV+AZSTPY76UHK5z VZt6RLAsD4LMA3UJe6Jaz8JrGOOL3T4= X-Received: by 2002:a25:9a41:: with SMTP id r1mr1597091ybo.516.1594199562803; Wed, 08 Jul 2020 02:12:42 -0700 (PDT) Date: Wed, 8 Jul 2020 02:12:34 -0700 In-Reply-To: <20200708091237.3922153-1-drosen@google.com> Message-Id: <20200708091237.3922153-2-drosen@google.com> Mime-Version: 1.0 References: <20200708091237.3922153-1-drosen@google.com> X-Mailer: git-send-email 2.27.0.383.g050319c2ae-goog Subject: [PATCH v12 1/4] unicode: Add utf8_casefold_hash From: Daniel Rosenberg To: "Theodore Ts'o" , linux-ext4@vger.kernel.org, Jaegeuk Kim , Chao Yu , linux-f2fs-devel@lists.sourceforge.net, Eric Biggers , linux-fscrypt@vger.kernel.org, Alexander Viro Cc: Andreas Dilger , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Gabriel Krisman Bertazi , kernel-team@android.com, Daniel Rosenberg , Eric Biggers Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This adds a case insensitive hash function to allow taking the hash without needing to allocate a casefolded copy of the string. The existing d_hash implementations for casefolding allocate memory within rcu-walk, by avoiding it we can be more efficient and avoid worrying about a failed allocation. Signed-off-by: Daniel Rosenberg Reviewed-by: Gabriel Krisman Bertazi Reviewed-by: Eric Biggers --- fs/unicode/utf8-core.c | 23 ++++++++++++++++++++++- include/linux/unicode.h | 3 +++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c index 2a878b739115..dc25823bfed9 100644 --- a/fs/unicode/utf8-core.c +++ b/fs/unicode/utf8-core.c @@ -6,6 +6,7 @@ #include #include #include +#include #include "utf8n.h" @@ -122,9 +123,29 @@ int utf8_casefold(const struct unicode_map *um, const struct qstr *str, } return -EINVAL; } - EXPORT_SYMBOL(utf8_casefold); +int utf8_casefold_hash(const struct unicode_map *um, const void *salt, + struct qstr *str) +{ + const struct utf8data *data = utf8nfdicf(um->version); + struct utf8cursor cur; + int c; + unsigned long hash = init_name_hash(salt); + + if (utf8ncursor(&cur, data, str->name, str->len) < 0) + return -EINVAL; + + while ((c = utf8byte(&cur))) { + if (c < 0) + return -EINVAL; + hash = partial_name_hash((unsigned char)c, hash); + } + str->hash = end_name_hash(hash); + return 0; +} +EXPORT_SYMBOL(utf8_casefold_hash); + int utf8_normalize(const struct unicode_map *um, const struct qstr *str, unsigned char *dest, size_t dlen) { diff --git a/include/linux/unicode.h b/include/linux/unicode.h index 990aa97d8049..74484d44c755 100644 --- a/include/linux/unicode.h +++ b/include/linux/unicode.h @@ -27,6 +27,9 @@ int utf8_normalize(const struct unicode_map *um, const struct qstr *str, int utf8_casefold(const struct unicode_map *um, const struct qstr *str, unsigned char *dest, size_t dlen); +int utf8_casefold_hash(const struct unicode_map *um, const void *salt, + struct qstr *str); + struct unicode_map *utf8_load(const char *version); void utf8_unload(struct unicode_map *um); -- 2.27.0.383.g050319c2ae-goog