Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp262114ybt; Tue, 23 Jun 2020 21:34:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzE0u05IaBueKMwmNbgay+6VnIM0uMrpVjm+snNKhfiE24z7lbQcu95mNMcCSmBjSKdZoBu X-Received: by 2002:a50:a661:: with SMTP id d88mr25312574edc.34.1592973289671; Tue, 23 Jun 2020 21:34:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592973289; cv=none; d=google.com; s=arc-20160816; b=yX2NkMDHSG6JG83698VrhWZ0NksXItlf3HRLSOOmOF6/83WvJGTvaF1rKjRL8y3Ooj l+0fTWnx0INXeHh23L+aPhjl/hOnGxEDUNkRUz4xV6btOTw1rRt3M0axul0ko4b9Lkc3 W9Oq1ipz5YDDFlrdq8aH8/0r1sj6gvo5wV5mYDKOhs+RsLzs6ABu8ghmOfHQpdUz0mkN ovZFeWKzoHDA3FTVUnGTscdN/oEF6tI40sdQTaYAjhJgiodxRIkuTiVZylrbTJh9FaV9 rKGWDAN48xSA8flygHfwpJQJr1/5B+2s0QMFqeK+KOkFM7LjAKAywJuCi5wsmHQCcX1U TWMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:from:subject:references :mime-version:message-id:in-reply-to:date:dkim-signature; bh=wjYzq7dNm+b/15M65eILIbgPJ/utRyDEBt1j+IyIFNc=; b=hVL93vOeCYau+w4MQjrvdcBk+K0Ahgz9jm27U1HEhzen5OY5b0lpJjaybJCX2935wd TpAL4Xx1zst6MXKS1zhha6mwrQHZR/D8wSNAY2JSliVaqzhC09jQBR7DBRRHM+xPRBge u5e8n/YsMeXHi5dvTSnCgT+2RDADiygYJT/c21WJmU2RZ0j6K9mAJQsNMkeXFvUzqD/x tbdoQS4v+KFXWsHB6qqaT8ourbATdlqawJbJmzKNMuCZ4sHCGsP4GryFuh83/IHZaU1q HoTQazq9ZjacQ1WBc0WCc3XlVIQAEhpeiCTo26oPKFsuu777LgDc6WS17/uWNfWuMli3 cWGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tZlQ1+jy; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gg13si11557429ejb.266.2020.06.23.21.34.20; Tue, 23 Jun 2020 21:34:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tZlQ1+jy; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728685AbgFXEeF (ORCPT + 99 others); Wed, 24 Jun 2020 00:34:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388736AbgFXEeA (ORCPT ); Wed, 24 Jun 2020 00:34:00 -0400 Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3EDEC061573 for ; Tue, 23 Jun 2020 21:33:58 -0700 (PDT) Received: by mail-qk1-x74a.google.com with SMTP id w27so780100qkw.2 for ; Tue, 23 Jun 2020 21:33:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=wjYzq7dNm+b/15M65eILIbgPJ/utRyDEBt1j+IyIFNc=; b=tZlQ1+jyfoR+7clXo6vrzd7pzgmnVS1pNwJ0UpJnMv8xv9+lzT5abGBMINYUNEUfgy wdW429okaRdglWfYGu7Ufv0UtQ5ZeFFeGzVjbeo0Xr8kg504XwZmBSE6aggZ+/vpQUXr jf/RyvipjXVfAL3RKcKQ3/zZm23wd9GYDVLGifCgvkc9Nu3e3J5/r9Fbt9l4vnko05Lw 3KLoYb78+1YFNX3aWw3/90HzD3sa+Per824hrr4K1hxgUMUJe5rxLQdxiTtq4A22HSw6 tZDO7K5LI3nuAlIbk0UEgIYk7QPJ35tPLzqslI5XJF8EJLq4Utbmj6YYPnRXMwnzxzf2 GSzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=wjYzq7dNm+b/15M65eILIbgPJ/utRyDEBt1j+IyIFNc=; b=JrU7ZjfqSOcoYpFV16a3Pv+YpDvjFjYqSIquijiz/Fw1Kb9vq24WpzJZgH8lL+OUhk byTUNOsvzB0KEEHOEPG9W8JhTf0B2RojXG1ahTSa35eBCpkcBWy1uoQR5juD/RykjuXc Nli3u0qXaeTzDErdMef7j+sf9FO1+rR/esYHDLHfDpBLkpTWwCetllCGmoYk6wPkGnfr FFf3Tm4wD8+SuyUAUi7PTj0Y6sQyw1VkcLs5/iOela3Puk5Arn+k2PV6xH/1M9hOqhSQ rIYFGlh0YMGehiEuzYpawtvu45BLHax+uAXm4Jde5XsDocneIcAk+zh9OFXy+1XzHZxe UfFA== X-Gm-Message-State: AOAM530vGoBzjVnsN/yGW/5YlLFzc/QknY+TguXLvidcgSToEURZ8ceU U4WTb4GN+EcnfNgtkB+h+vfRQl4AxKw= X-Received: by 2002:ad4:49aa:: with SMTP id u10mr30687919qvx.162.1592973237877; Tue, 23 Jun 2020 21:33:57 -0700 (PDT) Date: Tue, 23 Jun 2020 21:33:38 -0700 In-Reply-To: <20200624043341.33364-1-drosen@google.com> Message-Id: <20200624043341.33364-2-drosen@google.com> Mime-Version: 1.0 References: <20200624043341.33364-1-drosen@google.com> X-Mailer: git-send-email 2.27.0.111.gc72c7da667-goog Subject: [PATCH v9 1/4] unicode: Add utf8_casefold_hash From: Daniel Rosenberg To: "Theodore Ts'o" , linux-ext4@vger.kernel.org, Jaegeuk Kim , Chao Yu , linux-f2fs-devel@lists.sourceforge.net, Eric Biggers , linux-fscrypt@vger.kernel.org, Alexander Viro , Richard Weinberger Cc: linux-mtd@lists.infradead.org, Andreas Dilger , Jonathan Corbet , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Gabriel Krisman Bertazi , kernel-team@android.com, Daniel Rosenberg Content-Type: text/plain; charset="UTF-8" Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org This adds a case insensitive hash function to allow taking the hash without needing to allocate a casefolded copy of the string. Signed-off-by: Daniel Rosenberg --- fs/unicode/utf8-core.c | 23 ++++++++++++++++++++++- include/linux/unicode.h | 3 +++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c index 2a878b739115d..90656b9980720 100644 --- a/fs/unicode/utf8-core.c +++ b/fs/unicode/utf8-core.c @@ -6,6 +6,7 @@ #include #include #include +#include #include "utf8n.h" @@ -122,9 +123,29 @@ int utf8_casefold(const struct unicode_map *um, const struct qstr *str, } return -EINVAL; } - EXPORT_SYMBOL(utf8_casefold); +int utf8_casefold_hash(const struct unicode_map *um, const void *salt, + struct qstr *str) +{ + const struct utf8data *data = utf8nfdicf(um->version); + struct utf8cursor cur; + int c; + unsigned long hash = init_name_hash(salt); + + if (utf8ncursor(&cur, data, str->name, str->len) < 0) + return -EINVAL; + + while ((c = utf8byte(&cur))) { + if (c < 0) + return c; + hash = partial_name_hash((unsigned char)c, hash); + } + str->hash = end_name_hash(hash); + return 0; +} +EXPORT_SYMBOL(utf8_casefold_hash); + int utf8_normalize(const struct unicode_map *um, const struct qstr *str, unsigned char *dest, size_t dlen) { diff --git a/include/linux/unicode.h b/include/linux/unicode.h index 990aa97d80496..74484d44c7554 100644 --- a/include/linux/unicode.h +++ b/include/linux/unicode.h @@ -27,6 +27,9 @@ int utf8_normalize(const struct unicode_map *um, const struct qstr *str, int utf8_casefold(const struct unicode_map *um, const struct qstr *str, unsigned char *dest, size_t dlen); +int utf8_casefold_hash(const struct unicode_map *um, const void *salt, + struct qstr *str); + struct unicode_map *utf8_load(const char *version); void utf8_unload(struct unicode_map *um); -- 2.27.0.111.gc72c7da667-goog