Received: by 2002:a05:7412:6592:b0:d7:7d3a:4fe2 with SMTP id m18csp1849918rdg; Sat, 12 Aug 2023 21:25:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHryg+tMrB3uUg5vF3mJS1GOCRRj5LheOHugfBZY6btVB1mJxiGsb5xs/qVUviOgNW1vzOJ X-Received: by 2002:a17:907:7755:b0:993:6845:89d6 with SMTP id kx21-20020a170907775500b00993684589d6mr5297449ejc.47.1691900726577; Sat, 12 Aug 2023 21:25:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691900726; cv=none; d=google.com; s=arc-20160816; b=JTzTGj8RoLNMgOHV7JRmJ/gIQiugEvm31IVFeT0mY9d/Ws6nB9JFtu2wjAKRrRGi+Q qIpwG7+uJPOAomCja1G5+AyEj3mom5VZhsTqAjMk6faJX9tmjyaq19gSAaH2qVLr81Gi H06bcAQSAwoFzQwEHs/Kw2TJG8zZqJsnbrf1n+bW9/l4ZHl2987z1p9Vi43dLdbF1fK5 WNzHHmbBNSjOzah1ym77BAqybrItLRH153tOKWNVDmS35ot5LbEIGM/HXN++1A9vyLKi JtojsbLzPi8ZEDHmwNC4rbs7FWCY18G5HMwG4hpfBZGaAGDE47WwsSZiZQxY1SLv05u8 +XXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=N707t0UnjD5n1c6jBmliOTt/74LPyhSTOmgkCoyTu4E=; fh=Np7zH2Q+gWr/CfGf1x7KPOThq0hTy7RwTUg4yvV6F4U=; b=KSlXYbZ4WAE8+vI9UeiM6+J9G0yev5VDFBNz8nrNkDWkHrBz4Z1XL/8sOgOVQe+mco 1M/dgdHicpC/T7yBGZIimd7Xwuw48GF1miGBishW8rabYjNmn2GLQr9miVuVdYXQMmwf x1XDvpOp7V3ozd3kRs18QDb67EK2r836IWpXDezbrzWCFSWK+7JQxHDaQGFBmeM9qnBN TZv9sqyk52T6BopT86wsTDu/jZ++OrvIa7n33p3K1mWjv4s2XZ9zgSisNx0rdMStE7H5 GtUbRlJWWIjdI/CWW7uRz97ddGWQHQzwzn4ToOE3DbPrwW6m98Wt60IiLJ8T0FRbGz+q 2PbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@treblig.org header.s=bytemarkmx header.b=hAInYZ6V; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gx6-20020a170906f1c600b0099ca3cf3c38si5735980ejb.120.2023.08.12.21.25.02; Sat, 12 Aug 2023 21:25:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@treblig.org header.s=bytemarkmx header.b=hAInYZ6V; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230029AbjHMAyH (ORCPT + 99 others); Sat, 12 Aug 2023 20:54:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229458AbjHMAyH (ORCPT ); Sat, 12 Aug 2023 20:54:07 -0400 Received: from mx.treblig.org (unknown [IPv6:2a00:1098:5b::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D812F1706; Sat, 12 Aug 2023 17:54:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=treblig.org ; s=bytemarkmx; h=Content-Transfer-Encoding:MIME-Version:Message-ID:Date: Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=N707t0UnjD5n1c6jBmliOTt/74LPyhSTOmgkCoyTu4E=; b=hAInYZ6VTXQMVGAKjaW+4jr1fi KrSKOfVFyAkIfAs61blmYZ0BzFR/oyGtiLnMCvmSpiZUYmok0nIRUDIP0iPUwN/El9wUYAWztKbk9 XGgOWAgYML3WMbMsLOvLHYa8vLjGx2kaAcGjp8ZIhjQTeCMqTe1rMdvfOCcky6Xeok1viXOMlvMZy lbuX4UAD4fvBUrIQh8CQZvUs0GF6DJ5+FRyYBKLSZJ12PUOg6/IaNxwkjFRjzxf99AbEogZXkFKbu C6abvMI1VmIUtb8arQsrc3VdFDsprYJN2g2flcs3gxAQqKm5zUr5Cv5AHCTqkfoCr0N+HYRZHfQW5 pVKDp4Fg==; Received: from localhost ([127.0.0.1] helo=dalek.home.treblig.org) by mx.treblig.org with esmtp (Exim 4.94.2) (envelope-from ) id 1qUzMe-006b54-OM; Sun, 13 Aug 2023 00:53:51 +0000 From: linux@treblig.org To: smfrench@gmail.com, dave.kleikamp@oracle.com, tom@talpey.com, pc@manguebit.com Cc: linkinjeon@kernel.org, linux-cifs@vger.kernel.org, jfs-discussion@lists.sourceforge.net, linux-kernel@vger.kernel.org, krisman@collabora.com, "Dr. David Alan Gilbert" Subject: [PATCH v4 0/4] dedupe smb unicode files Date: Sun, 13 Aug 2023 01:53:40 +0100 Message-ID: <20230813005344.112955-1-linux@treblig.org> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RDNS_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Dr. David Alan Gilbert" The smb client and server code have (mostly) duplicated code for unicode manipulation, in particular upper case handling. Flatten this lot into shared code. There's some code that's slightly different between the two, and I've not attempted to share that - this should be strictly a no behaviour change set. In addition, the same tables and code are shared in jfs, however there's very little testing available for the unicode in there, so just share the raw data tables. I suspect there's more UCS-2 code that can be shared, in the NLS code and in the UCS-2 code used by the EFI interfaces. Lightly tested with a module and a monolithic build, and just mounting itself. This dupe was found using PMD: https://pmd.github.io/pmd/pmd_userdocs_cpd.html Dave Version 4 Put SPDX tag back to the way the tools like it, thanks to Paulo Alcantara for explaining Version 3 History comments simplification (Feedback from Tom Talpey) Dr. David Alan Gilbert (4): fs/smb: Remove unicode 'lower' tables fs/smb: Swing unicode common code from smb->NLS fs/smb/client: Use common code in client fs/jfs: Use common ucs2 upper case table fs/jfs/Kconfig | 1 + fs/jfs/Makefile | 2 +- fs/jfs/jfs_unicode.h | 17 +- fs/jfs/jfs_uniupr.c | 121 ------- fs/nls/Kconfig | 8 + fs/nls/Makefile | 1 + fs/nls/nls_ucs2_data.h | 15 + .../server/uniupr.h => nls/nls_ucs2_utils.c} | 156 +-------- fs/nls/nls_ucs2_utils.h | 285 +++++++++++++++ fs/smb/client/Kconfig | 1 + fs/smb/client/cifs_unicode.c | 1 - fs/smb/client/cifs_unicode.h | 330 +----------------- fs/smb/client/cifs_uniupr.h | 239 ------------- fs/smb/server/Kconfig | 1 + fs/smb/server/unicode.c | 1 - fs/smb/server/unicode.h | 325 +---------------- 16 files changed, 340 insertions(+), 1164 deletions(-) delete mode 100644 fs/jfs/jfs_uniupr.c create mode 100644 fs/nls/nls_ucs2_data.h rename fs/{smb/server/uniupr.h => nls/nls_ucs2_utils.c} (50%) create mode 100644 fs/nls/nls_ucs2_utils.h delete mode 100644 fs/smb/client/cifs_uniupr.h -- 2.41.0