Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp6794628rwp; Tue, 18 Jul 2023 06:06:59 -0700 (PDT) X-Google-Smtp-Source: APBJJlG/oIMCKiJjm9Vr/YxBf+555iZt2Er2//Z0rFvcK1WCvMerl5a+flVnMvCOvwh3G8vWPPZF X-Received: by 2002:a05:6602:3056:b0:785:cdf7:eaee with SMTP id p22-20020a056602305600b00785cdf7eaeemr2264229ioy.12.1689685619016; Tue, 18 Jul 2023 06:06:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689685618; cv=none; d=google.com; s=arc-20160816; b=fTYT3yaFWw3DEYYZsiQis6dBzPBoW6YUXMGAGobudTGWGKJlrg9Of97sAi/riNplYD sW1HA+/Btro2FqN9TI/9JWO1Fw46nHvBM4UTJ5Xpty/iA9d0ycJ/cyDd3E2KQhoqcXEj w6Cp6N3A1nU9bx6tpBSdH1sw9pjmt/KKBjqhyjnKojq/+a0u0jN7zg7xPD9JuVt0Lb7M WvvuGg+yWV8P1Ct2ajbgJ0zE9s1Wz5YnbbL9QwGOBPD6+tXvCZOiWLNIFkQ7M6WtzlUI thbIzB7z72f8ZYAXADRd4DnR3tffBfNo5rYkQn9LnR4r8nYkw5j+thi2QcrOIyGVXnSM WmgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8Ay5QRSG9P4e+8ycP6kMvciFy9JjBGYtPWKpX8ebR9E=; fh=/RacsB/FIyaIQZLsCl51O/4lZGjWuNGrCjxQndCzekA=; b=jWIfxwmgfb/v47ZKdQ1n3SE+K1v39d4LoVE8M1fPatN9ZKZjAJ5OrB4V9hC1DH6uyU yi5LWujhVf2rFqzJS2H36Xa+CZsASsoTMplmvasmMkQ04wv/oQ7DCjfoD5FUvWY/ePx4 UqEsHjdtEY1kkw2eQU/Snjfhu0jwOd5A6hstGNE4QfGRr24sJFi6ivoOhHOTaSgZQIXp u47mg/FqJ25qcbMCCpzmmZv5hXr6xKV1ZANLE3fAkmjU80DAoK8kHhkYi63RGXYXbPpr WF+MSHa0KurvaUsQoN57uQGdX/kR4SkSKjYA1zypuKu03V4UFoUwpYBBQlAOXqnpXiJ8 ccIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GAgLfbJO; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h23-20020a17090ac39700b002631bfbd8d0si1561854pjt.129.2023.07.18.06.06.36; Tue, 18 Jul 2023 06:06:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GAgLfbJO; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231543AbjGRNDU (ORCPT + 99 others); Tue, 18 Jul 2023 09:03:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232488AbjGRNCu (ORCPT ); Tue, 18 Jul 2023 09:02:50 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E019D212F; Tue, 18 Jul 2023 06:01:10 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 31C0B614FB; Tue, 18 Jul 2023 13:01:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2C1ECC433C7; Tue, 18 Jul 2023 13:01:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689685268; bh=pBS4evmuojvGZNpkpvZA6FWxaaRdlIHQ6sTGa/XNWCs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GAgLfbJOtr+4I7rn52XFmVGRiSEZGY7xOoxXJ1EQxCHe8l/oD5LehpTXCm3JdU8ET 00euiRXw07Qk3I4eEEdfmMMjTsSPj2ml/HIpJxGiPqIQrY14+lOMg2eLmflyiEB6Dq 375ucDgl6JZJKZwqAQOJ4H1ZjCdq7n20z4QrRVb0Pb3ms7M/rRB5Hq3VfKVdi2+QWD ygHX4ro1tUjT04KIb23Tj92+dxSmjILwUZng92A1L7nibXGC2cKf03Nwf+VJQCSna8 2Of8CuSbkv+f6mVIZm433GfUcB2FXO3yZKza0xpOc1TWl8OJhk1XlMee6YU++ijUqC 5L4T/+H8ExINw== From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: Ard Biesheuvel , Herbert Xu , Eric Biggers , Kees Cook , Haren Myneni , Nick Terrell , Minchan Kim , Sergey Senozhatsky , Jens Axboe , Giovanni Cabiddu , Richard Weinberger , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Steffen Klassert , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, qat-linux@intel.com, linuxppc-dev@lists.ozlabs.org, linux-mtd@lists.infradead.org, netdev@vger.kernel.org Subject: [RFC PATCH 21/21] crypto: scompress - Drop the use of per-cpu scratch buffers Date: Tue, 18 Jul 2023 14:58:47 +0200 Message-Id: <20230718125847.3869700-22-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230718125847.3869700-1-ardb@kernel.org> References: <20230718125847.3869700-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=7664; i=ardb@kernel.org; h=from:subject; bh=pBS4evmuojvGZNpkpvZA6FWxaaRdlIHQ6sTGa/XNWCs=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIWVbT3tLmkjv/s0bnUUz6+YWqHyfYPrt30PrH4wvXi9gu zB/Y2xYRykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZiIRz7DX9kVFbWr6oI67P7v F1mXlcX/zKa5g+2W38JQdYu2A2ZBpxkZrpf3PBfVDpm+YdPJA28zyi5tefKQvYLzLLfrcyn7Rsa DLAA= X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The scomp to acomp adaptation layer allocates 256k of scratch buffers per CPU in order to be able to present the input provided by the caller via scatterlists as linear byte arrays to the underlying synchronous compression drivers, most of which are thin wrappers around the various compression algorithm library implementations we have in the kernel. This sucks. With high core counts and SMT, this easily adds up to multiple megabytes that are permanently tied up for this purpose, and given that all acomp users pass either single pages or contiguous buffers in lowmem, we can optimize for this pattern and just pass the buffer directly if we can. This removes the need for scratch buffers, and along with it, the arbitrary 128k upper bound on the input and output size of the acomp API when the implementation happens to be scomp based. So add a scomp_map_sg() helper to try and obtain the virtual addresses associated with the scatterlists, which is guaranteed to be successful 100% of the time given the existing users, which all fit the prerequisite pattern. And as a fallback for other cases, use kvmalloc with GFP_KERNEL to allocate buffers on the fly and free them again right after. This puts the burden on future callers to either use a contiguous buffer, or deal with the potentially blocking nature of GFP_KERNEL. For IPcomp in particular, the only relevant compression algorithm is 'deflate' which is no longer implemented as an scomp, and so this change will not affect it even if we decide to convert it to take advantage of the ability to pass discontiguous scatterlists. Signed-off-by: Ard Biesheuvel --- crypto/scompress.c | 159 ++++++++++---------- include/crypto/internal/scompress.h | 2 - 2 files changed, 76 insertions(+), 85 deletions(-) diff --git a/crypto/scompress.c b/crypto/scompress.c index 3155cdce9116e092..1c050aa864bd604d 100644 --- a/crypto/scompress.c +++ b/crypto/scompress.c @@ -18,24 +18,11 @@ #include #include #include -#include #include #include "compress.h" -struct scomp_scratch { - spinlock_t lock; - void *src; - void *dst; -}; - -static DEFINE_PER_CPU(struct scomp_scratch, scomp_scratch) = { - .lock = __SPIN_LOCK_UNLOCKED(scomp_scratch.lock), -}; - static const struct crypto_type crypto_scomp_type; -static int scomp_scratch_users; -static DEFINE_MUTEX(scomp_lock); static int __maybe_unused crypto_scomp_report( struct sk_buff *skb, struct crypto_alg *alg) @@ -58,56 +45,45 @@ static void crypto_scomp_show(struct seq_file *m, struct crypto_alg *alg) seq_puts(m, "type : scomp\n"); } -static void crypto_scomp_free_scratches(void) -{ - struct scomp_scratch *scratch; - int i; - - for_each_possible_cpu(i) { - scratch = per_cpu_ptr(&scomp_scratch, i); - - vfree(scratch->src); - vfree(scratch->dst); - scratch->src = NULL; - scratch->dst = NULL; - } -} - -static int crypto_scomp_alloc_scratches(void) -{ - struct scomp_scratch *scratch; - int i; - - for_each_possible_cpu(i) { - void *mem; - - scratch = per_cpu_ptr(&scomp_scratch, i); - - mem = vmalloc_node(SCOMP_SCRATCH_SIZE, cpu_to_node(i)); - if (!mem) - goto error; - scratch->src = mem; - mem = vmalloc_node(SCOMP_SCRATCH_SIZE, cpu_to_node(i)); - if (!mem) - goto error; - scratch->dst = mem; - } - return 0; -error: - crypto_scomp_free_scratches(); - return -ENOMEM; -} - static int crypto_scomp_init_tfm(struct crypto_tfm *tfm) { - int ret = 0; + return 0; +} - mutex_lock(&scomp_lock); - if (!scomp_scratch_users++) - ret = crypto_scomp_alloc_scratches(); - mutex_unlock(&scomp_lock); +/** + * scomp_map_sg - Return virtual address of memory described by a scatterlist + * + * @sg: The address of the scatterlist in memory + * @len: The length of the buffer described by the scatterlist + * + * If the memory region described by scatterlist @sg consists of @len + * contiguous bytes in memory and is accessible via the linear mapping or via a + * single kmap(), return its virtual address. Otherwise, return NULL. + */ +static void *scomp_map_sg(struct scatterlist *sg, unsigned int len) +{ + struct page *page; + unsigned int offset; - return ret; + while (sg_is_chain(sg)) + sg = sg_next(sg); + + if (!sg || sg_nents_for_len(sg, len) != 1) + return NULL; + + page = sg_page(sg) + (sg->offset >> PAGE_SHIFT); + offset = offset_in_page(sg->offset); + + if (PageHighMem(page) && (offset + sg->length) > PAGE_SIZE) + return NULL; + + return kmap_local_page(page) + offset; +} + +static void scomp_unmap_sg(const void *addr) +{ + if (is_kmap_addr(addr)) + kunmap_local(addr); } static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) @@ -116,30 +92,52 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) void **tfm_ctx = acomp_tfm_ctx(tfm); struct crypto_scomp *scomp = *tfm_ctx; void **ctx = acomp_request_ctx(req); - struct scomp_scratch *scratch; + void *src_alloc = NULL; + void *dst_alloc = NULL; + const u8 *src; + u8 *dst; int ret; - if (!req->src || !req->slen || req->slen > SCOMP_SCRATCH_SIZE) + if (!req->src || !req->slen || !req->dst || !req->dlen) return -EINVAL; - if (!req->dst || !req->dlen || req->dlen > SCOMP_SCRATCH_SIZE) - return -EINVAL; - - scratch = raw_cpu_ptr(&scomp_scratch); - spin_lock(&scratch->lock); - - scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0); - if (dir) - ret = crypto_scomp_compress(scomp, scratch->src, req->slen, - scratch->dst, &req->dlen, *ctx); - else - ret = crypto_scomp_decompress(scomp, scratch->src, req->slen, - scratch->dst, &req->dlen, *ctx); - if (!ret) { - scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen, - 1); + dst = scomp_map_sg(req->dst, req->dlen); + if (!dst) { + dst = dst_alloc = kvmalloc(req->dlen, GFP_KERNEL); + if (!dst_alloc) + return -ENOMEM; } - spin_unlock(&scratch->lock); + + src = scomp_map_sg(req->src, req->slen); + if (!src) { + src = src_alloc = kvmalloc(req->slen, GFP_KERNEL); + if (!src_alloc) { + ret = -ENOMEM; + goto out; + } + scatterwalk_map_and_copy(src_alloc, req->src, 0, req->slen, 0); + } + + if (dir) + ret = crypto_scomp_compress(scomp, src, req->slen, dst, + &req->dlen, *ctx); + else + ret = crypto_scomp_decompress(scomp, src, req->slen, dst, + &req->dlen, *ctx); + + if (src_alloc) + kvfree(src_alloc); + else + scomp_unmap_sg(src); + + if (!ret && dst == dst_alloc) + scatterwalk_map_and_copy(dst, req->dst, 0, req->dlen, 1); +out: + if (dst_alloc) + kvfree(dst_alloc); + else + scomp_unmap_sg(dst); + return ret; } @@ -158,11 +156,6 @@ static void crypto_exit_scomp_ops_async(struct crypto_tfm *tfm) struct crypto_scomp **ctx = crypto_tfm_ctx(tfm); crypto_free_scomp(*ctx); - - mutex_lock(&scomp_lock); - if (!--scomp_scratch_users) - crypto_scomp_free_scratches(); - mutex_unlock(&scomp_lock); } int crypto_init_scomp_ops_async(struct crypto_tfm *tfm) diff --git a/include/crypto/internal/scompress.h b/include/crypto/internal/scompress.h index 858fe3965ae347ef..69e593d72cbdaa99 100644 --- a/include/crypto/internal/scompress.h +++ b/include/crypto/internal/scompress.h @@ -12,8 +12,6 @@ #include #include -#define SCOMP_SCRATCH_SIZE 131072 - struct acomp_req; struct crypto_scomp { -- 2.39.2