Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp520331pxb; Thu, 21 Apr 2022 04:59:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJybgUM/J7p5lf0Ll/bQ1/Ps4ZtVPoY2udqrUlAFZ1bdD/99wdJGlmUv4s44UvClrdwrlfa4 X-Received: by 2002:a17:906:fa8f:b0:6e4:de0d:45f with SMTP id lt15-20020a170906fa8f00b006e4de0d045fmr22423170ejb.235.1650542380116; Thu, 21 Apr 2022 04:59:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650542380; cv=none; d=google.com; s=arc-20160816; b=zh+I1QEzdg59YqB5J3y5nhiRlY9ZRG8eTwmlCj3cYZatSlGgnctxBu4DJzS+v6DJUO rCc1e5Kj80gVgtagH1rVh1Syaoz41IgZR46joJE6H6D1ub0kHSaNMPmUpuqZhqtG3aBx gcDgmGoKqW7EfpxQsj5r66NnYkMqjlT3aKjArsauD1uQ023NwSNTJ6WxXcyybDPjzotE SkTa80X7L7Fii40HgrqKEOz2ItnA4de8kIE3Ljty08Yr9hvTjIA1oXm6RnqIk7gQWTkf WIKcMJYbmphZEy0U3LISsReyZF+iLTW/YJ1xRCjPakqn+iEQHaOt+GwT8xXEMdz+o+Ek BV2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=Rxqwj56jqi0RXjHAY/lKhE+8hD5UwvHoIyi58xDY/mY=; b=HgBJnn1Lu6o1bCRWvh8KSGd6m9l2begmSmffCzJXepByy+RuC3o1Nelw/CV96V+M4y GEZdWcIB8u5ZUbzC8rVAYkZ8DvsSHUCUU4VrSd3XiTQ7NWOZiCgCLeMcROKYoqexEnGM qg1672od5f6MvKjCqMnlroovQezKlwKr8U+oYQXCPdwQGDG8po29BA+NDtv7VY+wKCxY 5rVlVOgHzKfRyuMdUSg8YDISRzIQoqVun8LgT+BKDLgXDh3qkzYfdlQhAQNGlZ5zkgIw rQB99oiGrCIDW3McKYBmJrR8++XaZIwg9UMM1vF8Y33yURt+I+NuWUDvcPV3gOjrSF/z 8uwA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b2-20020a1709063ca200b006e8c9736e33si4156547ejh.51.2022.04.21.04.59.15; Thu, 21 Apr 2022 04:59:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346399AbiDRQr3 (ORCPT + 99 others); Mon, 18 Apr 2022 12:47:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346373AbiDRQrZ (ORCPT ); Mon, 18 Apr 2022 12:47:25 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E934CE08 for ; Mon, 18 Apr 2022 09:44:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 37923B80FE4 for ; Mon, 18 Apr 2022 16:44:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CBFFFC385A7; Mon, 18 Apr 2022 16:44:40 +0000 (UTC) Date: Mon, 18 Apr 2022 17:44:37 +0100 From: Catalin Marinas To: Herbert Xu Cc: Ard Biesheuvel , Will Deacon , Marc Zyngier , Arnd Bergmann , Greg Kroah-Hartman , Andrew Morton , Linus Torvalds , Linux Memory Management List , Linux ARM , Linux Kernel Mailing List , "David S. Miller" Subject: Re: [PATCH 07/10] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 18, 2022 at 04:37:17PM +0800, Herbert Xu wrote: > On Sun, Apr 17, 2022 at 05:30:27PM +0100, Catalin Marinas wrote: > > Do you mean as per Ard's proposal here: > > > > https://lore.kernel.org/r/CAMj1kXH0x5Va7Wgs+mU1ONDwwsazOBuN4z4ihVzO2uG-n41Kbg@mail.gmail.com > > > > struct crypto_request { > > union { > > struct { > > ... fields ... > > }; > > u8 __padding[ARCH_DMA_MINALIGN]; > > }; > > void __ctx[] __aligned(CRYPTO_MINALIGN); > > }; > > > > If CRYPTO_MINALIGN is lowered to, say, 8 (to be the same as lowest > > ARCH_KMALLOC_MINALIGN), the __alignof__(req->__ctx) would be 8. > > Functions like crypto_tfm_ctx_alignment() will return 8 when what you > > need is 128. We can change those functions to return ARCH_DMA_MINALIGN > > instead or always bump cra_alignmask to ARCH_DMA_MINALIGN-1. > > OK, at this point I think we need to let the code do the talking :) > > I've seen Ard's patches already and I think I understand what your > needs are. So let me whip up some code to show you guys what I > think needs to be done. BTW before you have a go at this, there's also Linus' idea that does not change the crypto code (at least not functionally). Of course, you and Ard can still try to figure out how to reduce the padding but if we go with Linus' idea of a new GFP_NODMA flag, there won't be any changes to the crypto code as long as it doesn't pass such flag. So, the options: 1. Change ARCH_KMALLOC_MINALIGN to 8 (or ARCH_SLAB_MINALIGN if higher) while keeping ARCH_DMA_MINALIGN to 128. By default kmalloc() will honour the 128-byte alignment, unless GDP_NODMA is passed. This still requires changing CRYPTO_MINALIGN to ARCH_DMA_MINALIGN but there is no functional change, kmalloc() without the new flag will return CRYPTO_MINALIGN-aligned pointers. 2. Leave ARCH_KMALLOC_MINALIGN as ARCH_DMA_MINALIGN (128) and introduce a new GFP_PACKED (I think it fits better than 'NODMA') flag that reduces the minimum kmalloc() below ARCH_KMALLOC_MINALIGN (and probably at least ARCH_SLAB_MINALIGN). It's equivalent to (1) but does not touch the crypto code at all. (1) and (2) are the same, just minor naming difference. Happy to go with any of them. They still have the downside that we need to add the new GFP_ flag to those hotspots that allocate small objects (Arnd provided an idea on how to find them with ftrace) but at least we know it won't inadvertently break anything. -- Catalin