Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp77048rdh; Sat, 23 Sep 2023 02:05:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGqC5wWzgbtyLvjExFucZxEU74VGdY+3Lz8/6Mphze7mRR+CaWS8HdMp0MeuulGwtwB7nw/ X-Received: by 2002:a05:6a00:b48:b0:690:c5cf:91f5 with SMTP id p8-20020a056a000b4800b00690c5cf91f5mr1740649pfo.18.1695459940595; Sat, 23 Sep 2023 02:05:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695459940; cv=none; d=google.com; s=arc-20160816; b=W5Atv6fGzw8I79Am0DU3hFJfwhoY28szpRpgsvWGqpXy2gOf89fncomf6GjScyRYB/ yppfGU8qdUn8FgHuwkrXzBXAYiYBkyoM5r/lsJLFn8stx6TIW/HCEZfiMaFZjZ+apu8p O9zlG2U7lR8GYW4cEs1x5Oa63E62R9zMYbNHzOYjmdwfoXbjdxddwJppSgT1/r/WTw2C WC684ECvrxfoKLlX1IfgoFaNSyNzUFPHrUbjDhEzbqlnHFdYM6YLKbVouctoObu7Ak4Q uu86JdS6djWrQOLcxsMeuALXpgCdbyy3fyaHf8ZILQvo5y2zyvU9EHLOyAPPYrOTdI64 Wlzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=FcnYbvS5MK6GuTtXypuDsEXHQd3nYjUtnq+MVVoDLKA=; fh=2T02Yl1ohe0yjps02ZmUBaeJrjQu6oxVN63XsZYdqRU=; b=G5HBja0OyblfVx27Ib+sVjoTBnxRTgC6VtZnxIB5+7Futb5+MMgs5L5VJThLMgWlTz c5lWL+HshXJKgYwODbUycOQdWDrJsH5S9FNOctBK5BMLEUYRtjdMm2PQXZakBqwxdgDe ry10YFBw6eq/rdmigiLlJuSmbYA3rMwBmOELRHyJEU5D4r11r9t+4EZZDek9MNsa6oqw wQL4hkAaXpVAteeYSBpt0x9mtOtOvJoQmAWtTVQG56ALpJJT1d/KqA6LoF5TnEKwzBi3 qD0NGITeBl81UIJGy2s50z4sVqBwyVelbhs+udIspDhulqcMM0MAekHHytwe6bmgpADe EQAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=GJNAHCnb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id ds10-20020a056a004aca00b00690da053918si5378591pfb.4.2023.09.23.02.05.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 Sep 2023 02:05:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=GJNAHCnb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 8D8A3837432A; Sat, 23 Sep 2023 00:00:31 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229997AbjIWHAO (ORCPT + 99 others); Sat, 23 Sep 2023 03:00:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229808AbjIWHAL (ORCPT ); Sat, 23 Sep 2023 03:00:11 -0400 Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [IPv6:2607:f8b0:4864:20::735]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90FDD199; Sat, 23 Sep 2023 00:00:03 -0700 (PDT) Received: by mail-qk1-x735.google.com with SMTP id af79cd13be357-774105e8c81so176108885a.3; Sat, 23 Sep 2023 00:00:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695452402; x=1696057202; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=FcnYbvS5MK6GuTtXypuDsEXHQd3nYjUtnq+MVVoDLKA=; b=GJNAHCnb6x2Um5FVft6CRxSZusS0gEHiA1ktiho+VkWGLMVXPNA3PhDDLoUAH9nSnz DUR5xj9vLRPIGVji/grhQycf8HjdpShDrag28c5BaZkGQX8Cd3tUG8V7p2BgYPdE50ty zMNa/nRdoNr1uq5dxQ60jxWMBxkAWbCwhzo0PW1weYqGz5G1QXvqx5oKW7rV8lbPivds Ka6+VY1gCX6tsXZfOECpOy/vKTAWhJi2YCNYUXfPXvE06JTgQcgFKdm/oquBZt3qPzYq VnWKQpe31lDoltTwfUe2zvYWMjNf8fUGU63PO8e1ArPzSTJMjoMyYF7Jv5lAsbWDm40o 55Dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695452402; x=1696057202; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FcnYbvS5MK6GuTtXypuDsEXHQd3nYjUtnq+MVVoDLKA=; b=nN2oAeVTULLabF1e8628FJU/1pH6s7qSAE1nkvCAwmykwj05Z5bCWRECuGHQZMVd7n ri0vzpJ5eLdnWjG3+DD4z3dDxc5UJp4Vf+HudLa9mob+GpX/gSRth9qV9v6IKlY2vv4R Kiz/67mgRhbljQdWrV3tqSOvHTwlCH95xyKNC2PFxNgW1uvilYsGBTlnIIq2SZT0ee5J BYS/FZXhi12BY7HQPhVzyQZn25vgTffSbWGZKl9Zj+fmYf/yGkXgqmSFj0oLzFtiOvak 50yV9w0jPuJGCPGPdbauxLvWd83wpF3RmAY9ulGyMFAOfVKj1wsO1UM/MIhmLt/LA0tE k0Jg== X-Gm-Message-State: AOJu0Yw82/3LTVXinVlOrfJ0bFW+/3th6SgVwJsnQ1Ok4x0aAxTb+aym fGtg9LfaBhM3uTBqQ/UJ8hhNPsYp3DtZ7n32rLw= X-Received: by 2002:a05:620a:4487:b0:774:165a:6990 with SMTP id x7-20020a05620a448700b00774165a6990mr1758934qkp.20.1695452402627; Sat, 23 Sep 2023 00:00:02 -0700 (PDT) MIME-Version: 1.0 References: <20230920222231.686275-1-dhowells@redhat.com> <591a70bf016b4317add2d936696abc0f@AcuMS.aculab.com> <1173637.1695384067@warthog.procyon.org.uk> In-Reply-To: <1173637.1695384067@warthog.procyon.org.uk> From: Willem de Bruijn Date: Sat, 23 Sep 2023 08:59:25 +0200 Message-ID: Subject: Re: [PATCH v5 00/11] iov_iter: Convert the iterator macros into inline funcs To: David Howells Cc: David Laight , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jens Axboe , Al Viro , Linus Torvalds , Christoph Hellwig , Christian Brauner , Matthew Wilcox , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=3.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_SBL_CSS, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Sat, 23 Sep 2023 00:00:31 -0700 (PDT) X-Spam-Level: ** On Fri, Sep 22, 2023 at 2:01=E2=80=AFPM David Howells = wrote: > > David Laight wrote: > > > > (8) Move the copy-and-csum code to net/ where it can be in proximity= with > > > the code that uses it. This eliminates the code if CONFIG_NET= =3Dn and > > > allows for the slim possibility of it being inlined. > > > > > > (9) Fold memcpy_and_csum() in to its two users. > > > > > > (10) Move csum_and_copy_from_iter_full() out of line and merge in > > > csum_and_copy_from_iter() since the former is the only caller of= the > > > latter. > > > > I thought that the real idea behind these was to do the checksum > > at the same time as the copy to avoid loading the data into the L1 > > data-cache twice - especially for long buffers. > > I wonder how often there are multiple iov[] that actually make > > it better than just check summing the linear buffer? > > It also reduces the overhead for finding the data to checksum in the case= the > packet gets split since we're doing the checksumming as we copy - but wit= h a > linear buffer, that's negligible. > > > I had a feeling that check summing of udp data was done during > > copy_to/from_user, but the code can't be the copy-and-csum here > > for that because it is missing support form odd-length buffers. > > Is there a bug there? > > > Intel x86 desktop chips can easily checksum at 8 bytes/clock > > (But probably not with the current code!). > > (I've got ~12 bytes/clock using adox and adcx but that loop > > is entirely horrid and it would need run-time patching. > > Especially since I think some AMD cpu execute them very slowly.) > > > > OTOH 'rep movs[bq]' copy will copy 16 bytes/clock (32 if the > > destination is 32 byte aligned - it pretty much won't be). > > > > So you'd need a csum-and-copy loop that did 16 bytes every > > three clocks to get the same throughput for long buffers. > > In principle splitting the 'adc memory' into two instructions > > is the same number of u-ops - but I'm sure I've tried to do > > that and failed and the extra memory write can happen in > > parallel with everything else. > > So I don't think you'll get 16 bytes in two clocks - but you > > might get it is three. > > > > OTOH for a cpu where memcpy is code loop summing the data in > > the copy loop is likely to be a gain. > > > > But I suspect doing the checksum and copy at the same time > > got 'all to complicated' to actually implement fully. > > With most modern ethernet chips checksumming receive pacakets > > does it really get used enough for the additional complexity? > > You may be right. That's more a question for the networking folks than f= or > me. It's entirely possible that the checksumming code is just not used o= n > modern systems these days. > > Maybe Willem can comment since he's the UDP maintainer? Perhaps these days it is more relevant to embedded systems than high end servers.