Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp2857427rdb; Fri, 22 Sep 2023 10:08:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGWYMUrlyYnlpgLnOq6/tShsHkgpmqh7CJJxZo7TrXydcZvUkzGGB8lm6AQsWJXmu8Lpuw3 X-Received: by 2002:a05:6358:52c4:b0:13a:a85b:ce00 with SMTP id z4-20020a05635852c400b0013aa85bce00mr225948rwz.31.1695402503791; Fri, 22 Sep 2023 10:08:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695402503; cv=none; d=google.com; s=arc-20160816; b=bw5IpL/FmlGxCALNlaOV1KR4EaobUFrb7EhQ/gypRvSCZLFzP18BhhiHjbmA4f83m4 Zj57/lIsVkenz/X62/vbYbrNGdRLImSVoDYomz0F9q/piy1FMjCX6N59byRPXiH+eCbx LYNyYiLtBiZ8oAubk6Mt0i8md5eevAUEoVJgtNCJu5jzEZoDZaNcFj8U23pgFhFw4j+O 5sd5b5zNL/12hHMfWrl97w9UEj2iOwYUZxtdVe6I3Nzch7053YdSSrh/M6RMSoUc1pNA HRCXbPYsXCYP00o0jkQWUP1l4cQxP/x1qO8lhVFoUozDtpDl7lLI50axl+u4PERW44gI I9/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:content-id:mime-version:subject :cc:to:references:in-reply-to:from:organization:dkim-signature; bh=T8z1OFlqreKOrvEDFGvg7LWhEbXRLwVnIrf1SNiRPYE=; fh=M2PCrUuxzCxLPH4mY8SSF+oAZttEmVjqOYFwdynYM5Y=; b=XmrAm/Txvy2p8+YAJ8tHubM8tUGTZyquWWAywM5xQuejQ+IP/S85LP7e8R5z6u6JtB QCj0CNi/gIxUfxMaqIpMHuySVvtCzLQe143i7AMbbkAP39TeLZp9n5RCrjlDWxnaNkqa kUk7/4Zoh8F6OQ2wH5+vH1MF88nT46XSdB2Q7ZtlLVdgqsk0I5Ddst4vQXc3MzeQDKIL vz0/Ek/o550Npfw/ARZlRJQVoYGZso4/geFVig2I3RVtuEf6cS95xDHaofiuAqqdIPhm t0hdGGDa3bVLa+pvEIBLjCNpc+3SmzJG3pwHZqxbwittBa+rx85m5N+QO9JZYuZbjQ36 yIqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=LBdAUkJO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id y193-20020a638aca000000b00578ca751ddbsi4133145pgd.328.2023.09.22.10.08.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 10:08:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=LBdAUkJO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id CCED08392301; Fri, 22 Sep 2023 05:02:16 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233862AbjIVMCJ (ORCPT + 99 others); Fri, 22 Sep 2023 08:02:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233961AbjIVMCH (ORCPT ); Fri, 22 Sep 2023 08:02:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A65C6194 for ; Fri, 22 Sep 2023 05:01:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695384074; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=T8z1OFlqreKOrvEDFGvg7LWhEbXRLwVnIrf1SNiRPYE=; b=LBdAUkJO/cGzxF9W1MbCdX2nHNZDzDB8CiECWX8CKa5EB2fYXuzibvepO5eAQvEHtG7gWK 4t+gJFhimQq2v/l764wKXiQCi5niuK4VS2JHN6tFld9F5e5d11/HAGe8rAZuGrOLM9hlhq N9UAoaQ0kgGovoXE0+vfO2vDwIzacmk= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-267-mSZHRhYUPeeDueoxs7x-Ew-1; Fri, 22 Sep 2023 08:01:11 -0400 X-MC-Unique: mSZHRhYUPeeDueoxs7x-Ew-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8F4443C100C7; Fri, 22 Sep 2023 12:01:10 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.42.28.216]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4A713C15BB8; Fri, 22 Sep 2023 12:01:08 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: <591a70bf016b4317add2d936696abc0f@AcuMS.aculab.com> References: <591a70bf016b4317add2d936696abc0f@AcuMS.aculab.com> <20230920222231.686275-1-dhowells@redhat.com> To: David Laight , Willem de Bruijn Cc: dhowells@redhat.com, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jens Axboe , Al Viro , Linus Torvalds , Christoph Hellwig , Christian Brauner , Matthew Wilcox , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 00/11] iov_iter: Convert the iterator macros into inline funcs MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1173636.1695384067.1@warthog.procyon.org.uk> Date: Fri, 22 Sep 2023 13:01:07 +0100 Message-ID: <1173637.1695384067@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=2.7 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: ** X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 22 Sep 2023 05:02:17 -0700 (PDT) David Laight wrote: > > (8) Move the copy-and-csum code to net/ where it can be in proximity with > > the code that uses it. This eliminates the code if CONFIG_NET=n and > > allows for the slim possibility of it being inlined. > > > > (9) Fold memcpy_and_csum() in to its two users. > > > > (10) Move csum_and_copy_from_iter_full() out of line and merge in > > csum_and_copy_from_iter() since the former is the only caller of the > > latter. > > I thought that the real idea behind these was to do the checksum > at the same time as the copy to avoid loading the data into the L1 > data-cache twice - especially for long buffers. > I wonder how often there are multiple iov[] that actually make > it better than just check summing the linear buffer? It also reduces the overhead for finding the data to checksum in the case the packet gets split since we're doing the checksumming as we copy - but with a linear buffer, that's negligible. > I had a feeling that check summing of udp data was done during > copy_to/from_user, but the code can't be the copy-and-csum here > for that because it is missing support form odd-length buffers. Is there a bug there? > Intel x86 desktop chips can easily checksum at 8 bytes/clock > (But probably not with the current code!). > (I've got ~12 bytes/clock using adox and adcx but that loop > is entirely horrid and it would need run-time patching. > Especially since I think some AMD cpu execute them very slowly.) > > OTOH 'rep movs[bq]' copy will copy 16 bytes/clock (32 if the > destination is 32 byte aligned - it pretty much won't be). > > So you'd need a csum-and-copy loop that did 16 bytes every > three clocks to get the same throughput for long buffers. > In principle splitting the 'adc memory' into two instructions > is the same number of u-ops - but I'm sure I've tried to do > that and failed and the extra memory write can happen in > parallel with everything else. > So I don't think you'll get 16 bytes in two clocks - but you > might get it is three. > > OTOH for a cpu where memcpy is code loop summing the data in > the copy loop is likely to be a gain. > > But I suspect doing the checksum and copy at the same time > got 'all to complicated' to actually implement fully. > With most modern ethernet chips checksumming receive pacakets > does it really get used enough for the additional complexity? You may be right. That's more a question for the networking folks than for me. It's entirely possible that the checksumming code is just not used on modern systems these days. Maybe Willem can comment since he's the UDP maintainer? David