Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp15300rdb; Mon, 4 Dec 2023 18:21:25 -0800 (PST) X-Google-Smtp-Source: AGHT+IF0K0ozspzvx0cPbeFBBK1znQnceGKU6pB401D0so6KxnpXVppkSHyn968vHQHGPL0ia8Xi X-Received: by 2002:a17:90b:3806:b0:285:d6a2:3bf9 with SMTP id mq6-20020a17090b380600b00285d6a23bf9mr568911pjb.11.1701742884965; Mon, 04 Dec 2023 18:21:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701742884; cv=none; d=google.com; s=arc-20160816; b=bGrtB9MDa+AE495hDRQqoctXBYZdvFoM1D/4S0nN53Pwx991aFxvqKxwMQ8RYb6rWV Cr63g0NA4q+goH4nckIvAdz275e9E9VXvFj3qxGpGeWIICYn8v0zuuqiTRw8oHEQaB88 /e+Li5uu42Mp2r8IOQekI2rs7IHdqZ3L/N4dp+lXtZiBiFs2IKT3Yu0HnN0omR1X9jTa LYp5AoFOq1aeeq4mM2cPmV69NkJxiPJESM5QkXXYrmiLT5UhawuMQ508GJkdsPOmycU6 o1ffwv8NO24D9W6sqC6djml+2k+yD9s0zFVnC9u+7Hv3kdjSioki4QxshlMKl2b+cK43 OyLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=EEu1JQzPR+aQ2DPOZEiFw30XYWpIDeqUkY6qQHrbRV4=; fh=DgGXwGbnyiv1Rvt/NkXbEKOXDvWVLBCb4IpCdacEJoI=; b=uvvM00Sht7GOSSUF9W96QyopIF6bbanVkowL8L2lGm3Mq9q6GdOdiO202RT1CTJVOW Vq36yvmtnwVFAdAvRL0S/YTIAuTfvChDykx/HAvuhW4epL5Z11KJeN1P6LQkBqk861q+ Lv5AwZOpU40YTUTlJu53CViXw9NbIhlCQWNcuUHNtrNrNfWBfChIzp+6jgxtpxYarwn5 2gpjxi4/WPHxMkCtzXNnH/VMozka/6I57vVtX1fNxZmpPJv+zOHfeMbpPFsqXNpm3Kji xyE+uKbX7Zn/kymL3fEdc88pv36U90Vp77C4opsdk29pkGAkxWvlm2efIsNHhnyJ5lYV 7ydg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.org.uk header.s=zeniv-20220401 header.b=fzAz3umh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeniv.linux.org.uk Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id c2-20020a17090a8d0200b00286b22b913fsi2600962pjo.37.2023.12.04.18.21.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 18:21:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.org.uk header.s=zeniv-20220401 header.b=fzAz3umh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeniv.linux.org.uk Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 797BB80C7AD6; Mon, 4 Dec 2023 18:21:21 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343930AbjLECVD (ORCPT + 99 others); Mon, 4 Dec 2023 21:21:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343867AbjLECVC (ORCPT ); Mon, 4 Dec 2023 21:21:02 -0500 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [IPv6:2a03:a000:7:0:5054:ff:fe1c:15ff]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1340BB4; Mon, 4 Dec 2023 18:21:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=EEu1JQzPR+aQ2DPOZEiFw30XYWpIDeqUkY6qQHrbRV4=; b=fzAz3umhgaI9QVoUzvSuygd0cx YhEQCqm7nmiIGD2HNouUK23uCJ2QG8Uvf7LIcAVwJEjHAvch41V6UoGb4+2exQIpWg6yd8y1fYPqJ coU4LqngyafNl30XzwKk2Mm4m8UTqBMmDYynYP+mfJuCa34QS08Ftg5GCZCVV2LKSutlgvS8ALE27 vOAFMtj/JDE07ZcqKtYmB2RSkTMFSdSn8ecw1SWWX4Hu7RZ4ECXF2BM+XjCZqebUYFfVkM4kPlSgB bA/8LNGUiqH8LfbC+FI63ktuFETe9CYY0yrY+pdhWMDn8IGvzVv9KSmB0sI/iuTydJoxePfwm8Ifw kzwsphvw==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.96 #2 (Red Hat Linux)) id 1rAL3U-0078xn-30; Tue, 05 Dec 2023 02:21:01 +0000 Date: Tue, 5 Dec 2023 02:21:00 +0000 From: Al Viro To: linux-arch@vger.kernel.org Cc: gus Gusenleitner Klaus , Al Viro , Thomas Gleixner , lkml , Ingo Molnar , "bp@alien8.de" , "dave.hansen@linux.intel.com" , "x86@kernel.org" , "David S. Miller" , "dsahern@kernel.org" , "kuba@kernel.org" , Paolo Abeni , Eric Dumazet Subject: [RFC][PATCHES v2] checksum stuff Message-ID: <20231205022100.GB1674809@ZenIV> References: <20231018154205.GT800259@ZenIV> <20231019050250.GV800259@ZenIV> <20231019061427.GW800259@ZenIV> <20231019063925.GX800259@ZenIV> <20231019080615.GY800259@ZenIV> <20231021071525.GA789610@ZenIV> <20231021222203.GA800259@ZenIV> <20231022194020.GA972254@ZenIV> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231022194020.GA972254@ZenIV> Sender: Al Viro X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Mon, 04 Dec 2023 18:21:21 -0800 (PST) We need a way for csum_and_copy_{from,to}_user() to report faults. The approach taken back in 2020 (avoid 0 as return value by starting summing from ~0U, use 0 to report faults) had been broken; it does yield the right value modulo 2^16-1, but the case when data is entirely zero-filled is not handled right. It almost works, since for most of the codepaths we have a non-zero value added in and there 0 is not different from anything divisible by 0xffff. However, there are cases (ICMPv4 replies, for example) where we are not guaranteed that. In other words, we really need to have those primitives return 0 on filled-with-zeroes input. So let's make them return a 64bit value instead; we can do that cheaply (all supported architectures do that via a couple of registers) and we can use that to report faults without disturbing the 32bit csum. New type: __wsum_fault. 64bit, returned by csum_and_copy_..._user(). Primitives: * CSUM_FAULT representing the fault * to_wsum_fault() folding __wsum value into that * from_wsum_fault() extracting __wsum value * wsum_is_fault() checking if it's a fault value Representation depends upon the target. CSUM_FAULT: ~0ULL to_wsum_fault(v32): (u64)v32 for 64bit and 32bit l-e, (u64)v32 << 32 for 32bit b-e. Rationale: relationship between the calling conventions for returning 64bit and those for returning 32bit values. On 64bit architectures the same register is used; on 32bit l-e the lower half of the value goes in the same register that is used for returning 32bit values and the upper half goes into additional register. On 32bit b-e the opposite happens - upper 32 bits go into the register used for returning 32bit values and the lower 32 bits get stuffed into additional register. So with this choice of representation we need minimal changes on the asm side (zero an extra register in 32bit case, nothing in 64bit case), and from_wsum_fault() is as cheap as it gets. Sum calculation is back to "start from 0". The rest of the series consists of cleaning up assorted asm/checksum.h. Branch lives in git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #work.csum Individual patches in followups. Help with review and testing would be very welcome. Al Viro (18): make net/checksum.h self-contained get rid of asm/checksum.h includes outside of include/net/checksum.h and arch make net/checksum.h the sole user of asm/checksum.h Fix the csum_and_copy_..._user() idiocy bits missing from csum_and_copy_{from,to}_user() unexporting. consolidate csum_tcpudp_magic(), take default variant into net/checksum.h consolidate default ip_compute_csum() alpha: pull asm-generic/checksum.h mips: pull include of asm-generic/checksum.h out of #if nios2: pull asm-generic/checksum.h x86: merge csum_fold() for 32bit and 64bit x86: merge ip_fast_csum() for 32bit and 64bit x86: merge csum_tcpudp_nofold() for 32bit and 64bit amd64: saner handling of odd address in csum_partial() x86: optimized csum_add() is the same for 32bit and 64bit x86: lift the extern for csum_partial() into checksum.h x86_64: move csum_ipv6_magic() from csum-wrappers_64.c to csum-partial_64.c uml/x86: use normal x86 checksum.h arch/alpha/include/asm/asm-prototypes.h | 2 +- arch/alpha/include/asm/checksum.h | 68 ++---------- arch/alpha/lib/csum_partial_copy.c | 74 ++++++------- arch/arm/include/asm/checksum.h | 27 +---- arch/arm/kernel/armksyms.c | 3 +- arch/arm/lib/csumpartialcopygeneric.S | 3 +- arch/arm/lib/csumpartialcopyuser.S | 8 +- arch/hexagon/include/asm/checksum.h | 4 +- arch/hexagon/kernel/hexagon_ksyms.c | 1 - arch/hexagon/lib/checksum.c | 1 + arch/m68k/include/asm/checksum.h | 24 +--- arch/m68k/lib/checksum.c | 8 +- arch/microblaze/kernel/microblaze_ksyms.c | 2 +- arch/mips/include/asm/asm-prototypes.h | 2 +- arch/mips/include/asm/checksum.h | 32 ++---- arch/mips/lib/csum_partial.S | 12 +- arch/nios2/include/asm/checksum.h | 13 +-- arch/openrisc/kernel/or32_ksyms.c | 2 +- arch/parisc/include/asm/checksum.h | 21 ---- arch/powerpc/include/asm/asm-prototypes.h | 2 +- arch/powerpc/include/asm/checksum.h | 27 +---- arch/powerpc/lib/checksum_32.S | 6 +- arch/powerpc/lib/checksum_64.S | 4 +- arch/powerpc/lib/checksum_wrappers.c | 14 +-- arch/s390/include/asm/checksum.h | 18 --- arch/s390/kernel/ipl.c | 2 +- arch/s390/kernel/os_info.c | 2 +- arch/sh/include/asm/checksum_32.h | 32 +----- arch/sh/kernel/sh_ksyms_32.c | 2 +- arch/sh/lib/checksum.S | 6 +- arch/sparc/include/asm/asm-prototypes.h | 2 +- arch/sparc/include/asm/checksum_32.h | 63 ++++++----- arch/sparc/include/asm/checksum_64.h | 21 +--- arch/sparc/lib/checksum_32.S | 2 +- arch/sparc/lib/csum_copy.S | 4 +- arch/sparc/lib/csum_copy_from_user.S | 2 +- arch/sparc/lib/csum_copy_to_user.S | 2 +- arch/x86/include/asm/asm-prototypes.h | 2 +- arch/x86/include/asm/checksum.h | 177 ++++++++++++++++++++++++++++++ arch/x86/include/asm/checksum_32.h | 141 ++---------------------- arch/x86/include/asm/checksum_64.h | 172 +---------------------------- arch/x86/lib/checksum_32.S | 20 +++- arch/x86/lib/csum-copy_64.S | 6 +- arch/x86/lib/csum-partial_64.c | 41 ++++--- arch/x86/lib/csum-wrappers_64.c | 43 ++------ arch/x86/um/asm/checksum.h | 119 -------------------- arch/x86/um/asm/checksum_32.h | 38 ------- arch/x86/um/asm/checksum_64.h | 19 ---- arch/xtensa/include/asm/asm-prototypes.h | 2 +- arch/xtensa/include/asm/checksum.h | 33 +----- arch/xtensa/lib/checksum.S | 6 +- drivers/net/ethernet/brocade/bna/bnad.h | 2 - drivers/net/ethernet/lantiq_etop.c | 2 - drivers/net/vmxnet3/vmxnet3_int.h | 1 - drivers/s390/char/zcore.c | 2 +- include/asm-generic/checksum.h | 15 +-- include/net/checksum.h | 81 ++++++++++++-- include/net/ip6_checksum.h | 1 - lib/checksum_kunit.c | 2 +- net/core/datagram.c | 8 +- net/core/skbuff.c | 8 +- net/ipv6/ip6_checksum.c | 1 - 62 files changed, 501 insertions(+), 959 deletions(-) Part 1: sorting out the includes. We have asm/checksum.h and net/checksum.h; the latter pulls the former. A lot of things would become easier if we could move the things from asm/checksum.h to net/checksum.h; for that we need to make net/checksum.h the only file that pulls asm/checksum.h. 1/18) make net/checksum.h self-contained right now it has an implicit dependency upon linux/bitops.h (for the sake of ror32()). 2/18) get rid of asm/checksum.h includes outside of include/net/checksum.h and arch In almost all cases include is redundant; zcore.c and checksum_kunit.c are the sole exceptions and those got switched to net/checksum.h 3/18) make net/checksum.h the sole user of asm/checksum.h All other users (all in arch/* by now) can pull net/checksum.h. Part 2: fix the fault reporting. 4/18) Fix the csum_and_copy_..._user() idiocy Fix the breakage introduced back in 2020 - see above for details. Part 3: trimming related crap 5/18) bits missing from csum_and_copy_{from,to}_user() unexporting. 6/18) consolidate csum_tcpudp_magic(), take default variant into net/checksum.h 7/18) consolidate default ip_compute_csum() ... and take it into include/net/checksum.h 8/18) alpha: pull asm-generic/checksum.h 9/18) mips: pull include of asm-generic/checksum.h out of #if 10/18) nios2: pull asm-generic/checksum.h Part 4: trimming x86 crap 11/18) x86: merge csum_fold() for 32bit and 64bit identical... 12/18) x86: merge ip_fast_csum() for 32bit and 64bit Identical, except that 32bit version uses asm volatile where 64bit one uses plain asm. The former had become pointless when memory clobber got added to both versions... 13/18) x86: merge csum_tcpudp_nofold() for 32bit and 64bit identical... 14/18) amd64: saner handling of odd address in csum_partial() all we want there is to have return value congruent to result * 256 modulo 0xffff; no need to convert from 32bit to 16bit (i.e. take it modulo 0xffff) first - cyclic shift of 32bit value by 8 bits (in either direction) will work. Kills the from32to16() helper and yields better code... 15/18) x86: optimized csum_add() is the same for 32bit and 64bit 16/18) x86: lift the extern for csum_partial() into checksum.h 17/18) x86_64: move csum_ipv6_magic() from csum-wrappers_64.c to csum-partial_64.c ... and make uml/amd64 use it. 18/18) uml/x86: use normal x86 checksum.h The only difference left is that UML really does *NOT* want the csum-and-uaccess combinations; leave those in arch/x86/include/asm/checksum_{32,64}, move the rest into arch/x86/include/asm/checksum.h (under ifdefs) and that's pretty much it.