Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp18885071rwd; Wed, 28 Jun 2023 01:56:25 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ76nKQtYkWFuTeYK1SeH7vCWp0BE2sXzt1DfB1KWpPrMmWmR0kff4by2kMOfEQ670+FOlBa X-Received: by 2002:a17:906:7945:b0:982:781e:ba13 with SMTP id l5-20020a170906794500b00982781eba13mr33983502ejo.39.1687942585692; Wed, 28 Jun 2023 01:56:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687942585; cv=none; d=google.com; s=arc-20160816; b=cqqpCeiXF8GtC13MunBY5zVsfrQrEaDTJ0dy6w5kvUu2wyan0OrUAL3+Z+l8OgUmZt U9BdHcYRa93/4raXcd3wR+V4f/d26DEu7DJIor3XomF8uB3Bozo5RWH1Uk7mXpwSOruQ z+NEXidAJi1bjjmwRJYiPH9mO3zsCBVaUKIHSra25zu43u1SU1HD6cVF9I+L8nOukv2E Pdqul1MNurNMed4xvFPV3LZ2V4LIXG1zv+9UdQNRAkSS+nc5ZGF2Is3EOE29eaNFGKNH cdUXNvsb24UabqBiefpXgvtgwG/9kGOSR3yMDvGh+tGpY6WZnZTCULVUxIB75JtusHcg 82rA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=QVWd7E5mrDUMnUGloisuUqRc8bvnE8CheOEKaQRm0QQ=; fh=F7hISUKbyKEv7QBnEA6xFV90AEnoIWjgGafUScHdAfc=; b=kS25Z1iRk5Avm5+qxVYfpp2z0Rdc+u4HbbHLT5WpsuYYdFwrF4sQ7tTk7byUErG2DC 1STjvOFKiOeZmj+xqwyk13cPB7U0Xusmid0LKxYrOxiKX4ev3BQ82uMcYDZq+XFdxYLJ D7UfgbEAW0Zppfmysf1IruEO4Gb++lxgHvJGra9j/wKEwbMjbUoPlQZdVn6ATb57KhiZ j7VqIhkH4+tNz4VSGPzp/8HwrZE7vqfwN8gtfgG2ddfINFXwYHhBMC0tQrPzQMjHAGoU BTHlYGpT3p7h/IpCJ/z5NManeocCwJ+LOfLs7hq7q2/1WUkX/CrbXhV7c5xexhboRD91 uOhQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=RotoMZoZ; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i2-20020a1709061cc200b0098e1c7a62bbsi4723314ejh.96.2023.06.28.01.55.52; Wed, 28 Jun 2023 01:56:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=RotoMZoZ; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233626AbjF1Ic3 (ORCPT + 99 others); Wed, 28 Jun 2023 04:32:29 -0400 Received: from dfw.source.kernel.org ([139.178.84.217]:37900 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234042AbjF1I3t (ORCPT ); Wed, 28 Jun 2023 04:29:49 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 204B26126D for ; Wed, 28 Jun 2023 04:52:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 59BAEC433C0; Wed, 28 Jun 2023 04:52:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1687927928; bh=jwatz+rto1S/6MwsHg3Gwt8EWKIb1yJNk+UPEFu98Lc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=RotoMZoZ/Y6lALRDWf6kxWyAChVze3qa1cXVwEZ025gQAh4/ojQN8feWMFtEzLxC2 BWzC+jCkJtV7CEaXd3fJW/WYFi9+FAM0bVvUnz7rF7cwT9mmumyNrYxdv95y0rghL3 t2whU90fDxvGmXcfzuQrdMUzxmeCB3fe2Bm+ZNXm74ZKtE5f2D5p1tFZnqPePQAZcv nXtaaMqA94aI+HUC8r52GIRpCadJCNz9ZbW9oy5+ORgEnhwgINBtHyYiPjWEAXEo4V z/X1Qqc8CBKsDkoCAS/PwJo5WrvunYtX42C5eOWw++AYk66C2hUUhDi3CUfY/X5E/U XjXJjhRFJ7cbA== Date: Tue, 27 Jun 2023 21:52:06 -0700 From: Eric Biggers To: Pedro Falcato Cc: linux-ext4@vger.kernel.org, "Darrick J. Wong" Subject: Re: Question regarding the use of CRC32c for checksumming Message-ID: <20230628045206.GA1908@sol.localdomain> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Hi Pedro, On Mon, Jun 26, 2023 at 09:17:10PM +0100, Pedro Falcato wrote: > Hi, > > (+CC the original author, Darrick) > I've been investigating (in the context of my EFI ext4 driver) why all > ext4 checksums appear inverted. After making sure my CRC32c > implementation was correct and up-to-par with other ones, I looked at > the fs/ext4 checksumming code, which took me to the implementation of > ext4_chksum in ext4.h (excuse the gmail whitespace damage): > > >static inline u32 ext4_chksum(struct ext4_sb_info *sbi, u32 crc, > > const void *address, unsigned int length) > >{ > > struct { > > struct shash_desc shash; > > char ctx[4]; > > } desc; > > Open coding the crc32c crypto driver's internal state, seemingly to save a call? > > > > BUG_ON(crypto_shash_descsize(sbi->s_chksum_driver)!=sizeof(desc.ctx)); > > > > desc.shash.tfm = sbi->s_chksum_driver; > > *(u32 *)desc.ctx = crc; > > ...we set the starting CRC > > > > BUG_ON(crypto_shash_update(&desc.shash, address, length)); > > then call update, which keeps the current internal state in ctx[4] > > > > return *(u32 *)desc.ctx; > > and then we never call ->final() (nor ->finup()), which for crc32c would do: > > put_unaligned_le32(~ctx->crc, out); > > and as such get me the properly "inverted" crc32c I would expect. > FreeBSD never found this issue as their calculate_crc32c seems borked > too, and never inverts the result. > > Is my assessment correct? Was ->final() never called on purpose, or is > it an accident? Or is this merely a CRC32c variation I'm unaware of? > > I'd like to make sure I get all the context on this, before sending > any kind of documentation patch :) > > Thanks, > Pedro As far as I can tell, you are correct that ext4's CRC32C is just a raw CRC. It doesn't do the bitwise inversion at either the beginning or end. IMO, this is a mistake. In the design of CRCs, doing these inversions is recommended to strengthen the CRC slightly. However, it's also a common "mistake" to leave them out, and not too important, especially if many of the messages checksummed are fixed-length structures. Yes, if ext4 had used the kernel crypto API "properly", with crypto_shash_init() + crypto_shash_update() + crypto_shash_final(), it would have gotten the inversion at the beginning and end. (Note, this is true for "crc32c" but not "crc32". The crypto API isn't consistent about its CRC conventions.) But I'd also think of ext4's direct use of crypto_shash_update() as less of ext4 taking a shortcut or hack, and more of ext4 just having to work around the kernel crypto API being very clunky and inefficient for use cases like this... - Eric