Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp2242250pxb; Sun, 17 Oct 2021 09:25:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxF/qgQ1fBbJB8UE7V3zgbhaVZuTeAOs/+n5qZfLIS/j0oyCtr7Pj6hsLGy09fQ6psKXYlT X-Received: by 2002:a17:906:4ed9:: with SMTP id i25mr23416659ejv.228.1634487913233; Sun, 17 Oct 2021 09:25:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634487913; cv=none; d=google.com; s=arc-20160816; b=DQevnjk8h5LI2HCiEaUGcHwFaDM/LYjnJrYuPXXDoqVE4KaBr6AZAFdqMxmEytL/ya pStrQjlNZH087DRhQm4G1RlQH/Z8KGxIo1dhJRXzsg8Arez53EY1xvT6WOSTbHlH1yqW /yNKNmBTpF/h2tR14y6+4bbRWJhAOttlM8DpXbvaOU7t9g4oKktpC33S384tZQW3SbNr TVx0vIKopNVual0QmBXfdx9AAEV8ZFKji3b/rybPWinjg4HJC1WB/ZY1jQAkjIOpI3Am pjKS+XItH2JlKsWL0Ugcrttio/NshQeNVAoMYLV40ZPTmobCnEZZZETwLnHVh9GSUGED UsBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=IkANDyMx2LICJdXrOnagd/n5tW/y0/yA1MFrJx5sGTE=; b=f1hpRf6xh9ezMHkim5j6zpKBQ4E0PhQ12CiwJZShEeYNq2G/x97BSD6ieIUme8kxL7 Kp09RJb2LlDd5RdBg2eXmdJR7k4I1okRBZU07wGyR0RGuX8uAUNV+cYZalqHos9HpT4v I20kMF8D3SN4vHNaO+nwUwuNC4ocUVjymSiM+KOK+GWIoNVhsSWntEkPzY4QGT4xDKNA IcDHcKK/643vUOxY3pvpYxhYmcic/WhrtEi5oO+BFs82sWNoi5howJ91W05zPsRZSp+a LIMl+4wuCoIDdclUCPoG7ptkmTh9KAY/j5aOanHarP0EaS5Oh29qri0+uMJ+23Qru1+L banA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 13si17983522ejg.193.2021.10.17.09.24.37; Sun, 17 Oct 2021 09:25:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238405AbhJOTMx (ORCPT + 99 others); Fri, 15 Oct 2021 15:12:53 -0400 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:48835 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S232055AbhJOTMx (ORCPT ); Fri, 15 Oct 2021 15:12:53 -0400 Received: from cwcc.thunk.org (pool-72-74-133-215.bstnma.fios.verizon.net [72.74.133.215]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 19FJAf6L026185 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 15 Oct 2021 15:10:41 -0400 Received: by cwcc.thunk.org (Postfix, from userid 15806) id 3093315C00CA; Fri, 15 Oct 2021 15:10:41 -0400 (EDT) Date: Fri, 15 Oct 2021 15:10:41 -0400 From: "Theodore Ts'o" To: Avi Deitcher Cc: linux-ext4@vger.kernel.org Subject: Re: algorithm for half-md4 used in htree directories Message-ID: References: <3A493D20-568A-4D63-A575-5DEEBFAAF41A@dilger.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Fri, Oct 15, 2021 at 11:43:07AM -0700, Avi Deitcher wrote: > I am absolutely stumped. I tried the seed as four u32 as is on disk > (i.e. big-endian); four u32 little-endian; one long little-endian > array of bytes (I have no idea why that would make sense, but worth > trying); zeroed out so it gets the default. No one gives a consistent > solution. > > As far as I can tell: hash tells you which intermediate block to look > in, minor hash tells you which leaf block to look in, and then you > scan. So it is pretty easy to see in what range the minor and major > hash should be, but no luck. > > I put up a gist with debugfs and source and output. > https://gist.github.com/deitch/53b01a90635449e7674babfe7e7dd002 > > Anyone who feels like a look-see, I would much appreciate it (and if > they figure it out, owe a beer if ever in the same city). I'm really curious *why* you are trying to reverse engineer the implementation. What are you trying to do? In any case, you're mostly right about what hash and minor_hash are for. The 32-bit hash value is only thing that we use in the hashed B+ tree which is used for hash directories. The 32-bit minor hash is used to form a 64-bit number that gets used when we need to support things like NFSv3 directory cursors, and POSIX telldir/seekdir (which is a massive headache for modern file system, since it assumes that a 64-bit "offset" is all you get to reliably provide the POSIX telldir/seekdir/readdir guarantees even when the directory is getting large number of directory entries added and deleted without limit between the telldir and the seekdir call). As far as what you are doing wrong, I'll point out that *this* program (attached below) provides the correct result. Running this through a debugger and comparing it with your implrementation is left as an exercise for the reader --- but why do you want to try to make your own implementation, when you could just pull down e2fsprogs, compile it, and then *use* the provided ext2_fs library for whatever the heck you are trying to do? - Ted #include #include #include #include #include #include int main(int argc, char **argv) { uuid_t buf; unsigned int *p; int i; ext2_dirhash_t hash, minor_hash; errcode_t retval; uuid_parse("d64563bc-ea93-4aaf-a943-4657711ed153", buf); p = (unsigned int *) buf; for (i=0; i < 4; i++) { printf("buf[%d] = 0x%08x\n", i, p[i]); } retval = ext2fs_dirhash(1, "dir478", strlen("dir478"), p, &hash, &minor_hash); printf("dirhash results: retval=%u, hash=0x%08x, minor_hash=0x%08x\n", i, hash, minor_hash); exit(0); } % gcc -g -o /tmp/foo /tmp/foo.c -luuid -lext2fs -lcom_err % /tmp/foo buf[0] = 0xbc6345d6 buf[1] = 0xaf4a93ea buf[2] = 0x574643a9 buf[3] = 0x53d11e71 dirhash results: retval=4, hash=0x012225e2, minor_hash=0x3f08755d