On Wed, Feb 17, 2021 at 2:48 PM Andreas Dilger <[email protected]> wrote:
>
> On Feb 17, 2021, at 9:08 AM, Theodore Ts'o <[email protected]> wrote:
> >
> > On Tue, Feb 16, 2021 at 08:01:11PM -0800, Daniel Rosenberg wrote:
> >> I'm not sure what the conflict is, at least format-wise. Naturally,
> >> there would need to be some work to reconcile the two patches, but my
> >> patch only alters the format for directories which are encrypted and
> >> casefolded, which always must have the additional hash field. In the
> >> case of dirdata along with encryption and casefolding, couldn't we
> >> have the dirdata simply follow after the existing data? Since we
> >> always already know the length, it'd be unambiguous where that would
> >> start. Casefolding can only be altered on an empty directory, and you
> >> can only enable encryption for an empty directory, so I'm not too
> >> concerned there. I feel like having it swapping between the different
> >> methods makes it more prone to bugs, although it would be doable. I've
> >> started rebasing the dirdata patch on my end to see how easy it is to
> >> mix the two. At a glance, they touch a lot of the same areas in
> >> similar ways, so it shouldn't be too hard. It's more of a question of
> >> which way we want to resolve that, and which patch goes first.
> >>
> >> I've been trying to figure out how many devices in the field are using
> >> casefolded encryption, but haven't found out yet. The code is
> >> definitely available though, so I would not be surprised if it's being
> >> used, or is about to be.
> >
> > The problem is in how the space after the filename in a directory is
> > encoded. The dirdata format is (mildly) expandable, supporting up to
> > 4 different metadata chunks after the filename, using a very
> > compatctly encoded TLV (or moral equivalent) scheme. For directory
> > inodes that have both the encyption and compression flags set, we have
> > a single blob which gets used as the IV for the crypto.
> >
> > So it's the difference between a simple blob that is only used for one
> > thing in this particular case, and something which is the moral
> > equivalent of simple ASN.1 or protobuf encoding.
> >
> > Currently, datadata has defined uses for 2 of the 4 "chunks", which is
> > used in Lustre servers. The proposal which Andreas has suggested is
> > if the dirdata feature is supported, then the 3rd dirdata chunk would
> > be used for the case where we currently used by the
> > encrypted-casefolded extension, and the 4th would get reserved for a
> > to-be-defined extension mechanism.
> >
> > If there ext4 encrypted/casefold is not yet in use, and we can get the
> > changes out to all potential users before they release products out
> > into the field, then one approach would be to only support
> > encrypted/casefold when dirdata is also enabled.
> >
> > If ext4 encrypted/casefold is in use, my suggestion is that we support
> > both encrypted/casefold && !dirdata as you have currently implemented
> > it, and encrypted/casefold && dirdata as Andreas has proposed.
> >
> > IIRC, supporting that Andreas's scheme essentially means that we use
> > the top four bits in the rec_len field to indicate which chunks are
> > present, and then for each chunk which is present, there is a 1 byte
> > length followed by payload. So that means in the case where it's
> > encrypted/casefold && dirdata, the required storage of the directory
> > entry would take one additional byte, plus setting a bit indicating
> > that the encrypted/casefold dirdata chunk was present.
>
> I think your email already covers pretty much all of the points.
>
> One small difference between current "raw" encrypted/casefold hash vs.
> dirdata is that the former is 4-byte aligned within the dirent, while
> dirdata is packed. So in 3/4 cases dirdata would take the same amount
> of space (the 1-byte length would use one of the 1-3 bytes of padding
> vs. the raw format), since the next dirent needs to be aligned anyway.
>
> The other implication here is that the 8-byte hash may need to be
> copied out of the dirent into a local variable before use, due to
> alignment issues, but I'm not sure if that is actually needed or not.
>
> > So, no, they aren't incompatible ultimatly, but it might require a
> > tiny bit more work to integrate the combined support for dirdata plus
> > encrypted/casefold. One way we can do this, if we have to support the
> > current encrypted/casefold format because it's out there in deployed
> > implementations already, is to integrate encrypted/casefold &&
> > !dirdata first upstream, and then when we integrate dirdata into
> > upstream, we'll have to add support for the encrypted/casefold &&
> > dirdata case. This means that we'll have two variants of the on-disk
> > format to test and support, but I don't think it's the going to be
> > that difficult.
>
> It would be possible to detect if the encrypted/casefold+dirdata
> variant is in use, because the dirdata variant would have the 0x40
> bit set in the file_type byte. It isn't possible to positively
> identify the "raw" non-dirdata variant, but the assumption would be
> if (rec_len >= round_up(name_len, 4) + 8) in an encrypted+casefold
> directory that the "raw" hash must be present in the dirent.
>
> Cheers, Andreas
>
>
>
>
>
So sounds like we're going with the combined version. Andreas, do you
have any suggestions for changes to the casefolding patch to ease the
eventual merging with dirdata? A bunch of the changes are already
pretty similar, so some of it is just calling essentially the same
functions different things.
-Daniel