2021-02-10 08:25:01

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: Handle casefolding with encryption

On Feb 9, 2021, at 4:22 PM, Theodore Ts'o <[email protected]> wrote:
>
> On Wed, Feb 03, 2021 at 11:31:28AM -0500, Theodore Ts'o wrote:
>> On Wed, Feb 03, 2021 at 03:55:06AM -0700, Andreas Dilger wrote:
>>>
>>> It looks like this change will break the dirdata feature, which is similarly
>>> storing a data field beyond the end of the dirent. However, that feature also
>>> provides for flags stored in the high bits of the type field to indicate
>>> which of the fields are in use there.
>>> The first byte of each field stores
>>> the length, so it can be skipped even if the content is not understood.
>>
>> Daniel, for context, the dirdata field is an out-of-tree feature which
>> is used by Lustre, and so has fairly large deployed base. So if there
>> is a way that we can accomodate not breaking dirdata, that would be
>> good.
>>
>> Did the ext4 casefold+encryption implementation escape out to any
>> Android handsets?
>
> So from an OOB chat with Daniel, it appears that the ext4
> casefold+encryption implementation did in fact escape out to Android
> handsets. So I think what we will need to do, ultiumately, is support
> one way of supporting the casefold IV in the case where "encryption &&
> casefold", and another way when "encryption && casefold && dirdata".
>
> That's going to be a bit sucky, but I don't think it should be that
> complex. Daniel, Andreas, does that make sense to you?

I was just going to ping you about this, whether it made sense to remove
this feature addition from the "maint" branch (i.e. make a 1.45.8 without
it), and keep it only in 1.46 or "next" to reduce its spread?

Depending on the size of the "escape", it probably makes sense to move
toward having e2fsck migrate from the current mechanism to using dirdata
for all deployments. In the current implementation, tools don't really
know for sure if there is data beyond the filename in the dirent or not.

I guess it is implicit with the casefold+encryption case for dirents in
directories that have the encryption flag set in a filesystem that also
has casefold enabled, but it's definitely not friendly to these features
being enabled on an existing filesystem.

For example, what if casefold is enabled on an existing filesystem that
already has an encrypted directory? Does the code _assume_ that there is
a hash beyond the name if the rec_len is long enough for this? There will
definitely be some pre-existing dirents that will have a large rec_len
(e.g. those at the end of the block, or with deleted entries immediately
following), that do *not* have the proper hash stored in them. There may
be random garbage at the end of the dirent, and since every value in the
hash is valid, there is no way to know whether it is good or bad.

With the dirdata mechanism, there would be a bit set in the "file_type"
field that will indicate if the hash was present, as well as a length
field (0x08) that is a second confirmation that this field is valid.

Cheers, Andreas






Attachments:
signature.asc (890.00 B)
Message signed with OpenPGP

2021-02-10 08:28:58

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: Handle casefolding with encryption

On Tue, Feb 09, 2021 at 08:03:10PM -0700, Andreas Dilger wrote:
> Depending on the size of the "escape", it probably makes sense to move
> toward having e2fsck migrate from the current mechanism to using dirdata
> for all deployments. In the current implementation, tools don't really
> know for sure if there is data beyond the filename in the dirent or not.

It's actually quite well defined. If dirdata is enabled, then we
follow the dirdata rules. If dirdata is *not* enabled, then if a
directory inode has the case folding and encryption flags set, then
there will be cryptographic data immediately following the filename.
Otherwise, there is no valid data after the filename.

> For example, what if casefold is enabled on an existing filesystem that
> already has an encrypted directory? Does the code _assume_ that there is
> a hash beyond the name if the rec_len is long enough for this?

No, we will only expect there to be a hash beyond the name if
EXT4_CASEFOLD_FL and EXT4_ENCRYPT_FL flags are set on the inode. (And
if the rec_len is not large enough, then that's a corrupted directory
entry.)

> I guess it is implicit with the casefold+encryption case for dirents in
> directories that have the encryption flag set in a filesystem that also
> has casefold enabled, but it's definitely not friendly to these features
> being enabled on an existing filesystem.

No, it's fine. That's because the EXT4_CASEFOLD_FL inode flag can
only be set if the EXT4_FEATURE_INCOMPAT_CASEFOLD is set in the
superblock, and EXT4_ENCRYPT_FL inode flag can only be set if
EXT4_FEATURE_INCOMPAT_ENCRYPT is set in the superblock, this is why it
will be safe to enable of these features, since merely enabling the
file system features only allows new directories to be created with
both CASEFOLD_FL and ENCRYPT_FL set.

The only restriction we would have is a file system has both the case
folding and encryption features, it will *not* be safe to set the
dirdata feature flag without first scanning all of the directories to
see if there are any directories that have both the casefold and
encrypt flags set on that inode, and if so, to convert all of the
directory entries to use dirdata. I don't think this is going to be a
significant restriction in practice, though.

- Ted


2021-02-17 04:06:21

by Daniel Rosenberg

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: Handle casefolding with encryption

I'm not sure what the conflict is, at least format-wise. Naturally,
there would need to be some work to reconcile the two patches, but my
patch only alters the format for directories which are encrypted and
casefolded, which always must have the additional hash field. In the
case of dirdata along with encryption and casefolding, couldn't we
have the dirdata simply follow after the existing data? Since we
always already know the length, it'd be unambiguous where that would
start. Casefolding can only be altered on an empty directory, and you
can only enable encryption for an empty directory, so I'm not too
concerned there. I feel like having it swapping between the different
methods makes it more prone to bugs, although it would be doable. I've
started rebasing the dirdata patch on my end to see how easy it is to
mix the two. At a glance, they touch a lot of the same areas in
similar ways, so it shouldn't be too hard. It's more of a question of
which way we want to resolve that, and which patch goes first.

I've been trying to figure out how many devices in the field are using
casefolded encryption, but haven't found out yet. The code is
definitely available though, so I would not be surprised if it's being
used, or is about to be.

-Daniel
On Tue, Feb 9, 2021 at 8:03 PM Theodore Ts'o <[email protected]> wrote:
>
> On Tue, Feb 09, 2021 at 08:03:10PM -0700, Andreas Dilger wrote:
> > Depending on the size of the "escape", it probably makes sense to move
> > toward having e2fsck migrate from the current mechanism to using dirdata
> > for all deployments. In the current implementation, tools don't really
> > know for sure if there is data beyond the filename in the dirent or not.
>
> It's actually quite well defined. If dirdata is enabled, then we
> follow the dirdata rules. If dirdata is *not* enabled, then if a
> directory inode has the case folding and encryption flags set, then
> there will be cryptographic data immediately following the filename.
> Otherwise, there is no valid data after the filename.
>
> > For example, what if casefold is enabled on an existing filesystem that
> > already has an encrypted directory? Does the code _assume_ that there is
> > a hash beyond the name if the rec_len is long enough for this?
>
> No, we will only expect there to be a hash beyond the name if
> EXT4_CASEFOLD_FL and EXT4_ENCRYPT_FL flags are set on the inode. (And
> if the rec_len is not large enough, then that's a corrupted directory
> entry.)
>
> > I guess it is implicit with the casefold+encryption case for dirents in
> > directories that have the encryption flag set in a filesystem that also
> > has casefold enabled, but it's definitely not friendly to these features
> > being enabled on an existing filesystem.
>
> No, it's fine. That's because the EXT4_CASEFOLD_FL inode flag can
> only be set if the EXT4_FEATURE_INCOMPAT_CASEFOLD is set in the
> superblock, and EXT4_ENCRYPT_FL inode flag can only be set if
> EXT4_FEATURE_INCOMPAT_ENCRYPT is set in the superblock, this is why it
> will be safe to enable of these features, since merely enabling the
> file system features only allows new directories to be created with
> both CASEFOLD_FL and ENCRYPT_FL set.
>
> The only restriction we would have is a file system has both the case
> folding and encryption features, it will *not* be safe to set the
> dirdata feature flag without first scanning all of the directories to
> see if there are any directories that have both the casefold and
> encrypt flags set on that inode, and if so, to convert all of the
> directory entries to use dirdata. I don't think this is going to be a
> significant restriction in practice, though.
>
> - Ted
>
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>