2016-03-21 18:11:34

by Richard Yao

[permalink] [raw]
Subject: Making an interface for alternative data streams

I am thinking of implementing Solaris-style alternative data streams in
the ZFSOnLinux driver via an ioctl and writing a compatibility shim so
that software written to use O_XATTR can be trivially adapted to use the
interface.

I sketched out the fine details on github:

https://github.com/zfsonlinux/zfs/issues/4437

I would be much happier if the VFS gave filesystem drivers the ability
to implement O_XATTR. That would avoid the need to (ab)use an ioctl for
this and eliminate the risk of using a bit that would be defined to mean
something else. The former risks permissions checks becoming stale while
the latter is a situation that I would be happy to avoid.

Since this sort of interface is applicable to NFS too, I wanted to ask
what various mainline developers think about it before I tried doing an
initial implementation.


Attachments:
signature.asc (819.00 B)
OpenPGP digital signature

2016-03-21 18:18:26

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Mon, Mar 21, 2016 at 02:11:17PM -0400, Richard Yao wrote:
> I am thinking of implementing Solaris-style alternative data streams in
> the ZFSOnLinux driver via an ioctl and writing a compatibility shim so
> that software written to use O_XATTR can be trivially adapted to use the
> interface.

Even if alternate data streams weren't braindead to start with we are
not going to support your vioation of our copyrights Richard. Please go
away from our mailing lists.

2016-03-21 19:25:05

by Richard Yao

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On 03/21/2016 02:51 PM, Richard Yao wrote:
> On 03/21/2016 02:18 PM, Christoph Hellwig wrote:
>> On Mon, Mar 21, 2016 at 02:11:17PM -0400, Richard Yao wrote:
>>> I am thinking of implementing Solaris-style alternative data streams in
>>> the ZFSOnLinux driver via an ioctl and writing a compatibility shim so
>>> that software written to use O_XATTR can be trivially adapted to use the
>>> interface.
>>
>> Even if alternate data streams weren't braindead to start with we are
>> not going to support your vioation of our copyrights Richard. Please go
>> away from our mailing lists.
>
> I have not violated your copyrights. So far, both the SFC and SFLC are
> in agreement on that.
>

On second thought, this deserves a more detailed response. While I have
spoken to both the SFLC and SFC in the past month, I have spoken to
others in the past. They include an attorney at the DoJ, the FSFE at
LinuxCon Europe 2014 and the Linux Foundation's general consul at
Linuxcon Europe 2014. Not once has any of these organizations ever
accused me of violating Linux copyrights.

All of my work (sans one userland thing I did to help the elderly while
I was at SUNYSB) is under open source licenses. I even have some patches
in the mainline tree. I have done the best that I can to respect the
license of every project to which I have contributed and until today,
there has never been a single claim that I personally violated them,
even in matters of disagreement on things like distribution of binary LKMs.

I understand that the SFC is representing you against VMWare. I have
been in touch with them. They have told me that they do not believe that
I have violated your copyright. If your attorneys have changed their
minds, please let me know so that I can forward your complaint to my
lawyers. Otherwise, please refrain from making such remarks.

That said, while my mainline contributions have been small because much
of my work being both out-of-tree and subject to the limitations of
older kernels, I intend to contribute more in the future and I have no
intention of going away.


Attachments:
signature.asc (819.00 B)
OpenPGP digital signature

2016-03-21 19:30:02

by Richard Yao

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On 03/21/2016 02:18 PM, Christoph Hellwig wrote:
> On Mon, Mar 21, 2016 at 02:11:17PM -0400, Richard Yao wrote:
>> I am thinking of implementing Solaris-style alternative data streams in
>> the ZFSOnLinux driver via an ioctl and writing a compatibility shim so
>> that software written to use O_XATTR can be trivially adapted to use the
>> interface.
>
> Even if alternate data streams weren't braindead to start with we are
> not going to support your vioation of our copyrights Richard. Please go
> away from our mailing lists.

I have not violated your copyrights. So far, both the SFC and SFLC are
in agreement on that.


Attachments:
signature.asc (819.00 B)
OpenPGP digital signature

2016-03-21 20:19:31

by Richard Yao

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On 03/21/2016 02:11 PM, Richard Yao wrote:
> I am thinking of implementing Solaris-style alternative data streams in
> the ZFSOnLinux driver via an ioctl and writing a compatibility shim so
> that software written to use O_XATTR can be trivially adapted to use the
> interface.
>
> I sketched out the fine details on github:
>
> https://github.com/zfsonlinux/zfs/issues/4437
>
> I would be much happier if the VFS gave filesystem drivers the ability
> to implement O_XATTR. That would avoid the need to (ab)use an ioctl for
> this and eliminate the risk of using a bit that would be defined to mean
> something else. The former risks permissions checks becoming stale while
> the latter is a situation that I would be happy to avoid.
>
> Since this sort of interface is applicable to NFS too, I wanted to ask
> what various mainline developers think about it before I tried doing an
> initial implementation.
>

Maybe I should clarify that the idea is to allow read/write/list of
extended attributes via read/write/readdir so that those that want
extended attributes that are alternative data streams can have them. I
do not want to see extended attributes and alternative data streams be
different things. Alternative data streams are in the NFSv4
specification, so I thought that the developers of the NFS client driver
would want something like this.

If it went into the VFS, then existing in-tree filesystems could have it
mapped to the existing interface, which would allow it to work
everywhere extended attributes are implemented. If they are not
interested, then I could go ahead with my ioctl idea. I just wanted to
try to implement this in a way everyone who can use it would like so
that we can avoid a XKCD #927 situation in the future:

https://xkcd.com/927/


Attachments:
signature.asc (819.00 B)
OpenPGP digital signature

2016-03-21 20:40:42

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Mon, Mar 21, 2016 at 04:19:17PM -0400, Richard Yao wrote:
> Maybe I should clarify that the idea is to allow read/write/list of
> extended attributes via read/write/readdir so that those that want
> extended attributes that are alternative data streams can have them. I
> do not want to see extended attributes and alternative data streams be
> different things.

I think there are differences between the two that make this awkward.
Does anyone actually use alternative data stream for anything that makes
the effort worthwhile?

> Alternative data streams are in the NFSv4 specification, so I thought
> that the developers of the NFS client driver would want something like
> this.

Somebody would have to make a convincing argument that it's worthwhile.

Note there's also a proposal to add extended attributes to the NFSv4
protocol:

https://tools.ietf.org/html/draft-ietf-nfsv4-xattrs-02

That also has some more discussion of mismatches between the named- and
extended- attribute interfaces:

https://tools.ietf.org/html/draft-ietf-nfsv4-xattrs-02#section-5

--b.

2016-03-21 22:36:45

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Mon, Mar 21, 2016 at 04:40:41PM -0400, J. Bruce Fields wrote:
> On Mon, Mar 21, 2016 at 04:19:17PM -0400, Richard Yao wrote:
> > Maybe I should clarify that the idea is to allow read/write/list of
> > extended attributes via read/write/readdir so that those that want
> > extended attributes that are alternative data streams can have them. I
> > do not want to see extended attributes and alternative data streams be
> > different things.
>
> I think there are differences between the two that make this awkward.
> Does anyone actually use alternative data stream for anything that makes
> the effort worthwhile?

Windows malware authors *love* to use alternate data streams as a
place to hide their malware where many security scanners weren't
looking, and certainly most users won't find.

Does that count? :-)

- Ted

2016-03-21 22:48:06

by Cedric Blancher

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On 21 March 2016 at 23:36, Theodore Ts'o <[email protected]> wrote:
> On Mon, Mar 21, 2016 at 04:40:41PM -0400, J. Bruce Fields wrote:
>> On Mon, Mar 21, 2016 at 04:19:17PM -0400, Richard Yao wrote:
>> > Maybe I should clarify that the idea is to allow read/write/list of
>> > extended attributes via read/write/readdir so that those that want
>> > extended attributes that are alternative data streams can have them. I
>> > do not want to see extended attributes and alternative data streams be
>> > different things.
>>
>> I think there are differences between the two that make this awkward.
>> Does anyone actually use alternative data stream for anything that makes
>> the effort worthwhile?
>
> Windows malware authors *love* to use alternate data streams as a
> place to hide their malware where many security scanners weren't
> looking, and certainly most users won't find.
>
> Does that count? :-)

Old invalid argument, and Sophos and Symatec look there as well.

If it was a bad idea, why has Linux fs attributes which are almost the
same as O_XATTR except that they use a custom api? Why does Macos have
alternate streams (called forks)? Why did Solaris adopt it long ago
(and still gets support questions about it - just saying before
someone argues that no one uses THAT)?

Ced
--
Cedric Blancher <[email protected]>
Institute Pasteur

2016-03-22 00:12:56

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Mon, Mar 21, 2016 at 11:48:04PM +0100, Cedric Blancher wrote:
> Old invalid argument, and Sophos and Symatec look there as well.
>
> If it was a bad idea, why has Linux fs attributes which are almost the
> same as O_XATTR except that they use a custom api? Why does Macos have
> alternate streams (called forks)? Why did Solaris adopt it long ago
> (and still gets support questions about it - just saying before
> someone argues that no one uses THAT)?

Could you point us at some of those users?

--b.

2016-03-22 01:02:51

by Richard Yao

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams


> On Mar 21, 2016, at 8:12 PM, J. Bruce Fields <[email protected]> wrote:
>
>> On Mon, Mar 21, 2016 at 11:48:04PM +0100, Cedric Blancher wrote:
>> Old invalid argument, and Sophos and Symatec look there as well.
>>
>> If it was a bad idea, why has Linux fs attributes which are almost the
>> same as O_XATTR except that they use a custom api? Why does Macos have
>> alternate streams (called forks)? Why did Solaris adopt it long ago
>> (and still gets support questions about it - just saying before
>> someone argues that no one uses THAT)?
>
> Could you point us at some of those users?

I am told that Samba users would love this functionality.
>
> --b.

2016-03-22 02:01:40

by Richard Yao

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams


> On Mar 21, 2016, at 4:40 PM, J. Bruce Fields <[email protected]> wrote:
>
>> On Mon, Mar 21, 2016 at 04:19:17PM -0400, Richard Yao wrote:
>> Maybe I should clarify that the idea is to allow read/write/list of
>> extended attributes via read/write/readdir so that those that want
>> extended attributes that are alternative data streams can have them. I
>> do not want to see extended attributes and alternative data streams be
>> different things.
>
> I think there are differences between the two that make this awkward.
> Does anyone actually use alternative data stream for anything that makes
> the effort worthwhile?

For particularly large extended attributes, avoiding having to read the entire value into userspace, modify it and write it back is nice. I believe that XFS goes up to 64KB.

I have had multiple user inquiries regarding this each year. They tend to fall into two camps. One is the Samba camp while the other wants saner fsync semantics. When your disk format uses tar files for extended attributes, doing fsync either omits them or iterates through the extended attribute directory. Neither is very nice. At the moment, the answer in the driver that I develop is it works as long as they are not too big and a non-cross platform compatible option is enabled to store small ones the IRIX way that most in-tree file drivers use. Maybe the right answer is to make it user configurable, although having an interface that lets users target a specific extended attribute would be nice.

I already have the idea of implementing it through an ioctl without any changes to the VFS. I imagine if I do that userland software will begin adopting it and then people will start pinging other filesystem driver developers requesting support, like what happened with reflinks (although the majority of people using them seem to be using them in internal things that never see the light of day). At that point, we could be stuck with a significant body of userland software that expects an interface that probably is less agreeable than what it would have been had people sat down to discuss it in the first place.

>> Alternative data streams are in the NFSv4 specification, so I thought
>> that the developers of the NFS client driver would want something like
>> this.
>
> Somebody would have to make a convincing argument that it's worthwhile.
>
> Note there's also a proposal to add extended attributes to the NFSv4
> protocol:
>
> https://tools.ietf.org/html/draft-ietf-nfsv4-xattrs-02

I had not known about that. If that happens and we start seeing separate namespaces, we will need alternative data streams from the perspective of being able to archive the contents of a file system on one POSIX system and restore it on another. There is just no way that userland tools like rsync would be able to keep treating the two as equivalent when separate namespaces come into play and migrating to Linux would guarantee data loss when both are in use.

> That also has some more discussion of mismatches between the named- and
> extended- attribute interfaces:
>
> https://tools.ietf.org/html/draft-ietf-nfsv4-xattrs-02#section-5

That describes the exact problems with the current approach in the ZoL driver.

>
> --b.

2016-03-22 02:21:44

by Richard Yao

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On 03/21/2016 09:02 PM, Richard Yao wrote:
>
>> On Mar 21, 2016, at 8:12 PM, J. Bruce Fields <[email protected]> wrote:
>>
>>> On Mon, Mar 21, 2016 at 11:48:04PM +0100, Cedric Blancher wrote:
>>> Old invalid argument, and Sophos and Symatec look there as well.
>>>
>>> If it was a bad idea, why has Linux fs attributes which are almost the
>>> same as O_XATTR except that they use a custom api? Why does Macos have
>>> alternate streams (called forks)? Why did Solaris adopt it long ago
>>> (and still gets support questions about it - just saying before
>>> someone argues that no one uses THAT)?
>>
>> Could you point us at some of those users?
>
> I am told that Samba users would love this functionality.

Someone pinged me in IRC to let me know that Steve French was talking
about this earlier this month:

> Have there been any suggestions on how to list alternate data streams
> on a file other than using a pseudo-xattr as ntfs-3g does (querying
> xattr ntfs.streams.list - see http://linux.die.net/man/8/ntfs-3g)?

http://permalink.gmane.org/gmane.linux.kernel.cifs/11681

Windows appears to have name-value pair attributes and alternative data
streams in separate name-spaces simultaneously.


Attachments:
signature.asc (819.00 B)
OpenPGP digital signature

2016-03-22 16:15:45

by Richard Sharpe

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Mon, Mar 21, 2016 at 7:21 PM, Richard Yao <[email protected]> wrote:
> On 03/21/2016 09:02 PM, Richard Yao wrote:
>>
>>> On Mar 21, 2016, at 8:12 PM, J. Bruce Fields <[email protected]> wrote:
>>>
>>>> On Mon, Mar 21, 2016 at 11:48:04PM +0100, Cedric Blancher wrote:
>>>> Old invalid argument, and Sophos and Symatec look there as well.
>>>>
>>>> If it was a bad idea, why has Linux fs attributes which are almost the
>>>> same as O_XATTR except that they use a custom api? Why does Macos have
>>>> alternate streams (called forks)? Why did Solaris adopt it long ago
>>>> (and still gets support questions about it - just saying before
>>>> someone argues that no one uses THAT)?
>>>
>>> Could you point us at some of those users?
>>
>> I am told that Samba users would love this functionality.
>
> Someone pinged me in IRC to let me know that Steve French was talking
> about this earlier this month:
>
>> Have there been any suggestions on how to list alternate data streams
>> on a file other than using a pseudo-xattr as ntfs-3g does (querying
>> xattr ntfs.streams.list - see http://linux.die.net/man/8/ntfs-3g)?
>
> http://permalink.gmane.org/gmane.linux.kernel.cifs/11681
>
> Windows appears to have name-value pair attributes and alternative data
> streams in separate name-spaces simultaneously.

And it is more convoluted than that. If you use reparse points on a
file you cannot use name-value pair attributes (Windows Extended
Attributes) and vice versa, because they overloaded one of the fields
use to report on those things.

Of course, pretty much no one uses Windows Extended Attributes these
days, I believe.

However, plenty of Samba users store Windows ACLs in Linux XATTRs ...

--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

2016-03-22 20:08:03

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Tue, Mar 22, 2016 at 09:15:44AM -0700, Richard Sharpe wrote:
> On Mon, Mar 21, 2016 at 7:21 PM, Richard Yao <[email protected]> wrote:
> > On 03/21/2016 09:02 PM, Richard Yao wrote:
> >>
> >>> On Mar 21, 2016, at 8:12 PM, J. Bruce Fields <[email protected]> wrote:
> >>>
> >>>> On Mon, Mar 21, 2016 at 11:48:04PM +0100, Cedric Blancher wrote:
> >>>> Old invalid argument, and Sophos and Symatec look there as well.
> >>>>
> >>>> If it was a bad idea, why has Linux fs attributes which are almost the
> >>>> same as O_XATTR except that they use a custom api? Why does Macos have
> >>>> alternate streams (called forks)? Why did Solaris adopt it long ago
> >>>> (and still gets support questions about it - just saying before
> >>>> someone argues that no one uses THAT)?
> >>>
> >>> Could you point us at some of those users?
> >>
> >> I am told that Samba users would love this functionality.
> >
> > Someone pinged me in IRC to let me know that Steve French was talking
> > about this earlier this month:
> >
> >> Have there been any suggestions on how to list alternate data streams
> >> on a file other than using a pseudo-xattr as ntfs-3g does (querying
> >> xattr ntfs.streams.list - see http://linux.die.net/man/8/ntfs-3g)?
> >
> > http://permalink.gmane.org/gmane.linux.kernel.cifs/11681
> >
> > Windows appears to have name-value pair attributes and alternative data
> > streams in separate name-spaces simultaneously.
>
> And it is more convoluted than that. If you use reparse points on a
> file you cannot use name-value pair attributes (Windows Extended
> Attributes) and vice versa, because they overloaded one of the fields
> use to report on those things.
>
> Of course, pretty much no one uses Windows Extended Attributes these
> days, I believe.

But you do see people using "named attributes"/"alternative data
streams"?

This comes up at the LSF/MM summit every now and then and Jeremy Allison
inevitably says "hah, only malware writers use those", and that's the
end of the discussion. Sounds like Richard Yao has heard otherwise, but
it'd be nice to have actual examples of users.

--b.

>
> However, plenty of Samba users store Windows ACLs in Linux XATTRs ...
>
> --
> Regards,
> Richard Sharpe
> (何以解憂?唯有杜康。--曹操)

2016-03-22 20:13:50

by Richard Sharpe

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Tue, Mar 22, 2016 at 1:08 PM, J. Bruce Fields <[email protected]> wrote:
> On Tue, Mar 22, 2016 at 09:15:44AM -0700, Richard Sharpe wrote:
>> On Mon, Mar 21, 2016 at 7:21 PM, Richard Yao <[email protected]> wrote:
>> > On 03/21/2016 09:02 PM, Richard Yao wrote:
>> >>
>> >>> On Mar 21, 2016, at 8:12 PM, J. Bruce Fields <[email protected]> wrote:
>> >>>
>> >>>> On Mon, Mar 21, 2016 at 11:48:04PM +0100, Cedric Blancher wrote:
>> >>>> Old invalid argument, and Sophos and Symatec look there as well.
>> >>>>
>> >>>> If it was a bad idea, why has Linux fs attributes which are almost the
>> >>>> same as O_XATTR except that they use a custom api? Why does Macos have
>> >>>> alternate streams (called forks)? Why did Solaris adopt it long ago
>> >>>> (and still gets support questions about it - just saying before
>> >>>> someone argues that no one uses THAT)?
>> >>>
>> >>> Could you point us at some of those users?
>> >>
>> >> I am told that Samba users would love this functionality.
>> >
>> > Someone pinged me in IRC to let me know that Steve French was talking
>> > about this earlier this month:
>> >
>> >> Have there been any suggestions on how to list alternate data streams
>> >> on a file other than using a pseudo-xattr as ntfs-3g does (querying
>> >> xattr ntfs.streams.list - see http://linux.die.net/man/8/ntfs-3g)?
>> >
>> > http://permalink.gmane.org/gmane.linux.kernel.cifs/11681
>> >
>> > Windows appears to have name-value pair attributes and alternative data
>> > streams in separate name-spaces simultaneously.
>>
>> And it is more convoluted than that. If you use reparse points on a
>> file you cannot use name-value pair attributes (Windows Extended
>> Attributes) and vice versa, because they overloaded one of the fields
>> use to report on those things.
>>
>> Of course, pretty much no one uses Windows Extended Attributes these
>> days, I believe.
>
> But you do see people using "named attributes"/"alternative data
> streams"?
>
> This comes up at the LSF/MM summit every now and then and Jeremy Allison
> inevitably says "hah, only malware writers use those", and that's the
> end of the discussion. Sounds like Richard Yao has heard otherwise, but
> it'd be nice to have actual examples of users.

I have worked on three products that run on UNIX (one on FreeBSD, two
on Linux) that require Alternate Data Streams because customers demand
them. In each case we used Samba and its streams_depot VFS module. The
fact that Richard is discussing exposing streams in ZoL will be
interesting to some users, possibly especially to those Taiwanese
cheap NAS companies

Named Attributes are pretty rare these days. I think that only virus
checkers ever used them to store the fact that they had checked that
file.

--
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)

2016-03-22 20:32:05

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Tue, Mar 22, 2016 at 01:13:49PM -0700, Richard Sharpe wrote:
> On Tue, Mar 22, 2016 at 1:08 PM, J. Bruce Fields <[email protected]> wrote:
> > On Tue, Mar 22, 2016 at 09:15:44AM -0700, Richard Sharpe wrote:
> >> On Mon, Mar 21, 2016 at 7:21 PM, Richard Yao <[email protected]> wrote:
> >> > On 03/21/2016 09:02 PM, Richard Yao wrote:
> >> >>
> >> >>> On Mar 21, 2016, at 8:12 PM, J. Bruce Fields <[email protected]> wrote:
> >> >>>
> >> >>>> On Mon, Mar 21, 2016 at 11:48:04PM +0100, Cedric Blancher wrote:
> >> >>>> Old invalid argument, and Sophos and Symatec look there as well.
> >> >>>>
> >> >>>> If it was a bad idea, why has Linux fs attributes which are almost the
> >> >>>> same as O_XATTR except that they use a custom api? Why does Macos have
> >> >>>> alternate streams (called forks)? Why did Solaris adopt it long ago
> >> >>>> (and still gets support questions about it - just saying before
> >> >>>> someone argues that no one uses THAT)?
> >> >>>
> >> >>> Could you point us at some of those users?
> >> >>
> >> >> I am told that Samba users would love this functionality.
> >> >
> >> > Someone pinged me in IRC to let me know that Steve French was talking
> >> > about this earlier this month:
> >> >
> >> >> Have there been any suggestions on how to list alternate data streams
> >> >> on a file other than using a pseudo-xattr as ntfs-3g does (querying
> >> >> xattr ntfs.streams.list - see http://linux.die.net/man/8/ntfs-3g)?
> >> >
> >> > http://permalink.gmane.org/gmane.linux.kernel.cifs/11681
> >> >
> >> > Windows appears to have name-value pair attributes and alternative data
> >> > streams in separate name-spaces simultaneously.
> >>
> >> And it is more convoluted than that. If you use reparse points on a
> >> file you cannot use name-value pair attributes (Windows Extended
> >> Attributes) and vice versa, because they overloaded one of the fields
> >> use to report on those things.
> >>
> >> Of course, pretty much no one uses Windows Extended Attributes these
> >> days, I believe.
> >
> > But you do see people using "named attributes"/"alternative data
> > streams"?
> >
> > This comes up at the LSF/MM summit every now and then and Jeremy Allison
> > inevitably says "hah, only malware writers use those", and that's the
> > end of the discussion. Sounds like Richard Yao has heard otherwise, but
> > it'd be nice to have actual examples of users.
>
> I have worked on three products that run on UNIX (one on FreeBSD, two
> on Linux) that require Alternate Data Streams because customers demand
> them. In each case we used Samba and its streams_depot VFS module. The
> fact that Richard is discussing exposing streams in ZoL will be
> interesting to some users, possibly especially to those Taiwanese
> cheap NAS companies

Any clues on where we might look for example end users? (So, not Samba,
or somebody else that's just passing along the interface, but an
application that's actually consuming it?)

> Named Attributes are pretty rare these days. I think that only virus
> checkers ever used them to store the fact that they had checked that
> file.

By the way, the language here cause me no end of confusion, because RFC
3530 (NFSv4) uses "named attributes" to mean alternate data streams. So
I've tended to use "named" vs "extended" attributes to distinguish the
two cases, whereas you seem to use them as synonyms.

--b.

2016-03-22 21:29:34

by Dave Chinner

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Mon, Mar 21, 2016 at 10:01:35PM -0400, Richard Yao wrote:
>
> > On Mar 21, 2016, at 4:40 PM, J. Bruce Fields
> > <[email protected]> wrote:
> >
> >> On Mon, Mar 21, 2016 at 04:19:17PM -0400, Richard Yao wrote:
> >> Maybe I should clarify that the idea is to allow
> >> read/write/list of extended attributes via read/write/readdir
> >> so that those that want extended attributes that are
> >> alternative data streams can have them. I do not want to see
> >> extended attributes and alternative data streams be different
> >> things.
> >
> > I think there are differences between the two that make this
> > awkward. Does anyone actually use alternative data stream for
> > anything that makes the effort worthwhile?
>
> For particularly large extended attributes, avoiding having to
> read the entire value into userspace, modify it and write it back
> is nice. I believe that XFS goes up to 64KB.

Sorry, but XFS xattrs are cannot be partially overwritten due to the
write atomicity requirement of xattrs (i.e. either the entire change
or none of the change is present after a crash). not to mention that
we'd have to completely re-implement extended attributes in XFS to
support them being used as ADS. That's simply not going to happen.

Extended attributes are *not data streams*. Stop trying to make them
data streams - the APIs and the implementations in the fileystems
are simply not designed to be used as seekable data streams.

If you want additional seekable data streams, then come up with a
filesystem namespace method of addressing these alternate data
streams as *separate files containing data*. That's all an ADS is -
a namespace hack to address multiple data files through a single
file name. That's the problem that needs solving and it has nothing
to do with xattrs.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2016-03-22 21:43:05

by Jeremy Allison

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Tue, Mar 22, 2016 at 04:08:01PM -0400, J. Bruce Fields wrote:
>
> But you do see people using "named attributes"/"alternative data
> streams"?
>
> This comes up at the LSF/MM summit every now and then and Jeremy Allison
> inevitably says "hah, only malware writers use those", and that's the
> end of the discussion. Sounds like Richard Yao has heard otherwise, but
> it'd be nice to have actual examples of users.

The only use I know of other than malware writers is
the :Zone.Identifier stream used by Internet Explorer.

http://woshub.com/how-windows-determines-that-the-file-has-been-downloaded-from-the-internet/

Not sure if the new Microsoft browser still uses them
(I haven't used desktop Windows in over 10 years).

2016-03-22 21:52:34

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Wed, Mar 23, 2016 at 08:29:31AM +1100, Dave Chinner wrote:
> If you want additional seekable data streams, then come up with a
> filesystem namespace method of addressing these alternate data
> streams as *separate files containing data*. That's all an ADS is -
> a namespace hack to address multiple data files through a single
> file name. That's the problem that needs solving and it has nothing
> to do with xattrs.

There was some thread about this years ago. Looking... "silent semantic
changes with reiser4" or "possible design issues for hybrids" look like
the relevant threads.

One good starting point might be:

http://mid.gmane.org/[email protected]

Don't know how out of date that might be now.

--b.

2016-03-22 22:50:21

by Dave Chinner

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Tue, Mar 22, 2016 at 05:52:32PM -0400, J. Bruce Fields wrote:
> On Wed, Mar 23, 2016 at 08:29:31AM +1100, Dave Chinner wrote:
> > If you want additional seekable data streams, then come up with a
> > filesystem namespace method of addressing these alternate data
> > streams as *separate files containing data*. That's all an ADS is -
> > a namespace hack to address multiple data files through a single
> > file name. That's the problem that needs solving and it has nothing
> > to do with xattrs.
>
> There was some thread about this years ago. Looking... "silent semantic
> changes with reiser4" or "possible design issues for hybrids" look like
> the relevant threads.
>
> One good starting point might be:
>
> http://mid.gmane.org/[email protected]
>
> Don't know how out of date that might be now.

Not sure it is the same issue - that looks like auto-bind-mount
stuff. There are some relevant questions though, like "directory
over file" semantics and implications...

But we have covered this ground before, and at LSF/MM summits, too.
historically it has been made pretty clear that xattrs are not
something that can be used for alternate data streams. I think ADS
is primarily a namespace and API problem - that needs to be sorted
out first, and then what filesytems need to implement will be
obvious.

Perhaps overlayfs is a good place to start prototyping such
functionality as there is an abstraction between the presented
namespace and the underlying storage of the data....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2016-03-23 04:13:51

by Steve French

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Tue, Mar 22, 2016 at 4:42 PM, Jeremy Allison <[email protected]> wrote:
> On Tue, Mar 22, 2016 at 04:08:01PM -0400, J. Bruce Fields wrote:
>>
>> But you do see people using "named attributes"/"alternative data
>> streams"?
>>
>> This comes up at the LSF/MM summit every now and then and Jeremy Allison
>> inevitably says "hah, only malware writers use those", and that's the
>> end of the discussion. Sounds like Richard Yao has heard otherwise, but
>> it'd be nice to have actual examples of users.
>
> The only use I know of other than malware writers is
> the :Zone.Identifier stream used by Internet Explorer.
>
> http://woshub.com/how-windows-determines-that-the-file-has-been-downloaded-from-the-internet/
>
> Not sure if the new Microsoft browser still uses them
> (I haven't used desktop Windows in over 10 years).

Yes, the browser still uses it (at least on the system I tried
yesterday), and so do a few important subsystems (the file resource
manager for example). Presumably streams are used even more on Mac.

I was experimenting with some patches in the last few weeks to list
streams (either via an xattr as ntfs-3g does, but I am leaning toward
an ioctl for cifs.ko). They are needed for backup (at least), and not
just for accessing Macs (which use resource forks extensively), but
since Windows stores the zone identifier (where a file came from is
stored when internet explorer downloads anything) in an alternate data
stream, and also "FCI" (file classification information) is stored
there.

--
Thanks,

Steve

2016-03-23 04:20:12

by Steve French

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Tue, Mar 22, 2016 at 11:13 PM, Steve French <[email protected]> wrote:
> On Tue, Mar 22, 2016 at 4:42 PM, Jeremy Allison <[email protected]> wrote:
>> On Tue, Mar 22, 2016 at 04:08:01PM -0400, J. Bruce Fields wrote:
>>>
>>> But you do see people using "named attributes"/"alternative data
>>> streams"?
>>>
>>> This comes up at the LSF/MM summit every now and then and Jeremy Allison
>>> inevitably says "hah, only malware writers use those", and that's the
>>> end of the discussion. Sounds like Richard Yao has heard otherwise, but
>>> it'd be nice to have actual examples of users.
>>
>> The only use I know of other than malware writers is
>> the :Zone.Identifier stream used by Internet Explorer.
>>
>> http://woshub.com/how-windows-determines-that-the-file-has-been-downloaded-from-the-internet/
>>
>> Not sure if the new Microsoft browser still uses them
>> (I haven't used desktop Windows in over 10 years).
>
> Yes, the browser still uses it (at least on the system I tried
> yesterday), and so do a few important subsystems (the file resource
> manager for example). Presumably streams are used even more on Mac.
>
> I was experimenting with some patches in the last few weeks to list
> streams (either via an xattr as ntfs-3g does, but I am leaning toward
> an ioctl for cifs.ko). They are needed for backup (at least), and not
> just for accessing Macs (which use resource forks extensively), but
> since Windows stores the zone identifier (where a file came from is
> stored when internet explorer downloads anything) in an alternate data
> stream, and also "FCI" (file classification information) is stored
> there.

I should also note that since SMB3 operations are handle based
(except open/create itself), I prefer using an ioctl rather than xattr
query to list streams. In addition, by overlapping the alternate
data stream name space, with the EAs name space they are
harder to tell apart (xattrs are used less frequently on Windows
than in the past but they do show up from time to time,
e.g. in their Services for Unix). Seems wrong to make it easy
to confuse streams and EAs (extended attributes).


--
Thanks,

Steve

2016-03-23 14:45:43

by Steve French

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

One of the arguments in favor of additional interfaces (ioctl or
openat) for accessing alternate data streams which may not be obvious
to Windows users is that while alternate data streams can be opened
just like regular files in Windows (and thus over SMB3 mounts), in
Linux it is hard to allow opening a stream and still support files
with the ':' (colon) character in their file name since colon is used
a separator for the stream name in Windows (and is a reserved
character), but is a valid character in POSIX. When we use a cifs
or smb3 mount to Windows or Mac we typically map characters (into the
Unicode remap range just above 0xF000) like ':' the same way the Mac
does (and Windows services for Mac does as well). This is enabled
with mount option "mapposix"

So without an ioctl to query the stream contents (or a new syscall),
you have to choose whether to either allow : in a filename or allow
opening streams.

There is some additional information on some of the more important
uses in Windows for alternate data streams at the end of the article
in this link: https://blogs.technet.microsoft.com/askcore/2013/03/24/alternate-data-streams-in-ntfs/

On Tue, Mar 22, 2016 at 11:19 PM, Steve French <[email protected]> wrote:
> On Tue, Mar 22, 2016 at 11:13 PM, Steve French <[email protected]> wrote:
>> On Tue, Mar 22, 2016 at 4:42 PM, Jeremy Allison <[email protected]> wrote:
>>> On Tue, Mar 22, 2016 at 04:08:01PM -0400, J. Bruce Fields wrote:
>>>>
>>>> But you do see people using "named attributes"/"alternative data
>>>> streams"?
>>>>
>>>> This comes up at the LSF/MM summit every now and then and Jeremy Allison
>>>> inevitably says "hah, only malware writers use those", and that's the
>>>> end of the discussion. Sounds like Richard Yao has heard otherwise, but
>>>> it'd be nice to have actual examples of users.
>>>
>>> The only use I know of other than malware writers is
>>> the :Zone.Identifier stream used by Internet Explorer.
>>>
>>> http://woshub.com/how-windows-determines-that-the-file-has-been-downloaded-from-the-internet/
>>>
>>> Not sure if the new Microsoft browser still uses them
>>> (I haven't used desktop Windows in over 10 years).
>>
>> Yes, the browser still uses it (at least on the system I tried
>> yesterday), and so do a few important subsystems (the file resource
>> manager for example). Presumably streams are used even more on Mac.
>>
>> I was experimenting with some patches in the last few weeks to list
>> streams (either via an xattr as ntfs-3g does, but I am leaning toward
>> an ioctl for cifs.ko). They are needed for backup (at least), and not
>> just for accessing Macs (which use resource forks extensively), but
>> since Windows stores the zone identifier (where a file came from is
>> stored when internet explorer downloads anything) in an alternate data
>> stream, and also "FCI" (file classification information) is stored
>> there.
>
> I should also note that since SMB3 operations are handle based
> (except open/create itself), I prefer using an ioctl rather than xattr
> query to list streams. In addition, by overlapping the alternate
> data stream name space, with the EAs name space they are
> harder to tell apart (xattrs are used less frequently on Windows
> than in the past but they do show up from time to time,
> e.g. in their Services for Unix). Seems wrong to make it easy
> to confuse streams and EAs (extended attributes).
>
>
> --
> Thanks,
>
> Steve



--
Thanks,

Steve

2016-03-23 15:16:11

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Tue, Mar 22, 2016 at 11:19:51PM -0500, Steve French wrote:
> On Tue, Mar 22, 2016 at 11:13 PM, Steve French <[email protected]> wrote:
> > Yes, the browser still uses it (at least on the system I tried
> > yesterday), and so do a few important subsystems (the file resource
> > manager for example). Presumably streams are used even more on Mac.
> >
> > I was experimenting with some patches in the last few weeks to list
> > streams (either via an xattr as ntfs-3g does, but I am leaning toward
> > an ioctl for cifs.ko). They are needed for backup (at least), and not
> > just for accessing Macs (which use resource forks extensively), but
> > since Windows stores the zone identifier (where a file came from is
> > stored when internet explorer downloads anything) in an alternate data
> > stream, and also "FCI" (file classification information) is stored
> > there.

Sounds like there are important user, then.

> I should also note that since SMB3 operations are handle based
> (except open/create itself), I prefer using an ioctl rather than xattr
> query to list streams.

So on Linux you'd want that ioctl to return a file descriptor for a
given stream?

> In addition, by overlapping the alternate
> data stream name space, with the EAs name space they are
> harder to tell apart (xattrs are used less frequently on Windows
> than in the past but they do show up from time to time,
> e.g. in their Services for Unix). Seems wrong to make it easy
> to confuse streams and EAs (extended attributes).

Sounds like everyone's agreed that the two should be kept distinct.

--b.

2016-03-23 17:02:09

by Jeremy Allison

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Wed, Mar 23, 2016 at 09:45:07AM -0500, Steve French wrote:
> One of the arguments in favor of additional interfaces (ioctl or
> openat) for accessing alternate data streams which may not be obvious
> to Windows users is that while alternate data streams can be opened
> just like regular files in Windows (and thus over SMB3 mounts), in
> Linux it is hard to allow opening a stream and still support files
> with the ':' (colon) character in their file name since colon is used
> a separator for the stream name in Windows (and is a reserved
> character), but is a valid character in POSIX. When we use a cifs
> or smb3 mount to Windows or Mac we typically map characters (into the
> Unicode remap range just above 0xF000) like ':' the same way the Mac
> does (and Windows services for Mac does as well). This is enabled
> with mount option "mapposix"
>
> So without an ioctl to query the stream contents (or a new syscall),
> you have to choose whether to either allow : in a filename or allow
> opening streams.
>
> There is some additional information on some of the more important
> uses in Windows for alternate data streams at the end of the article
> in this link: https://blogs.technet.microsoft.com/askcore/2013/03/24/alternate-data-streams-in-ntfs/

Sorry Steve, but none of the uses in there can be called "important".

I personally have an intense dislike for streams in a filesystem,
and was very disappointed when Microsoft re-added them to the
previously streamless ReFS (probably for backwards compatibility
stuff like this).

There's no way to transfer stream-riddled files over the Internet,
and the amount of code complexity we have in Samba having to deal
with them is nasty and has lead to more than one security hole in
the past.

Please don't add this to Linux.

2016-03-23 17:17:15

by Steve French

[permalink] [raw]
Subject: Re: Making an interface for alternative data streams

On Wed, Mar 23, 2016 at 12:01 PM, Jeremy Allison <[email protected]> wrote:
> On Wed, Mar 23, 2016 at 09:45:07AM -0500, Steve French wrote:
>> One of the arguments in favor of additional interfaces (ioctl or
>> openat) for accessing alternate data streams which may not be obvious
>> to Windows users is that while alternate data streams can be opened
>> just like regular files in Windows (and thus over SMB3 mounts), in
>> Linux it is hard to allow opening a stream and still support files
>> with the ':' (colon) character in their file name since colon is used
>> a separator for the stream name in Windows (and is a reserved
>> character), but is a valid character in POSIX. When we use a cifs
>> or smb3 mount to Windows or Mac we typically map characters (into the
>> Unicode remap range just above 0xF000) like ':' the same way the Mac
>> does (and Windows services for Mac does as well). This is enabled
>> with mount option "mapposix"
>>
>> So without an ioctl to query the stream contents (or a new syscall),
>> you have to choose whether to either allow : in a filename or allow
>> opening streams.
>>
>> There is some additional information on some of the more important
>> uses in Windows for alternate data streams at the end of the article
>> in this link: https://blogs.technet.microsoft.com/askcore/2013/03/24/alternate-data-streams-in-ntfs/
>
> Sorry Steve, but none of the uses in there can be called "important".
>
> I personally have an intense dislike for streams in a filesystem,
> and was very disappointed when Microsoft re-added them to the
> previously streamless ReFS (probably for backwards compatibility
> stuff like this).
>
> There's no way to transfer stream-riddled files over the Internet,
> and the amount of code complexity we have in Samba having to deal
> with them is nasty and has lead to more than one security hole in
> the past.
>
> Please don't add this to Linux.

Well, I can avoid setting them, but I do have to be able to query them
for backup.

--
Thanks,

Steve