On Tue, Jan 12, 2021 at 10:48:00PM +0800, 吴异 wrote:
> Telling users how to configure the exported file system in the most secure
> way does
> mitigate the problem to some extent, but this does not seem to address the
> security risks posed by no_ subtree_ check in the code. In my opinion,when
> the generated filehandle does not contain the inode information of the
> parent directory,the nfsd_acceptable function can also recursively
> determine whether the request file exceeds the export path dentry.Enabling
> subtree_check to add parent directory information only brings some troubles.
Filesystems don't necessarily provide us with an efficient way to find
parent directories from any given file. (And note a single file may
have multiple parent directories.)
(I do wonder if we could do better in the directory case, though. We
already reconnect directories all the way back up to the root.)
> I have a bold idea, why not directly remove the file handle modification in
> subtree_check, and then normalize the judgment of whether dentry exceeds
> the export point directory in nfsd_acceptable (line 38 to 54 in
> /fs/nfsd/nfsfh.c) .
>
> As far as I understand it, the reason why subtree_check is not turned on by
> default is that it will cause problems when reading and writing files,
> rather than it wastes more time when nfsd_acceptable.
>
> In short,I think it's open to question whether the security of the system
> depends on the user's complete correct configuration(the system does not
> prohibit the export of a subdirectory).
> Enabling subtree_check to add parent directoryinformation only brings
> some troubles.
>
> In short,I think it's open to question whether the security of the
> system depends on the user's complete correct configuration(the system
> does not prohibit the export of a subdirectory).
I'd love to replace the export interface by one that prohibited
subdirectory exports (or at least made it more obvious where they're
being used.)
But given the interface we already have, that would be a disruptive and
time-consuming change.
Another approach is to add more entropy to filehandles so they're harder
to guess; see e.g.:
https://www.fsl.cs.stonybrook.edu/docs/nfscrack-tr/index.html
In the end none of these change the fact that a filehandle has an
infinite lifetime, so once it's leaked, there's nothing you can do. The
authors suggest NFSv4 volatile filehandles as a solution to that
problem, but I don't think they've thought through the obstacles to
making volatile filehandles work.
--b.
On Tue, 2021-01-12 at 10:32 -0500, J. Bruce Fields wrote:
> On Tue, Jan 12, 2021 at 10:48:00PM +0800, 吴异 wrote:
> > Telling users how to configure the exported file system in the most
> > secure
> > way does
> > mitigate the problem to some extent, but this does not seem to
> > address the
> > security risks posed by no_ subtree_ check in the code. In my
> > opinion,when
> > the generated filehandle does not contain the inode information of
> > the
> > parent directory,the nfsd_acceptable function can also recursively
> > determine whether the request file exceeds the export path
> > dentry.Enabling
> > subtree_check to add parent directory information only brings some
> > troubles.
>
> Filesystems don't necessarily provide us with an efficient way to
> find
> parent directories from any given file. (And note a single file may
> have multiple parent directories.)
>
> (I do wonder if we could do better in the directory case, though. We
> already reconnect directories all the way back up to the root.)
>
> > I have a bold idea, why not directly remove the file handle
> > modification in
> > subtree_check, and then normalize the judgment of whether dentry
> > exceeds
> > the export point directory in nfsd_acceptable (line 38 to 54 in
> > /fs/nfsd/nfsfh.c) .
> >
> > As far as I understand it, the reason why subtree_check is not
> > turned on by
> > default is that it will cause problems when reading and writing
> > files,
> > rather than it wastes more time when nfsd_acceptable.
> >
> > In short,I think it's open to question whether the security of the
> > system
> > depends on the user's complete correct configuration(the system
> > does not
> > prohibit the export of a subdirectory).
>
> > Enabling subtree_check to add parent directoryinformation only
> > brings
> > some troubles.
> >
> > In short,I think it's open to question whether the security of the
> > system depends on the user's complete correct configuration(the
> > system
> > does not prohibit the export of a subdirectory).
>
> I'd love to replace the export interface by one that prohibited
> subdirectory exports (or at least made it more obvious where they're
> being used.)
>
> But given the interface we already have, that would be a disruptive
> and
> time-consuming change.
>
> Another approach is to add more entropy to filehandles so they're
> harder
> to guess; see e.g.:
>
> https://www.fsl.cs.stonybrook.edu/docs/nfscrack-tr/index.html
>
> In the end none of these change the fact that a filehandle has an
> infinite lifetime, so once it's leaked, there's nothing you can do.
> The
> authors suggest NFSv4 volatile filehandles as a solution to that
> problem, but I don't think they've thought through the obstacles to
> making volatile filehandles work.
>
> --b.
The point is that there is no good solution to the 'I want to export a
subtree of a filesystem' problem, and so it is plainly wrong to try to
make a default of those solutions, which break the one sane case of
exporting the whole filesystem.
Just a reminder that we kicked out subtree_check not only because a
trivial rename of a file breaks the client's ability to perform I/O by
invalidating the filehandle. In addition, that option causes filehandle
aliasing (i.e. multiple filehandles pointing to the same file) which is
a major PITA for clients to try to manage for more or less the same
reason that it is a major PITA to try to manage these files using
paths.
The discussion on volatile filehandles in RFC5661 does try to address
some of the above issues, but ends up concluding that you need to
introduce POSIX-incompatible restrictions, such as trying to ban
renames and deletions of open files in order to make it work.
None of these compromises are necessary if you export a whole
filesystem (or a hierarchy of whole filesystems). That's the sane case.
That's the one that people should default to using.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
I was under the impression that the best practice is to create something
along the lines of
/srv/nfs
and then bind mount everything you plan to export into that folder; e.g.
/etc/fstab:
/data2/xray /srv/nfs/xray none defaults,bind 0
Presumably this becomes a non-issue under these circumstances? Not sure
it's a good idea to attempt to accommodate every wacky use case someone
attempts to implement.
On 1/12/21 10:53 AM, Trond Myklebust wrote:
> On Tue, 2021-01-12 at 10:32 -0500, J. Bruce Fields wrote:
>> On Tue, Jan 12, 2021 at 10:48:00PM +0800, 吴异 wrote:
>>> Telling users how to configure the exported file system in the most
>>> secure
>>> way does
>>> mitigate the problem to some extent, but this does not seem to
>>> address the
>>> security risks posed by no_ subtree_ check in the code. In my
>>> opinion,when
>>> the generated filehandle does not contain the inode information of
>>> the
>>> parent directory,the nfsd_acceptable function can also recursively
>>> determine whether the request file exceeds the export path
>>> dentry.Enabling
>>> subtree_check to add parent directory information only brings some
>>> troubles.
>>
>> Filesystems don't necessarily provide us with an efficient way to
>> find
>> parent directories from any given file. (And note a single file may
>> have multiple parent directories.)
>>
>> (I do wonder if we could do better in the directory case, though. We
>> already reconnect directories all the way back up to the root.)
>>
>>> I have a bold idea, why not directly remove the file handle
>>> modification in
>>> subtree_check, and then normalize the judgment of whether dentry
>>> exceeds
>>> the export point directory in nfsd_acceptable (line 38 to 54 in
>>> /fs/nfsd/nfsfh.c) .
>>>
>>> As far as I understand it, the reason why subtree_check is not
>>> turned on by
>>> default is that it will cause problems when reading and writing
>>> files,
>>> rather than it wastes more time when nfsd_acceptable.
>>>
>>> In short,I think it's open to question whether the security of the
>>> system
>>> depends on the user's complete correct configuration(the system
>>> does not
>>> prohibit the export of a subdirectory).
>>
>>> Enabling subtree_check to add parent directoryinformation only
>>> brings
>>> some troubles.
>>>
>>> In short,I think it's open to question whether the security of the
>>> system depends on the user's complete correct configuration(the
>>> system
>>> does not prohibit the export of a subdirectory).
>>
>> I'd love to replace the export interface by one that prohibited
>> subdirectory exports (or at least made it more obvious where they're
>> being used.)
>>
>> But given the interface we already have, that would be a disruptive
>> and
>> time-consuming change.
>>
>> Another approach is to add more entropy to filehandles so they're
>> harder
>> to guess; see e.g.:
>>
>> https://www.fsl.cs.stonybrook.edu/docs/nfscrack-tr/index.html
>>
>> In the end none of these change the fact that a filehandle has an
>> infinite lifetime, so once it's leaked, there's nothing you can do.
>> The
>> authors suggest NFSv4 volatile filehandles as a solution to that
>> problem, but I don't think they've thought through the obstacles to
>> making volatile filehandles work.
>>
>> --b.
>
> The point is that there is no good solution to the 'I want to export a
> subtree of a filesystem' problem, and so it is plainly wrong to try to
> make a default of those solutions, which break the one sane case of
> exporting the whole filesystem.
>
> Just a reminder that we kicked out subtree_check not only because a
> trivial rename of a file breaks the client's ability to perform I/O by
> invalidating the filehandle. In addition, that option causes filehandle
> aliasing (i.e. multiple filehandles pointing to the same file) which is
> a major PITA for clients to try to manage for more or less the same
> reason that it is a major PITA to try to manage these files using
> paths.
>
> The discussion on volatile filehandles in RFC5661 does try to address
> some of the above issues, but ends up concluding that you need to
> introduce POSIX-incompatible restrictions, such as trying to ban
> renames and deletions of open files in order to make it work.
>
> None of these compromises are necessary if you export a whole
> filesystem (or a hierarchy of whole filesystems). That's the sane case.
> That's the one that people should default to using.
>
On Tue, Jan 12, 2021 at 11:20:28AM -0600, Patrick Goetz wrote:
> I was under the impression that the best practice is to create
> something along the lines of
>
> /srv/nfs
>
> and then bind mount everything you plan to export into that folder; e.g.
>
> /etc/fstab:
> /data2/xray /srv/nfs/xray none defaults,bind 0
You can do that if you'd like. I doesn't make much difference here.
You can think of a filehandle as just a (device number, inode number)
pair. (It's actually more complicated, but ignore that for now.)
So if the server's given a filehandle, it can easily determine the
filehandle is for an object on /dev/sda2. It *cannot* easily determine
whether that object is somewhere underneath /some/directory.
So in your example, if /data2/xray is on the same filesystem as /data2,
then the server will happily allow operations on filehandles anywhere in
/data2.
Every export point should be the root of a filesystem.
--b.
>
> Presumably this becomes a non-issue under these circumstances? Not
> sure it's a good idea to attempt to accommodate every wacky use case
> someone attempts to implement.
>
>
> On 1/12/21 10:53 AM, Trond Myklebust wrote:
> >On Tue, 2021-01-12 at 10:32 -0500, J. Bruce Fields wrote:
> >>On Tue, Jan 12, 2021 at 10:48:00PM +0800, 吴异 wrote:
> >>>Telling users how to configure the exported file system in the most
> >>>secure
> >>>way does
> >>>mitigate the problem to some extent, but this does not seem to
> >>>address the
> >>>security risks posed by no_ subtree_ check in the code. In my
> >>>opinion,when
> >>>the generated filehandle does not contain the inode information of
> >>>the
> >>>parent directory,the nfsd_acceptable function can also recursively
> >>>determine whether the request file exceeds the export path
> >>>dentry.Enabling
> >>>subtree_check to add parent directory information only brings some
> >>>troubles.
> >>
> >>Filesystems don't necessarily provide us with an efficient way to
> >>find
> >>parent directories from any given file. (And note a single file may
> >>have multiple parent directories.)
> >>
> >>(I do wonder if we could do better in the directory case, though. We
> >>already reconnect directories all the way back up to the root.)
> >>
> >>>I have a bold idea, why not directly remove the file handle
> >>>modification in
> >>>subtree_check, and then normalize the judgment of whether dentry
> >>>exceeds
> >>>the export point directory in nfsd_acceptable (line 38 to 54 in
> >>>/fs/nfsd/nfsfh.c) .
> >>>
> >>>As far as I understand it, the reason why subtree_check is not
> >>>turned on by
> >>>default is that it will cause problems when reading and writing
> >>>files,
> >>>rather than it wastes more time when nfsd_acceptable.
> >>>
> >>>In short,I think it's open to question whether the security of the
> >>>system
> >>>depends on the user's complete correct configuration(the system
> >>>does not
> >>>prohibit the export of a subdirectory).
> >>
> >>>Enabling subtree_check to add parent directoryinformation only
> >>>brings
> >>>some troubles.
> >>>
> >>>In short,I think it's open to question whether the security of the
> >>>system depends on the user's complete correct configuration(the
> >>>system
> >>>does not prohibit the export of a subdirectory).
> >>
> >>I'd love to replace the export interface by one that prohibited
> >>subdirectory exports (or at least made it more obvious where they're
> >>being used.)
> >>
> >>But given the interface we already have, that would be a disruptive
> >>and
> >>time-consuming change.
> >>
> >>Another approach is to add more entropy to filehandles so they're
> >>harder
> >>to guess; see e.g.:
> >>
> >> https://www.fsl.cs.stonybrook.edu/docs/nfscrack-tr/index.html
> >>
> >>In the end none of these change the fact that a filehandle has an
> >>infinite lifetime, so once it's leaked, there's nothing you can do.
> >>The
> >>authors suggest NFSv4 volatile filehandles as a solution to that
> >>problem, but I don't think they've thought through the obstacles to
> >>making volatile filehandles work.
> >>
> >>--b.
> >
> >The point is that there is no good solution to the 'I want to export a
> >subtree of a filesystem' problem, and so it is plainly wrong to try to
> >make a default of those solutions, which break the one sane case of
> >exporting the whole filesystem.
> >
> >Just a reminder that we kicked out subtree_check not only because a
> >trivial rename of a file breaks the client's ability to perform I/O by
> >invalidating the filehandle. In addition, that option causes filehandle
> >aliasing (i.e. multiple filehandles pointing to the same file) which is
> >a major PITA for clients to try to manage for more or less the same
> >reason that it is a major PITA to try to manage these files using
> >paths.
> >
> >The discussion on volatile filehandles in RFC5661 does try to address
> >some of the above issues, but ends up concluding that you need to
> >introduce POSIX-incompatible restrictions, such as trying to ban
> >renames and deletions of open files in order to make it work.
> >
> >None of these compromises are necessary if you export a whole
> >filesystem (or a hierarchy of whole filesystems). That's the sane case.
> >That's the one that people should default to using.
> >
FYI, if people really want to use some sort of subtree exports, for XFS
(and probably ext4) we encode the project id into the file handle and
use the hierarchical project ID inheritance that we already use for
project quotas.
On Wed, 2021-01-13 at 08:12 +0000, Christoph Hellwig wrote:
> FYI, if people really want to use some sort of subtree exports, for
> XFS
> (and probably ext4) we encode the project id into the file handle and
> use the hierarchical project ID inheritance that we already use for
> project quotas.
You'd basically need something along the lines of a NetApp qtree.
i.e. a persisted tag that can label all the files and directories in a
subtree, and that is used to enforce a set of rules that are generally
normally associated with filesystems. So no renames from objects inside
the subtree to directories that lie outside it. No hard links that
cross the subtree boundary.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
On Wed, Jan 13, 2021 at 02:34:45PM +0000, Trond Myklebust wrote:
> On Wed, 2021-01-13 at 08:12 +0000, Christoph Hellwig wrote:
> > FYI, if people really want to use some sort of subtree exports, for
> > XFS
> > (and probably ext4) we encode the project id into the file handle and
> > use the hierarchical project ID inheritance that we already use for
> > project quotas.
>
> You'd basically need something along the lines of a NetApp qtree.
>
> i.e. a persisted tag that can label all the files and directories in a
> subtree, and that is used to enforce a set of rules that are generally
> normally associated with filesystems. So no renames from objects inside
> the subtree to directories that lie outside it. No hard links that
> cross the subtree boundary.
That is the XFS project ID, which ext4 has also picked up a few years
ago.
On Wed, 2021-01-13 at 14:40 +0000, [email protected] wrote:
> On Wed, Jan 13, 2021 at 02:34:45PM +0000, Trond Myklebust wrote:
> > On Wed, 2021-01-13 at 08:12 +0000, Christoph Hellwig wrote:
> > > FYI, if people really want to use some sort of subtree exports,
> > > for
> > > XFS
> > > (and probably ext4) we encode the project id into the file handle
> > > and
> > > use the hierarchical project ID inheritance that we already use
> > > for
> > > project quotas.
> >
> > You'd basically need something along the lines of a NetApp qtree.
> >
> > i.e. a persisted tag that can label all the files and directories
> > in a
> > subtree, and that is used to enforce a set of rules that are
> > generally
> > normally associated with filesystems. So no renames from objects
> > inside
> > the subtree to directories that lie outside it. No hard links that
> > cross the subtree boundary.
>
> That is the XFS project ID, which ext4 has also picked up a few years
> ago.
How would that work then? Would you just look at the project ID of the
directory identified by the filehandle as the export point, and then
match to the project ID on the target inode? That sounds like it
doesn't even need to encode anything special in the filehandle.
How do you set a project ID in XFS?
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
On Wed, Jan 13, 2021 at 03:16:52PM +0000, Trond Myklebust wrote:
> How would that work then? Would you just look at the project ID of the
> directory identified by the filehandle as the export point, and then
> match to the project ID on the target inode? That sounds like it
> doesn't even need to encode anything special in the filehandle.
True, we would not even have to encode them.
> How do you set a project ID in XFS?
With the XFS_IOC_SETXFLAGS ioctl.
On the command line side people usually do it using the xfs_quota
tool as part of setting up the tree quotas, but it can also be done
separately using the chproj subcommand of xfs_io.
> On Wed, Jan 13, 2021 at 03:16:52PM +0000, Trond Myklebust wrote:
> > How would that work then? Would you just look at the project ID of the
> > directory identified by the filehandle as the export point, and then
> > match to the project ID on the target inode? That sounds like it
> > doesn't even need to encode anything special in the filehandle.
>
> True, we would not even have to encode them.
>
> > How do you set a project ID in XFS?
>
> With the XFS_IOC_SETXFLAGS ioctl.
>
> On the command line side people usually do it using the xfs_quota tool as
part
> of setting up the tree quotas, but it can also be done separately using
the chproj
> subcommand of xfs_io.
Is this also queried via the ioctl? If this is a viable way of specifying
sub-trees, nfs-ganesha could also use it, though making an ioctl call for
each file system object would add some metadata performance hit.
Frank
hello,
I want to consult you on what is the original intention of designing
subtree_check and whether it is to solve the 'I want to export a
subtree of a filesystem' problem.
As far as I know, when opening subtree_check, the folder's file
handle does not contain the inode information of its parent directory
and
'while (tdentry != exp->ex_path.dentry && !IS_ROOT(tdentry))' in
nfsd_acceptable can work well to Intercept handles beyond the export
point.
This seems to delete code as follows in nfsfh.c could solve the 'I
want to export a subtree of a filesystem' problem and ensure safety:
if (exp->ex_flags & NFSEXP_NOSUBTREECHECK)
return 1;
Or replace by follow:
if (exp->ex_path.dentry == exp->vfs_mount->mnt_root)
return 1;
When I was reading the nfsd code, I was confused about whether the
designer used the file system as a security boundary or an export
point.Since exporting a complete file system is the safest, why not
directly prohibit unsafe practices, but add code like subtree_check to
try to verify the file handle.
I may not understand your design ideas.
Yours sincerely,
Trond Myklebust <[email protected]> 于2021年1月13日周三 上午12:53写道:
>
> On Tue, 2021-01-12 at 10:32 -0500, J. Bruce Fields wrote:
> > On Tue, Jan 12, 2021 at 10:48:00PM +0800, 吴异 wrote:
> > > Telling users how to configure the exported file system in the most
> > > secure
> > > way does
> > > mitigate the problem to some extent, but this does not seem to
> > > address the
> > > security risks posed by no_ subtree_ check in the code. In my
> > > opinion,when
> > > the generated filehandle does not contain the inode information of
> > > the
> > > parent directory,the nfsd_acceptable function can also recursively
> > > determine whether the request file exceeds the export path
> > > dentry.Enabling
> > > subtree_check to add parent directory information only brings some
> > > troubles.
> >
> > Filesystems don't necessarily provide us with an efficient way to
> > find
> > parent directories from any given file. (And note a single file may
> > have multiple parent directories.)
> >
> > (I do wonder if we could do better in the directory case, though. We
> > already reconnect directories all the way back up to the root.)
> >
> > > I have a bold idea, why not directly remove the file handle
> > > modification in
> > > subtree_check, and then normalize the judgment of whether dentry
> > > exceeds
> > > the export point directory in nfsd_acceptable (line 38 to 54 in
> > > /fs/nfsd/nfsfh.c) .
> > >
> > > As far as I understand it, the reason why subtree_check is not
> > > turned on by
> > > default is that it will cause problems when reading and writing
> > > files,
> > > rather than it wastes more time when nfsd_acceptable.
> > >
> > > In short,I think it's open to question whether the security of the
> > > system
> > > depends on the user's complete correct configuration(the system
> > > does not
> > > prohibit the export of a subdirectory).
> >
> > > Enabling subtree_check to add parent directoryinformation only
> > > brings
> > > some troubles.
> > >
> > > In short,I think it's open to question whether the security of the
> > > system depends on the user's complete correct configuration(the
> > > system
> > > does not prohibit the export of a subdirectory).
> >
> > I'd love to replace the export interface by one that prohibited
> > subdirectory exports (or at least made it more obvious where they're
> > being used.)
> >
> > But given the interface we already have, that would be a disruptive
> > and
> > time-consuming change.
> >
> > Another approach is to add more entropy to filehandles so they're
> > harder
> > to guess; see e.g.:
> >
> > https://www.fsl.cs.stonybrook.edu/docs/nfscrack-tr/index.html
> >
> > In the end none of these change the fact that a filehandle has an
> > infinite lifetime, so once it's leaked, there's nothing you can do.
> > The
> > authors suggest NFSv4 volatile filehandles as a solution to that
> > problem, but I don't think they've thought through the obstacles to
> > making volatile filehandles work.
> >
> > --b.
>
> The point is that there is no good solution to the 'I want to export a
> subtree of a filesystem' problem, and so it is plainly wrong to try to
> make a default of those solutions, which break the one sane case of
> exporting the whole filesystem.
>
> Just a reminder that we kicked out subtree_check not only because a
> trivial rename of a file breaks the client's ability to perform I/O by
> invalidating the filehandle. In addition, that option causes filehandle
> aliasing (i.e. multiple filehandles pointing to the same file) which is
> a major PITA for clients to try to manage for more or less the same
> reason that it is a major PITA to try to manage these files using
> paths.
>
> The discussion on volatile filehandles in RFC5661 does try to address
> some of the above issues, but ends up concluding that you need to
> introduce POSIX-incompatible restrictions, such as trying to ban
> renames and deletions of open files in order to make it work.
>
> None of these compromises are necessary if you export a whole
> filesystem (or a hierarchy of whole filesystems). That's the sane case.
> That's the one that people should default to using.
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]
>
>
On Tue, Jan 19, 2021 at 12:29:28AM +0800, 吴异 wrote:
> I want to consult you on what is the original intention of designing
> subtree_check and whether it is to solve the 'I want to export a
> subtree of a filesystem' problem.
>
> As far as I know, when opening subtree_check, the folder's file
> handle does not contain the inode information of its parent directory
> and
> 'while (tdentry != exp->ex_path.dentry && !IS_ROOT(tdentry))' in
> nfsd_acceptable can work well to Intercept handles beyond the export
> point.
>
> This seems to delete code as follows in nfsfh.c could solve the 'I
> want to export a subtree of a filesystem' problem and ensure safety:
> if (exp->ex_flags & NFSEXP_NOSUBTREECHECK)
> return 1;
>
> Or replace by follow:
> if (exp->ex_path.dentry == exp->vfs_mount->mnt_root)
> return 1;
>
> When I was reading the nfsd code, I was confused about whether the
> designer used the file system as a security boundary or an export
> point.Since exporting a complete file system is the safest, why not
> directly prohibit unsafe practices, but add code like subtree_check to
> try to verify the file handle.
Sorry, I honestly don't understand the question.
If you have a specific proposal, perhaps you could send a patch.
--b.
>
> I may not understand your design ideas.
>
> Yours sincerely,
>
> Trond Myklebust <[email protected]> 于2021年1月13日周三 上午12:53写道:
> >
> > On Tue, 2021-01-12 at 10:32 -0500, J. Bruce Fields wrote:
> > > On Tue, Jan 12, 2021 at 10:48:00PM +0800, 吴异 wrote:
> > > > Telling users how to configure the exported file system in the most
> > > > secure
> > > > way does
> > > > mitigate the problem to some extent, but this does not seem to
> > > > address the
> > > > security risks posed by no_ subtree_ check in the code. In my
> > > > opinion,when
> > > > the generated filehandle does not contain the inode information of
> > > > the
> > > > parent directory,the nfsd_acceptable function can also recursively
> > > > determine whether the request file exceeds the export path
> > > > dentry.Enabling
> > > > subtree_check to add parent directory information only brings some
> > > > troubles.
> > >
> > > Filesystems don't necessarily provide us with an efficient way to
> > > find
> > > parent directories from any given file. (And note a single file may
> > > have multiple parent directories.)
> > >
> > > (I do wonder if we could do better in the directory case, though. We
> > > already reconnect directories all the way back up to the root.)
> > >
> > > > I have a bold idea, why not directly remove the file handle
> > > > modification in
> > > > subtree_check, and then normalize the judgment of whether dentry
> > > > exceeds
> > > > the export point directory in nfsd_acceptable (line 38 to 54 in
> > > > /fs/nfsd/nfsfh.c) .
> > > >
> > > > As far as I understand it, the reason why subtree_check is not
> > > > turned on by
> > > > default is that it will cause problems when reading and writing
> > > > files,
> > > > rather than it wastes more time when nfsd_acceptable.
> > > >
> > > > In short,I think it's open to question whether the security of the
> > > > system
> > > > depends on the user's complete correct configuration(the system
> > > > does not
> > > > prohibit the export of a subdirectory).
> > >
> > > > Enabling subtree_check to add parent directoryinformation only
> > > > brings
> > > > some troubles.
> > > >
> > > > In short,I think it's open to question whether the security of the
> > > > system depends on the user's complete correct configuration(the
> > > > system
> > > > does not prohibit the export of a subdirectory).
> > >
> > > I'd love to replace the export interface by one that prohibited
> > > subdirectory exports (or at least made it more obvious where they're
> > > being used.)
> > >
> > > But given the interface we already have, that would be a disruptive
> > > and
> > > time-consuming change.
> > >
> > > Another approach is to add more entropy to filehandles so they're
> > > harder
> > > to guess; see e.g.:
> > >
> > > https://www.fsl.cs.stonybrook.edu/docs/nfscrack-tr/index.html
> > >
> > > In the end none of these change the fact that a filehandle has an
> > > infinite lifetime, so once it's leaked, there's nothing you can do.
> > > The
> > > authors suggest NFSv4 volatile filehandles as a solution to that
> > > problem, but I don't think they've thought through the obstacles to
> > > making volatile filehandles work.
> > >
> > > --b.
> >
> > The point is that there is no good solution to the 'I want to export a
> > subtree of a filesystem' problem, and so it is plainly wrong to try to
> > make a default of those solutions, which break the one sane case of
> > exporting the whole filesystem.
> >
> > Just a reminder that we kicked out subtree_check not only because a
> > trivial rename of a file breaks the client's ability to perform I/O by
> > invalidating the filehandle. In addition, that option causes filehandle
> > aliasing (i.e. multiple filehandles pointing to the same file) which is
> > a major PITA for clients to try to manage for more or less the same
> > reason that it is a major PITA to try to manage these files using
> > paths.
> >
> > The discussion on volatile filehandles in RFC5661 does try to address
> > some of the above issues, but ends up concluding that you need to
> > introduce POSIX-incompatible restrictions, such as trying to ban
> > renames and deletions of open files in order to make it work.
> >
> > None of these compromises are necessary if you export a whole
> > filesystem (or a hierarchy of whole filesystems). That's the sane case.
> > That's the one that people should default to using.
> >
> > --
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > [email protected]
> >
> >
My patch is below:
/fs/nfsd/nfsfh.c
nfsd_acceptable(void expv,struct dentry *dentry)
{
- if(exp->ex_flags & NFSEXP_NOSUBTREEXHECK)
- return 1;
+ if(is_root_export(exp))
+ return 1;
+ /* If not subdirectory export, accept anything*/
}
在 2021年1月19日星期二,[email protected] <[email protected]> 写道:
> On Tue, Jan 19, 2021 at 12:29:28AM +0800, 吴异 wrote:
>> I want to consult you on what is the original intention of designing
>> subtree_check and whether it is to solve the 'I want to export a
>> subtree of a filesystem' problem.
>>
>> As far as I know, when opening subtree_check, the folder's file
>> handle does not contain the inode information of its parent directory
>> and
>> 'while (tdentry != exp->ex_path.dentry && !IS_ROOT(tdentry))' in
>> nfsd_acceptable can work well to Intercept handles beyond the export
>> point.
>>
>> This seems to delete code as follows in nfsfh.c could solve the 'I
>> want to export a subtree of a filesystem' problem and ensure safety:
>> if (exp->ex_flags & NFSEXP_NOSUBTREECHECK)
>> return 1;
>>
>> Or replace by follow:
>> if (exp->ex_path.dentry == exp->vfs_mount->mnt_root)
>> return 1;
>>
>> When I was reading the nfsd code, I was confused about whether the
>> designer used the file system as a security boundary or an export
>> point.Since exporting a complete file system is the safest, why not
>> directly prohibit unsafe practices, but add code like subtree_check to
>> try to verify the file handle.
>
> Sorry, I honestly don't understand the question.
>
> If you have a specific proposal, perhaps you could send a patch.
>
> --b.
>
>>
>> I may not understand your design ideas.
>>
>> Yours sincerely,
>>
>> Trond Myklebust <[email protected]> 于2021年1月13日周三 上午12:53写道:
>> >
>> > On Tue, 2021-01-12 at 10:32 -0500, J. Bruce Fields wrote:
>> > > On Tue, Jan 12, 2021 at 10:48:00PM +0800, 吴异 wrote:
>> > > > Telling users how to configure the exported file system in the most
>> > > > secure
>> > > > way does
>> > > > mitigate the problem to some extent, but this does not seem to
>> > > > address the
>> > > > security risks posed by no_ subtree_ check in the code. In my
>> > > > opinion,when
>> > > > the generated filehandle does not contain the inode information of
>> > > > the
>> > > > parent directory,the nfsd_acceptable function can also recursively
>> > > > determine whether the request file exceeds the export path
>> > > > dentry.Enabling
>> > > > subtree_check to add parent directory information only brings some
>> > > > troubles.
>> > >
>> > > Filesystems don't necessarily provide us with an efficient way to
>> > > find
>> > > parent directories from any given file. (And note a single file may
>> > > have multiple parent directories.)
>> > >
>> > > (I do wonder if we could do better in the directory case, though. We
>> > > already reconnect directories all the way back up to the root.)
>> > >
>> > > > I have a bold idea, why not directly remove the file handle
>> > > > modification in
>> > > > subtree_check, and then normalize the judgment of whether dentry
>> > > > exceeds
>> > > > the export point directory in nfsd_acceptable (line 38 to 54 in
>> > > > /fs/nfsd/nfsfh.c) .
>> > > >
>> > > > As far as I understand it, the reason why subtree_check is not
>> > > > turned on by
>> > > > default is that it will cause problems when reading and writing
>> > > > files,
>> > > > rather than it wastes more time when nfsd_acceptable.
>> > > >
>> > > > In short,I think it's open to question whether the security of the
>> > > > system
>> > > > depends on the user's complete correct configuration(the system
>> > > > does not
>> > > > prohibit the export of a subdirectory).
>> > >
>> > > > Enabling subtree_check to add parent directoryinformation only
>> > > > brings
>> > > > some troubles.
>> > > >
>> > > > In short,I think it's open to question whether the security of the
>> > > > system depends on the user's complete correct configuration(the
>> > > > system
>> > > > does not prohibit the export of a subdirectory).
>> > >
>> > > I'd love to replace the export interface by one that prohibited
>> > > subdirectory exports (or at least made it more obvious where they're
>> > > being used.)
>> > >
>> > > But given the interface we already have, that would be a disruptive
>> > > and
>> > > time-consuming change.
>> > >
>> > > Another approach is to add more entropy to filehandles so they're
>> > > harder
>> > > to guess; see e.g.:
>> > >
>> > > https://www.fsl.cs.stonybrook.edu/docs/nfscrack-tr/index.html
>> > >
>> > > In the end none of these change the fact that a filehandle has an
>> > > infinite lifetime, so once it's leaked, there's nothing you can do.
>> > > The
>> > > authors suggest NFSv4 volatile filehandles as a solution to that
>> > > problem, but I don't think they've thought through the obstacles to
>> > > making volatile filehandles work.
>> > >
>> > > --b.
>> >
>> > The point is that there is no good solution to the 'I want to export a
>> > subtree of a filesystem' problem, and so it is plainly wrong to try to
>> > make a default of those solutions, which break the one sane case of
>> > exporting the whole filesystem.
>> >
>> > Just a reminder that we kicked out subtree_check not only because a
>> > trivial rename of a file breaks the client's ability to perform I/O by
>> > invalidating the filehandle. In addition, that option causes filehandle
>> > aliasing (i.e. multiple filehandles pointing to the same file) which is
>> > a major PITA for clients to try to manage for more or less the same
>> > reason that it is a major PITA to try to manage these files using
>> > paths.
>> >
>> > The discussion on volatile filehandles in RFC5661 does try to address
>> > some of the above issues, but ends up concluding that you need to
>> > introduce POSIX-incompatible restrictions, such as trying to ban
>> > renames and deletions of open files in order to make it work.
>> >
>> > None of these compromises are necessary if you export a whole
>> > filesystem (or a hierarchy of whole filesystems). That's the sane case.
>> > That's the one that people should default to using.
>> >
>> > --
>> > Trond Myklebust
>> > Linux NFS client maintainer, Hammerspace
>> > [email protected]
>> >
>> >
>
On Tue, Jan 19, 2021 at 10:48:22AM +0800, 吴异 wrote:
> My patch is below:
> /fs/nfsd/nfsfh.c
>
> nfsd_acceptable(void expv,struct dentry *dentry)
> {
> - if(exp->ex_flags & NFSEXP_NOSUBTREEXHECK)
> - return 1;
>
> + if(is_root_export(exp))
> + return 1;
So you end up doing the SUBTREECHECK checks in cases where the
filehandle doesn't actually have any parent information. That means we
may be stuck with no way to connect the file back up the export point,
so this check will fail even though the file may be under the export
point. So users probably get spurious ESTALE errors.
To reproduce you probably need to do something like open a file on the
client, reboot the server, then try to access the open file again. The
dentry cache will no longer know how to look up the parents after the
reboot.
--b.
>
> + /* If not subdirectory export, accept anything*/
>
> }
>
>
>
>
> 在 2021年1月19日星期二,[email protected] <[email protected]> 写道:
> > On Tue, Jan 19, 2021 at 12:29:28AM +0800, 吴异 wrote:
> >> I want to consult you on what is the original intention of designing
> >> subtree_check and whether it is to solve the 'I want to export a
> >> subtree of a filesystem' problem.
> >>
> >> As far as I know, when opening subtree_check, the folder's file
> >> handle does not contain the inode information of its parent directory
> >> and
> >> 'while (tdentry != exp->ex_path.dentry && !IS_ROOT(tdentry))' in
> >> nfsd_acceptable can work well to Intercept handles beyond the export
> >> point.
> >>
> >> This seems to delete code as follows in nfsfh.c could solve the 'I
> >> want to export a subtree of a filesystem' problem and ensure safety:
> >> if (exp->ex_flags & NFSEXP_NOSUBTREECHECK)
> >> return 1;
> >>
> >> Or replace by follow:
> >> if (exp->ex_path.dentry == exp->vfs_mount->mnt_root)
> >> return 1;
> >>
> >> When I was reading the nfsd code, I was confused about whether the
> >> designer used the file system as a security boundary or an export
> >> point.Since exporting a complete file system is the safest, why not
> >> directly prohibit unsafe practices, but add code like subtree_check to
> >> try to verify the file handle.
> >
> > Sorry, I honestly don't understand the question.
> >
> > If you have a specific proposal, perhaps you could send a patch.
> >
> > --b.
> >
> >>
> >> I may not understand your design ideas.
> >>
> >> Yours sincerely,
> >>
> >> Trond Myklebust <[email protected]> 于2021年1月13日周三 上午12:53写道:
> >> >
> >> > On Tue, 2021-01-12 at 10:32 -0500, J. Bruce Fields wrote:
> >> > > On Tue, Jan 12, 2021 at 10:48:00PM +0800, 吴异 wrote:
> >> > > > Telling users how to configure the exported file system in the most
> >> > > > secure
> >> > > > way does
> >> > > > mitigate the problem to some extent, but this does not seem to
> >> > > > address the
> >> > > > security risks posed by no_ subtree_ check in the code. In my
> >> > > > opinion,when
> >> > > > the generated filehandle does not contain the inode information of
> >> > > > the
> >> > > > parent directory,the nfsd_acceptable function can also recursively
> >> > > > determine whether the request file exceeds the export path
> >> > > > dentry.Enabling
> >> > > > subtree_check to add parent directory information only brings some
> >> > > > troubles.
> >> > >
> >> > > Filesystems don't necessarily provide us with an efficient way to
> >> > > find
> >> > > parent directories from any given file. (And note a single file may
> >> > > have multiple parent directories.)
> >> > >
> >> > > (I do wonder if we could do better in the directory case, though. We
> >> > > already reconnect directories all the way back up to the root.)
> >> > >
> >> > > > I have a bold idea, why not directly remove the file handle
> >> > > > modification in
> >> > > > subtree_check, and then normalize the judgment of whether dentry
> >> > > > exceeds
> >> > > > the export point directory in nfsd_acceptable (line 38 to 54 in
> >> > > > /fs/nfsd/nfsfh.c) .
> >> > > >
> >> > > > As far as I understand it, the reason why subtree_check is not
> >> > > > turned on by
> >> > > > default is that it will cause problems when reading and writing
> >> > > > files,
> >> > > > rather than it wastes more time when nfsd_acceptable.
> >> > > >
> >> > > > In short,I think it's open to question whether the security of the
> >> > > > system
> >> > > > depends on the user's complete correct configuration(the system
> >> > > > does not
> >> > > > prohibit the export of a subdirectory).
> >> > >
> >> > > > Enabling subtree_check to add parent directoryinformation only
> >> > > > brings
> >> > > > some troubles.
> >> > > >
> >> > > > In short,I think it's open to question whether the security of the
> >> > > > system depends on the user's complete correct configuration(the
> >> > > > system
> >> > > > does not prohibit the export of a subdirectory).
> >> > >
> >> > > I'd love to replace the export interface by one that prohibited
> >> > > subdirectory exports (or at least made it more obvious where they're
> >> > > being used.)
> >> > >
> >> > > But given the interface we already have, that would be a disruptive
> >> > > and
> >> > > time-consuming change.
> >> > >
> >> > > Another approach is to add more entropy to filehandles so they're
> >> > > harder
> >> > > to guess; see e.g.:
> >> > >
> >> > > https://www.fsl.cs.stonybrook.edu/docs/nfscrack-tr/index.html
> >> > >
> >> > > In the end none of these change the fact that a filehandle has an
> >> > > infinite lifetime, so once it's leaked, there's nothing you can do.
> >> > > The
> >> > > authors suggest NFSv4 volatile filehandles as a solution to that
> >> > > problem, but I don't think they've thought through the obstacles to
> >> > > making volatile filehandles work.
> >> > >
> >> > > --b.
> >> >
> >> > The point is that there is no good solution to the 'I want to export a
> >> > subtree of a filesystem' problem, and so it is plainly wrong to try to
> >> > make a default of those solutions, which break the one sane case of
> >> > exporting the whole filesystem.
> >> >
> >> > Just a reminder that we kicked out subtree_check not only because a
> >> > trivial rename of a file breaks the client's ability to perform I/O by
> >> > invalidating the filehandle. In addition, that option causes filehandle
> >> > aliasing (i.e. multiple filehandles pointing to the same file) which is
> >> > a major PITA for clients to try to manage for more or less the same
> >> > reason that it is a major PITA to try to manage these files using
> >> > paths.
> >> >
> >> > The discussion on volatile filehandles in RFC5661 does try to address
> >> > some of the above issues, but ends up concluding that you need to
> >> > introduce POSIX-incompatible restrictions, such as trying to ban
> >> > renames and deletions of open files in order to make it work.
> >> >
> >> > None of these compromises are necessary if you export a whole
> >> > filesystem (or a hierarchy of whole filesystems). That's the sane case.
> >> > That's the one that people should default to using.
> >> >
> >> > --
> >> > Trond Myklebust
> >> > Linux NFS client maintainer, Hammerspace
> >> > [email protected]
> >> >
> >> >
> >
See below.
On 1/12/21 12:03 PM, [email protected] wrote:
> On Tue, Jan 12, 2021 at 11:20:28AM -0600, Patrick Goetz wrote:
>> I was under the impression that the best practice is to create
>> something along the lines of
>>
>> /srv/nfs
>>
>> and then bind mount everything you plan to export into that folder; e.g.
>>
>> /etc/fstab:
>> /data2/xray /srv/nfs/xray none defaults,bind 0
>
> You can do that if you'd like. I doesn't make much difference here.
>
> You can think of a filehandle as just a (device number, inode number)
> pair. (It's actually more complicated, but ignore that for now.)
>
> So if the server's given a filehandle, it can easily determine the
> filehandle is for an object on /dev/sda2. It *cannot* easily determine
> whether that object is somewhere underneath /some/directory.
>
> So in your example, if /data2/xray is on the same filesystem as /data2,
> then the server will happily allow operations on filehandles anywhere in
> /data2.
>
> Every export point should be the root of a filesystem.
>
> --b.
>
I didn't respond to this message immediately, but it's been bothering me
ever since. When I do a bind mount like this in /etc/fstab:
/data2/xray /srv/nfs/xray none defaults,bind 0
it's my understanding that the kernel keeps track of the resulting
/srv/nfs/xray filesystem in it's vfs somehow. Even when directly on the
server I can't "break out" of /srv/nfs/xray to get to the other
directories in /data. Then how on earth would an NFS client do this?
I thought the whole point of doing a bind mount like this is to solve
the problem of exporting leaves of a directory hierarchy. In particular,
"So in your example, if /data2/xray is on the same filesystem as
/data2, then the server will happily allow operations on
filehandles anywhere in /data2."
Yes, sure; but I'm not exporting /data2/xray; I'm exporting
/srv/nfs/xray, a bind mount to the preceding. Am I missing something,
or is NFS too insecure to use in any context requiring differentiated
security settings on different folders in the same directory structure?
It's not practical to making everything you export its own partition;
although I suppose one could do this with ZFS datasets.
>>
>> Presumably this becomes a non-issue under these circumstances? Not
>> sure it's a good idea to attempt to accommodate every wacky use case
>> someone attempts to implement.
>>
>>
>> On 1/12/21 10:53 AM, Trond Myklebust wrote:
>>> On Tue, 2021-01-12 at 10:32 -0500, J. Bruce Fields wrote:
>>>> On Tue, Jan 12, 2021 at 10:48:00PM +0800, 吴异 wrote:
>>>>> Telling users how to configure the exported file system in the most
>>>>> secure
>>>>> way does
>>>>> mitigate the problem to some extent, but this does not seem to
>>>>> address the
>>>>> security risks posed by no_ subtree_ check in the code. In my
>>>>> opinion,when
>>>>> the generated filehandle does not contain the inode information of
>>>>> the
>>>>> parent directory,the nfsd_acceptable function can also recursively
>>>>> determine whether the request file exceeds the export path
>>>>> dentry.Enabling
>>>>> subtree_check to add parent directory information only brings some
>>>>> troubles.
>>>>
>>>> Filesystems don't necessarily provide us with an efficient way to
>>>> find
>>>> parent directories from any given file. (And note a single file may
>>>> have multiple parent directories.)
>>>>
>>>> (I do wonder if we could do better in the directory case, though. We
>>>> already reconnect directories all the way back up to the root.)
>>>>
>>>>> I have a bold idea, why not directly remove the file handle
>>>>> modification in
>>>>> subtree_check, and then normalize the judgment of whether dentry
>>>>> exceeds
>>>>> the export point directory in nfsd_acceptable (line 38 to 54 in
>>>>> /fs/nfsd/nfsfh.c) .
>>>>>
>>>>> As far as I understand it, the reason why subtree_check is not
>>>>> turned on by
>>>>> default is that it will cause problems when reading and writing
>>>>> files,
>>>>> rather than it wastes more time when nfsd_acceptable.
>>>>>
>>>>> In short,I think it's open to question whether the security of the
>>>>> system
>>>>> depends on the user's complete correct configuration(the system
>>>>> does not
>>>>> prohibit the export of a subdirectory).
>>>>
>>>>> Enabling subtree_check to add parent directoryinformation only
>>>>> brings
>>>>> some troubles.
>>>>>
>>>>> In short,I think it's open to question whether the security of the
>>>>> system depends on the user's complete correct configuration(the
>>>>> system
>>>>> does not prohibit the export of a subdirectory).
>>>>
>>>> I'd love to replace the export interface by one that prohibited
>>>> subdirectory exports (or at least made it more obvious where they're
>>>> being used.)
>>>>
>>>> But given the interface we already have, that would be a disruptive
>>>> and
>>>> time-consuming change.
>>>>
>>>> Another approach is to add more entropy to filehandles so they're
>>>> harder
>>>> to guess; see e.g.:
>>>>
>>>> https://www.fsl.cs.stonybrook.edu/docs/nfscrack-tr/index.html
>>>>
>>>> In the end none of these change the fact that a filehandle has an
>>>> infinite lifetime, so once it's leaked, there's nothing you can do.
>>>> The
>>>> authors suggest NFSv4 volatile filehandles as a solution to that
>>>> problem, but I don't think they've thought through the obstacles to
>>>> making volatile filehandles work.
>>>>
>>>> --b.
>>>
>>> The point is that there is no good solution to the 'I want to export a
>>> subtree of a filesystem' problem, and so it is plainly wrong to try to
>>> make a default of those solutions, which break the one sane case of
>>> exporting the whole filesystem.
>>>
>>> Just a reminder that we kicked out subtree_check not only because a
>>> trivial rename of a file breaks the client's ability to perform I/O by
>>> invalidating the filehandle. In addition, that option causes filehandle
>>> aliasing (i.e. multiple filehandles pointing to the same file) which is
>>> a major PITA for clients to try to manage for more or less the same
>>> reason that it is a major PITA to try to manage these files using
>>> paths.
>>>
>>> The discussion on volatile filehandles in RFC5661 does try to address
>>> some of the above issues, but ends up concluding that you need to
>>> introduce POSIX-incompatible restrictions, such as trying to ban
>>> renames and deletions of open files in order to make it work.
>>>
>>> None of these compromises are necessary if you export a whole
>>> filesystem (or a hierarchy of whole filesystems). That's the sane case.
>>> That's the one that people should default to using.
>>>
On Thu, Jan 21, 2021 at 02:01:13PM -0600, Patrick Goetz wrote:
> I didn't respond to this message immediately, but it's been
> bothering me ever since. When I do a bind mount like this in
> /etc/fstab:
>
> /data2/xray /srv/nfs/xray none defaults,bind 0
>
> it's my understanding that the kernel keeps track of the resulting
> /srv/nfs/xray filesystem in it's vfs somehow. Even when directly on
> the server I can't "break out" of /srv/nfs/xray to get to the other
> directories in /data. Then how on earth would an NFS client do
> this?
As I said, NFS allows you to look up objects by filehandle (so,
basically by inode number), not just by path.
Also, note, mounting something over a directory doesn't hide what's
under the mountpoint. And it's unwise to depend on directory
permissions alone to hide contents of anything underneath that
directory.
> I thought the whole point of doing a bind mount like this is to
> solve the problem of exporting leaves of a directory hierarchy. In
> particular,
>
> "So in your example, if /data2/xray is on the same filesystem as
> /data2, then the server will happily allow operations on
> filehandles anywhere in /data2."
>
> Yes, sure; but I'm not exporting /data2/xray; I'm exporting
> /srv/nfs/xray, a bind mount to the preceding. Am I missing
> something, or is NFS too insecure to use in any context requiring
> differentiated security settings on different folders in the same
> directory structure?
Definitely do *not* depend on NFS to enforce different export options on
different subdirectories of the same filesystem.
> It's not practical to making everything you export its own partition;
> although I suppose one could do this with ZFS datasets.
I'd be happy to hear about any use cases where that's not practical.
As Christophe pointed out, xfs/ext4 project ids are another option.
--b.
On 1/21/21 4:04 PM, [email protected] wrote:
> On Thu, Jan 21, 2021 at 02:01:13PM -0600, Patrick Goetz wrote:
>> I didn't respond to this message immediately, but it's been
>> bothering me ever since. When I do a bind mount like this in
>> /etc/fstab:
>>
>> /data2/xray /srv/nfs/xray none defaults,bind 0
>>
>> it's my understanding that the kernel keeps track of the resulting
>> /srv/nfs/xray filesystem in it's vfs somehow. Even when directly on
>> the server I can't "break out" of /srv/nfs/xray to get to the other
>> directories in /data. Then how on earth would an NFS client do
>> this?
>
> As I said, NFS allows you to look up objects by filehandle (so,
> basically by inode number), not just by path
Except surely this doesn't buy you much if you don't have root access to
the system? Is this all only an issue when the filesystems are exported
with no_root_squash?
I feel like I must be missing something, but it seems to me that if I'm
not root, I'm not going to be able to access inodes I don't have
permissions to access even when directly connected to the exporting server.
>
> Also, note, mounting something over a directory doesn't hide what's
> under the mountpoint. And it's unwise to depend on directory
> permissions alone to hide contents of anything underneath that
> directory.
Well, I only ever bind mount over empty directories; but again, "doesn't
hide what's under the mount point" from whom? I'm sure root can get to
this somehow, but can someone with ordinary user access? even if the
user doesn't have permissions to access the stuff that's been mounted over?
>
>> I thought the whole point of doing a bind mount like this is to
>> solve the problem of exporting leaves of a directory hierarchy. In
>> particular,
>>
>> "So in your example, if /data2/xray is on the same filesystem as
>> /data2, then the server will happily allow operations on
>> filehandles anywhere in /data2."
>>
>> Yes, sure; but I'm not exporting /data2/xray; I'm exporting
>> /srv/nfs/xray, a bind mount to the preceding. Am I missing
>> something, or is NFS too insecure to use in any context requiring
>> differentiated security settings on different folders in the same
>> directory structure?
>
> Definitely do *not* depend on NFS to enforce different export options on
> different subdirectories of the same filesystem.
>
>> It's not practical to making everything you export its own partition;
>> although I suppose one could do this with ZFS datasets.
>
> I'd be happy to hear about any use cases where that's not practical.
>
Sure. The xray example is taken from one of my research groups which
collects thousands of very large electron microscopy images, along with
some xray data. I will certainly design this differently in the next
iteration (most likely using ZFS), but our current server has a 519T
attached storage device which presents itself as a single device:
/dev/sdg. Different groups need access to different classes of data,
which I export separately and with are presented on the workstations as
/xray, /EM, etc..
Yes, I could partition the storage device, but then I run into the usual
issues where one partition runs out of space while others are barely
utilized. This is one good reason to switch to ZFS datasets. The other
is that -- with 450T+ of ever changing data, currently rsync backups are
almost impossible. I'm hoping zfs send/receive is going to save me here.
> As Christophe pointed out, xfs/ext4 project ids are another option.
>
> --b.
>
I must have missed this one, but it just leaves me more confused.
Project ID's are filesystem metadata, yet this affords better boundary
enforcement than a bind mount? Also, the only use case for Project ID's
I was able to find are project quotas, so am not even sure how this
would be implemented, and used by NFS.
On Thu, Jan 21, 2021 at 05:19:32PM -0600, Patrick Goetz wrote:
> On 1/21/21 4:04 PM, [email protected] wrote:
> >As I said, NFS allows you to look up objects by filehandle (so,
> >basically by inode number), not just by path
>
> Except surely this doesn't buy you much if you don't have root
> access to the system? Is this all only an issue when the
> filesystems are exported with no_root_squash?
>
> I feel like I must be missing something, but it seems to me that if
> I'm not root, I'm not going to be able to access inodes I don't have
> permissions to access even when directly connected to the exporting
> server.
If an attacker has access to the network (so they can send their own
hand-crafted NFS requests), then filehandle guessing allows them to
bypass the normal process of looking up a file. So if you were
depending on lookup permissions along that path, or on hiding that path
somehow, you're out of luck.
But it doesn't let them bypass the permissions on the file itself once
they get there. If the permissions on the file don't allow read, the
server still won't let them read it.
> >>It's not practical to making everything you export its own partition;
> >>although I suppose one could do this with ZFS datasets.
> >
> >I'd be happy to hear about any use cases where that's not practical.
>
> Sure. The xray example is taken from one of my research groups which
> collects thousands of very large electron microscopy images, along
> with some xray data. I will certainly design this differently in the
> next iteration (most likely using ZFS), but our current server has a
> 519T attached storage device which presents itself as a single
> device: /dev/sdg. Different groups need access to different classes
> of data, which I export separately and with are presented on the
> workstations as /xray, /EM, etc..
>
> Yes, I could partition the storage device, but then I run into the
> usual issues where one partition runs out of space while others are
> barely utilized. This is one good reason to switch to ZFS datasets.
> The other is that -- with 450T+ of ever changing data, currently
> rsync backups are almost impossible. I'm hoping zfs send/receive is
> going to save me here.
>
> >As Christophe pointed out, xfs/ext4 project ids are another option.
>
> I must have missed this one, but it just leaves me more confused.
> Project ID's are filesystem metadata, yet this affords better
> boundary enforcement than a bind mount?
Right. The project ID is stored in the inode, so it's easy to look up
from the filehandle. (Whereas figuring out what paths might lead to
that inode is a little tricker.)
> Also, the only use case for Project ID's I was able to find are
> project quotas, so am not even sure how this would be implemented, and
> used by NFS.
Project ID's were implemented for quotas, but they also have the
characteristics to work well as NFS export boundaries.
That said, I think Christoph was suggesting this is something we *could*
support, not something that we now do. Though looking at it quickly, I
think it shouldn't take much code at all. I'll put it on my list....
Other options for doing this kind of thing might be btrfs subvolumes or
lvm thin provisioning. I haven't personally used either, but they
should both work now.
--b.
Thanks for engaging; this has been informative.
On 1/21/21 7:30 PM, [email protected] wrote:
> On Thu, Jan 21, 2021 at 05:19:32PM -0600, Patrick Goetz wrote:
>> On 1/21/21 4:04 PM, [email protected] wrote:
>>> As I said, NFS allows you to look up objects by filehandle (so,
>>> basically by inode number), not just by path
>>
>> Except surely this doesn't buy you much if you don't have root
>> access to the system? Is this all only an issue when the
>> filesystems are exported with no_root_squash?
>>
>> I feel like I must be missing something, but it seems to me that if
>> I'm not root, I'm not going to be able to access inodes I don't have
>> permissions to access even when directly connected to the exporting
>> server.
>
> If an attacker has access to the network (so they can send their own
> hand-crafted NFS requests), then filehandle guessing allows them to
> bypass the normal process of looking up a file. So if you were
> depending on lookup permissions along that path, or on hiding that path
> somehow, you're out of luck.
>
> But it doesn't let them bypass the permissions on the file itself once
> they get there. If the permissions on the file don't allow read, the
> server still won't let them read it.
>
That's probably good enough. Security through obscurity isn't a good
idea, so file/directory level permissions should be atomically correct
and not rely on directory hierarchies, restricted direct access by
users, or anything like this.
I didn't not know about the filehandle guessing thing and will keep that
in mind for the next NFS server I deploy.
>>>> It's not practical to making everything you export its own partition;
>>>> although I suppose one could do this with ZFS datasets.
>>>
>>> I'd be happy to hear about any use cases where that's not practical.
>>
>> Sure. The xray example is taken from one of my research groups which
>> collects thousands of very large electron microscopy images, along
>> with some xray data. I will certainly design this differently in the
>> next iteration (most likely using ZFS), but our current server has a
>> 519T attached storage device which presents itself as a single
>> device: /dev/sdg. Different groups need access to different classes
>> of data, which I export separately and with are presented on the
>> workstations as /xray, /EM, etc..
>>
>> Yes, I could partition the storage device, but then I run into the
>> usual issues where one partition runs out of space while others are
>> barely utilized. This is one good reason to switch to ZFS datasets.
>> The other is that -- with 450T+ of ever changing data, currently
>> rsync backups are almost impossible. I'm hoping zfs send/receive is
>> going to save me here.
>>
>>> As Christophe pointed out, xfs/ext4 project ids are another option.
>>
>> I must have missed this one, but it just leaves me more confused.
>> Project ID's are filesystem metadata, yet this affords better
>> boundary enforcement than a bind mount?
>
> Right. The project ID is stored in the inode, so it's easy to look up
> from the filehandle. (Whereas figuring out what paths might lead to
> that inode is a little tricker.)
>
>> Also, the only use case for Project ID's I was able to find are
>> project quotas, so am not even sure how this would be implemented, and
>> used by NFS.
>
> Project ID's were implemented for quotas, but they also have the
> characteristics to work well as NFS export boundaries.
>
> That said, I think Christoph was suggesting this is something we *could*
> support, not something that we now do. Though looking at it quickly, I
> think it shouldn't take much code at all. I'll put it on my list....
>
> Other options for doing this kind of thing might be btrfs subvolumes or
> lvm thin provisioning. I haven't personally used either, but they
> should both work now.
>
> --b.
>
On 1/22/2021 8:20 AM, Patrick Goetz wrote:
> Thanks for engaging; this has been informative.
>
> On 1/21/21 7:30 PM, [email protected] wrote:
>> On Thu, Jan 21, 2021 at 05:19:32PM -0600, Patrick Goetz wrote:
>>> On 1/21/21 4:04 PM, [email protected] wrote:
>>>> As I said, NFS allows you to look up objects by filehandle (so,
>>>> basically by inode number), not just by path
>>>
>>> Except surely this doesn't buy you much if you don't have root
>>> access to the system? Is this all only an issue when the
>>> filesystems are exported with no_root_squash?
>>>
>>> I feel like I must be missing something, but it seems to me that if
>>> I'm not root, I'm not going to be able to access inodes I don't have
>>> permissions to access even when directly connected to the exporting
>>> server.
>>
>> If an attacker has access to the network (so they can send their own
>> hand-crafted NFS requests), then filehandle guessing allows them to
>> bypass the normal process of looking up a file. So if you were
>> depending on lookup permissions along that path, or on hiding that path
>> somehow, you're out of luck.
>>
>> But it doesn't let them bypass the permissions on the file itself once
>> they get there. If the permissions on the file don't allow read, the
>> server still won't let them read it.
>>
>
> That's probably good enough. Security through obscurity isn't a good
> idea, so file/directory level permissions should be atomically correct
> and not rely on directory hierarchies, restricted direct access by
> users, or anything like this.
>
> I didn't not know about the filehandle guessing thing and will keep that
> in mind for the next NFS server I deploy.
There are some NFS clients which implement open-by-filehandle, and don't
rely on lookup at all in normal operation. It is highly efficient, and
provides a very low latency file access. I believe Yahoo was the first
to implement it at scale in production, but it's a very straightforward
cache operation.
Tom.
>>>>> It's not practical to making everything you export its own partition;
>>>>> although I suppose one could do this with ZFS datasets.
>>>>
>>>> I'd be happy to hear about any use cases where that's not practical.
>>>
>>> Sure. The xray example is taken from one of my research groups which
>>> collects thousands of very large electron microscopy images, along
>>> with some xray data. I will certainly design this differently in the
>>> next iteration (most likely using ZFS), but our current server has a
>>> 519T attached storage device which presents itself as a single
>>> device: /dev/sdg. Different groups need access to different classes
>>> of data, which I export separately and with are presented on the
>>> workstations as /xray, /EM, etc..
>>>
>>> Yes, I could partition the storage device, but then I run into the
>>> usual issues where one partition runs out of space while others are
>>> barely utilized. This is one good reason to switch to ZFS datasets.
>>> The other is that -- with 450T+ of ever changing data, currently
>>> rsync backups are almost impossible. I'm hoping zfs send/receive is
>>> going to save me here.
>>>
>>>> As Christophe pointed out, xfs/ext4 project ids are another option.
>>>
>>> I must have missed this one, but it just leaves me more confused.
>>> Project ID's are filesystem metadata, yet this affords better
>>> boundary enforcement than a bind mount?
>>
>> Right. The project ID is stored in the inode, so it's easy to look up
>> from the filehandle. (Whereas figuring out what paths might lead to
>> that inode is a little tricker.)
>>
>>> Also, the only use case for Project ID's I was able to find are
>>> project quotas, so am not even sure how this would be implemented, and
>>> used by NFS.
>>
>> Project ID's were implemented for quotas, but they also have the
>> characteristics to work well as NFS export boundaries.
>>
>> That said, I think Christoph was suggesting this is something we *could*
>> support, not something that we now do. Though looking at it quickly, I
>> think it shouldn't take much code at all. I'll put it on my list....
>>
>> Other options for doing this kind of thing might be btrfs subvolumes or
>> lvm thin provisioning. I haven't personally used either, but they
>> should both work now.
>>
>> --b.
>>
>