2021-10-20 15:55:53

by J. Bruce Fields

[permalink] [raw]
Subject: server-to-server copy by default

knfsd has supported server-to-server copy for a couple years (since
5.5). You have set a module parameter to enable it. I'm getting asked
when we could turn that parameter on by default.

I've got a couple vague criteria: one just general maturity, the other a
security question:

1. General maturity: the only reports I recall seeing are from testers.
Is anyone using this? Does it work for them? Do they find a benefit?
Maybe we could turn it on by default in one distro (Fedora?) and promote
it a little and see what that turns up?

2. Security question: with server-to-server copy enabled, you can send
the server a COPY call with any random address, and the server will
mount that address, open a file, and read from it. Is that safe?

Normally we only mount servers that were chosen by root. Here we'll
mount any random server that some client told us to. What's the worst
that random server can do? Do we trust our xdr decoding? Can it DOS us
by throwing the client's state recovery code into some loop with weird
error returns? Etc.

Maybe it's fine. I'm OK with some level of risk. I just want to make
sure somebody's thought this through.

There's also interest in allowing unprivileged NFS mounts, but I don't
think we've turned that on yet, partly for similar reasons. This is a
subset of that problem.

--b.


2021-10-20 16:01:32

by Chuck Lever III

[permalink] [raw]
Subject: Re: server-to-server copy by default



> On Oct 20, 2021, at 11:54 AM, J. Bruce Fields <[email protected]> wrote:
>
> knfsd has supported server-to-server copy for a couple years (since
> 5.5). You have set a module parameter to enable it. I'm getting asked
> when we could turn that parameter on by default.
>
> I've got a couple vague criteria: one just general maturity, the other a
> security question:
>
> 1. General maturity: the only reports I recall seeing are from testers.
> Is anyone using this? Does it work for them? Do they find a benefit?
> Maybe we could turn it on by default in one distro (Fedora?) and promote
> it a little and see what that turns up?

I like the idea of enabling it in one of the technology
preview distributions.

But wrt the maturity question, is the work finished? Or,
perhaps a better question is do we have a minimum viable
product here that can be enabled, or is more work needed
to meet even that bar?

One thing that I recall is missing is support for Kerberos
in the server-to-server copy operation. Is that in plan,
or deemed unimportant?


> 2. Security question: with server-to-server copy enabled, you can send
> the server a COPY call with any random address, and the server will
> mount that address, open a file, and read from it. Is that safe?
>
> Normally we only mount servers that were chosen by root. Here we'll
> mount any random server that some client told us to. What's the worst
> that random server can do? Do we trust our xdr decoding? Can it DOS us
> by throwing the client's state recovery code into some loop with weird
> error returns? Etc.

A basic question is what is in distribution QE test suites
that could exercise this feature? Should upstream be tasked
with providing any missing pieces (as part of, say, pynfs,
or nfstests)?


> Maybe it's fine. I'm OK with some level of risk. I just want to make
> sure somebody's thought this through.
>
> There's also interest in allowing unprivileged NFS mounts, but I don't
> think we've turned that on yet, partly for similar reasons. This is a
> subset of that problem.
>
> --b.

--
Chuck Lever



2021-10-20 16:34:07

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Wed, Oct 20, 2021 at 12:00 PM Chuck Lever III <[email protected]> wrote:
>
>
>
> > On Oct 20, 2021, at 11:54 AM, J. Bruce Fields <[email protected]> wrote:
> >
> > knfsd has supported server-to-server copy for a couple years (since
> > 5.5). You have set a module parameter to enable it. I'm getting asked
> > when we could turn that parameter on by default.
> >
> > I've got a couple vague criteria: one just general maturity, the other a
> > security question:
> >
> > 1. General maturity: the only reports I recall seeing are from testers.
> > Is anyone using this? Does it work for them? Do they find a benefit?
> > Maybe we could turn it on by default in one distro (Fedora?) and promote
> > it a little and see what that turns up?
>
> I like the idea of enabling it in one of the technology
> preview distributions.
>
> But wrt the maturity question, is the work finished? Or,
> perhaps a better question is do we have a minimum viable
> product here that can be enabled, or is more work needed
> to meet even that bar?
>
> One thing that I recall is missing is support for Kerberos
> in the server-to-server copy operation. Is that in plan,
> or deemed unimportant?

Netapp has some code for gssv3 support which is required for
server-to-server and possibly some copy offload pieces (Andy's work
before his retirement). Anna was picking up the gssv3 work but hasn't
had the cycles yet to complete it. We can make it more of a priority
if that is a show stopper.

> > 2. Security question: with server-to-server copy enabled, you can send
> > the server a COPY call with any random address, and the server will
> > mount that address, open a file, and read from it. Is that safe?
> >
> > Normally we only mount servers that were chosen by root. Here we'll
> > mount any random server that some client told us to. What's the worst
> > that random server can do? Do we trust our xdr decoding? Can it DOS us
> > by throwing the client's state recovery code into some loop with weird
> > error returns? Etc.
>
> A basic question is what is in distribution QE test suites
> that could exercise this feature? Should upstream be tasked
> with providing any missing pieces (as part of, say, pynfs,
> or nfstests)?

There are server-to-server tests in the nfstest testing suite. I'm not
sure if any of the xfstest copy_ofload exercise server to server
capability. Anna wrote the tests.

> > Maybe it's fine. I'm OK with some level of risk. I just want to make
> > sure somebody's thought this through.
> >
> > There's also interest in allowing unprivileged NFS mounts, but I don't
> > think we've turned that on yet, partly for similar reasons. This is a
> > subset of that problem.
> >
> > --b.
>
> --
> Chuck Lever
>
>
>

2021-10-20 16:39:48

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Wed, Oct 20, 2021 at 11:54 AM J. Bruce Fields <[email protected]> wrote:
>
> knfsd has supported server-to-server copy for a couple years (since
> 5.5). You have set a module parameter to enable it. I'm getting asked
> when we could turn that parameter on by default.
>
> I've got a couple vague criteria: one just general maturity, the other a
> security question:
>
> 1. General maturity: the only reports I recall seeing are from testers.
> Is anyone using this? Does it work for them? Do they find a benefit?
> Maybe we could turn it on by default in one distro (Fedora?) and promote
> it a little and see what that turns up?
>
> 2. Security question: with server-to-server copy enabled, you can send
> the server a COPY call with any random address, and the server will
> mount that address, open a file, and read from it. Is that safe?

How about adding a piece then on the server (a policy) that would only
control that? The concept behind the server-to-server was that servers
might have a private/fast network between them that they would want to
utilize. A more restrictive policy could be to only allow predefined
network space to do the COPY? I know that more work. But sound like
perhaps it might be something that provides more control to the
server.

But as Chuck pointed out perhaps the kerberos piece would make this
concern irrelevant.

> Normally we only mount servers that were chosen by root. Here we'll
> mount any random server that some client told us to. What's the worst
> that random server can do? Do we trust our xdr decoding? Can it DOS us
> by throwing the client's state recovery code into some loop with weird
> error returns? Etc.

Client code has been modified to know about special copy stateids that
if the client gets BAD_STATEID it knows not to try to do recovery and
instead it errors back to the "application", it being nfsd.

> Maybe it's fine. I'm OK with some level of risk. I just want to make
> sure somebody's thought this through.
>
> There's also interest in allowing unprivileged NFS mounts, but I don't
> think we've turned that on yet, partly for similar reasons. This is a
> subset of that problem.
>
> --b.

2021-10-20 17:25:50

by Steve Dickson

[permalink] [raw]
Subject: Re: server-to-server copy by default

Hey!

On 10/20/21 12:00, Chuck Lever III wrote:
>
>
>> On Oct 20, 2021, at 11:54 AM, J. Bruce Fields <[email protected]> wrote:
>>
>> knfsd has supported server-to-server copy for a couple years (since
>> 5.5). You have set a module parameter to enable it. I'm getting asked
>> when we could turn that parameter on by default.
>>
>> I've got a couple vague criteria: one just general maturity, the other a
>> security question:
>>
>> 1. General maturity: the only reports I recall seeing are from testers.
>> Is anyone using this? Does it work for them? Do they find a benefit?
>> Maybe we could turn it on by default in one distro (Fedora?) and promote
>> it a little and see what that turns up?
>
> I like the idea of enabling it in one of the technology
> preview distributions.
My thoughts on this... we can do this one of two ways.
either do a kernel patch to enable inter_copy_offload_enable
by default or I could have nfs-utils drop a nfsd.conf file in
/etc/modprobe.d/ enabling it.

The kernel patch is more of a commitment but the
nfs-utils change is easier to back out.

>
> But wrt the maturity question, is the work finished? Or,
> perhaps a better question is do we have a minimum viable
> product here that can be enabled, or is more work needed
> to meet even that bar?
I've been testing it and it seems to be pretty solid.

Question, Olga mentioned Dia did a patch that eliminations
the (rsize*14) file size limit... Meaning the file has
to be greater that (rsize*14) for the SSC to happen.
Was that patch committed? I have not looked that hard
but I haven't found it...

>
> One thing that I recall is missing is support for Kerberos
> in the server-to-server copy operation. Is that in plan,
> or deemed unimportant?
Personally I think we should make sure the technology
is stable before adding things on to it.. IMHO.

>
>
>> 2. Security question: with server-to-server copy enabled, you can send
>> the server a COPY call with any random address, and the server will
>> mount that address, open a file, and read from it. Is that safe?
>>
>> Normally we only mount servers that were chosen by root. Here we'll
>> mount any random server that some client told us to. What's the worst
>> that random server can do? Do we trust our xdr decoding? Can it DOS us
>> by throwing the client's state recovery code into some loop with weird
>> error returns? Etc.
>
> A basic question is what is in distribution QE test suites
> that could exercise this feature? Should upstream be tasked
> with providing any missing pieces (as part of, say, pynfs,
> or nfstests)?
As Olga pointed out... nfstests already has a test...

my two cents,

steved.

>
>
>> Maybe it's fine. I'm OK with some level of risk. I just want to make
>> sure somebody's thought this through.
>>
>> There's also interest in allowing unprivileged NFS mounts, but I don't
>> think we've turned that on yet, partly for similar reasons. This is a
>> subset of that problem.
>>
>> --b.
>
> --
> Chuck Lever
>
>
>

2021-10-20 17:47:26

by Chuck Lever III

[permalink] [raw]
Subject: Re: server-to-server copy by default



> On Oct 20, 2021, at 12:37 PM, Olga Kornievskaia <[email protected]> wrote:
>
> On Wed, Oct 20, 2021 at 11:54 AM J. Bruce Fields <[email protected]> wrote:
>>
>> knfsd has supported server-to-server copy for a couple years (since
>> 5.5). You have set a module parameter to enable it. I'm getting asked
>> when we could turn that parameter on by default.
>>
>> I've got a couple vague criteria: one just general maturity, the other a
>> security question:
>>
>> 1. General maturity: the only reports I recall seeing are from testers.
>> Is anyone using this? Does it work for them? Do they find a benefit?
>> Maybe we could turn it on by default in one distro (Fedora?) and promote
>> it a little and see what that turns up?
>>
>> 2. Security question: with server-to-server copy enabled, you can send
>> the server a COPY call with any random address, and the server will
>> mount that address, open a file, and read from it. Is that safe?
>
> How about adding a piece then on the server (a policy) that would only
> control that? The concept behind the server-to-server was that servers
> might have a private/fast network between them that they would want to
> utilize. A more restrictive policy could be to only allow predefined
> network space to do the COPY? I know that more work. But sound like
> perhaps it might be something that provides more control to the
> server.
>
> But as Chuck pointed out perhaps the kerberos piece would make this
> concern irrelevant.

I like the idea of having a server-side policy setting that
controls whether s2sc is permitted, and maybe establishes a
range of IP addresses allowed to be destination servers.


--
Chuck Lever



2021-10-20 17:51:44

by Chuck Lever III

[permalink] [raw]
Subject: Re: server-to-server copy by default


> On Oct 20, 2021, at 1:24 PM, Steve Dickson <[email protected]> wrote:
>
> Hey!
>
> On 10/20/21 12:00, Chuck Lever III wrote:
>>> On Oct 20, 2021, at 11:54 AM, J. Bruce Fields <[email protected]> wrote:
>>>
>>> knfsd has supported server-to-server copy for a couple years (since
>>> 5.5). You have set a module parameter to enable it. I'm getting asked
>>> when we could turn that parameter on by default.
>>>
>>> I've got a couple vague criteria: one just general maturity, the other a
>>> security question:
>>>
>>> 1. General maturity: the only reports I recall seeing are from testers.
>>> Is anyone using this? Does it work for them? Do they find a benefit?
>>> Maybe we could turn it on by default in one distro (Fedora?) and promote
>>> it a little and see what that turns up?
>
>> But wrt the maturity question, is the work finished? Or,
>> perhaps a better question is do we have a minimum viable
>> product here that can be enabled, or is more work needed
>> to meet even that bar?
> I've been testing it and it seems to be pretty solid.

Well I was more interested in knowing if the capability
is in a state where it is now useful to users rather
than just being a prototype for a new protocol element.

Along with having tests that QE teams can run, we probably
also need to know whether there is adequate documentation.


--
Chuck Lever



2021-10-20 18:01:03

by J. Bruce Fields

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Wed, Oct 20, 2021 at 12:37:08PM -0400, Olga Kornievskaia wrote:
> On Wed, Oct 20, 2021 at 11:54 AM J. Bruce Fields <[email protected]> wrote:
> >
> > knfsd has supported server-to-server copy for a couple years (since
> > 5.5). You have set a module parameter to enable it. I'm getting asked
> > when we could turn that parameter on by default.
> >
> > I've got a couple vague criteria: one just general maturity, the other a
> > security question:
> >
> > 1. General maturity: the only reports I recall seeing are from testers.
> > Is anyone using this? Does it work for them? Do they find a benefit?
> > Maybe we could turn it on by default in one distro (Fedora?) and promote
> > it a little and see what that turns up?
> >
> > 2. Security question: with server-to-server copy enabled, you can send
> > the server a COPY call with any random address, and the server will
> > mount that address, open a file, and read from it. Is that safe?

(Whoops, I forgot, there's no open, just reads. And I don't know how
much actual protocol there is involved in the mount.)

> How about adding a piece then on the server (a policy) that would only
> control that? The concept behind the server-to-server was that servers
> might have a private/fast network between them that they would want to
> utilize. A more restrictive policy could be to only allow predefined
> network space to do the COPY? I know that more work. But sound like
> perhaps it might be something that provides more control to the
> server.

That sounds like a step backwards if you're trying to enable it by
default.

But in the case there's a special server-to-server network, the way to
handle that is by configuring the source server to return addresses on
that network in the cnr_source_server field of the COPY_NOTIFY reply,
right?

> But as Chuck pointed out perhaps the kerberos piece would make this
> concern irrelevant.

I don't think kerberos addresses this. (It may make increase the attack
surface, in fact.)

> > Normally we only mount servers that were chosen by root. Here we'll
> > mount any random server that some client told us to. What's the worst
> > that random server can do? Do we trust our xdr decoding? Can it DOS us
> > by throwing the client's state recovery code into some loop with weird
> > error returns? Etc.
>
> Client code has been modified to know about special copy stateids that
> if the client gets BAD_STATEID it knows not to try to do recovery and
> instead it errors back to the "application", it being nfsd.

Good to know, thanks. What are the list of rpc calls that are made to
the source server--is it just READ, or does it at least need to create a
client and a session? Are there any other error handling paths that we
wouldn't want to go down? Etc.

--b.

> > Maybe it's fine. I'm OK with some level of risk. I just want to make
> > sure somebody's thought this through.
> >
> > There's also interest in allowing unprivileged NFS mounts, but I don't
> > think we've turned that on yet, partly for similar reasons. This is a
> > subset of that problem.
> >
> > --b.

2021-10-20 18:15:36

by J. Bruce Fields

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Wed, Oct 20, 2021 at 05:45:58PM +0000, Chuck Lever III wrote:
> > On Oct 20, 2021, at 12:37 PM, Olga Kornievskaia <[email protected]> wrote:
> >
> > On Wed, Oct 20, 2021 at 11:54 AM J. Bruce Fields <[email protected]> wrote:
> >>
> >> knfsd has supported server-to-server copy for a couple years (since
> >> 5.5). You have set a module parameter to enable it. I'm getting asked
> >> when we could turn that parameter on by default.
> >>
> >> I've got a couple vague criteria: one just general maturity, the other a
> >> security question:
> >>
> >> 1. General maturity: the only reports I recall seeing are from testers.
> >> Is anyone using this? Does it work for them? Do they find a benefit?
> >> Maybe we could turn it on by default in one distro (Fedora?) and promote
> >> it a little and see what that turns up?
> >>
> >> 2. Security question: with server-to-server copy enabled, you can send
> >> the server a COPY call with any random address, and the server will
> >> mount that address, open a file, and read from it. Is that safe?
> >
> > How about adding a piece then on the server (a policy) that would only
> > control that? The concept behind the server-to-server was that servers
> > might have a private/fast network between them that they would want to
> > utilize. A more restrictive policy could be to only allow predefined
> > network space to do the COPY? I know that more work. But sound like
> > perhaps it might be something that provides more control to the
> > server.
> >
> > But as Chuck pointed out perhaps the kerberos piece would make this
> > concern irrelevant.
>
> I like the idea of having a server-side policy setting that
> controls whether s2sc is permitted, and maybe establishes a
> range of IP addresses allowed to be destination servers.

Maybe, but:

1) Couldn't you get something awfully close to that with
firewall configuration?

2) I'm getting asked why server-side copy isn't on by default.
So I guess the requirement to set inter_copy_offload_enable is
too much. How does requiring more complicated configuration
answer that concern?

3) There's interest in allowing unprivileged NFS mounts. That's
more of a security risk than this. What's the client
maintainers' judgement about unprivileged NFS mounts? Do they
think that would be safe to allow by default in distros? If so,
then we're certainly fine here.

--b.

2021-10-20 19:04:17

by Dai Ngo

[permalink] [raw]
Subject: Re: server-to-server copy by default


On 10/20/21 9:33 AM, Olga Kornievskaia wrote:
> On Wed, Oct 20, 2021 at 12:00 PM Chuck Lever III <[email protected]> wrote:
>>
>>
>>> On Oct 20, 2021, at 11:54 AM, J. Bruce Fields <[email protected]> wrote:
>>>
>>> knfsd has supported server-to-server copy for a couple years (since
>>> 5.5). You have set a module parameter to enable it. I'm getting asked
>>> when we could turn that parameter on by default.
>>>
>>> I've got a couple vague criteria: one just general maturity, the other a
>>> security question:
>>>
>>> 1. General maturity: the only reports I recall seeing are from testers.
>>> Is anyone using this? Does it work for them? Do they find a benefit?
>>> Maybe we could turn it on by default in one distro (Fedora?) and promote
>>> it a little and see what that turns up?
>> I like the idea of enabling it in one of the technology
>> preview distributions.
>>
>> But wrt the maturity question, is the work finished? Or,
>> perhaps a better question is do we have a minimum viable
>> product here that can be enabled, or is more work needed
>> to meet even that bar?
>>
>> One thing that I recall is missing is support for Kerberos
>> in the server-to-server copy operation. Is that in plan,
>> or deemed unimportant?
> Netapp has some code for gssv3 support which is required for
> server-to-server and possibly some copy offload pieces (Andy's work
> before his retirement). Anna was picking up the gssv3 work but hasn't
> had the cycles yet to complete it. We can make it more of a priority
> if that is a show stopper.

yes, afaik, the gssv3 support for server-to-server is the only
missing functionality (besides the potential security related issues
mentioned in this thread). It'd be good if we can implement this, or
as Steve suggested we can wait a little while for the technology
is stable before adding gssv3 support.

Do you consider this missing functionality as a show stopper?

>
>>> 2. Security question: with server-to-server copy enabled, you can send
>>> the server a COPY call with any random address, and the server will
>>> mount that address, open a file, and read from it. Is that safe?

The client already has write access to the share on the destination
server, it can write any data to the destination file. If the client
sends a COPY with random address of the source server, that source server
has to export the share in such a way that allows the destination
server to mount. If the share on the random source server is open for
everyone, then isn't it the same as the client writes random data from
its local file to the destination file without server-to-server copy?

-Dai

>>>
>>> Normally we only mount servers that were chosen by root. Here we'll
>>> mount any random server that some client told us to. What's the worst
>>> that random server can do? Do we trust our xdr decoding? Can it DOS us
>>> by throwing the client's state recovery code into some loop with weird
>>> error returns? Etc.
>> A basic question is what is in distribution QE test suites
>> that could exercise this feature? Should upstream be tasked
>> with providing any missing pieces (as part of, say, pynfs,
>> or nfstests)?
> There are server-to-server tests in the nfstest testing suite. I'm not
> sure if any of the xfstest copy_ofload exercise server to server
> capability. Anna wrote the tests.
>
>>> Maybe it's fine. I'm OK with some level of risk. I just want to make
>>> sure somebody's thought this through.
>>>
>>> There's also interest in allowing unprivileged NFS mounts, but I don't
>>> think we've turned that on yet, partly for similar reasons. This is a
>>> subset of that problem.
>>>
>>> --b.
>> --
>> Chuck Lever
>>
>>
>>

2021-10-20 19:05:50

by Chuck Lever III

[permalink] [raw]
Subject: Re: server-to-server copy by default



> On Oct 20, 2021, at 2:15 PM, Bruce Fields <[email protected]> wrote:
>
> On Wed, Oct 20, 2021 at 05:45:58PM +0000, Chuck Lever III wrote:
>>> On Oct 20, 2021, at 12:37 PM, Olga Kornievskaia <[email protected]> wrote:
>>>
>>> On Wed, Oct 20, 2021 at 11:54 AM J. Bruce Fields <[email protected]> wrote:
>>>>
>>>> knfsd has supported server-to-server copy for a couple years (since
>>>> 5.5). You have set a module parameter to enable it. I'm getting asked
>>>> when we could turn that parameter on by default.
>>>>
>>>> I've got a couple vague criteria: one just general maturity, the other a
>>>> security question:
>>>>
>>>> 1. General maturity: the only reports I recall seeing are from testers.
>>>> Is anyone using this? Does it work for them? Do they find a benefit?
>>>> Maybe we could turn it on by default in one distro (Fedora?) and promote
>>>> it a little and see what that turns up?
>>>>
>>>> 2. Security question: with server-to-server copy enabled, you can send
>>>> the server a COPY call with any random address, and the server will
>>>> mount that address, open a file, and read from it. Is that safe?
>>>
>>> How about adding a piece then on the server (a policy) that would only
>>> control that? The concept behind the server-to-server was that servers
>>> might have a private/fast network between them that they would want to
>>> utilize. A more restrictive policy could be to only allow predefined
>>> network space to do the COPY? I know that more work. But sound like
>>> perhaps it might be something that provides more control to the
>>> server.
>>>
>>> But as Chuck pointed out perhaps the kerberos piece would make this
>>> concern irrelevant.
>>
>> I like the idea of having a server-side policy setting that
>> controls whether s2sc is permitted, and maybe establishes a
>> range of IP addresses allowed to be destination servers.
>
> Maybe, but:
>
> 1) Couldn't you get something awfully close to that with
> firewall configuration?

Not if the s2sc policy setting is on each export.


> 2) I'm getting asked why server-side copy isn't on by default.

And your answer to that was "we haven't figured out how to
guarantee security when it's enabled".


> So I guess the requirement to set inter_copy_offload_enable is
> too much. How does requiring more complicated configuration
> answer that concern?

It answers the concern by letting local administrators choose
to enable or disable s2sc based on their own security needs.


> 3) There's interest in allowing unprivileged NFS mounts. That's
> more of a security risk than this. What's the client
> maintainers' judgement about unprivileged NFS mounts? Do they
> think that would be safe to allow by default in distros? If so,
> then we're certainly fine here.

Unprivileged mounting seems like a different question to me.
Related, possibly, but not the same. I'd rather leave that
discussion to another thread.


--
Chuck Lever



2021-10-20 20:29:59

by J. Bruce Fields

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Wed, Oct 20, 2021 at 12:03:46PM -0700, [email protected] wrote:
>
> On 10/20/21 9:33 AM, Olga Kornievskaia wrote:
> >On Wed, Oct 20, 2021 at 12:00 PM Chuck Lever III <[email protected]> wrote:
> >>>2. Security question: with server-to-server copy enabled, you can send
> >>>the server a COPY call with any random address, and the server will
> >>>mount that address, open a file, and read from it. Is that safe?
>
> The client already has write access to the share on the destination
> server, it can write any data to the destination file.

Agreed. Please look back at what I said; I'm not thinking about attacks
on the source server, I'm thinking about attacks on the destination (the
one that receives the COPY).

--b.

2021-10-21 05:01:26

by Dai Ngo

[permalink] [raw]
Subject: Re: server-to-server copy by default

On 10/20/21 1:29 PM, Bruce Fields wrote:

> On Wed, Oct 20, 2021 at 12:03:46PM -0700, [email protected] wrote:
>> On 10/20/21 9:33 AM, Olga Kornievskaia wrote:
>>> On Wed, Oct 20, 2021 at 12:00 PM Chuck Lever III <[email protected]> wrote:
>>>>> 2. Security question: with server-to-server copy enabled, you can send
>>>>> the server a COPY call with any random address, and the server will
>>>>> mount that address, open a file, and read from it. Is that safe?
>> The client already has write access to the share on the destination
>> server, it can write any data to the destination file.
> Agreed. Please look back at what I said; I'm not thinking about attacks
> on the source server, I'm thinking about attacks on the destination (the
> one that receives the COPY).

Sorry for missing you point. If I understand correctly, your concern is
that a malicious client can direct the destination server to mount a
malicious source server that can generate DOS attack to the destination
server.

The attack can come from the replies of the source server or requests
from the source server to the destination server via the back channel.
One of possible attack in the reply is BAD_STATEID which was handled
by the client code as mentioned by Olga.

Here is the list of NFS requests made from the destination to the
source server:

EXCHANGE_ID
CREATE_SESSION
RECLAIM_COMLETE
SEQUENCE
PUTROOTFH
PUTHF
GETFH
GETATTR
READ/READ_PLUS
DESTROY_SESSION
DESTROY_CLIENTID

Do you think we should review all replies from these requests to make
sure error replies do not cause problems for the destination server?

same for the back channel ops:

OP_CB_GETATTR
OP_CB_RECALL
OP_CB_LAYOUTRECALL
OP_CB_NOTIFY
OP_CB_PUSH_DELEG
OP_CB_RECALL_ANY
OP_CB_RECALLABLE_OBJ_AVAIL
OP_CB_RECALL_SLOT
OP_CB_SEQUENCE
OP_CB_WANTS_CANCELLED
OP_CB_NOTIFY_LOCK
OP_CB_NOTIFY_DEVICEID
OP_CB_OFFLOAD

-Dai

> --b.

2021-10-21 13:45:10

by Steve Dickson

[permalink] [raw]
Subject: Re: server-to-server copy by default



On 10/20/21 15:04, Chuck Lever III wrote:
>
>
>> On Oct 20, 2021, at 2:15 PM, Bruce Fields <[email protected]> wrote:
>>
>> On Wed, Oct 20, 2021 at 05:45:58PM +0000, Chuck Lever III wrote:
>>>> On Oct 20, 2021, at 12:37 PM, Olga Kornievskaia <[email protected]> wrote:
>>>>
>>>> On Wed, Oct 20, 2021 at 11:54 AM J. Bruce Fields <[email protected]> wrote:
>>>>>
>>>>> knfsd has supported server-to-server copy for a couple years (since
>>>>> 5.5). You have set a module parameter to enable it. I'm getting asked
>>>>> when we could turn that parameter on by default.
>>>>>
>>>>> I've got a couple vague criteria: one just general maturity, the other a
>>>>> security question:
>>>>>
>>>>> 1. General maturity: the only reports I recall seeing are from testers.
>>>>> Is anyone using this? Does it work for them? Do they find a benefit?
>>>>> Maybe we could turn it on by default in one distro (Fedora?) and promote
>>>>> it a little and see what that turns up?
>>>>>
>>>>> 2. Security question: with server-to-server copy enabled, you can send
>>>>> the server a COPY call with any random address, and the server will
>>>>> mount that address, open a file, and read from it. Is that safe?
>>>>
>>>> How about adding a piece then on the server (a policy) that would only
>>>> control that? The concept behind the server-to-server was that servers
>>>> might have a private/fast network between them that they would want to
>>>> utilize. A more restrictive policy could be to only allow predefined
>>>> network space to do the COPY? I know that more work. But sound like
>>>> perhaps it might be something that provides more control to the
>>>> server.
>>>>
>>>> But as Chuck pointed out perhaps the kerberos piece would make this
>>>> concern irrelevant.
>>>
>>> I like the idea of having a server-side policy setting that
>>> controls whether s2sc is permitted, and maybe establishes a
>>> range of IP addresses allowed to be destination servers.
>>
>> Maybe, but:
>>
>> 1) Couldn't you get something awfully close to that with
>> firewall configuration?
>
> Not if the s2sc policy setting is on each export.
Is this level complication really necessary... I just
don't see why people would not want to make copies
on all exports faster.

Is not having this option a showstopper to enabling it?

>
>
>> 2) I'm getting asked why server-side copy isn't on by default.
>
> And your answer to that was "we haven't figured out how to
> guarantee security when it's enabled".
I'm thinking the servers will be behind a firewall
which by definition makes them secure.

Now if there is a malicious app throwing COPY calls
with rouge address behind the firewall is that
something we really need to protect from? The
network has already been compromised.

As Olga pointed out... clustered servers will have a
will have a very fast connection between them which
is something we should take advantage of... IMHO

steved.

>
>
>> So I guess the requirement to set inter_copy_offload_enable is
>> too much. How does requiring more complicated configuration
>> answer that concern?
>
> It answers the concern by letting local administrators choose
> to enable or disable s2sc based on their own security needs.
>
>
>> 3) There's interest in allowing unprivileged NFS mounts. That's
>> more of a security risk than this. What's the client
>> maintainers' judgement about unprivileged NFS mounts? Do they
>> think that would be safe to allow by default in distros? If so,
>> then we're certainly fine here.
>
> Unprivileged mounting seems like a different question to me.
> Related, possibly, but not the same. I'd rather leave that
> discussion to another thread.
>
>
> --
> Chuck Lever
>
>
>

2021-10-21 13:57:03

by J. Bruce Fields

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Wed, Oct 20, 2021 at 07:04:53PM +0000, Chuck Lever III wrote:
> Not if the s2sc policy setting is on each export.

I don't want anyone investing time in writing code to enable this
configuration for the kernel and mountd and exportfs, documenting it,
explaining it to users repeatedly, etc., all for a problem we're like to
eventually declare not a problem.

--b.

2021-10-21 14:03:24

by J. Bruce Fields

[permalink] [raw]
Subject: Re: server-to-server copy by default

Thanks for the persistence:

On Wed, Oct 20, 2021 at 10:00:41PM -0700, [email protected] wrote:
> The attack can come from the replies of the source server or requests
> from the source server to the destination server via the back channel.
> One of possible attack in the reply is BAD_STATEID which was handled
> by the client code as mentioned by Olga.
>
> Here is the list of NFS requests made from the destination to the
> source server:
>
> EXCHANGE_ID
> CREATE_SESSION
> RECLAIM_COMLETE
> SEQUENCE
> PUTROOTFH
> PUTHF
> GETFH
> GETATTR
> READ/READ_PLUS
> DESTROY_SESSION
> DESTROY_CLIENTID
>
> Do you think we should review all replies from these requests to make
> sure error replies do not cause problems for the destination server?

That's the exactly the sort of analysis I was curious to see, yes.

(I doubt the PUTROOTFH, PUTFH, GETFH, and GETATTR are really necessary,
I wonder if there's any way we could just bypass them in our case. I
don't know, maybe that's more trouble than it's worth.)

> same for the back channel ops:
>
> OP_CB_GETATTR
> OP_CB_RECALL
> OP_CB_LAYOUTRECALL
> OP_CB_NOTIFY
> OP_CB_PUSH_DELEG
> OP_CB_RECALL_ANY
> OP_CB_RECALLABLE_OBJ_AVAIL
> OP_CB_RECALL_SLOT
> OP_CB_SEQUENCE
> OP_CB_WANTS_CANCELLED
> OP_CB_NOTIFY_LOCK
> OP_CB_NOTIFY_DEVICEID
> OP_CB_OFFLOAD

There shouldn't be any need for callbacks at all. We might be able to
get away without even setting up a backchannel. But, yes, if the server
tries to send one anyway, it'd be good to know we do something
reasonable.

--b.

2021-10-21 14:14:37

by J. Bruce Fields

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Wed, Oct 20, 2021 at 07:04:53PM +0000, Chuck Lever III wrote:
> Unprivileged mounting seems like a different question to me.
> Related, possibly, but not the same. I'd rather leave that
> discussion to another thread.

Well, I'd be curious if client maintainers have any thoughts.

The NFS client still disallows unprivileged mounts, right? Is it
something you think could be supported, and if so, do you have an idea
what's left to do?

Trond, I remember asking you about unprivileged mounts at a bakeathon a
few years ago, and at the time you seemed to think it'd be a reasonable
thing to do eventually, and the one obstacle you mentioned was that the
client wasn't capable of maintaining separate state in different
namespaces. That's fixed, isn't it?

--b.

2021-10-21 14:23:18

by Trond Myklebust

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Thu, 2021-10-21 at 10:13 -0400, Bruce Fields wrote:
> On Wed, Oct 20, 2021 at 07:04:53PM +0000, Chuck Lever III wrote:
> > Unprivileged mounting seems like a different question to me.
> > Related, possibly, but not the same. I'd rather leave that
> > discussion to another thread.
>
> Well, I'd be curious if client maintainers have any thoughts.
>
> The NFS client still disallows unprivileged mounts, right?  Is it
> something you think could be supported, and if so, do you have an
> idea
> what's left to do?
>
> Trond, I remember asking you about unprivileged mounts at a bakeathon
> a
> few years ago, and at the time you seemed to think it'd be a
> reasonable
> thing to do eventually, and the one obstacle you mentioned was that
> the
> client wasn't capable of maintaining separate state in different
> namespaces.  That's fixed, isn't it?
>

Yes, that's mostly fixed. As far as I'm concerned, there should be no
major obstacles to allowing unprivileged mounts in their own private
net namespace.
The one thing to note, though, is that AUTH_SYS still required that the
container be given a CAP_NET_BIND_SERVICE privilege to allow binding to
a privileged port.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2021-10-21 14:39:18

by J. Bruce Fields

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Thu, Oct 21, 2021 at 02:22:13PM +0000, Trond Myklebust wrote:
> Yes, that's mostly fixed. As far as I'm concerned, there should be no
> major obstacles to allowing unprivileged mounts in their own private
> net namespace.

Do you think it'd be a reasonable thing to turn on now by default in
distros or something the admin should have to opt-in to only on trusted
networks?

I'm wondering how much confidence we have in the client's robustness in
the face of possibly compromised servers.

> The one thing to note, though, is that AUTH_SYS still required that the
> container be given a CAP_NET_BIND_SERVICE privilege to allow binding to
> a privileged port.

Got it, thanks.

--b.

2021-10-22 06:35:34

by Dai Ngo

[permalink] [raw]
Subject: Re: server-to-server copy by default

On 10/21/21 7:02 AM, Bruce Fields wrote:
> Thanks for the persistence:
>
> On Wed, Oct 20, 2021 at 10:00:41PM -0700, [email protected] wrote:
>> The attack can come from the replies of the source server or requests
>> from the source server to the destination server via the back channel.
>> One of possible attack in the reply is BAD_STATEID which was handled
>> by the client code as mentioned by Olga.
>>
>> Here is the list of NFS requests made from the destination to the
>> source server:
>>
>> EXCHANGE_ID
>> CREATE_SESSION
>> RECLAIM_COMLETE
>> SEQUENCE
>> PUTROOTFH
>> PUTHF
>> GETFH
>> GETATTR
>> READ/READ_PLUS
>> DESTROY_SESSION
>> DESTROY_CLIENTID
>>
>> Do you think we should review all replies from these requests to make
>> sure error replies do not cause problems for the destination server?
> That's the exactly the sort of analysis I was curious to see, yes.

I will go through these requests to see if is there is anything that
we need to do to ensure the destination does not react negatively
on the replies.

>
> (I doubt the PUTROOTFH, PUTFH, GETFH, and GETATTR are really necessary,
> I wonder if there's any way we could just bypass them in our case. I
> don't know, maybe that's more trouble than it's worth.)

I'll take a look but I think we should avoid modifying the client
code if possible.

>
>> same for the back channel ops:
>>
>> OP_CB_GETATTR
>> OP_CB_RECALL
>> OP_CB_LAYOUTRECALL
>> OP_CB_NOTIFY
>> OP_CB_PUSH_DELEG
>> OP_CB_RECALL_ANY
>> OP_CB_RECALLABLE_OBJ_AVAIL
>> OP_CB_RECALL_SLOT
>> OP_CB_SEQUENCE
>> OP_CB_WANTS_CANCELLED
>> OP_CB_NOTIFY_LOCK
>> OP_CB_NOTIFY_DEVICEID
>> OP_CB_OFFLOAD
> There shouldn't be any need for callbacks at all. We might be able to
> get away without even setting up a backchannel. But, yes, if the server
> tries to send one anyway, it'd be good to know we do something
> reasonable.

or do not specify the back channel when creating the session somehow.
I will report back.

-Dai

>
> --b.

2021-10-22 12:59:54

by J. Bruce Fields

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Thu, Oct 21, 2021 at 11:34:44PM -0700, [email protected] wrote:
> On 10/21/21 7:02 AM, Bruce Fields wrote:
> >Thanks for the persistence:
> >
> >On Wed, Oct 20, 2021 at 10:00:41PM -0700, [email protected] wrote:
> >>The attack can come from the replies of the source server or requests
> >>from the source server to the destination server via the back channel.
> >>One of possible attack in the reply is BAD_STATEID which was handled
> >>by the client code as mentioned by Olga.
> >>
> >>Here is the list of NFS requests made from the destination to the
> >>source server:
> >>
> >> EXCHANGE_ID
> >> CREATE_SESSION
> >> RECLAIM_COMLETE
> >> SEQUENCE
> >> PUTROOTFH
> >> PUTHF
> >> GETFH
> >> GETATTR
> >> READ/READ_PLUS
> >> DESTROY_SESSION
> >> DESTROY_CLIENTID
> >>
> >>Do you think we should review all replies from these requests to make
> >>sure error replies do not cause problems for the destination server?
> >That's the exactly the sort of analysis I was curious to see, yes.
>
> I will go through these requests to see if is there is anything that
> we need to do to ensure the destination does not react negatively
> on the replies.
>
> >
> >(I doubt the PUTROOTFH, PUTFH, GETFH, and GETATTR are really necessary,
> >I wonder if there's any way we could just bypass them in our case. I
> >don't know, maybe that's more trouble than it's worth.)
>
> I'll take a look but I think we should avoid modifying the client
> code if possible.
>
> >
> >>same for the back channel ops:
> >>
> >> OP_CB_GETATTR
> >> OP_CB_RECALL
> >> OP_CB_LAYOUTRECALL
> >> OP_CB_NOTIFY
> >> OP_CB_PUSH_DELEG
> >> OP_CB_RECALL_ANY
> >> OP_CB_RECALLABLE_OBJ_AVAIL
> >> OP_CB_RECALL_SLOT
> >> OP_CB_SEQUENCE
> >> OP_CB_WANTS_CANCELLED
> >> OP_CB_NOTIFY_LOCK
> >> OP_CB_NOTIFY_DEVICEID
> >> OP_CB_OFFLOAD
> >There shouldn't be any need for callbacks at all. We might be able to
> >get away without even setting up a backchannel. But, yes, if the server
> >tries to send one anyway, it'd be good to know we do something
> >reasonable.
>
> or do not specify the back channel when creating the session somehow.
> I will report back.

Thank you, Dai!

--b.

2021-11-01 17:38:50

by Dai Ngo

[permalink] [raw]
Subject: Re: server-to-server copy by default


On 10/21/21 11:34 PM, [email protected] wrote:
> On 10/21/21 7:02 AM, Bruce Fields wrote:
>> On Wed, Oct 20, 2021 at 10:00:41PM -0700, [email protected] wrote:
>>> The attack can come from the replies of the source server or requests
>>> from the source server to the destination server via the back channel.
>>> One of possible attack in the reply is BAD_STATEID which was handled
>>> by the client code as mentioned by Olga.
>>>
>>> Here is the list of NFS requests made from the destination to the
>>> source server:
>>>
>>>          EXCHANGE_ID
>>>          CREATE_SESSION
>>>          RECLAIM_COMLETE
>>>          SEQUENCE
>>>          PUTROOTFH
>>>          PUTHF
>>>          GETFH
>>>          GETATTR
>>>          READ/READ_PLUS
>>>          DESTROY_SESSION
>>>          DESTROY_CLIENTID
>>>
>>> Do you think we should review all replies from these requests to make
>>> sure error replies do not cause problems for the destination server?
>> That's the exactly the sort of analysis I was curious to see, yes.
>
> I will go through these requests to see if is there is anything that
> we need to do to ensure the destination does not react negatively
> on the replies.

still need to be done.

>
>>
>> (I doubt the PUTROOTFH, PUTFH, GETFH, and GETATTR are really necessary,
>> I wonder if there's any way we could just bypass them in our case.  I
>> don't know, maybe that's more trouble than it's worth.)
>
> I'll take a look but I think we should avoid modifying the client
> code if possible.
>
>>
>>> same for the back channel ops:
>>>
>>>          OP_CB_GETATTR
>>>          OP_CB_RECALL
>>>          OP_CB_LAYOUTRECALL
>>>          OP_CB_NOTIFY
>>>          OP_CB_PUSH_DELEG
>>>          OP_CB_RECALL_ANY
>>>          OP_CB_RECALLABLE_OBJ_AVAIL
>>>          OP_CB_RECALL_SLOT
>>>          OP_CB_SEQUENCE
>>>          OP_CB_WANTS_CANCELLED
>>>          OP_CB_NOTIFY_LOCK
>>>          OP_CB_NOTIFY_DEVICEID
>>>          OP_CB_OFFLOAD
>> There shouldn't be any need for callbacks at all.  We might be able to
>> get away without even setting up a backchannel.  But, yes, if the server
>> tries to send one anyway, it'd be good to know we do something
>> reasonable.
>
> or do not specify the back channel when creating the session somehow.
> I will report back.

We can not disable the back channel of the SSC v4.2 mount since it might
share the same connection with a regular NFSv4.2 mount from the destination
to the source server. We need to be able to identify whether the back channel
request is for the regular mount or the SSC mount and if it's for the SSC
mount then drop the request.

To differentiate back channel request for SSC vs regular mount I plan to
do the following:

Mark the nfs_server of the SSC mount with with a flag (NFS_MOUNT_SSC)

When a back channel request comes in, we check all the nfs_server's
that share the same nfs_client based on the clientid in the request.

If there is one or more nfs_server's sharing the same nfs_client and
none of them is marked as NFS_MOUNT_SSC then we allow the request to
be processed as normal (non-SSC case).

If there are multiple nfs_server's and one of then is marked as NFS_MOUNT_SSC
then we allow the request to be processed. This is because if there
is a regular mount from destination to source server that means the
source server is already trusted by the destination's admin.

If there is only one nfs_server and it's marked as NFS_MOUNT_SSC then
we drop that request.

Do see any problem with this approach or you have any suggestion on
how to handle this?

Thanks,
-Dai

2021-11-01 18:23:08

by Charles Hedrick

[permalink] [raw]
Subject: Re: server-to-server copy by default

I am in general concerned about turning on new features before basic ones work reliably. We’ve had enough different failures that we’ve backup up to NFS 3 for file systems with heavy use.

We first tried turning off delegation. That helped a lot. But we just ran into a two different machine hung trying to lock Chome’s profile. (I sent a bit of information on that one previously.) We had to restart NFS on the server to fix it, and that caused us to lose a bunch of VMs. (That shouldn’t have happened. It looks like ESX misbehaved.) If I could turn off NFS4 locking I would.

> On Oct 20, 2021, at 11:54 AM, J. Bruce Fields <[email protected]> wrote:
>
> knfsd has supported server-to-server copy for a couple years (since
> 5.5). You have set a module parameter to enable it. I'm getting asked
> when we could turn that parameter on by default.
>
> I've got a couple vague criteria: one just general maturity, the other a
> security question:
>
> 1. General maturity: the only reports I recall seeing are from testers.
> Is anyone using this? Does it work for them? Do they find a benefit?
> Maybe we could turn it on by default in one distro (Fedora?) and promote
> it a little and see what that turns up?
>
> 2. Security question: with server-to-server copy enabled, you can send
> the server a COPY call with any random address, and the server will
> mount that address, open a file, and read from it. Is that safe?
>
> Normally we only mount servers that were chosen by root. Here we'll
> mount any random server that some client told us to. What's the worst
> that random server can do? Do we trust our xdr decoding? Can it DOS us
> by throwing the client's state recovery code into some loop with weird
> error returns? Etc.
>
> Maybe it's fine. I'm OK with some level of risk. I just want to make
> sure somebody's thought this through.
>
> There's also interest in allowing unprivileged NFS mounts, but I don't
> think we've turned that on yet, partly for similar reasons. This is a
> subset of that problem.
>
> --b.

2021-11-01 19:26:15

by Steve Dickson

[permalink] [raw]
Subject: Re: server-to-server copy by default

Hello,

On 11/1/21 14:22, Charles Hedrick wrote:
> I am in general concerned about turning on new features before basic ones work reliably. We’ve had enough different failures that we’ve backup up to NFS 3 for file systems with heavy use.
>
> We first tried turning off delegation. That helped a lot. But we just ran into a two different machine hung trying to lock Chome’s profile. (I sent a bit of information on that one previously.) We had to restart NFS on the server to fix it, and that caused us to lose a bunch of VMs. (That shouldn’t have happened. It looks like ESX misbehaved.) If I could turn off NFS4 locking I would.
This is the reason I was hopping not make this a global switch
but a per export switch...

Question... Do you do many server to server copies in your world?
Meaning a client coping from one server to another?

steved.

>
>> On Oct 20, 2021, at 11:54 AM, J. Bruce Fields <[email protected]> wrote:
>>
>> knfsd has supported server-to-server copy for a couple years (since
>> 5.5). You have set a module parameter to enable it. I'm getting asked
>> when we could turn that parameter on by default.
>>
>> I've got a couple vague criteria: one just general maturity, the other a
>> security question:
>>
>> 1. General maturity: the only reports I recall seeing are from testers.
>> Is anyone using this? Does it work for them? Do they find a benefit?
>> Maybe we could turn it on by default in one distro (Fedora?) and promote
>> it a little and see what that turns up?
>>
>> 2. Security question: with server-to-server copy enabled, you can send
>> the server a COPY call with any random address, and the server will
>> mount that address, open a file, and read from it. Is that safe?
>>
>> Normally we only mount servers that were chosen by root. Here we'll
>> mount any random server that some client told us to. What's the worst
>> that random server can do? Do we trust our xdr decoding? Can it DOS us
>> by throwing the client's state recovery code into some loop with weird
>> error returns? Etc.
>>
>> Maybe it's fine. I'm OK with some level of risk. I just want to make
>> sure somebody's thought this through.
>>
>> There's also interest in allowing unprivileged NFS mounts, but I don't
>> think we've turned that on yet, partly for similar reasons. This is a
>> subset of that problem.
>>
>> --b.
>

2021-11-01 19:34:04

by J. Bruce Fields

[permalink] [raw]
Subject: Re: server-to-server copy by default

On Mon, Nov 01, 2021 at 10:37:11AM -0700, [email protected] wrote:
>
> On 10/21/21 11:34 PM, [email protected] wrote:
> >On 10/21/21 7:02 AM, Bruce Fields wrote:
> >>On Wed, Oct 20, 2021 at 10:00:41PM -0700, [email protected] wrote:
> >>>The attack can come from the replies of the source server or requests
> >>>from the source server to the destination server via the back channel.
> >>>One of possible attack in the reply is BAD_STATEID which was handled
> >>>by the client code as mentioned by Olga.
> >>>
> >>>Here is the list of NFS requests made from the destination to the
> >>>source server:
> >>>
> >>>         EXCHANGE_ID
> >>>         CREATE_SESSION
> >>>         RECLAIM_COMLETE
> >>>         SEQUENCE
> >>>         PUTROOTFH
> >>>         PUTHF
> >>>         GETFH
> >>>         GETATTR
> >>>         READ/READ_PLUS
> >>>         DESTROY_SESSION
> >>>         DESTROY_CLIENTID
> >>>
> >>>Do you think we should review all replies from these requests to make
> >>>sure error replies do not cause problems for the destination server?
> >>That's the exactly the sort of analysis I was curious to see, yes.
> >
> >I will go through these requests to see if is there is anything that
> >we need to do to ensure the destination does not react negatively
> >on the replies.
>
> still need to be done.
>
> >
> >>
> >>(I doubt the PUTROOTFH, PUTFH, GETFH, and GETATTR are really necessary,
> >>I wonder if there's any way we could just bypass them in our case.  I
> >>don't know, maybe that's more trouble than it's worth.)
> >
> >I'll take a look but I think we should avoid modifying the client
> >code if possible.
> >
> >>
> >>>same for the back channel ops:
> >>>
> >>>         OP_CB_GETATTR
> >>>         OP_CB_RECALL
> >>>         OP_CB_LAYOUTRECALL
> >>>         OP_CB_NOTIFY
> >>>         OP_CB_PUSH_DELEG
> >>>         OP_CB_RECALL_ANY
> >>>         OP_CB_RECALLABLE_OBJ_AVAIL
> >>>         OP_CB_RECALL_SLOT
> >>>         OP_CB_SEQUENCE
> >>>         OP_CB_WANTS_CANCELLED
> >>>         OP_CB_NOTIFY_LOCK
> >>>         OP_CB_NOTIFY_DEVICEID
> >>>         OP_CB_OFFLOAD
> >>There shouldn't be any need for callbacks at all.  We might be able to
> >>get away without even setting up a backchannel.  But, yes, if the server
> >>tries to send one anyway, it'd be good to know we do something
> >>reasonable.
> >
> >or do not specify the back channel when creating the session somehow.
> >I will report back.
>
> We can not disable the back channel of the SSC v4.2 mount since it might
> share the same connection with a regular NFSv4.2 mount from the destination
> to the source server.

Hm.

Well, now that I think of it, a backchannel is probably required for the
SSC case anyway. (I think CB_RECALL_SLOT is mandatory to support?)

--b.

2021-11-01 19:44:55

by Charles Hedrick

[permalink] [raw]
Subject: Re: server-to-server copy by default

I don’t think it’s a common operation. But there’s enough going on that it’s hard to be sure.

> On Nov 1, 2021, at 3:25 PM, Steve Dickson <[email protected]> wrote:
>
> Hello,
>
> On 11/1/21 14:22, Charles Hedrick wrote:
>> I am in general concerned about turning on new features before basic ones work reliably. We’ve had enough different failures that we’ve backup up to NFS 3 for file systems with heavy use.
>> We first tried turning off delegation. That helped a lot. But we just ran into a two different machine hung trying to lock Chome’s profile. (I sent a bit of information on that one previously.) We had to restart NFS on the server to fix it, and that caused us to lose a bunch of VMs. (That shouldn’t have happened. It looks like ESX misbehaved.) If I could turn off NFS4 locking I would.
> This is the reason I was hopping not make this a global switch
> but a per export switch...
>
> Question... Do you do many server to server copies in your world?
> Meaning a client coping from one server to another?
>
> steved.
>
>>> On Oct 20, 2021, at 11:54 AM, J. Bruce Fields <[email protected]> wrote:
>>>
>>> knfsd has supported server-to-server copy for a couple years (since
>>> 5.5). You have set a module parameter to enable it. I'm getting asked
>>> when we could turn that parameter on by default.
>>>
>>> I've got a couple vague criteria: one just general maturity, the other a
>>> security question:
>>>
>>> 1. General maturity: the only reports I recall seeing are from testers.
>>> Is anyone using this? Does it work for them? Do they find a benefit?
>>> Maybe we could turn it on by default in one distro (Fedora?) and promote
>>> it a little and see what that turns up?
>>>
>>> 2. Security question: with server-to-server copy enabled, you can send
>>> the server a COPY call with any random address, and the server will
>>> mount that address, open a file, and read from it. Is that safe?
>>>
>>> Normally we only mount servers that were chosen by root. Here we'll
>>> mount any random server that some client told us to. What's the worst
>>> that random server can do? Do we trust our xdr decoding? Can it DOS us
>>> by throwing the client's state recovery code into some loop with weird
>>> error returns? Etc.
>>>
>>> Maybe it's fine. I'm OK with some level of risk. I just want to make
>>> sure somebody's thought this through.
>>>
>>> There's also interest in allowing unprivileged NFS mounts, but I don't
>>> think we've turned that on yet, partly for similar reasons. This is a
>>> subset of that problem.
>>>
>>> --b.
>

2021-11-01 19:56:35

by Dai Ngo

[permalink] [raw]
Subject: Re: server-to-server copy by default


On 11/1/21 12:33 PM, Bruce Fields wrote:
> On Mon, Nov 01, 2021 at 10:37:11AM -0700, [email protected] wrote:
>> On 10/21/21 11:34 PM, [email protected] wrote:
>>> On 10/21/21 7:02 AM, Bruce Fields wrote:
>>>> On Wed, Oct 20, 2021 at 10:00:41PM -0700, [email protected] wrote:
>>>>> The attack can come from the replies of the source server or requests
>>>> >from the source server to the destination server via the back channel.
>>>>> One of possible attack in the reply is BAD_STATEID which was handled
>>>>> by the client code as mentioned by Olga.
>>>>>
>>>>> Here is the list of NFS requests made from the destination to the
>>>>> source server:
>>>>>
>>>>>          EXCHANGE_ID
>>>>>          CREATE_SESSION
>>>>>          RECLAIM_COMLETE
>>>>>          SEQUENCE
>>>>>          PUTROOTFH
>>>>>          PUTHF
>>>>>          GETFH
>>>>>          GETATTR
>>>>>          READ/READ_PLUS
>>>>>          DESTROY_SESSION
>>>>>          DESTROY_CLIENTID
>>>>>
>>>>> Do you think we should review all replies from these requests to make
>>>>> sure error replies do not cause problems for the destination server?
>>>> That's the exactly the sort of analysis I was curious to see, yes.
>>> I will go through these requests to see if is there is anything that
>>> we need to do to ensure the destination does not react negatively
>>> on the replies.
>> still need to be done.
>>
>>>> (I doubt the PUTROOTFH, PUTFH, GETFH, and GETATTR are really necessary,
>>>> I wonder if there's any way we could just bypass them in our case.  I
>>>> don't know, maybe that's more trouble than it's worth.)
>>> I'll take a look but I think we should avoid modifying the client
>>> code if possible.
>>>
>>>>> same for the back channel ops:
>>>>>
>>>>>          OP_CB_GETATTR
>>>>>          OP_CB_RECALL
>>>>>          OP_CB_LAYOUTRECALL
>>>>>          OP_CB_NOTIFY
>>>>>          OP_CB_PUSH_DELEG
>>>>>          OP_CB_RECALL_ANY
>>>>>          OP_CB_RECALLABLE_OBJ_AVAIL
>>>>>          OP_CB_RECALL_SLOT
>>>>>          OP_CB_SEQUENCE
>>>>>          OP_CB_WANTS_CANCELLED
>>>>>          OP_CB_NOTIFY_LOCK
>>>>>          OP_CB_NOTIFY_DEVICEID
>>>>>          OP_CB_OFFLOAD
>>>> There shouldn't be any need for callbacks at all.  We might be able to
>>>> get away without even setting up a backchannel.  But, yes, if the server
>>>> tries to send one anyway, it'd be good to know we do something
>>>> reasonable.
>>> or do not specify the back channel when creating the session somehow.
>>> I will report back.
>> We can not disable the back channel of the SSC v4.2 mount since it might
>> share the same connection with a regular NFSv4.2 mount from the destination
>> to the source server.
> Hm.
>
> Well, now that I think of it, a backchannel is probably required for the
> SSC case anyway. (I think CB_RECALL_SLOT is mandatory to support?)

I think the back channel is not required. From Sec 13 of RFC 7862:

The REQUIRED or OPTIONAL designation for callback operations sent by
the server is for both the client and server. Generally, the client
has the option of creating the backchannel and sending the operations
on the forechannel that will be a catalyst for the server sending
callback operations. A partial exception is CB_RECALL_SLOT; the only
way the client can avoid supporting this operation is by not creating
a backchannel.

I have not found a clean/simple way to not creating the back channel
for the SSC mount. The reasons is that the SSC mount can be using an
existing connection of a regular mount to the source server, or a
regular mount can happen after the SSC mount.

It seems that we should support back channel for SSC mount but do a
due diligent check to make sure the requests come from a trusted source
and also limit the number of ops supported on the SSC back channel.

-Dai

>
> --b.