DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2A3532086B
MIME-Version: 1.0
References: <20181116142627.GA19946@fieldses.org> <CAN-5tyGsDAd2hkaw5nONdS2TzRy1qK7xyFDpNeT7Jsd9ZrH7+g@mail.gmail.com>
 <20181116175645.GA21852@fieldses.org> <20181116180118.GB21852@fieldses.org>
 <CAN-5tyGRy5vi4P-jMTmV7cto8b_6s5Cuop7dTN0Y7witpzr2pQ@mail.gmail.com>
 <CAN-5tyHX3h6TXJhFeZPGZvAWXXxwaxpAkZtRNV9+L8m5xJ3fVw@mail.gmail.com>
 <20181116193016.GA22304@fieldses.org> <CAN-5tyFDGnCAdQcYJRC15nsqLOmOt2c4+bZN7pEFSL-1PAx2XQ@mail.gmail.com>
 <20181116195844.GB22304@fieldses.org>
In-Reply-To: <20181116195844.GB22304@fieldses.org>
From:   Olga Kornievskaia <aglo@umich.edu>
Date:   Fri, 16 Nov 2018 15:11:58 -0500
Message-ID: <CAN-5tyGhkNkhU0ow8DbEm1RAjC2xiAZXwnCm5i13j-2zXHRv3A@mail.gmail.com>
Subject: Re: handle_async_copy calling kzalloc under spinlock
To:     "J. Bruce Fields" <bfields@fieldses.org>
Cc:     Olga Kornievskaia <kolga@netapp.com>,
        linux-nfs <linux-nfs@vger.kernel.org>,
        Anna Schumaker <Anna.Schumaker@netapp.com>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-nfs-owner@vger.kernel.org
Precedence: bulk

On Fri, Nov 16, 2018 at 2:58 PM J. Bruce Fields <bfields@fieldses.org> wrote:
>
> On Fri, Nov 16, 2018 at 02:49:00PM -0500, Olga Kornievskaia wrote:
> > On Fri, Nov 16, 2018 at 2:30 PM J. Bruce Fields <bfields@fieldses.org> wrote:
> > >
> > > On Fri, Nov 16, 2018 at 01:52:29PM -0500, Olga Kornievskaia wrote:
> > > > On Fri, Nov 16, 2018 at 1:30 PM Olga Kornievskaia <aglo@umich.edu> wrote:
> > > > > Then how does the copy knows not to go wait for the callback? Copy
> > > > > checks the pending_callback list to see if received a callback. If
> > > > > not, it puts itself on the copy list and goes to sleep. The callback,
> > > > > checks the copy list and if it finds a copy signals it, if not it puts
> > > > > itself on the pending_callback list. a lock is held over checking one
> > > > > list and putting yourself on the other.
> > >
> > > OK, apologies, I don't really understand those data structures yet, but
> > > something seems wrong to me.
> > >
> > > Under what circumstances could we recieve a CB_OFFLOAD without having
> > > started the corresponding copy already?
> >
> > It can receive a CB_OFFLOAD before it receives a reply to the COPY.
> > It's possible and I can trigger it during testing when doing a really
> > short copy. The copy is done and callback thread sends a reply.
> > CB_OFFLOAD call and COPY reply can be switched on the server or on the
> > processing on the client.
>
> That race is discussed in
> https://tools.ietf.org/html/rfc5661#section-2.10.6.3 and is supposed to
> be dealt with by using referring triples and/or returning DELAY.

I believe those are suggestions and not mandates? A client can't rely
that the server will implement referring sequence information. Sending
"delay" to the server might be an option but it's an option that most
like will interfere with performance as well?

> > > And shouldn't CB_OFFLOAD be returning bad_stateid in the case it doesn't
> > > recognize the given stateid?
> >
> > It could but what should the server do in this case. I would imagine
> > it wouldn't do anything. There is nothing it can do. So now we have a
> > copy that send the call and is going to wait on the reply which will
> > never come as the 1st one came and we rejected it and now copy will
> > wait forever.
> >
> > Please describe what "is wrong" with the current implementation.  I
> > believe it provide a reasonable solution to the race condition.
>
> Looks like a server that sends bad stateids in callbacks could cause you
> to allocate something that will never get freed.

I thought the philosophy was that client shouldn't be coded to a
broken server. If needed, we can later on add a cleanup thread that
goes thru the list and removes really old entries.

>
> --b.
>
> > > It looks like the allocation failure is
> > > the *only* way we'll return an error on CB_OFFLOAD, and that seems
> > > wrong.
> >
> > Yes it is the only error we currently return. Do you see any other
> > useful errors that a client should return (and would be useful to
> > handle on the server). I don't see any need for any more
> > complications.
> >
> > > > > > I also wonder if SERVERFAULT is really the best error for a memory
> > > > > > allocation failure there.
> > > > >
> > > > > I guess EIO or ENOMEM might be better. But I don't think this error
> > > > > gets returned anywhere to the main process.
> > > > >
> > > >
> > > > Wait. It is returning SERVERFAULT because it's the callback server replying
> > > > back to the server's CB_RECALL call and I believe SERVERFAULT is the
> > > > appropriate error here. NFS doesn't have ENOMEM error.
> > >
> > > We could return DELAY if we think it might be worth the server trying
> > > the CB_RECALL again.  (That's what nfsd usually returns on allocation
> > > failures.  I don't know if that's really ideal.)
> >
> > If the client had any smarts to say correct this error that would be
> > useful to return but this is not the case. I don't believe there is a