LinuxLists.cc - once again problems with interrupted slots

2020-06-04 21:53:30

Subject: once again problems with interrupted slots

Hi Trond,

There is a problem with interrupted slots (yet again).

We send an operation to the server and it gets interrupted by the a signal.

We used to send a sole SEQUENCE to remove the problem of having real
operation get an out of the cache reply and failing. Now we are not
doing it again (since 3453d5708 NFSv4.1: Avoid false retries when RPC
calls are interrupted"). So the problem is

We bump the sequence on the next use of the slot, and get SEQ_MISORDERED.

We decrement the number back to the interrupted operation. This gets
us a reply out of the cache. We again fail with REMOTE EIO error.

Going back to the commit's message. I don't see the logic that the
server can't tell if this is a new call or the old one. We used to
send a lone SEQUENCE as a way to protect reuse of slot by a normal
operation. An interrupted slot couldn't have been another SEQUENCE. So
I don't see how the server can't tell a difference between SEQUENCE
and any other operations.

2020-06-05 12:14:18

by Tom Talpey

[permalink] [raw]

Subject: Re: once again problems with interrupted slots

On 6/4/2020 5:21 PM, Olga Kornievskaia wrote:
> Hi Trond,
>
> There is a problem with interrupted slots (yet again).
>
> We send an operation to the server and it gets interrupted by the a signal.
>
> We used to send a sole SEQUENCE to remove the problem of having real
> operation get an out of the cache reply and failing. Now we are not
> doing it again (since 3453d5708 NFSv4.1: Avoid false retries when RPC
> calls are interrupted"). So the problem is
>
> We bump the sequence on the next use of the slot, and get SEQ_MISORDERED.

Misordered? It sounds like the client isn't managing the sequence
number, or perhaps the server never saw the original request, and
is being overly strict.

> We decrement the number back to the interrupted operation. This gets
> us a reply out of the cache. We again fail with REMOTE EIO error.

Ew. The client *decrements* the sequence?

Tom.

> Going back to the commit's message. I don't see the logic that the
> server can't tell if this is a new call or the old one. We used to
> send a lone SEQUENCE as a way to protect reuse of slot by a normal
> operation. An interrupted slot couldn't have been another SEQUENCE. So
> I don't see how the server can't tell a difference between SEQUENCE
> and any other operations.
>
>

2020-06-05 13:26:38

by Olga Kornievskaia

[permalink] [raw]

Subject: Re: once again problems with interrupted slots

On Fri, Jun 5, 2020 at 8:06 AM Tom Talpey <[email protected]> wrote:
>
> On 6/4/2020 5:21 PM, Olga Kornievskaia wrote:
> > Hi Trond,
> >
> > There is a problem with interrupted slots (yet again).
> >
> > We send an operation to the server and it gets interrupted by the a signal.
> >
> > We used to send a sole SEQUENCE to remove the problem of having real
> > operation get an out of the cache reply and failing. Now we are not
> > doing it again (since 3453d5708 NFSv4.1: Avoid false retries when RPC
> > calls are interrupted"). So the problem is
> >
> > We bump the sequence on the next use of the slot, and get SEQ_MISORDERED.
>
> Misordered? It sounds like the client isn't managing the sequence
> number, or perhaps the server never saw the original request, and
> is being overly strict.

Well, both the client and the server are acting appropriately. I'm
not arguing against bumping the sequence. Client sent say REMOVE with
slot=1 seq=5 which got interrupted. So client doesn't know in what
state the slot is left. So it sends the next operation say READ with
slot=1 seq=6. Server acts appropriately too, as it's version of the
slot has seq=4, this request with seq=6 gets SEQ_MISORDERED.

> > We decrement the number back to the interrupted operation. This gets
> > us a reply out of the cache. We again fail with REMOTE EIO error.
>
> Ew. The client *decrements* the sequence?

Yes, as client then decides that server never received seq=5 operation
so it re-sends with seq=5. But in reality seq=5 operation also reached
the server so it has 2 requests REMOVE/READ both with seq=5 for
slot=1. This leads to READ failing with some error.

We used to before send a sole SEQUENCE when we have an interrupted
slot to sync up the seq numbers. But commit 3453d5708 changed that and
I would like to understand why. As I think we need to go back to
sending sole SEQUENCE.

> Tom.
>
> > Going back to the commit's message. I don't see the logic that the
> > server can't tell if this is a new call or the old one. We used to
> > send a lone SEQUENCE as a way to protect reuse of slot by a normal
> > operation. An interrupted slot couldn't have been another SEQUENCE. So
> > I don't see how the server can't tell a difference between SEQUENCE
> > and any other operations.
> >
> >

2020-06-05 13:49:39

by Tom Talpey

[permalink] [raw]

Subject: Re: once again problems with interrupted slots

On 6/5/2020 9:24 AM, Olga Kornievskaia wrote:
> On Fri, Jun 5, 2020 at 8:06 AM Tom Talpey <[email protected]> wrote:
>>
>> On 6/4/2020 5:21 PM, Olga Kornievskaia wrote:
>>> Hi Trond,
>>>
>>> There is a problem with interrupted slots (yet again).
>>>
>>> We send an operation to the server and it gets interrupted by the a signal.
>>>
>>> We used to send a sole SEQUENCE to remove the problem of having real
>>> operation get an out of the cache reply and failing. Now we are not
>>> doing it again (since 3453d5708 NFSv4.1: Avoid false retries when RPC
>>> calls are interrupted"). So the problem is
>>>
>>> We bump the sequence on the next use of the slot, and get SEQ_MISORDERED.
>>
>> Misordered? It sounds like the client isn't managing the sequence
>> number, or perhaps the server never saw the original request, and
>> is being overly strict.
>
> Well, both the client and the server are acting appropriately. I'm
> not arguing against bumping the sequence. Client sent say REMOVE with
> slot=1 seq=5 which got interrupted. So client doesn't know in what
> state the slot is left. So it sends the next operation say READ with
> slot=1 seq=6. Server acts appropriately too, as it's version of the
> slot has seq=4, this request with seq=6 gets SEQ_MISORDERED.

Wait, if the client sent slot=1 seq=5, then unless the connection
breaks, that slot is at seq=5, simple as that. If the operation was
interrupted before sending the request, then the sequence should
not be bumped.

>>> We decrement the number back to the interrupted operation. This gets
>>> us a reply out of the cache. We again fail with REMOTE EIO error.
>>
>> Ew. The client *decrements* the sequence?
>
> Yes, as client then decides that server never received seq=5 operation
> so it re-sends with seq=5. But in reality seq=5 operation also reached
> the server so it has 2 requests REMOVE/READ both with seq=5 for
> slot=1. This leads to READ failing with some error.

But if the connection didn't break, it's reliable therefore the "resend"
must not be performed. This is a new operation, not a retry. It cannot
use the same slot+seq pair. And decrementing the slot is even sillier,
it's reusing *two* seq's at that point.

> We used to before send a sole SEQUENCE when we have an interrupted
> slot to sync up the seq numbers. But commit 3453d5708 changed that and
> I would like to understand why. As I think we need to go back to
> sending sole SEQUENCE.

Sounds like a hack, frankly. What if the server responds the same
way? The client will start burning the wire.

Closing the connection, or never using that slot again, seems to
me the only correct option, given the other behavior described.

Tom.

>>> Going back to the commit's message. I don't see the logic that the
>>> server can't tell if this is a new call or the old one. We used to
>>> send a lone SEQUENCE as a way to protect reuse of slot by a normal
>>> operation. An interrupted slot couldn't have been another SEQUENCE. So
>>> I don't see how the server can't tell a difference between SEQUENCE
>>> and any other operations.
>>>
>>>
>
>

2020-06-05 15:35:18

by Olga Kornievskaia

[permalink] [raw]

Subject: Re: once again problems with interrupted slots

On Fri, Jun 5, 2020 at 9:49 AM Tom Talpey <[email protected]> wrote:
>
> On 6/5/2020 9:24 AM, Olga Kornievskaia wrote:
> > On Fri, Jun 5, 2020 at 8:06 AM Tom Talpey <[email protected]> wrote:
> >>
> >> On 6/4/2020 5:21 PM, Olga Kornievskaia wrote:
> >>> Hi Trond,
> >>>
> >>> There is a problem with interrupted slots (yet again).
> >>>
> >>> We send an operation to the server and it gets interrupted by the a signal.
> >>>
> >>> We used to send a sole SEQUENCE to remove the problem of having real
> >>> operation get an out of the cache reply and failing. Now we are not
> >>> doing it again (since 3453d5708 NFSv4.1: Avoid false retries when RPC
> >>> calls are interrupted"). So the problem is
> >>>
> >>> We bump the sequence on the next use of the slot, and get SEQ_MISORDERED.
> >>
> >> Misordered? It sounds like the client isn't managing the sequence
> >> number, or perhaps the server never saw the original request, and
> >> is being overly strict.
> >
> > Well, both the client and the server are acting appropriately. I'm
> > not arguing against bumping the sequence. Client sent say REMOVE with
> > slot=1 seq=5 which got interrupted. So client doesn't know in what
> > state the slot is left. So it sends the next operation say READ with
> > slot=1 seq=6. Server acts appropriately too, as it's version of the
> > slot has seq=4, this request with seq=6 gets SEQ_MISORDERED.
>
> Wait, if the client sent slot=1 seq=5, then unless the connection
> breaks, that slot is at seq=5, simple as that. If the operation was
> interrupted before sending the request, then the sequence should
> not be bumped.

Connection doesn't drop. We tried not bumping the sequence (i think
that was probably how it was originally done). Then you still get into
the same problem right away. REMOVE and READ will be using seq=5.

> >>> We decrement the number back to the interrupted operation. This gets
> >>> us a reply out of the cache. We again fail with REMOTE EIO error.
> >>
> >> Ew. The client *decrements* the sequence?
> >
> > Yes, as client then decides that server never received seq=5 operation
> > so it re-sends with seq=5. But in reality seq=5 operation also reached
> > the server so it has 2 requests REMOVE/READ both with seq=5 for
> > slot=1. This leads to READ failing with some error.
>
> But if the connection didn't break, it's reliable therefore the "resend"
> must not be performed. This is a new operation, not a retry. It cannot
> use the same slot+seq pair. And decrementing the slot is even sillier,
> it's reusing *two* seq's at that point.

When the slot gets interrupted we don't know when the interruption
happened. If we got SEQ_MISORDERED, it might be because interruption
happened before the request was ever sent to the server, so it's valid
for the seq to stay the same (ie decrementing the seq). I don't see
how decrementing the seq is reusing 2 seq values: only one value is
valid and client is trying to figure out which one.

> > We used to before send a sole SEQUENCE when we have an interrupted
> > slot to sync up the seq numbers. But commit 3453d5708 changed that and
> > I would like to understand why. As I think we need to go back to
> > sending sole SEQUENCE.
>
> Sounds like a hack, frankly. What if the server responds the same
> way? The client will start burning the wire.

Sending the SEQUENCE on the same slot/seqid as an interrupted slot
doesn't lead to any operation failing.

> Closing the connection, or never using that slot again, seems to
> me the only correct option, given the other behavior described.

Not ever using an interrupted slot seems too drastic (we can end up
with a session where all slots are unusable. or losing slots also
means losing ability to send more requests in parallel). I thought
that's given a sequence of events and error codes we should be able to
re-sync the slot.

>
> Tom.
>
>
> >>> Going back to the commit's message. I don't see the logic that the
> >>> server can't tell if this is a new call or the old one. We used to
> >>> send a lone SEQUENCE as a way to protect reuse of slot by a normal
> >>> operation. An interrupted slot couldn't have been another SEQUENCE. So
> >>> I don't see how the server can't tell a difference between SEQUENCE
> >>> and any other operations.
> >>>
> >>>
> >
> >

2020-06-06 01:11:42

by Tom Talpey

[permalink] [raw]

Subject: Re: once again problems with interrupted slots

On 6/5/2020 11:30 AM, Olga Kornievskaia wrote:
> On Fri, Jun 5, 2020 at 9:49 AM Tom Talpey <[email protected]> wrote:
>>
>> On 6/5/2020 9:24 AM, Olga Kornievskaia wrote:
>>> On Fri, Jun 5, 2020 at 8:06 AM Tom Talpey <[email protected]> wrote:
>>>>
>>>> On 6/4/2020 5:21 PM, Olga Kornievskaia wrote:
>>>>> Hi Trond,
>>>>>
>>>>> There is a problem with interrupted slots (yet again).
>>>>>
>>>>> We send an operation to the server and it gets interrupted by the a signal.
>>>>>
>>>>> We used to send a sole SEQUENCE to remove the problem of having real
>>>>> operation get an out of the cache reply and failing. Now we are not
>>>>> doing it again (since 3453d5708 NFSv4.1: Avoid false retries when RPC
>>>>> calls are interrupted"). So the problem is
>>>>>
>>>>> We bump the sequence on the next use of the slot, and get SEQ_MISORDERED.
>>>>
>>>> Misordered? It sounds like the client isn't managing the sequence
>>>> number, or perhaps the server never saw the original request, and
>>>> is being overly strict.
>>>
>>> Well, both the client and the server are acting appropriately. I'm
>>> not arguing against bumping the sequence. Client sent say REMOVE with
>>> slot=1 seq=5 which got interrupted. So client doesn't know in what
>>> state the slot is left. So it sends the next operation say READ with
>>> slot=1 seq=6. Server acts appropriately too, as it's version of the
>>> slot has seq=4, this request with seq=6 gets SEQ_MISORDERED.
>>
>> Wait, if the client sent slot=1 seq=5, then unless the connection
>> breaks, that slot is at seq=5, simple as that. If the operation was
>> interrupted before sending the request, then the sequence should
>> not be bumped.
>
> Connection doesn't drop. We tried not bumping the sequence (i think
> that was probably how it was originally done). Then you still get into
> the same problem right away. REMOVE and READ will be using seq=5.

Well, if the connection has not dropped, the behavior you describe is
violating the protocol in at least two ways:

- it is retransmitting a request on a reliable transport
- it is sending a second message into a busy slot

The last one also has the possibility of overrunning an RDMA credit
limit too, btw.

I don't think you should expect any correct result to come of these
client behaviors.

>>>>> We decrement the number back to the interrupted operation. This gets
>>>>> us a reply out of the cache. We again fail with REMOTE EIO error.
>>>>
>>>> Ew. The client *decrements* the sequence?
>>>
>>> Yes, as client then decides that server never received seq=5 operation
>>> so it re-sends with seq=5. But in reality seq=5 operation also reached
>>> the server so it has 2 requests REMOVE/READ both with seq=5 for
>>> slot=1. This leads to READ failing with some error.
>>
>> But if the connection didn't break, it's reliable therefore the "resend"
>> must not be performed. This is a new operation, not a retry. It cannot
>> use the same slot+seq pair. And decrementing the slot is even sillier,
>> it's reusing *two* seq's at that point.
>
> When the slot gets interrupted we don't know when the interruption
> happened. If we got SEQ_MISORDERED, it might be because interruption
> happened before the request was ever sent to the server, so it's valid
> for the seq to stay the same (ie decrementing the seq). I don't see
> how decrementing the seq is reusing 2 seq values: only one value is
> valid and client is trying to figure out which one.

Well, the client code needs to track whether the request was sent, and
use a new slot if so. If not, then it's reusing sequence values because
it's either retransmitting, or sending a different message, on a slot
which has an outstanding (abandoned) request. Again, protocol violation.

Slots can be thought of as single-request mailboxes, which are then
used serially. (To get N parallel operations, you need N slots.) The
sequence is *not* a window. The only valid new value is previous+1.

>>> We used to before send a sole SEQUENCE when we have an interrupted
>>> slot to sync up the seq numbers. But commit 3453d5708 changed that and
>>> I would like to understand why. As I think we need to go back to
>>> sending sole SEQUENCE.
>>
>> Sounds like a hack, frankly. What if the server responds the same
>> way? The client will start burning the wire.
>
> Sending the SEQUENCE on the same slot/seqid as an interrupted slot
> doesn't lead to any operation failing >
>> Closing the connection, or never using that slot again, seems to
>> me the only correct option, given the other behavior described.
>
> Not ever using an interrupted slot seems too drastic (we can end up
> with a session where all slots are unusable. or losing slots also
> means losing ability to send more requests in parallel). I thought
> that's given a sequence of events and error codes we should be able to
> re-sync the slot.

The only way to "re-sync the slot" is to wait for the server's reply,
or close the connection.

Yes, it's extreme to stop using the slot, and it means the client will
very likely have to request more slot credits. But, it's what the
protocol requires, short of starting over on a new connection.

Tom.

>>>>> Going back to the commit's message. I don't see the logic that the
>>>>> server can't tell if this is a new call or the old one. We used to
>>>>> send a lone SEQUENCE as a way to protect reuse of slot by a normal
>>>>> operation. An interrupted slot couldn't have been another SEQUENCE. So
>>>>> I don't see how the server can't tell a difference between SEQUENCE
>>>>> and any other operations.
>>>>>
>>>>>
>>>
>>>
>
>

2020-06-09 18:30:09

by Chuck Lever III

[permalink] [raw]

Subject: Re: once again problems with interrupted slots

> On Jun 5, 2020, at 9:24 AM, Olga Kornievskaia <[email protected]> wrote:
>
> On Fri, Jun 5, 2020 at 8:06 AM Tom Talpey <[email protected]> wrote:
>>
>> On 6/4/2020 5:21 PM, Olga Kornievskaia wrote:
>>> Hi Trond,
>>>
>>> There is a problem with interrupted slots (yet again).
>>>
>>> We send an operation to the server and it gets interrupted by the a signal.
>>>
>>> We used to send a sole SEQUENCE to remove the problem of having real
>>> operation get an out of the cache reply and failing. Now we are not
>>> doing it again (since 3453d5708 NFSv4.1: Avoid false retries when RPC
>>> calls are interrupted"). So the problem is
>>>
>>> We bump the sequence on the next use of the slot, and get SEQ_MISORDERED.
>>
>> Misordered? It sounds like the client isn't managing the sequence
>> number, or perhaps the server never saw the original request, and
>> is being overly strict.
>
> Well, both the client and the server are acting appropriately. I'm
> not arguing against bumping the sequence. Client sent say REMOVE with
> slot=1 seq=5 which got interrupted. So client doesn't know in what
> state the slot is left. So it sends the next operation say READ with
> slot=1 seq=6. Server acts appropriately too, as it's version of the
> slot has seq=4, this request with seq=6 gets SEQ_MISORDERED.
>
>>> We decrement the number back to the interrupted operation. This gets
>>> us a reply out of the cache. We again fail with REMOTE EIO error.
>>
>> Ew. The client *decrements* the sequence?
>
> Yes, as client then decides that server never received seq=5 operation
> so it re-sends with seq=5. But in reality seq=5 operation also reached
> the server so it has 2 requests REMOVE/READ both with seq=5 for
> slot=1. This leads to READ failing with some error.

> We used to before send a sole SEQUENCE when we have an interrupted
> slot to sync up the seq numbers. But commit 3453d5708 changed that and
> I would like to understand why. As I think we need to go back to
> sending sole SEQUENCE.

I think that's right. The question I have is _when_ a client
should use a SEQUENCE probe to resolve the problem.

>> On Wed Jun 20 17:53:34 2018 -0400, Trond Myklebust wrote:
>> NFSv4.1: Avoid false retries when RPC calls are interrupted
>>
>> A 'false retry' in NFSv4.1 occurs when the client attempts to transmit a
>> new RPC call using a slot+sequence number combination that references an
>> already cached one. Currently, the Linux NFS client will do this if a
>> user process interrupts an RPC call that is in progress.
>> The problem with doing so is that we defeat the main mechanism used by
>> the server to differentiate between a new call and a replayed one. Even
>> if the server is able to perfectly cache the arguments of the old call,
>> it cannot know if the client intended to replay or send a new call.

The first and third sentences together seem to mean that:

The Linux client can transmit a new call using a slot and
sequence that references an already cached one, but it should
never do that.

The fourth sentence suggests that it is entirely up to the
client to ensure this aspect of session operation is correct
because the specification does not require a server to cache
Call arguments.

So 3453d5708 is a client bug fix, and thus probably cannot be
simply reverted without exposing an old bug.

The second sentence implies that the client used to know exactly
when there might be (client-induced) disagreement between the
slot sequence numbers on the client and server.

I'm a little naive about this stuff... but it seems to me that,
in those cases, the client should not retire that slot before
it ascertains whether the client and server sequence numbers
are in agreement.

In other words, instead of having the next slot user deal with
the fallout of an interrupted operation, the client should
ensure the client and server agree on slot state (possibly via
a singleton SEQUENCE probe) _before_ it retires the interrupted
slot. If the disagreement cannot be resolved, then the client
must not use that slot again.

Tom points out that there are only two valid sequence number
values on the server: N (Call received) or N-1 (Call not
received). Outside of those two, slot state is not recoverable.
That should enable slot recovery to be completed (success or
failure) in no more than one or two operations.

Perhaps the fix in 3453d5708 is incorrect or incomplete. Is
there a way to alter it to make it work, or should it be
reverted and replaced?

> On Jun 5, 2020, at 11:30 AM, Olga Kornievskaia <[email protected]> wrote:
> Not ever using an interrupted slot seems too drastic (we can end up
> with a session where all slots are unusable. or losing slots also
> means losing ability to send more requests in parallel). I thought
> that's given a sequence of events and error codes we should be able to
> re-sync the slot.

Maybe not so drastic. The way to prevent a deadlock if all slots
become unusable is to use DESTROY_SESSION and then create a fresh
session. The client could even do that preemptively if there are
more than, say, two or three unusable slots in a session.

Session-draining logic would need to continue to flush in-use
slots, but skip slots that are marked unusable to avoid a hang.

--
Chuck Lever