2017-03-20 16:35:00

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: pNFS: invalid IP:port selection when talks to DS



Dear (p)NFS-ors,

we observe VERY unpleasant situation with pNFS in the production.
Our hosts run multiple DSes on different ports, usually 24001-24009.
With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
a wrong port number when talks to data server:

If client uses different DSes on the same host, then at some point it starts
to send data to the wrong port number:

Client <=> MDS:


1 0.000000000 131.169.251.53 → 131.169.51.35 NFS V4 Call OPEN DH: 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
2 0.001469799 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 1) OPEN StateID: 0xec18
3 0.001578128 131.169.251.53 → 131.169.51.35 NFS V4 Call SETATTR FH: 0x6ccf3dfa
4 0.002657187 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 3) SETATTR
5 0.003243819 131.169.251.53 → 131.169.51.35 NFS V4 Call LAYOUTGET
6 0.014603386 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 5) LAYOUTGET
7 0.014899121 131.169.251.53 → 131.169.51.35 NFS V4 Call GETDEVINFO
8 0.015014216 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 7) GETDEVINFO
Opcode: GETDEVINFO (47)
Status: NFS4_OK (0)
layout type: LAYOUT4_NFSV4_1_FILES (1)
device index: 0
r_netid: tcp
length: 3
contents: tcp
fill bytes: opaque data
r_addr: 131.169.51.50.93.197
length: 20
contents: 131.169.51.50.93.197
r_netid: tcp
length: 3
contents: tcp
fill bytes: opaque data
r_addr: 131.169.51.50.93.197
length: 20
contents: 131.169.51.50.93.197
notification bitmap: 6
notification bitmap: 0
[Main Opcode: GETDEVINFO (47)]

9 0.105442455 131.169.251.53 → 131.169.51.35 NFS V4 Call TEST_STATEID
10 0.105521354 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 9) TEST_STATEID



NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.

client <=> DS

$ tshark -r ds-write.pcap -n -z conv,tcp
1 0.000000 131.169.251.53 → 131.169.51.50 NFS V4 Call WRITE StateID: 0xff01 Offset: 0 Len: 3968
2 0.000090 131.169.51.50 → 131.169.251.53 NFS V4 Reply (Call In 1) WRITE Status: NFS4ERR_BAD_STATEID
================================================================================
TCP Conversations
Filter:<No Filter>
| <- | | -> | | Total | Relative | Duration |
| Frames Bytes | | Frames Bytes | | Frames Bytes | Start | |
131.169.51.50:24006 <-> 131.169.251.53:847 1 4240 1 168 2 4408 0.000000000 0.0001
================================================================================

NOTICE, that it talks to DS on port 24006!

Is there know fix which is missing in CentOS7? I can't reproduce it with
4.9 kernel (or it's harder to reproduce).


The packages are attached.

Tigran.


Attachments:
ds-write.pcapng (4.41 kB)
mds.pcapng (3.50 kB)
Download all attachments

2017-03-20 16:16:44

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: Re: pNFS: invalid IP:port selection when talks to DS


re-sending without attachments.
The capture failes can be found at:


client <-> mds: https://desycloud.desy.de/index.php/s/58JFyfMQmNF99pU
client <-> ds: https://desycloud.desy.de/index.php/s/dKf290ikQcifL9K

Tigran.

----- Original Message -----
> From: "Mkrtchyan, Tigran" <[email protected]>
> To: "Linux NFS Mailing list" <[email protected]>
> Cc: "Steve Dickson" <[email protected]>
> Sent: Monday, March 20, 2017 4:52:40 PM
> Subject: pNFS: invalid IP:port selection when talks to DS

> Dear (p)NFS-ors,
>=20
> we observe VERY unpleasant situation with pNFS in the production.
> Our hosts run multiple DSes on different ports, usually 24001-24009.
> With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
> a wrong port number when talks to data server:
>=20
> If client uses different DSes on the same host, then at some point it sta=
rts
> to send data to the wrong port number:
>=20
> Client <=3D> MDS:
>=20
>=20
> 1 0.000000000 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call OPEN =
DH:
> 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
> 2 0.001469799 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Cal=
l In 1) OPEN
> StateID: 0xec18
> 3 0.001578128 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call SETAT=
TR FH: 0x6ccf3dfa
> 4 0.002657187 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Cal=
l In 3) SETATTR
> 5 0.003243819 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call LAYOU=
TGET
> 6 0.014603386 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Cal=
l In 5) LAYOUTGET
> 7 0.014899121 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call GETDE=
VINFO
> 8 0.015014216 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Cal=
l In 7) GETDEVINFO
> Opcode: GETDEVINFO (47)
> Status: NFS4_OK (0)
> layout type: LAYOUT4_NFSV4_1_FILES (1)
> device index: 0
> r_netid: tcp
> length: 3
> contents: tcp
> fill bytes: opaque data
> r_addr: 131.169.51.50.93.197
> length: 20
> contents: 131.169.51.50.93.197
> r_netid: tcp
> length: 3
> contents: tcp
> fill bytes: opaque data
> r_addr: 131.169.51.50.93.197
> length: 20
> contents: 131.169.51.50.93.197
> notification bitmap: 6
> notification bitmap: 0
> [Main Opcode: GETDEVINFO (47)]
>=20
> 9 0.105442455 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call TEST_=
STATEID
> 10 0.105521354 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Cal=
l In 9)
> TEST_STATEID
>=20
>=20
>=20
> NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.
>=20
> client <=3D> DS
>=20
> $ tshark -r ds-write.pcap -n -z conv,tcp
> 1 0.000000 131.169.251.53 =E2=86=92 131.169.51.50 NFS V4 Call WRITE =
StateID: 0xff01
> Offset: 0 Len: 3968
> 2 0.000090 131.169.51.50 =E2=86=92 131.169.251.53 NFS V4 Reply (Call=
In 1) WRITE
> Status: NFS4ERR_BAD_STATEID
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
> TCP Conversations
> Filter:<No Filter>
> | <- =
| | -> | | Total | Relative | Duration |
> | Frames Bytes=
| | Frames Bytes | | Frames Bytes | Start |
> | |
> 131.169.51.50:24006 <-> 131.169.251.53:847 1 42=
40
> 1 168 2 4408 0.000000000 0.0001
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
>=20
> NOTICE, that it talks to DS on port 24006!
>=20
> Is there know fix which is missing in CentOS7? I can't reproduce it with
> 4.9 kernel (or it's harder to reproduce).
>=20
>=20
> The packages are attached.
>=20
> Tigran.

2017-03-20 20:14:45

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: pNFS: invalid IP:port selection when talks to DS

Hi Tigran,

While I don't have an answer to your question, I'd like to point out
that in 4.9 is when Andy's session trunking patches when in.

I'm curious this client that's now talking to the DS at port 24006
instead of 24005, did it before also earlier correctly (legally)
talked to DS that was on 24006?

On Mon, Mar 20, 2017 at 11:52 AM, Mkrtchyan, Tigran
<[email protected]> wrote:
>
>
> Dear (p)NFS-ors,
>
> we observe VERY unpleasant situation with pNFS in the production.
> Our hosts run multiple DSes on different ports, usually 24001-24009.
> With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
> a wrong port number when talks to data server:
>
> If client uses different DSes on the same host, then at some point it sta=
rts
> to send data to the wrong port number:
>
> Client <=3D> MDS:
>
>
> 1 0.000000000 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call OPEN=
DH: 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
> 2 0.001469799 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Ca=
ll In 1) OPEN StateID: 0xec18
> 3 0.001578128 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call SETA=
TTR FH: 0x6ccf3dfa
> 4 0.002657187 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Ca=
ll In 3) SETATTR
> 5 0.003243819 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call LAYO=
UTGET
> 6 0.014603386 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Ca=
ll In 5) LAYOUTGET
> 7 0.014899121 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call GETD=
EVINFO
> 8 0.015014216 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Ca=
ll In 7) GETDEVINFO
> Opcode: GETDEVINFO (47)
> Status: NFS4_OK (0)
> layout type: LAYOUT4_NFSV4_1_FILES (1)
> device index: 0
> r_netid: tcp
> length: 3
> contents: tcp
> fill bytes: opaque data
> r_addr: 131.169.51.50.93.197
> length: 20
> contents: 131.169.51.50.93.197
> r_netid: tcp
> length: 3
> contents: tcp
> fill bytes: opaque data
> r_addr: 131.169.51.50.93.197
> length: 20
> contents: 131.169.51.50.93.197
> notification bitmap: 6
> notification bitmap: 0
> [Main Opcode: GETDEVINFO (47)]
>
> 9 0.105442455 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call TEST=
_STATEID
> 10 0.105521354 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (Ca=
ll In 9) TEST_STATEID
>
>
>
> NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.
>
> client <=3D> DS
>
> $ tshark -r ds-write.pcap -n -z conv,tcp
> 1 0.000000 131.169.251.53 =E2=86=92 131.169.51.50 NFS V4 Call WRITE=
StateID: 0xff01 Offset: 0 Len: 3968
> 2 0.000090 131.169.51.50 =E2=86=92 131.169.251.53 NFS V4 Reply (Cal=
l In 1) WRITE Status: NFS4ERR_BAD_STATEID
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
> TCP Conversations
> Filter:<No Filter>
> | <- =
| | -> | | Total | Relative | Duration |
> | Frames Byte=
s | | Frames Bytes | | Frames Bytes | Start | |
> 131.169.51.50:24006 <-> 131.169.251.53:847 1 42=
40 1 168 2 4408 0.000000000 0.0001
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
>
> NOTICE, that it talks to DS on port 24006!
>
> Is there know fix which is missing in CentOS7? I can't reproduce it with
> 4.9 kernel (or it's harder to reproduce).
>
>
> The packages are attached.
>
> Tigran.
>

2017-03-20 20:51:28

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: Re: pNFS: invalid IP:port selection when talks to DS

Hi Olga,

----- Original Message -----
> From: "Olga Kornievskaia" <[email protected]>
> To: "Mkrtchyan, Tigran" <[email protected]>
> Cc: "Linux NFS Mailing list" <[email protected]>, "Steve Dickson"=
<[email protected]>
> Sent: Monday, March 20, 2017 9:14:34 PM
> Subject: Re: pNFS: invalid IP:port selection when talks to DS

> Hi Tigran,
>=20
> While I don't have an answer to your question, I'd like to point out
> that in 4.9 is when Andy's session trunking patches when in.
>=20
> I'm curious this client that's now talking to the DS at port 24006
> instead of 24005, did it before also earlier correctly (legally)
> talked to DS that was on 24006?

Yes, earlier during testing it had legal access to DS on port 24006.

Tigran.

>=20
> On Mon, Mar 20, 2017 at 11:52 AM, Mkrtchyan, Tigran
> <[email protected]> wrote:
>>
>>
>> Dear (p)NFS-ors,
>>
>> we observe VERY unpleasant situation with pNFS in the production.
>> Our hosts run multiple DSes on different ports, usually 24001-24009.
>> With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
>> a wrong port number when talks to data server:
>>
>> If client uses different DSes on the same host, then at some point it st=
arts
>> to send data to the wrong port number:
>>
>> Client <=3D> MDS:
>>
>>
>> 1 0.000000000 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call OPE=
N DH:
>> 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
>> 2 0.001469799 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (C=
all In 1) OPEN
>> StateID: 0xec18
>> 3 0.001578128 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call SET=
ATTR FH: 0x6ccf3dfa
>> 4 0.002657187 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (C=
all In 3) SETATTR
>> 5 0.003243819 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call LAY=
OUTGET
>> 6 0.014603386 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (C=
all In 5) LAYOUTGET
>> 7 0.014899121 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call GET=
DEVINFO
>> 8 0.015014216 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (C=
all In 7) GETDEVINFO
>> Opcode: GETDEVINFO (47)
>> Status: NFS4_OK (0)
>> layout type: LAYOUT4_NFSV4_1_FILES (1)
>> device index: 0
>> r_netid: tcp
>> length: 3
>> contents: tcp
>> fill bytes: opaque data
>> r_addr: 131.169.51.50.93.197
>> length: 20
>> contents: 131.169.51.50.93.197
>> r_netid: tcp
>> length: 3
>> contents: tcp
>> fill bytes: opaque data
>> r_addr: 131.169.51.50.93.197
>> length: 20
>> contents: 131.169.51.50.93.197
>> notification bitmap: 6
>> notification bitmap: 0
>> [Main Opcode: GETDEVINFO (47)]
>>
>> 9 0.105442455 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call TES=
T_STATEID
>> 10 0.105521354 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (C=
all In 9)
>> TEST_STATEID
>>
>>
>>
>> NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.
>>
>> client <=3D> DS
>>
>> $ tshark -r ds-write.pcap -n -z conv,tcp
>> 1 0.000000 131.169.251.53 =E2=86=92 131.169.51.50 NFS V4 Call WRIT=
E StateID: 0xff01
>> Offset: 0 Len: 3968
>> 2 0.000090 131.169.51.50 =E2=86=92 131.169.251.53 NFS V4 Reply (Ca=
ll In 1) WRITE
>> Status: NFS4ERR_BAD_STATEID
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
>> TCP Conversations
>> Filter:<No Filter>
>> | <- =
| | -> | | Total | Relative | Duration |
>> | Frames Byt=
es | | Frames Bytes | | Frames Bytes | Start |
>> | |
>> 131.169.51.50:24006 <-> 131.169.251.53:847 1 4=
240
>> 1 168 2 4408 0.000000000 0.0001
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
>>
>> NOTICE, that it talks to DS on port 24006!
>>
>> Is there know fix which is missing in CentOS7? I can't reproduce it with
>> 4.9 kernel (or it's harder to reproduce).
>>
>>
>> The packages are attached.
>>
>> Tigran.
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-03-20 21:10:25

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: Re: pNFS: invalid IP:port selection when talks to DS


Hi Olga,

you did not have the answer, however you gave me an important hint!
I believe, all our DSes on a single host generate the same server
owner during exchange-id. I guess, this can be the reason, why
client decides to talk to an other DS.

Tigran.

----- Original Message -----
> From: "Mkrtchyan, Tigran" <[email protected]>
> To: "Olga Kornievskaia" <[email protected]>
> Cc: "Linux NFS Mailing list" <[email protected]>, "Steve Dickson"=
<[email protected]>
> Sent: Monday, March 20, 2017 9:51:21 PM
> Subject: Re: pNFS: invalid IP:port selection when talks to DS

> Hi Olga,
>=20
> ----- Original Message -----
>> From: "Olga Kornievskaia" <[email protected]>
>> To: "Mkrtchyan, Tigran" <[email protected]>
>> Cc: "Linux NFS Mailing list" <[email protected]>, "Steve Dickson=
"
>> <[email protected]>
>> Sent: Monday, March 20, 2017 9:14:34 PM
>> Subject: Re: pNFS: invalid IP:port selection when talks to DS
>=20
>> Hi Tigran,
>>=20
>> While I don't have an answer to your question, I'd like to point out
>> that in 4.9 is when Andy's session trunking patches when in.
>>=20
>> I'm curious this client that's now talking to the DS at port 24006
>> instead of 24005, did it before also earlier correctly (legally)
>> talked to DS that was on 24006?
>=20
> Yes, earlier during testing it had legal access to DS on port 24006.
>=20
> Tigran.
>=20
>>=20
>> On Mon, Mar 20, 2017 at 11:52 AM, Mkrtchyan, Tigran
>> <[email protected]> wrote:
>>>
>>>
>>> Dear (p)NFS-ors,
>>>
>>> we observe VERY unpleasant situation with pNFS in the production.
>>> Our hosts run multiple DSes on different ports, usually 24001-24009.
>>> With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
>>> a wrong port number when talks to data server:
>>>
>>> If client uses different DSes on the same host, then at some point it s=
tarts
>>> to send data to the wrong port number:
>>>
>>> Client <=3D> MDS:
>>>
>>>
>>> 1 0.000000000 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call OP=
EN DH:
>>> 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
>>> 2 0.001469799 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 1) OPEN
>>> StateID: 0xec18
>>> 3 0.001578128 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call SE=
TATTR FH: 0x6ccf3dfa
>>> 4 0.002657187 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 3) SETATTR
>>> 5 0.003243819 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call LA=
YOUTGET
>>> 6 0.014603386 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 5) LAYOUTGET
>>> 7 0.014899121 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call GE=
TDEVINFO
>>> 8 0.015014216 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 7) GETDEVINFO
>>> Opcode: GETDEVINFO (47)
>>> Status: NFS4_OK (0)
>>> layout type: LAYOUT4_NFSV4_1_FILES (1)
>>> device index: 0
>>> r_netid: tcp
>>> length: 3
>>> contents: tcp
>>> fill bytes: opaque data
>>> r_addr: 131.169.51.50.93.197
>>> length: 20
>>> contents: 131.169.51.50.93.197
>>> r_netid: tcp
>>> length: 3
>>> contents: tcp
>>> fill bytes: opaque data
>>> r_addr: 131.169.51.50.93.197
>>> length: 20
>>> contents: 131.169.51.50.93.197
>>> notification bitmap: 6
>>> notification bitmap: 0
>>> [Main Opcode: GETDEVINFO (47)]
>>>
>>> 9 0.105442455 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call TE=
ST_STATEID
>>> 10 0.105521354 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 9)
>>> TEST_STATEID
>>>
>>>
>>>
>>> NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.
>>>
>>> client <=3D> DS
>>>
>>> $ tshark -r ds-write.pcap -n -z conv,tcp
>>> 1 0.000000 131.169.251.53 =E2=86=92 131.169.51.50 NFS V4 Call WRI=
TE StateID: 0xff01
>>> Offset: 0 Len: 3968
>>> 2 0.000090 131.169.51.50 =E2=86=92 131.169.251.53 NFS V4 Reply (C=
all In 1) WRITE
>>> Status: NFS4ERR_BAD_STATEID
>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>>> TCP Conversations
>>> Filter:<No Filter>
>>> | <- =
| | -> | | Total | Relative | Duration |
>>> | Frames By=
tes | | Frames Bytes | | Frames Bytes | Start |
>>> | |
>>> 131.169.51.50:24006 <-> 131.169.251.53:847 1 =
4240
>>> 1 168 2 4408 0.000000000 0.0001
>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>>>
>>> NOTICE, that it talks to DS on port 24006!
>>>
>>> Is there know fix which is missing in CentOS7? I can't reproduce it wit=
h
>>> 4.9 kernel (or it's harder to reproduce).
>>>
>>>
>>> The packages are attached.
>>>
>>> Tigran.
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-03-22 16:11:34

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: pNFS: invalid IP:port selection when talks to DS

Hi Tigran,

I still don't have the answer to your question but I'm just puzzled
why it "works" with 4.9 (session trunking). New code would check the
server owner and if they are the same, then it would add that to the
list of addresses to trunk. I'd assume you'd be seeing the same
behavior with the new code. Thus, I'm puzzled. That aside, if you
don't want the new code to trunk between your DSs on the same server,
they should return different owner.

I'm assuming device ids are different for the DSs on different ports?

On Mon, Mar 20, 2017 at 5:09 PM, Mkrtchyan, Tigran
<[email protected]> wrote:
>
> Hi Olga,
>
> you did not have the answer, however you gave me an important hint!
> I believe, all our DSes on a single host generate the same server
> owner during exchange-id. I guess, this can be the reason, why
> client decides to talk to an other DS.
>
> Tigran.
>
> ----- Original Message -----
>> From: "Mkrtchyan, Tigran" <[email protected]>
>> To: "Olga Kornievskaia" <[email protected]>
>> Cc: "Linux NFS Mailing list" <[email protected]>, "Steve Dickson=
" <[email protected]>
>> Sent: Monday, March 20, 2017 9:51:21 PM
>> Subject: Re: pNFS: invalid IP:port selection when talks to DS
>
>> Hi Olga,
>>
>> ----- Original Message -----
>>> From: "Olga Kornievskaia" <[email protected]>
>>> To: "Mkrtchyan, Tigran" <[email protected]>
>>> Cc: "Linux NFS Mailing list" <[email protected]>, "Steve Dickso=
n"
>>> <[email protected]>
>>> Sent: Monday, March 20, 2017 9:14:34 PM
>>> Subject: Re: pNFS: invalid IP:port selection when talks to DS
>>
>>> Hi Tigran,
>>>
>>> While I don't have an answer to your question, I'd like to point out
>>> that in 4.9 is when Andy's session trunking patches when in.
>>>
>>> I'm curious this client that's now talking to the DS at port 24006
>>> instead of 24005, did it before also earlier correctly (legally)
>>> talked to DS that was on 24006?
>>
>> Yes, earlier during testing it had legal access to DS on port 24006.
>>
>> Tigran.
>>
>>>
>>> On Mon, Mar 20, 2017 at 11:52 AM, Mkrtchyan, Tigran
>>> <[email protected]> wrote:
>>>>
>>>>
>>>> Dear (p)NFS-ors,
>>>>
>>>> we observe VERY unpleasant situation with pNFS in the production.
>>>> Our hosts run multiple DSes on different ports, usually 24001-24009.
>>>> With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
>>>> a wrong port number when talks to data server:
>>>>
>>>> If client uses different DSes on the same host, then at some point it =
starts
>>>> to send data to the wrong port number:
>>>>
>>>> Client <=3D> MDS:
>>>>
>>>>
>>>> 1 0.000000000 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call O=
PEN DH:
>>>> 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
>>>> 2 0.001469799 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply =
(Call In 1) OPEN
>>>> StateID: 0xec18
>>>> 3 0.001578128 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call S=
ETATTR FH: 0x6ccf3dfa
>>>> 4 0.002657187 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply =
(Call In 3) SETATTR
>>>> 5 0.003243819 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call L=
AYOUTGET
>>>> 6 0.014603386 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply =
(Call In 5) LAYOUTGET
>>>> 7 0.014899121 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call G=
ETDEVINFO
>>>> 8 0.015014216 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply =
(Call In 7) GETDEVINFO
>>>> Opcode: GETDEVINFO (47)
>>>> Status: NFS4_OK (0)
>>>> layout type: LAYOUT4_NFSV4_1_FILES (1)
>>>> device index: 0
>>>> r_netid: tcp
>>>> length: 3
>>>> contents: tcp
>>>> fill bytes: opaque data
>>>> r_addr: 131.169.51.50.93.197
>>>> length: 20
>>>> contents: 131.169.51.50.93.197
>>>> r_netid: tcp
>>>> length: 3
>>>> contents: tcp
>>>> fill bytes: opaque data
>>>> r_addr: 131.169.51.50.93.197
>>>> length: 20
>>>> contents: 131.169.51.50.93.197
>>>> notification bitmap: 6
>>>> notification bitmap: 0
>>>> [Main Opcode: GETDEVINFO (47)]
>>>>
>>>> 9 0.105442455 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call T=
EST_STATEID
>>>> 10 0.105521354 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply =
(Call In 9)
>>>> TEST_STATEID
>>>>
>>>>
>>>>
>>>> NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.
>>>>
>>>> client <=3D> DS
>>>>
>>>> $ tshark -r ds-write.pcap -n -z conv,tcp
>>>> 1 0.000000 131.169.251.53 =E2=86=92 131.169.51.50 NFS V4 Call WR=
ITE StateID: 0xff01
>>>> Offset: 0 Len: 3968
>>>> 2 0.000090 131.169.51.50 =E2=86=92 131.169.251.53 NFS V4 Reply (=
Call In 1) WRITE
>>>> Status: NFS4ERR_BAD_STATEID
>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>>>> TCP Conversations
>>>> Filter:<No Filter>
>>>> | <- =
| | -> | | Total | Relative | Duration |
>>>> | Frames B=
ytes | | Frames Bytes | | Frames Bytes | Start |
>>>> | |
>>>> 131.169.51.50:24006 <-> 131.169.251.53:847 1 =
4240
>>>> 1 168 2 4408 0.000000000 0.0001
>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>>>>
>>>> NOTICE, that it talks to DS on port 24006!
>>>>
>>>> Is there know fix which is missing in CentOS7? I can't reproduce it wi=
th
>>>> 4.9 kernel (or it's harder to reproduce).
>>>>
>>>>
>>>> The packages are attached.
>>>>
>>>> Tigran.
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-03-22 20:27:35

by Mkrtchyan, Tigran

[permalink] [raw]
Subject: Re: pNFS: invalid IP:port selection when talks to DS


Hi Olga,

----- Original Message -----
> From: "Olga Kornievskaia" <[email protected]>
> To: "Mkrtchyan, Tigran" <[email protected]>
> Cc: "Linux NFS Mailing list" <[email protected]>, "Steve Dickson"=
<[email protected]>
> Sent: Wednesday, March 22, 2017 5:04:21 PM
> Subject: Re: pNFS: invalid IP:port selection when talks to DS

> Hi Tigran,
>=20
> I still don't have the answer to your question but I'm just puzzled
> why it "works" with 4.9 (session trunking). New code would check the
> server owner and if they are the same, then it would add that to the
> list of addresses to trunk. I'd assume you'd be seeing the same
> behavior with the new code. Thus, I'm puzzled. That aside, if you
> don't want the new code to trunk between your DSs on the same server,
> they should return different owner.

I have no idea why 4.9 works differently. May be this no enough 'load'
to trigger trunking. Anyway, I have updated our code to generate different
server owners. We still need to test it, before all 600 DSes get updated.

>=20
> I'm assuming device ids are different for the DSs on different ports?

Yes, each DS gets a unique id.

Thanks,
Tigran.

>=20
> On Mon, Mar 20, 2017 at 5:09 PM, Mkrtchyan, Tigran
> <[email protected]> wrote:
>>
>> Hi Olga,
>>
>> you did not have the answer, however you gave me an important hint!
>> I believe, all our DSes on a single host generate the same server
>> owner during exchange-id. I guess, this can be the reason, why
>> client decides to talk to an other DS.
>>
>> Tigran.
>>
>> ----- Original Message -----
>>> From: "Mkrtchyan, Tigran" <[email protected]>
>>> To: "Olga Kornievskaia" <[email protected]>
>>> Cc: "Linux NFS Mailing list" <[email protected]>, "Steve Dickso=
n"
>>> <[email protected]>
>>> Sent: Monday, March 20, 2017 9:51:21 PM
>>> Subject: Re: pNFS: invalid IP:port selection when talks to DS
>>
>>> Hi Olga,
>>>
>>> ----- Original Message -----
>>>> From: "Olga Kornievskaia" <[email protected]>
>>>> To: "Mkrtchyan, Tigran" <[email protected]>
>>>> Cc: "Linux NFS Mailing list" <[email protected]>, "Steve Dicks=
on"
>>>> <[email protected]>
>>>> Sent: Monday, March 20, 2017 9:14:34 PM
>>>> Subject: Re: pNFS: invalid IP:port selection when talks to DS
>>>
>>>> Hi Tigran,
>>>>
>>>> While I don't have an answer to your question, I'd like to point out
>>>> that in 4.9 is when Andy's session trunking patches when in.
>>>>
>>>> I'm curious this client that's now talking to the DS at port 24006
>>>> instead of 24005, did it before also earlier correctly (legally)
>>>> talked to DS that was on 24006?
>>>
>>> Yes, earlier during testing it had legal access to DS on port 24006.
>>>
>>> Tigran.
>>>
>>>>
>>>> On Mon, Mar 20, 2017 at 11:52 AM, Mkrtchyan, Tigran
>>>> <[email protected]> wrote:
>>>>>
>>>>>
>>>>> Dear (p)NFS-ors,
>>>>>
>>>>> we observe VERY unpleasant situation with pNFS in the production.
>>>>> Our hosts run multiple DSes on different ports, usually 24001-24009.
>>>>> With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
>>>>> a wrong port number when talks to data server:
>>>>>
>>>>> If client uses different DSes on the same host, then at some point it=
starts
>>>>> to send data to the wrong port number:
>>>>>
>>>>> Client <=3D> MDS:
>>>>>
>>>>>
>>>>> 1 0.000000000 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call =
OPEN DH:
>>>>> 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
>>>>> 2 0.001469799 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply=
(Call In 1) OPEN
>>>>> StateID: 0xec18
>>>>> 3 0.001578128 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call =
SETATTR FH: 0x6ccf3dfa
>>>>> 4 0.002657187 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply=
(Call In 3) SETATTR
>>>>> 5 0.003243819 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call =
LAYOUTGET
>>>>> 6 0.014603386 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply=
(Call In 5) LAYOUTGET
>>>>> 7 0.014899121 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call =
GETDEVINFO
>>>>> 8 0.015014216 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply=
(Call In 7) GETDEVINFO
>>>>> Opcode: GETDEVINFO (47)
>>>>> Status: NFS4_OK (0)
>>>>> layout type: LAYOUT4_NFSV4_1_FILES (1)
>>>>> device index: 0
>>>>> r_netid: tcp
>>>>> length: 3
>>>>> contents: tcp
>>>>> fill bytes: opaque data
>>>>> r_addr: 131.169.51.50.93.197
>>>>> length: 20
>>>>> contents: 131.169.51.50.93.197
>>>>> r_netid: tcp
>>>>> length: 3
>>>>> contents: tcp
>>>>> fill bytes: opaque data
>>>>> r_addr: 131.169.51.50.93.197
>>>>> length: 20
>>>>> contents: 131.169.51.50.93.197
>>>>> notification bitmap: 6
>>>>> notification bitmap: 0
>>>>> [Main Opcode: GETDEVINFO (47)]
>>>>>
>>>>> 9 0.105442455 131.169.251.53 =E2=86=92 131.169.51.35 NFS V4 Call =
TEST_STATEID
>>>>> 10 0.105521354 131.169.51.35 =E2=86=92 131.169.251.53 NFS V4 Reply=
(Call In 9)
>>>>> TEST_STATEID
>>>>>
>>>>>
>>>>>
>>>>> NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.
>>>>>
>>>>> client <=3D> DS
>>>>>
>>>>> $ tshark -r ds-write.pcap -n -z conv,tcp
>>>>> 1 0.000000 131.169.251.53 =E2=86=92 131.169.51.50 NFS V4 Call W=
RITE StateID: 0xff01
>>>>> Offset: 0 Len: 3968
>>>>> 2 0.000090 131.169.51.50 =E2=86=92 131.169.251.53 NFS V4 Reply =
(Call In 1) WRITE
>>>>> Status: NFS4ERR_BAD_STATEID
>>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>>>>> TCP Conversations
>>>>> Filter:<No Filter>
>>>>> | <-=
| | -> | | Total | Relative | Duration |
>>>>> | Frames =
Bytes | | Frames Bytes | | Frames Bytes | Start |
>>>>> | |
>>>>> 131.169.51.50:24006 <-> 131.169.251.53:847 1 =
4240
>>>>> 1 168 2 4408 0.000000000 0.0001
>>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>>>>>
>>>>> NOTICE, that it talks to DS on port 24006!
>>>>>
>>>>> Is there know fix which is missing in CentOS7? I can't reproduce it w=
ith
>>>>> 4.9 kernel (or it's harder to reproduce).
>>>>>
>>>>>
>>>>> The packages are attached.
>>>>>
>>>>> Tigran.
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" i=
n
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html