2013-09-25 05:20:18

by Jongman Heo

[permalink] [raw]
Subject: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")


Hi all,

My embedded development box fails to NFS-boot with NFS server which uses recent kernel.

Using git bisect, I found it is caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server").


1. dmesg (NFS boot failure case)

...
[ 2.040893] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 2.046207] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[ 2.053570] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 3.055023] IP-Config: Guessing netmask 255.255.0.0
[ 3.059979] IP-Config: Gateway not on directly connected network.
[ 3.066330] Looking up port of RPC 100003/2 on 165.213.88.249
[ 3.074001] Looking up port of RPC 100005/1 on 165.213.88.249
[ 3.122878] VFS: Unable to mount root fs via NFS, trying floppy.
[ 3.129134] VFS: Cannot open root device "nfs" or unknown-block(2,0)
[ 3.135478] Please append a correct "root=" boot option; here are the available partitions:
[ 3.143831] 1f00 3072 mtdblock0 (driver?)
[ 3.148798] 1f01 64 mtdblock1 (driver?)
[ 3.153758] 1f02 64 mtdblock2 (driver?)
[ 3.158719] 1f03 64 mtdblock3 (driver?)
[ 3.163682] 1f04 64 mtdblock4 (driver?)
[ 3.168644] 1f05 64 mtdblock5 (driver?)
[ 3.173607] 1f06 64 mtdblock6 (driver?)
[ 3.178568] 0800 488386584 sda driver: sd
[ 3.183099] 0801 506016 sda1
[ 3.186927] 0802 4008217 sda2
[ 3.190755] 0803 483869767 sda3
[ 3.194584] b300 1880064 mmcblk0 driver: mmcblk
[ 3.199802] b301 4096 mmcblk0p1
[ 3.204063] b302 102400 mmcblk0p2
[ 3.208330] b303 4096 mmcblk0p3
[ 3.212594] b304 1 mmcblk0p4
[ 3.216855] b305 2048 mmcblk0p5
[ 3.221116] b306 2048 mmcblk0p6
[ 3.225382] b307 2048 mmcblk0p7
[ 3.229644] b308 4096 mmcblk0p8
[ 3.233906] b309 12288 mmcblk0p9
[ 3.238176] b30a 16384 mmcblk0p10
[ 3.242524] b30b 142336 mmcblk0p11
[ 3.246869] b30c 1572864 mmcblk0p12
[ 3.251219] b320 12288 mmcblk0gp1 (driver?)
[ 3.256272] b310 12288 mmcblk0gp0 (driver?)
[ 3.261320] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
[ 3.269566] Pid: 1, comm: swapper Not tainted 2.6.35 #1
[ 3.274776] Call Trace:
[ 3.277232] [<80d0db5b>] ? printk+0x1e/0x20
[ 3.281492] [<80d0dad1>] panic+0x65/0xd1
[ 3.285495] [<80eb9ce3>] mount_block_root+0x125/0x1be
[ 3.290631] [<809d1f6d>] ? sys_mknod+0x2d/0x30
[ 3.295156] [<80eb9f6d>] mount_root+0xd0/0xf2
[ 3.299591] [<80eba0d9>] prepare_namespace+0x14a/0x184
[ 3.304803] [<809c44f6>] ? sys_access+0x26/0x30
[ 3.309411] [<80eb9a4e>] kernel_init+0x25e/0x26e
[ 3.314105] [<80eb97f0>] ? kernel_init+0x0/0x26e
[ 3.318800] [<80903242>] kernel_thread_helper+0x6/0x10


2. Client (my embedded box) configuration
It's kernel 2.6.35 based, and has following NFS kernel configs.

# grep NFS .config
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
# CONFIG_NFS_V4_1 is not set
CONFIG_ROOT_NFS=y
# CONFIG_NFSD is not set
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y


3. Server (NFSD) configuration
Fedora 19 + latest linus git kernel 3.12.0-rc2+ (commit 22356f44, mm: Place preemption point in do_mlockall() loop)


4. workaround

Reverting the commit 4bdc33ed resolves my issue, NFS boot is working then.
I've done git bisect, but lost the resulting bisect log due to sudden power loss :(.

Best regards,
Jongman Heo
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?


2013-09-25 13:52:44

by Anna Schumaker

[permalink] [raw]
Subject: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")

Hi Jongman,

Is the panic on your client or server? I don't see how the patch your
bisect led you to could cause the problem, since all it does is expand
the minor version array on the server. Your client doesn't have NFSD
enabled, so this code shouldn't even be affecting it.

A few questions: what is your /etc/exports on the server? What
version of NFS are you using for nfsroot?

Thanks!
Anna

On Wed, Sep 25, 2013 at 1:19 AM, Jongman Heo <[email protected]> wrote:
>
> Hi all,
>
> My embedded development box fails to NFS-boot with NFS server which uses recent kernel.
>
> Using git bisect, I found it is caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server").
>
>
> 1. dmesg (NFS boot failure case)
>
> ...
> [ 2.040893] ADDRCONF(NETDEV_UP): eth0: link is not ready
> [ 2.046207] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> [ 2.053570] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [ 3.055023] IP-Config: Guessing netmask 255.255.0.0
> [ 3.059979] IP-Config: Gateway not on directly connected network.
> [ 3.066330] Looking up port of RPC 100003/2 on 165.213.88.249
> [ 3.074001] Looking up port of RPC 100005/1 on 165.213.88.249
> [ 3.122878] VFS: Unable to mount root fs via NFS, trying floppy.
> [ 3.129134] VFS: Cannot open root device "nfs" or unknown-block(2,0)
> [ 3.135478] Please append a correct "root=" boot option; here are the available partitions:
> [ 3.143831] 1f00 3072 mtdblock0 (driver?)
> [ 3.148798] 1f01 64 mtdblock1 (driver?)
> [ 3.153758] 1f02 64 mtdblock2 (driver?)
> [ 3.158719] 1f03 64 mtdblock3 (driver?)
> [ 3.163682] 1f04 64 mtdblock4 (driver?)
> [ 3.168644] 1f05 64 mtdblock5 (driver?)
> [ 3.173607] 1f06 64 mtdblock6 (driver?)
> [ 3.178568] 0800 488386584 sda driver: sd
> [ 3.183099] 0801 506016 sda1
> [ 3.186927] 0802 4008217 sda2
> [ 3.190755] 0803 483869767 sda3
> [ 3.194584] b300 1880064 mmcblk0 driver: mmcblk
> [ 3.199802] b301 4096 mmcblk0p1
> [ 3.204063] b302 102400 mmcblk0p2
> [ 3.208330] b303 4096 mmcblk0p3
> [ 3.212594] b304 1 mmcblk0p4
> [ 3.216855] b305 2048 mmcblk0p5
> [ 3.221116] b306 2048 mmcblk0p6
> [ 3.225382] b307 2048 mmcblk0p7
> [ 3.229644] b308 4096 mmcblk0p8
> [ 3.233906] b309 12288 mmcblk0p9
> [ 3.238176] b30a 16384 mmcblk0p10
> [ 3.242524] b30b 142336 mmcblk0p11
> [ 3.246869] b30c 1572864 mmcblk0p12
> [ 3.251219] b320 12288 mmcblk0gp1 (driver?)
> [ 3.256272] b310 12288 mmcblk0gp0 (driver?)
> [ 3.261320] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
> [ 3.269566] Pid: 1, comm: swapper Not tainted 2.6.35 #1
> [ 3.274776] Call Trace:
> [ 3.277232] [<80d0db5b>] ? printk+0x1e/0x20
> [ 3.281492] [<80d0dad1>] panic+0x65/0xd1
> [ 3.285495] [<80eb9ce3>] mount_block_root+0x125/0x1be
> [ 3.290631] [<809d1f6d>] ? sys_mknod+0x2d/0x30
> [ 3.295156] [<80eb9f6d>] mount_root+0xd0/0xf2
> [ 3.299591] [<80eba0d9>] prepare_namespace+0x14a/0x184
> [ 3.304803] [<809c44f6>] ? sys_access+0x26/0x30
> [ 3.309411] [<80eb9a4e>] kernel_init+0x25e/0x26e
> [ 3.314105] [<80eb97f0>] ? kernel_init+0x0/0x26e
> [ 3.318800] [<80903242>] kernel_thread_helper+0x6/0x10
>
>
> 2. Client (my embedded box) configuration
> It's kernel 2.6.35 based, and has following NFS kernel configs.
>
> # grep NFS .config
> CONFIG_NFS_FS=y
> CONFIG_NFS_V3=y
> CONFIG_NFS_V3_ACL=y
> CONFIG_NFS_V4=y
> # CONFIG_NFS_V4_1 is not set
> CONFIG_ROOT_NFS=y
> # CONFIG_NFSD is not set
> CONFIG_NFS_ACL_SUPPORT=y
> CONFIG_NFS_COMMON=y
>
>
> 3. Server (NFSD) configuration
> Fedora 19 + latest linus git kernel 3.12.0-rc2+ (commit 22356f44, mm: Place preemption point in do_mlockall() loop)
>
>
> 4. workaround
>
> Reverting the commit 4bdc33ed resolves my issue, NFS boot is working then.
> I've done git bisect, but lost the resulting bisect log due to sudden power loss :(.
>
> Best regards,
> Jongman Heo

2013-09-25 14:05:17

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")

On Wed, Sep 25, 2013 at 05:19:50AM +0000, Jongman Heo wrote:
> My embedded development box fails to NFS-boot with NFS server which uses recent kernel.
>
> Using git bisect, I found it is caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server").
>
>
> 1. dmesg (NFS boot failure case)
>
> ...
> [ 2.040893] ADDRCONF(NETDEV_UP): eth0: link is not ready
> [ 2.046207] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> [ 2.053570] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [ 3.055023] IP-Config: Guessing netmask 255.255.0.0
> [ 3.059979] IP-Config: Gateway not on directly connected network.
> [ 3.066330] Looking up port of RPC 100003/2 on 165.213.88.249
> [ 3.074001] Looking up port of RPC 100005/1 on 165.213.88.249
> [ 3.122878] VFS: Unable to mount root fs via NFS, trying floppy.
> [ 3.129134] VFS: Cannot open root device "nfs" or unknown-block(2,0)
> [ 3.135478] Please append a correct "root=" boot option; here are the available partitions:
> [ 3.143831] 1f00 3072 mtdblock0 (driver?)
> [ 3.148798] 1f01 64 mtdblock1 (driver?)
> [ 3.153758] 1f02 64 mtdblock2 (driver?)
> [ 3.158719] 1f03 64 mtdblock3 (driver?)
> [ 3.163682] 1f04 64 mtdblock4 (driver?)
> [ 3.168644] 1f05 64 mtdblock5 (driver?)
> [ 3.173607] 1f06 64 mtdblock6 (driver?)
> [ 3.178568] 0800 488386584 sda driver: sd
> [ 3.183099] 0801 506016 sda1
> [ 3.186927] 0802 4008217 sda2
> [ 3.190755] 0803 483869767 sda3
> [ 3.194584] b300 1880064 mmcblk0 driver: mmcblk
> [ 3.199802] b301 4096 mmcblk0p1
> [ 3.204063] b302 102400 mmcblk0p2
> [ 3.208330] b303 4096 mmcblk0p3
> [ 3.212594] b304 1 mmcblk0p4
> [ 3.216855] b305 2048 mmcblk0p5
> [ 3.221116] b306 2048 mmcblk0p6
> [ 3.225382] b307 2048 mmcblk0p7
> [ 3.229644] b308 4096 mmcblk0p8
> [ 3.233906] b309 12288 mmcblk0p9
> [ 3.238176] b30a 16384 mmcblk0p10
> [ 3.242524] b30b 142336 mmcblk0p11
> [ 3.246869] b30c 1572864 mmcblk0p12
> [ 3.251219] b320 12288 mmcblk0gp1 (driver?)
> [ 3.256272] b310 12288 mmcblk0gp0 (driver?)
> [ 3.261320] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
> [ 3.269566] Pid: 1, comm: swapper Not tainted 2.6.35 #1
> [ 3.274776] Call Trace:
> [ 3.277232] [<80d0db5b>] ? printk+0x1e/0x20
> [ 3.281492] [<80d0dad1>] panic+0x65/0xd1
> [ 3.285495] [<80eb9ce3>] mount_block_root+0x125/0x1be
> [ 3.290631] [<809d1f6d>] ? sys_mknod+0x2d/0x30
> [ 3.295156] [<80eb9f6d>] mount_root+0xd0/0xf2
> [ 3.299591] [<80eba0d9>] prepare_namespace+0x14a/0x184
> [ 3.304803] [<809c44f6>] ? sys_access+0x26/0x30
> [ 3.309411] [<80eb9a4e>] kernel_init+0x25e/0x26e
> [ 3.314105] [<80eb97f0>] ? kernel_init+0x0/0x26e
> [ 3.318800] [<80903242>] kernel_thread_helper+0x6/0x10
>
>
> 2. Client (my embedded box) configuration
> It's kernel 2.6.35 based, and has following NFS kernel configs.
>
> # grep NFS .config
> CONFIG_NFS_FS=y
> CONFIG_NFS_V3=y
> CONFIG_NFS_V3_ACL=y
> CONFIG_NFS_V4=y
> # CONFIG_NFS_V4_1 is not set
> CONFIG_ROOT_NFS=y
> # CONFIG_NFSD is not set
> CONFIG_NFS_ACL_SUPPORT=y
> CONFIG_NFS_COMMON=y
>
>
> 3. Server (NFSD) configuration
> Fedora 19 + latest linus git kernel 3.12.0-rc2+ (commit 22356f44, mm: Place preemption point in do_mlockall() loop)
>
>
> 4. workaround
>
> Reverting the commit 4bdc33ed resolves my issue, NFS boot is working then.
> I've done git bisect, but lost the resulting bisect log due to sudden power loss :(.

So when you say you revert that commit, you mean you revert it on your
*server*, right? You're not changing the client at all throughout these
tests?

A network trace might be interesting: so, on the server, run

tcpdump -s0 -wtmp.pcap -ieth0

(replace eth0 by the right network interface), then try booting the
client and after the client fails, kill tcpdump and send us a copy of
tmp.pcap.

(And also you might want to fire up "wireshark tmp.pcap" and take a look
yourself--you'll probably see something like a version mismatch error in
the network traffic.)

--b.

2013-09-26 04:22:54

by Jongman Heo

[permalink] [raw]
Subject: Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")


Hi,

>
>------- Original Message -------
>Sender : J. Bruce Fields<[email protected]>
>Date : 2013-09-25 23:05 (GMT+09:00)
>Title : Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")
>
>On Wed, Sep 25, 2013 at 05:19:50AM +0000, Jongman Heo wrote:
>> My embedded development box fails to NFS-boot with NFS server which uses recent kernel.
>>
>> Using git bisect, I found it is caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server").
>>
>>
>> 1. dmesg (NFS boot failure case)
>>
>> ...
>> [ 2.040893] ADDRCONF(NETDEV_UP): eth0: link is not ready
>> [ 2.046207] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
>> [ 2.053570] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>> [ 3.055023] IP-Config: Guessing netmask 255.255.0.0
>> [ 3.059979] IP-Config: Gateway not on directly connected network.
>> [ 3.066330] Looking up port of RPC 100003/2 on 165.213.88.249
>> [ 3.074001] Looking up port of RPC 100005/1 on 165.213.88.249
>> [ 3.122878] VFS: Unable to mount root fs via NFS, trying floppy.
>> [ 3.129134] VFS: Cannot open root device "nfs" or unknown-block(2,0)
>> [ 3.135478] Please append a correct "root=" boot option; here are the available partitions:
>> [ 3.143831] 1f00 3072 mtdblock0 (driver?)
>> [ 3.148798] 1f01 64 mtdblock1 (driver?)
>> [ 3.153758] 1f02 64 mtdblock2 (driver?)
>> [ 3.158719] 1f03 64 mtdblock3 (driver?)
>> [ 3.163682] 1f04 64 mtdblock4 (driver?)
>> [ 3.168644] 1f05 64 mtdblock5 (driver?)
>> [ 3.173607] 1f06 64 mtdblock6 (driver?)
>> [ 3.178568] 0800 488386584 sda driver: sd
>> [ 3.183099] 0801 506016 sda1
>> [ 3.186927] 0802 4008217 sda2
>> [ 3.190755] 0803 483869767 sda3
>> [ 3.194584] b300 1880064 mmcblk0 driver: mmcblk
>> [ 3.199802] b301 4096 mmcblk0p1
>> [ 3.204063] b302 102400 mmcblk0p2
>> [ 3.208330] b303 4096 mmcblk0p3
>> [ 3.212594] b304 1 mmcblk0p4
>> [ 3.216855] b305 2048 mmcblk0p5
>> [ 3.221116] b306 2048 mmcblk0p6
>> [ 3.225382] b307 2048 mmcblk0p7
>> [ 3.229644] b308 4096 mmcblk0p8
>> [ 3.233906] b309 12288 mmcblk0p9
>> [ 3.238176] b30a 16384 mmcblk0p10
>> [ 3.242524] b30b 142336 mmcblk0p11
>> [ 3.246869] b30c 1572864 mmcblk0p12
>> [ 3.251219] b320 12288 mmcblk0gp1 (driver?)
>> [ 3.256272] b310 12288 mmcblk0gp0 (driver?)
>> [ 3.261320] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
>> [ 3.269566] Pid: 1, comm: swapper Not tainted 2.6.35 #1
>> [ 3.274776] Call Trace:
>> [ 3.277232] [<80d0db5b>] ? printk+0x1e/0x20
>> [ 3.281492] [<80d0dad1>] panic+0x65/0xd1
>> [ 3.285495] [<80eb9ce3>] mount_block_root+0x125/0x1be
>> [ 3.290631] [<809d1f6d>] ? sys_mknod+0x2d/0x30
>> [ 3.295156] [<80eb9f6d>] mount_root+0xd0/0xf2
>> [ 3.299591] [<80eba0d9>] prepare_namespace+0x14a/0x184
>> [ 3.304803] [<809c44f6>] ? sys_access+0x26/0x30
>> [ 3.309411] [<80eb9a4e>] kernel_init+0x25e/0x26e
>> [ 3.314105] [<80eb97f0>] ? kernel_init+0x0/0x26e
>> [ 3.318800] [<80903242>] kernel_thread_helper+0x6/0x10
>>
>>
>> 2. Client (my embedded box) configuration
>> It's kernel 2.6.35 based, and has following NFS kernel configs.
>>
>> # grep NFS .config
>> CONFIG_NFS_FS=y
>> CONFIG_NFS_V3=y
>> CONFIG_NFS_V3_ACL=y
>> CONFIG_NFS_V4=y
>> # CONFIG_NFS_V4_1 is not set
>> CONFIG_ROOT_NFS=y
>> # CONFIG_NFSD is not set
>> CONFIG_NFS_ACL_SUPPORT=y
>> CONFIG_NFS_COMMON=y
>>
>>
>> 3. Server (NFSD) configuration
>> Fedora 19 + latest linus git kernel 3.12.0-rc2+ (commit 22356f44, mm: Place preemption point in do_mlockall() loop)
>>
>>
>> 4. workaround
>>
>> Reverting the commit 4bdc33ed resolves my issue, NFS boot is working then.
>> I've done git bisect, but lost the resulting bisect log due to sudden power loss :(.
>
>So when you say you revert that commit, you mean you revert it on your
>*server*, right? You're not changing the client at all throughout these
>tests?

Right. I reverted the commit on my server, while client is same throughout the tests.

>
>A network trace might be interesting: so, on the server, run
>
>tcpdump -s0 -wtmp.pcap -ieth0
>
>(replace eth0 by the right network interface), then try booting the
>client and after the client fails, kill tcpdump and send us a copy of
>tmp.pcap.
>
>(And also you might want to fire up "wireshark tmp.pcap" and take a look
>yourself--you'll probably see something like a version mismatch error in
>the network traffic.)
>
>--b.

I've attached two tcpdump files.
In the dump, 165.213.88.238 is IP address for NFS client (embedded box with 2.6.35 kernel), and 192.168.64.128 is for NFS server (running latest git kernel with and without the commit revert)

* tmp_good_filtered.pcap
- latest linus git tree + commit 4bdc33ed reverted
- NFS boot is working

* tmp_bad_filtered.pcap
- latest linus git tree
- NFS boot doesn't work

In error case, I can see following message from wireshark packet window ;

Accept State: remote can't support version # (2)
Program Version (Minimum): 3
Program Version (Maximum): 4


And I forgot to attach my config of NFS server. Here it is.

# grep NFS .config
CONFIG_NFS_FS=y
CONFIG_NFS_V2=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
# CONFIG_NFS_SWAP is not set
CONFIG_NFS_V4_1=y
# CONFIG_NFS_V4_2 is not set
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFSD=y
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
# CONFIG_NFSD_V4_SECURITY_LABEL is not set
# CONFIG_NFSD_FAULT_INJECTION is not set
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y

# rpm -qa|grep nfs
nfs-utils-1.2.8-4.0.fc19.i686
libnfsidmap-0.25-5.fc19.i686

# cat /etc/exports
/home/NFSROOT_mmc/ 165.213.88.238(rw,no_root_squash,sync)


Thanks,
Jongman Heo.


Attachments:
tmp_good_filtered.pcap (7.27 kB)
tmp_bad_filtered.pcap (1.11 kB)
Download all attachments

2013-09-26 04:27:42

by Jongman Heo

[permalink] [raw]
Subject: Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")

>
>------- Original Message -------
>
>Sender : Anna Schumaker<[email protected]>
>
>Date : 2013-09-25 22:52 (GMT+09:00)
>
>Title : Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")
>
>
>
>Hi Jongman,
>
>Is the panic on your client or server? I don't see how the patch your
>bisect led you to could cause the problem, since all it does is expand
>the minor version array on the server. Your client doesn't have NFSD
>enabled, so this code shouldn't even be affecting it.
>
>A few questions: what is your /etc/exports on the server? What
>version of NFS are you using for nfsroot?
>
>Thanks!
>Anna
>
>On Wed, Sep 25, 2013 at 1:19 AM, Jongman Heo wrote:
>>
>> Hi all,
>>
>> My embedded development box fails to NFS-boot with NFS server which uses recent kernel.
>>
>> Using git bisect, I found it is caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server").
>>
>>
>> 1. dmesg (NFS boot failure case)
>>
>> ...
>> [ 2.040893] ADDRCONF(NETDEV_UP): eth0: link is not ready
>> [ 2.046207] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
>> [ 2.053570] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>> [ 3.055023] IP-Config: Guessing netmask 255.255.0.0
>> [ 3.059979] IP-Config: Gateway not on directly connected network.
>> [ 3.066330] Looking up port of RPC 100003/2 on 165.213.88.249
>> [ 3.074001] Looking up port of RPC 100005/1 on 165.213.88.249
>> [ 3.122878] VFS: Unable to mount root fs via NFS, trying floppy.
>> [ 3.129134] VFS: Cannot open root device "nfs" or unknown-block(2,0)
>> [ 3.135478] Please append a correct "root=" boot option; here are the available partitions:
>> [ 3.143831] 1f00 3072 mtdblock0 (driver?)
>> [ 3.148798] 1f01 64 mtdblock1 (driver?)
>> [ 3.153758] 1f02 64 mtdblock2 (driver?)
>> [ 3.158719] 1f03 64 mtdblock3 (driver?)
>> [ 3.163682] 1f04 64 mtdblock4 (driver?)
>> [ 3.168644] 1f05 64 mtdblock5 (driver?)
>> [ 3.173607] 1f06 64 mtdblock6 (driver?)
>> [ 3.178568] 0800 488386584 sda driver: sd
>> [ 3.183099] 0801 506016 sda1
>> [ 3.186927] 0802 4008217 sda2
>> [ 3.190755] 0803 483869767 sda3
>> [ 3.194584] b300 1880064 mmcblk0 driver: mmcblk
>> [ 3.199802] b301 4096 mmcblk0p1
>> [ 3.204063] b302 102400 mmcblk0p2
>> [ 3.208330] b303 4096 mmcblk0p3
>> [ 3.212594] b304 1 mmcblk0p4
>> [ 3.216855] b305 2048 mmcblk0p5
>> [ 3.221116] b306 2048 mmcblk0p6
>> [ 3.225382] b307 2048 mmcblk0p7
>> [ 3.229644] b308 4096 mmcblk0p8
>> [ 3.233906] b309 12288 mmcblk0p9
>> [ 3.238176] b30a 16384 mmcblk0p10
>> [ 3.242524] b30b 142336 mmcblk0p11
>> [ 3.246869] b30c 1572864 mmcblk0p12
>> [ 3.251219] b320 12288 mmcblk0gp1 (driver?)
>> [ 3.256272] b310 12288 mmcblk0gp0 (driver?)
>> [ 3.261320] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
>> [ 3.269566] Pid: 1, comm: swapper Not tainted 2.6.35 #1
>> [ 3.274776] Call Trace:
>> [ 3.277232] [<80d0db5b>] ? printk+0x1e/0x20
>> [ 3.281492] [<80d0dad1>] panic+0x65/0xd1
>> [ 3.285495] [<80eb9ce3>] mount_block_root+0x125/0x1be
>> [ 3.290631] [<809d1f6d>] ? sys_mknod+0x2d/0x30
>> [ 3.295156] [<80eb9f6d>] mount_root+0xd0/0xf2
>> [ 3.299591] [<80eba0d9>] prepare_namespace+0x14a/0x184
>> [ 3.304803] [<809c44f6>] ? sys_access+0x26/0x30
>> [ 3.309411] [<80eb9a4e>] kernel_init+0x25e/0x26e
>> [ 3.314105] [<80eb97f0>] ? kernel_init+0x0/0x26e
>> [ 3.318800] [<80903242>] kernel_thread_helper+0x6/0x10
>>
>>
>> 2. Client (my embedded box) configuration
>> It's kernel 2.6.35 based, and has following NFS kernel configs.
>>
>> # grep NFS .config
>> CONFIG_NFS_FS=y
>> CONFIG_NFS_V3=y
>> CONFIG_NFS_V3_ACL=y
>> CONFIG_NFS_V4=y
>> # CONFIG_NFS_V4_1 is not set
>> CONFIG_ROOT_NFS=y
>> # CONFIG_NFSD is not set
>> CONFIG_NFS_ACL_SUPPORT=y
>> CONFIG_NFS_COMMON=y
>>
>>
>> 3. Server (NFSD) configuration
>> Fedora 19 + latest linus git kernel 3.12.0-rc2+ (commit 22356f44, mm: Place preemption point in do_mlockall() loop)
>>
>>
>> 4. workaround
>>
>> Reverting the commit 4bdc33ed resolves my issue, NFS boot is working then.
>> I've done git bisect, but lost the resulting bisect log due to sudden power loss :(.
>>
>> Best regards,
>> Jongman Heo
>
>
>

Hi,

Please see my e-mail reply to J. Bruce Fields for the detail.

Thanks,
Jongman Heo.????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2013-09-26 17:47:45

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")

On Thu, Sep 26, 2013 at 04:22:48AM +0000, Jongman Heo wrote:
>
> Hi,
>
> >
> >------- Original Message -------
> >Sender : J. Bruce Fields<[email protected]>
> >Date : 2013-09-25 23:05 (GMT+09:00)
> >Title : Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")
> >
> >On Wed, Sep 25, 2013 at 05:19:50AM +0000, Jongman Heo wrote:
> >> My embedded development box fails to NFS-boot with NFS server which uses recent kernel.
> >>
> >> Using git bisect, I found it is caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server").
> >>
> >>
> >> 1. dmesg (NFS boot failure case)
> >>
> >> ...
> >> [ 2.040893] ADDRCONF(NETDEV_UP): eth0: link is not ready
> >> [ 2.046207] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> >> [ 2.053570] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> >> [ 3.055023] IP-Config: Guessing netmask 255.255.0.0
> >> [ 3.059979] IP-Config: Gateway not on directly connected network.
> >> [ 3.066330] Looking up port of RPC 100003/2 on 165.213.88.249
> >> [ 3.074001] Looking up port of RPC 100005/1 on 165.213.88.249
> >> [ 3.122878] VFS: Unable to mount root fs via NFS, trying floppy.
> >> [ 3.129134] VFS: Cannot open root device "nfs" or unknown-block(2,0)
> >> [ 3.135478] Please append a correct "root=" boot option; here are the available partitions:
> >> [ 3.143831] 1f00 3072 mtdblock0 (driver?)
> >> [ 3.148798] 1f01 64 mtdblock1 (driver?)
> >> [ 3.153758] 1f02 64 mtdblock2 (driver?)
> >> [ 3.158719] 1f03 64 mtdblock3 (driver?)
> >> [ 3.163682] 1f04 64 mtdblock4 (driver?)
> >> [ 3.168644] 1f05 64 mtdblock5 (driver?)
> >> [ 3.173607] 1f06 64 mtdblock6 (driver?)
> >> [ 3.178568] 0800 488386584 sda driver: sd
> >> [ 3.183099] 0801 506016 sda1
> >> [ 3.186927] 0802 4008217 sda2
> >> [ 3.190755] 0803 483869767 sda3
> >> [ 3.194584] b300 1880064 mmcblk0 driver: mmcblk
> >> [ 3.199802] b301 4096 mmcblk0p1
> >> [ 3.204063] b302 102400 mmcblk0p2
> >> [ 3.208330] b303 4096 mmcblk0p3
> >> [ 3.212594] b304 1 mmcblk0p4
> >> [ 3.216855] b305 2048 mmcblk0p5
> >> [ 3.221116] b306 2048 mmcblk0p6
> >> [ 3.225382] b307 2048 mmcblk0p7
> >> [ 3.229644] b308 4096 mmcblk0p8
> >> [ 3.233906] b309 12288 mmcblk0p9
> >> [ 3.238176] b30a 16384 mmcblk0p10
> >> [ 3.242524] b30b 142336 mmcblk0p11
> >> [ 3.246869] b30c 1572864 mmcblk0p12
> >> [ 3.251219] b320 12288 mmcblk0gp1 (driver?)
> >> [ 3.256272] b310 12288 mmcblk0gp0 (driver?)
> >> [ 3.261320] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
> >> [ 3.269566] Pid: 1, comm: swapper Not tainted 2.6.35 #1
> >> [ 3.274776] Call Trace:
> >> [ 3.277232] [<80d0db5b>] ? printk+0x1e/0x20
> >> [ 3.281492] [<80d0dad1>] panic+0x65/0xd1
> >> [ 3.285495] [<80eb9ce3>] mount_block_root+0x125/0x1be
> >> [ 3.290631] [<809d1f6d>] ? sys_mknod+0x2d/0x30
> >> [ 3.295156] [<80eb9f6d>] mount_root+0xd0/0xf2
> >> [ 3.299591] [<80eba0d9>] prepare_namespace+0x14a/0x184
> >> [ 3.304803] [<809c44f6>] ? sys_access+0x26/0x30
> >> [ 3.309411] [<80eb9a4e>] kernel_init+0x25e/0x26e
> >> [ 3.314105] [<80eb97f0>] ? kernel_init+0x0/0x26e
> >> [ 3.318800] [<80903242>] kernel_thread_helper+0x6/0x10
> >>
> >>
> >> 2. Client (my embedded box) configuration
> >> It's kernel 2.6.35 based, and has following NFS kernel configs.
> >>
> >> # grep NFS .config
> >> CONFIG_NFS_FS=y
> >> CONFIG_NFS_V3=y
> >> CONFIG_NFS_V3_ACL=y
> >> CONFIG_NFS_V4=y
> >> # CONFIG_NFS_V4_1 is not set
> >> CONFIG_ROOT_NFS=y
> >> # CONFIG_NFSD is not set
> >> CONFIG_NFS_ACL_SUPPORT=y
> >> CONFIG_NFS_COMMON=y
> >>
> >>
> >> 3. Server (NFSD) configuration
> >> Fedora 19 + latest linus git kernel 3.12.0-rc2+ (commit 22356f44, mm: Place preemption point in do_mlockall() loop)
> >>
> >>
> >> 4. workaround
> >>
> >> Reverting the commit 4bdc33ed resolves my issue, NFS boot is working then.
> >> I've done git bisect, but lost the resulting bisect log due to sudden power loss :(.
> >
> >So when you say you revert that commit, you mean you revert it on your
> >*server*, right? You're not changing the client at all throughout these
> >tests?
>
> Right. I reverted the commit on my server, while client is same throughout the tests.
>
> >
> >A network trace might be interesting: so, on the server, run
> >
> >tcpdump -s0 -wtmp.pcap -ieth0
> >
> >(replace eth0 by the right network interface), then try booting the
> >client and after the client fails, kill tcpdump and send us a copy of
> >tmp.pcap.
> >
> >(And also you might want to fire up "wireshark tmp.pcap" and take a look
> >yourself--you'll probably see something like a version mismatch error in
> >the network traffic.)
> >
> >--b.
>
> I've attached two tcpdump files.
> In the dump, 165.213.88.238 is IP address for NFS client (embedded box with 2.6.35 kernel), and 192.168.64.128 is for NFS server (running latest git kernel with and without the commit revert)
>
> * tmp_good_filtered.pcap
> - latest linus git tree + commit 4bdc33ed reverted
> - NFS boot is working
>
> * tmp_bad_filtered.pcap
> - latest linus git tree
> - NFS boot doesn't work
>
> In error case, I can see following message from wireshark packet window ;
>
> Accept State: remote can't support version # (2)
> Program Version (Minimum): 3
> Program Version (Maximum): 4

This is pretty weird--it's not at all obvious how that patch would
affect this.

You're absolutely positive that the *only* thing you're changing on the
server between the "good" and "bad" cases is that one kernel patch?
You're not changing anything in userspace?

What does "cat /proc/fs/nfsd/versions" report in the good and bad cases?

(BTW, out of curiosity: what kind of client is this that only supports
NFSv2 and NFSv3? Even for an embedded system that's a bit surprising.)

--b.

2013-09-26 23:58:03

by Jongman Heo

[permalink] [raw]
Subject: Re: Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")


Hi,Bruce,

>
>------- Original Message -------
>Sender : J. Bruce Fields<[email protected]>
>Date : 2013-09-27 02:47 (GMT+09:00)
>Title : Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")
>
>On Thu, Sep 26, 2013 at 04:22:48AM +0000, Jongman Heo wrote:
>>
>> Hi,
>>
>> >
>> >------- Original Message -------
>> >Sender : J. Bruce Fields
>> >Date : 2013-09-25 23:05 (GMT+09:00)
>> >Title : Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")
>> >
>> >On Wed, Sep 25, 2013 at 05:19:50AM +0000, Jongman Heo wrote:
>> >> My embedded development box fails to NFS-boot with NFS server which uses recent kernel.
>> >>
>> >> Using git bisect, I found it is caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server").
>> >>
>> >>
>> >> 1. dmesg (NFS boot failure case)
>> >>
>> >> ...
>> >> [ 2.040893] ADDRCONF(NETDEV_UP): eth0: link is not ready
>> >> [ 2.046207] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
>> >> [ 2.053570] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>> >> [ 3.055023] IP-Config: Guessing netmask 255.255.0.0
>> >> [ 3.059979] IP-Config: Gateway not on directly connected network.
>> >> [ 3.066330] Looking up port of RPC 100003/2 on 165.213.88.249
>> >> [ 3.074001] Looking up port of RPC 100005/1 on 165.213.88.249
>> >> [ 3.122878] VFS: Unable to mount root fs via NFS, trying floppy.
>> >> [ 3.129134] VFS: Cannot open root device "nfs" or unknown-block(2,0)
>> >> [ 3.135478] Please append a correct "root=" boot option; here are the available partitions:
>> >> [ 3.143831] 1f00 3072 mtdblock0 (driver?)
>> >> [ 3.148798] 1f01 64 mtdblock1 (driver?)
>> >> [ 3.153758] 1f02 64 mtdblock2 (driver?)
>> >> [ 3.158719] 1f03 64 mtdblock3 (driver?)
>> >> [ 3.163682] 1f04 64 mtdblock4 (driver?)
>> >> [ 3.168644] 1f05 64 mtdblock5 (driver?)
>> >> [ 3.173607] 1f06 64 mtdblock6 (driver?)
>> >> [ 3.178568] 0800 488386584 sda driver: sd
>> >> [ 3.183099] 0801 506016 sda1
>> >> [ 3.186927] 0802 4008217 sda2
>> >> [ 3.190755] 0803 483869767 sda3
>> >> [ 3.194584] b300 1880064 mmcblk0 driver: mmcblk
>> >> [ 3.199802] b301 4096 mmcblk0p1
>> >> [ 3.204063] b302 102400 mmcblk0p2
>> >> [ 3.208330] b303 4096 mmcblk0p3
>> >> [ 3.212594] b304 1 mmcblk0p4
>> >> [ 3.216855] b305 2048 mmcblk0p5
>> >> [ 3.221116] b306 2048 mmcblk0p6
>> >> [ 3.225382] b307 2048 mmcblk0p7
>> >> [ 3.229644] b308 4096 mmcblk0p8
>> >> [ 3.233906] b309 12288 mmcblk0p9
>> >> [ 3.238176] b30a 16384 mmcblk0p10
>> >> [ 3.242524] b30b 142336 mmcblk0p11
>> >> [ 3.246869] b30c 1572864 mmcblk0p12
>> >> [ 3.251219] b320 12288 mmcblk0gp1 (driver?)
>> >> [ 3.256272] b310 12288 mmcblk0gp0 (driver?)
>> >> [ 3.261320] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
>> >> [ 3.269566] Pid: 1, comm: swapper Not tainted 2.6.35 #1
>> >> [ 3.274776] Call Trace:
>> >> [ 3.277232] [<80d0db5b>] ? printk+0x1e/0x20
>> >> [ 3.281492] [<80d0dad1>] panic+0x65/0xd1
>> >> [ 3.285495] [<80eb9ce3>] mount_block_root+0x125/0x1be
>> >> [ 3.290631] [<809d1f6d>] ? sys_mknod+0x2d/0x30
>> >> [ 3.295156] [<80eb9f6d>] mount_root+0xd0/0xf2
>> >> [ 3.299591] [<80eba0d9>] prepare_namespace+0x14a/0x184
>> >> [ 3.304803] [<809c44f6>] ? sys_access+0x26/0x30
>> >> [ 3.309411] [<80eb9a4e>] kernel_init+0x25e/0x26e
>> >> [ 3.314105] [<80eb97f0>] ? kernel_init+0x0/0x26e
>> >> [ 3.318800] [<80903242>] kernel_thread_helper+0x6/0x10
>> >>
>> >>
>> >> 2. Client (my embedded box) configuration
>> >> It's kernel 2.6.35 based, and has following NFS kernel configs.
>> >>
>> >> # grep NFS .config
>> >> CONFIG_NFS_FS=y
>> >> CONFIG_NFS_V3=y
>> >> CONFIG_NFS_V3_ACL=y
>> >> CONFIG_NFS_V4=y
>> >> # CONFIG_NFS_V4_1 is not set
>> >> CONFIG_ROOT_NFS=y
>> >> # CONFIG_NFSD is not set
>> >> CONFIG_NFS_ACL_SUPPORT=y
>> >> CONFIG_NFS_COMMON=y
>> >>
>> >>
>> >> 3. Server (NFSD) configuration
>> >> Fedora 19 + latest linus git kernel 3.12.0-rc2+ (commit 22356f44, mm: Place preemption point in do_mlockall() loop)
>> >>
>> >>
>> >> 4. workaround
>> >>
>> >> Reverting the commit 4bdc33ed resolves my issue, NFS boot is working then.
>> >> I've done git bisect, but lost the resulting bisect log due to sudden power loss :(.
>> >
>> >So when you say you revert that commit, you mean you revert it on your
>> >*server*, right? You're not changing the client at all throughout these
>> >tests?
>>
>> Right. I reverted the commit on my server, while client is same throughout the tests.
>>
>> >
>> >A network trace might be interesting: so, on the server, run
>> >
>> >tcpdump -s0 -wtmp.pcap -ieth0
>> >
>> >(replace eth0 by the right network interface), then try booting the
>> >client and after the client fails, kill tcpdump and send us a copy of
>> >tmp.pcap.
>> >
>> >(And also you might want to fire up "wireshark tmp.pcap" and take a look
>> >yourself--you'll probably see something like a version mismatch error in
>> >the network traffic.)
>> >
>> >--b.
>>
>> I've attached two tcpdump files.
>> In the dump, 165.213.88.238 is IP address for NFS client (embedded box with 2.6.35 kernel), and 192.168.64.128 is for NFS server (running latest git kernel with and without the commit revert)
>>
>> * tmp_good_filtered.pcap
>> - latest linus git tree + commit 4bdc33ed reverted
>> - NFS boot is working
>>
>> * tmp_bad_filtered.pcap
>> - latest linus git tree
>> - NFS boot doesn't work
>>
>> In error case, I can see following message from wireshark packet window ;
>>
>> Accept State: remote can't support version # (2)
>> Program Version (Minimum): 3
>> Program Version (Maximum): 4
>
>This is pretty weird--it's not at all obvious how that patch would
>affect this.
>
>You're absolutely positive that the *only* thing you're changing on the
>server between the "good" and "bad" cases is that one kernel patch?
>You're not changing anything in userspace?
>

Yes, pretty sure.

>What does "cat /proc/fs/nfsd/versions" report in the good and bad cases?
>
>(BTW, out of curiosity: what kind of client is this that only supports
>NFSv2 and NFSv3? Even for an embedded system that's a bit surprising.)
>
>--b.
>

Here are /proc/fs/nfsd/versions information for good and bad cases ;

good (commit 4bdc33ed reverted)

# cat /proc/fs/nfsd/versions
+2 +3 +4 +4.1


bad (current linus git)

# cat /proc/fs/nfsd/versions
-2 +3 +4 +4.1 -4.2


I don't know why the commit 4bdc33ed makes this difference ( from +2 to -2 ).

My NFS server just uses Fedora 19 + latest kernel (which is not a rare setup...),
so I think some people can verify if this version information change happens w/ and w/o the commit revert.

Don't know the detail of NFS protocol, but our NFS client seems not to try with v3 and v4 in case v2 fails...
Is this an unexpected (buggy) behavior of my old embedded box (NFS client of kernel 2.6.35), or expected one from the NFS protocol?

Thanks,
Jongman Heo.????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2013-09-27 01:12:48

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")

On Thu, Sep 26, 2013 at 11:57:57PM +0000, Jongman Heo wrote:
> >------- Original Message -------
> >Sender : J. Bruce Fields<[email protected]>
> >This is pretty weird--it's not at all obvious how that patch would
> >affect this.
> >
> >You're absolutely positive that the *only* thing you're changing on the
> >server between the "good" and "bad" cases is that one kernel patch?
> >You're not changing anything in userspace?
> >
>
> Yes, pretty sure.
>
> >What does "cat /proc/fs/nfsd/versions" report in the good and bad cases?
> >
> >(BTW, out of curiosity: what kind of client is this that only supports
> >NFSv2 and NFSv3? Even for an embedded system that's a bit surprising.)
> >
> >--b.
> >
>
> Here are /proc/fs/nfsd/versions information for good and bad cases ;
>
> good (commit 4bdc33ed reverted)
>
> # cat /proc/fs/nfsd/versions
> +2 +3 +4 +4.1
>
>
> bad (current linus git)
>
> # cat /proc/fs/nfsd/versions
> -2 +3 +4 +4.1 -4.2
>
>
> I don't know why the commit 4bdc33ed makes this difference ( from +2 to -2 ).
>
> My NFS server just uses Fedora 19 + latest kernel (which is not a rare setup...),

The thing is, nfs-utils *did* make exactly this change with commit
6b4e4965a6b82e8d49cea1c0316b951ba4e9e83e "rpc.nfsd: No longer advertise
NFS v2 support." in 1.2.9-rc4 which entered f19 recently. And that
kernel commit doesn't look related. So I strongly suspect that you got
the nfs-utils update (or rebooted after the update) at the same time as
bisecting, and that confused the bisect results.

> so I think some people can verify if this version information change happens w/ and w/o the commit revert.
>
> Don't know the detail of NFS protocol, but our NFS client seems not to try with v3 and v4 in case v2 fails...
> Is this an unexpected (buggy) behavior of my old embedded box (NFS client of kernel 2.6.35), or expected one from the NFS protocol?

Digging into a historical git repo just for fun.... It looks like NFSv3
support was added in 2.3.99pre4-3, probably in 2000? (The date on that
commit is 2007, so obviously this repo I have is very confused. Maybe I
should go find if there's a better one someplace.)

So anyway it's either configured out of the kernel or the mount
commandline's asking for v2, or I don't know what....

--b.

2013-09-27 02:21:38

by Jongman Heo

[permalink] [raw]
Subject: Re: Re: Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")

Hi,

>
>------- Original Message -------
>Sender : J. Bruce Fields<[email protected]>
>Date : 2013-09-27 10:12 (GMT+09:00)
>Title : Re: Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")
>
>On Thu, Sep 26, 2013 at 11:57:57PM +0000, Jongman Heo wrote:
>> >------- Original Message -------
>> >Sender : J. Bruce Fields
>> >This is pretty weird--it's not at all obvious how that patch would
>> >affect this.
>> >
>> >You're absolutely positive that the *only* thing you're changing on the
>> >server between the "good" and "bad" cases is that one kernel patch?
>> >You're not changing anything in userspace?
>> >
>>
>> Yes, pretty sure.
>>
>> >What does "cat /proc/fs/nfsd/versions" report in the good and bad cases?
>> >
>> >(BTW, out of curiosity: what kind of client is this that only supports
>> >NFSv2 and NFSv3? Even for an embedded system that's a bit surprising.)
>> >
>> >--b.
>> >
>>
>> Here are /proc/fs/nfsd/versions information for good and bad cases ;
>>
>> good (commit 4bdc33ed reverted)
>>
>> # cat /proc/fs/nfsd/versions
>> +2 +3 +4 +4.1
>>
>>
>> bad (current linus git)
>>
>> # cat /proc/fs/nfsd/versions
>> -2 +3 +4 +4.1 -4.2
>>
>>
>> I don't know why the commit 4bdc33ed makes this difference ( from +2 to -2 ).
>>
>> My NFS server just uses Fedora 19 + latest kernel (which is not a rare setup...),
>
>The thing is, nfs-utils *did* make exactly this change with commit
>6b4e4965a6b82e8d49cea1c0316b951ba4e9e83e "rpc.nfsd: No longer advertise
>NFS v2 support." in 1.2.9-rc4 which entered f19 recently. And that
>kernel commit doesn't look related. So I strongly suspect that you got
>the nfs-utils update (or rebooted after the update) at the same time as
>bisecting, and that confused the bisect results.
>

No, I haven't changed/upgraded nfs-utils package during git bisect.
And I can still reproduce the issue.

# rpm -qa|grep nfs-utils
nfs-utils-1.2.8-4.0.fc19.i686

# rpm -q --changelog nfs-utils|head -6
* Mon Aug 19 2013 Steve Dickson <[email protected]> 1.2.8-4.0
- Updated to latest upstream RC release: nfs-utils-1-2-9-rc4

* Tue Jul 23 2013 Steve Dickson <[email protected]> 1.2.8-3.0
- Updated to latest upstream RC release: nfs-utils-1-2-9-rc3

As you noticed, 1.2.9-rc4 is applied to Fedora 19's nfs-utils what I'm using...

With the nfs-utils, reverting the commit makes difference to me.


Another workaround from user-space (instead of revert) works for me.

Latest linus git gives following NFS version support (with my server's .config).

# cat /proc/fs/nfsd/versions
-2 +3 +4 +4.1 -4.2

I've changed /etc/sysconfig/nfs file as following diff ;

@@ -10,7 +10,7 @@
#LOCKD_UDPPORT=32769
#
# Optional arguments passed to rpc.nfsd. See rpc.nfsd(8)
-RPCNFSDARGS=""
+RPCNFSDARGS="-V 2"


Then. run nfs server again...

# systemctl restart nfs-server.service
# cat /proc/fs/nfsd/versions
+2 +3 +4 +4.1 -4.2

Now NFS boot is working.
Actually I'm OK with this, since NFS boot is just used for debugging purpose, not for production use, in my case.


FYI, in other Linux machine (Ubuntu 12.04), nfs-utils version is "1.2.5-3ubuntu3.1", and the kernel commit doesn't cause NFS v2 support issue.

# cat /proc/fs/nfsd/versions
+2 +3 +4 +4.1 -4.2

So, the change in nfs-utils 1.2.9-rc4 seems to be the root cause, but I don't know why the kernel commit revert resolves the issue.

>> so I think some people can verify if this version information change happens w/ and w/o the commit revert.
>>
>> Don't know the detail of NFS protocol, but our NFS client seems not to try with v3 and v4 in case v2 fails...
>> Is this an unexpected (buggy) behavior of my old embedded box (NFS client of kernel 2.6.35), or expected one from the NFS protocol?
>
>Digging into a historical git repo just for fun.... It looks like NFSv3
>support was added in 2.3.99pre4-3, probably in 2000? (The date on that
>commit is 2007, so obviously this repo I have is very confused. Maybe I
>should go find if there's a better one someplace.)
>
>So anyway it's either configured out of the kernel or the mount
>commandline's asking for v2, or I don't know what....
>
>--b.
>

Thanks,
Jongman Heo.????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2013-09-27 20:09:44

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Re: Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")

On Fri, Sep 27, 2013 at 02:21:33AM +0000, Jongman Heo wrote:
> Hi,
>
> >
> >------- Original Message -------
> >Sender : J. Bruce Fields<[email protected]>
> >Date : 2013-09-27 10:12 (GMT+09:00)
> >Title : Re: Re: Re: Regression caused by commit 4bdc33ed ("NFSDv4.2: Add NFS v4.2 support to the NFS server")
> >
> >On Thu, Sep 26, 2013 at 11:57:57PM +0000, Jongman Heo wrote:
> >> >------- Original Message -------
> >> >Sender : J. Bruce Fields
> >> >This is pretty weird--it's not at all obvious how that patch would
> >> >affect this.
> >> >
> >> >You're absolutely positive that the *only* thing you're changing on the
> >> >server between the "good" and "bad" cases is that one kernel patch?
> >> >You're not changing anything in userspace?
> >> >
> >>
> >> Yes, pretty sure.
> >>
> >> >What does "cat /proc/fs/nfsd/versions" report in the good and bad cases?
> >> >
> >> >(BTW, out of curiosity: what kind of client is this that only supports
> >> >NFSv2 and NFSv3? Even for an embedded system that's a bit surprising.)
> >> >
> >> >--b.
> >> >
> >>
> >> Here are /proc/fs/nfsd/versions information for good and bad cases ;
> >>
> >> good (commit 4bdc33ed reverted)
> >>
> >> # cat /proc/fs/nfsd/versions
> >> +2 +3 +4 +4.1
> >>
> >>
> >> bad (current linus git)
> >>
> >> # cat /proc/fs/nfsd/versions
> >> -2 +3 +4 +4.1 -4.2
> >>
> >>
> >> I don't know why the commit 4bdc33ed makes this difference ( from +2 to -2 ).
> >>
> >> My NFS server just uses Fedora 19 + latest kernel (which is not a rare setup...),
> >
> >The thing is, nfs-utils *did* make exactly this change with commit
> >6b4e4965a6b82e8d49cea1c0316b951ba4e9e83e "rpc.nfsd: No longer advertise
> >NFS v2 support." in 1.2.9-rc4 which entered f19 recently. And that
> >kernel commit doesn't look related. So I strongly suspect that you got
> >the nfs-utils update (or rebooted after the update) at the same time as
> >bisecting, and that confused the bisect results.
> >
>
> No, I haven't changed/upgraded nfs-utils package during git bisect.

Well, all it would take would be a long-ago yum update that you'd
forgotten about by the time you rebooted to a new kernel at which point
the new rpc.nfsd behavior would take affect on restarting the nfs
server.

> And I can still reproduce the issue.

So I'm still really skeptical but if you're positive then I guess I
should go try to reproduce and make sure there's not something very
screwed up with the nfsd/versions interface.

--b.