Below my logs obtained on centos 5.4 with kernel 2.6.18-164.11.1.el5 when I ask OpenMPI+BLCR to load checkpoint snapshot from NFS share.
General layout is next: host is diskless with nfsroot over NFSv3, /home/* auto-mounted via NFSv4,
and checkpoint directory (where BLCR snapshot is) mounted via NFSv3 (because over NFS4 it kills system even faster).
CentOS 5.4 / kernel 2.6.18-164.11.1.el5
NFS servier is OpenSolaris.
BLCR-0.8.2+OpenMPI-1.4.1 (if it does matter).
Although checkpoint snapshot is on NFSv3 (on NFSv4 at kills system in different way), during restore of processes BLCR try to open some files on /home/user share which is on NFSv4.
Practically, for last couple of years I'm regularly trying to implement config with diskless hosts where /home/* folders will be automounted over NFSv4 (to have proper ACL and attrs), and all what I see:
1) you can't have root on NFS4 (although you can move idmap to initrd and mount NFS4 as root, you always get after some time hanging system, or system with broken idmapping), so you have to use NFS3 for root. And, obviously, NFS4 root isn't desirable, if you take into account idmapping, which means that on server you really need to create corespondent UIDs for all system/service UIDs you have on the clients and have to keep it synchronized.
2) root over NFSv3 and mounts over NFSv4 can't coexist together. At least in real combat systems. There always some different bugs in different places which prevents this config from working. I tried at least 15 different versions of kernels in range 2.6.16-2.6.31, from different distros and vanilla kernels, but never managed to get it working stable.
Will it ever work?
Anton.
----------- 0d [user.notice] -----------: [cut here ] --------- [please bite here ] ---------
Kernel 0d [user.notice] Kernel: BUG at fs/nfs/nfs4xdr.c:872
invalid 0d [user.notice] invalid: opcode: 0000 [1]
SMP 0d [user.notice] SMP:
0d [user.notice] :
last 0d [user.notice] last: sysfs file: /devices/system/cpu/cpu15/topology/physical_package_id
CPU 0d [user.notice] CPU: 12
0d [user.notice] :
Modules 0d [user.notice] Modules: linked in:
blcr(U) 0d [user.notice] blcr(U):
blcr_imports(U) 0d [user.notice] blcr_imports(U):
netconsole 0d [user.notice] netconsole:
autofs4 0d [user.notice] autofs4:
testmgr_cipher 0d [user.notice] testmgr_cipher:
testmgr 0d [user.notice] testmgr:
aead 0d [user.notice] aead:
crypto_blkcipher 0d [user.notice] crypto_blkcipher:
crypto_algapi 0d [user.notice] crypto_algapi:
des 0d [user.notice] des:
ip_conntrack_netbios_ns 0d [user.notice] ip_conntrack_netbios_ns:
ipt_REJECT 0d [user.notice] ipt_REJECT:
xt_state 0d [user.notice] xt_state:
ip_conntrack 0d [user.notice] ip_conntrack:
nfnetlink 0d [user.notice] nfnetlink:
iptable_filter 0d [user.notice] iptable_filter:
ip_tables 0d [user.notice] ip_tables:
ip6t_REJECT 0d [user.notice] ip6t_REJECT:
xt_tcpudp 0d [user.notice] xt_tcpudp:
ip6table_filter 0d [user.notice] ip6table_filter:
ip6_tables 0d [user.notice] ip6_tables:
x_tables 0d [user.notice] x_tables:
rdma_ucm(U) 0d [user.notice] rdma_ucm(U):
ib_ucm(U) 0d [user.notice] ib_ucm(U):
ib_sdp(U) 0d [user.notice] ib_sdp(U):
rdma_cm(U) 0d [user.notice] rdma_cm(U):
iw_cm(U) 0d [user.notice] iw_cm(U):
ib_addr(U) 0d [user.notice] ib_addr(U):
ib_ipoib(U) 0d [user.notice] ib_ipoib(U):
ipoib_helper(U) 0d [user.notice] ipoib_helper(U):
ib_cm(U) 0d [user.notice] ib_cm(U):
ib_sa(U) 0d [user.notice] ib_sa(U):
ib_uverbs(U) 0d [user.notice] ib_uverbs(U):
ib_umad(U) 0d [user.notice] ib_umad(U):
iw_nes(U) 0d [user.notice] iw_nes(U):
iw_cxgb3(U) 0d [user.notice] iw_cxgb3(U):
cxgb3(U) 0d [user.notice] cxgb3(U):
ib_qib(U) 0d [user.notice] ib_qib(U):
dca 0d [user.notice] dca:
mlx4_en(U) 0d [user.notice] mlx4_en(U):
mlx4_ib(U) 0d [user.notice] mlx4_ib(U):
ib_mthca(U) 0d [user.notice] ib_mthca(U):
ib_mad(U) 0d [user.notice] ib_mad(U):
ib_core(U) 0d [user.notice] ib_core(U):
dm_mirror 0d [user.notice] dm_mirror:
dm_log 0d [user.notice] dm_log:
dm_multipath 0d [user.notice] dm_multipath:
scsi_dh 0d [user.notice] scsi_dh:
dm_mod 0d [user.notice] dm_mod:
video 0d [user.notice] video:
hwmon 0d [user.notice] hwmon:
backlight 0d [user.notice] backlight:
sbs 0d [user.notice] sbs:
i2c_ec 0d [user.notice] i2c_ec:
button 0d [user.notice] button:
battery 0d [user.notice] battery:
asus_acpi 0d [user.notice] asus_acpi:
acpi_memhotplug 0d [user.notice] acpi_memhotplug:
ac 0d [user.notice] ac:
parport_pc 0d [user.notice] parport_pc:
lp 0d [user.notice] lp:
parport 0d [user.notice] parport:
joydev 0d [user.notice] joydev:
sr_mod 0d [user.notice] sr_mod:
cdrom 0d [user.notice] cdrom:
sd_mod 0d [user.notice] sd_mod:
sg 0d [user.notice] sg:
mptsas 0d [user.notice] mptsas:
mlx4_core(U) 0d [user.notice] mlx4_core(U):
mptscsih 0d [user.notice] mptscsih:
pcspkr 0d [user.notice] pcspkr:
mptbase 0d [user.notice] mptbase:
scsi_transport_sas 0d [user.notice] scsi_transport_sas:
i2c_nforce2 0d [user.notice] i2c_nforce2:
i2c_core 0d [user.notice] i2c_core:
serio_raw 0d [user.notice] serio_raw:
usb_storage 0d [user.notice] usb_storage:
scsi_mod 0d [user.notice] scsi_mod:
shpchp 0d [user.notice] shpchp:
bnx2 0d [user.notice] bnx2:
e1000 0d [user.notice] e1000:
tg3 0d [user.notice] tg3:
nfs 0d [user.notice] nfs:
lockd 0d [user.notice] lockd:
ipv6 0d [user.notice] ipv6:
fscache 0d [user.notice] fscache:
nfs_acl 0d [user.notice] nfs_acl:
rpcsec_gss_krb5 0d [user.notice] rpcsec_gss_krb5:
auth_rpcgss 0d [user.notice] auth_rpcgss:
xfrm_nalgo 0d [user.notice] xfrm_nalgo:
crypto_api 0d [user.notice] crypto_api:
sunrpc 0d [user.notice] sunrpc:
uhci_hcd 0d [user.notice] uhci_hcd:
ohci_hcd 0d [user.notice] ohci_hcd:
ehci_hcd 0d [user.notice] ehci_hcd:
0d [user.notice] :
Pid 0d [user.notice] Pid: 6821, comm: vasp Tainted: G 2.6.18-164.11.1.el5 #1
RIP 0d [user.notice] RIP: 0010:[<ffffffff881554ff>]
0d [user.notice] [<ffffffff881554ff>]: :nfs:encode_share_access+0x6d/0x82
RSP 0d [user.notice] RSP: 0018:ffff81041d0677b8 EFLAGS: 00010297
RAX 0d [user.notice] RAX: 00000000ffffffff RBX: ffff81041c0910a8 RCX: ffff81041c0910a8
RDX 0d [user.notice] RDX: 0000000000000008 RSI: 0000000000000008 RDI: ffff81041d067808
RBP 0d [user.notice] RBP: 0000000000000080 R08: ffff81041c09109c R09: 0000000000000009
R10 0d [user.notice] R10: ffff810415c9ce00 R11: ffffffff88158d4f R12: ffff81041d067808
R13 0d [user.notice] R13: ffff810417c4ea68 R14: ffff81041d067ab8 R15: ffff810426afa000
FS 0d [user.notice] FS: 00002b6e05f681c0(0000) GS:ffff81010e957240(0000) knlGS:0000000000000000
CS 0d [user.notice] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2 0d [user.notice] CR2: 0000003192a03080 CR3: 0000000417712000 CR4: 00000000000006e0
Process 0d [user.notice] Process: vasp (pid: 6821, threadinfo ffff81041d066000, task ffff81042689c100)
Stack 0d [user.notice] Stack:
ffffffffffffffff 0d [user.notice] ffffffffffffffff:
ffff81041c0910a0 0d [user.notice] ffff81041c0910a0:
ffff810426be2408 0d [user.notice] ffff810426be2408:
ffffffff881589ff 0d [user.notice] ffffffff881589ff:
0d [user.notice] :
0000000000000000 0d [user.notice] 0000000000000000:
ffff810417c4ea68 0d [user.notice] ffff810417c4ea68:
ffff810426be2408 0d [user.notice] ffff810426be2408:
ffffffff88158d4f 0d [user.notice] ffffffff88158d4f:
0d [user.notice] :
ffff810417c4ea68 0d [user.notice] ffff810417c4ea68:
ffffffff88158dbc 0d [user.notice] ffffffff88158dbc:
ffff81041c0910b0 0d [user.notice] ffff81041c0910b0:
ffff810417c4ea70 0d [user.notice] ffff810417c4ea70:
0d [user.notice] :
Call 0d [user.notice] Call: Trace:
0d [user.notice] [<ffffffff881589ff>]: :nfs:encode_open+0x66/0x33e
0d [user.notice] [<ffffffff88158d4f>]: :ac+0x0/0xac
0d [user.notice] [<ffffffff88158dbc>]: :nfs:nfs4_xdr_enc_open+0x6d/0xac
0d [user.notice] [<ffffffff88158d4f>]: :nfs:nfs4_xdr_enc_open+0x0/0xac
0d [user.notice] [<ffffffff880313f0>]: :sunrpc:call_transmit+0x1bc/0x222
0d [user.notice] [<ffffffff880369c1>]: :sunrpc:__rpc_execute+0x92/0x24e
0d [user.notice] [<ffffffff88036bd4>]: :sunrpc:rpc_run_task+0x37/0x3f
0d [user.notice] [<ffffffff881501b1>]: :nfs:_nfs4_proc_open+0x50/0x1aa
0d [user.notice] [<ffffffff881510c3>]: :nfs:nfs4_do_open+0xc2/0x1dd
0d [user.notice] [<ffffffff88152a89>]: :nfs:nfs4_proc_create+0x7f/0x1b2
0d [user.notice] [<ffffffff8012827c>]: avc_has_perm+0x46/0x58
0d [user.notice] [<ffffffff8813d18a>]: :nfs:nfs_create+0x91/0x103
0d [user.notice] [<ffffffff8003a593>]: vfs_create+0xe6/0x158
0d [user.notice] [<ffffffff887e5d16>]: :blcr:cr_mknod+0x19f/0x2b8
0d [user.notice] [<ffffffff887e5ee0>]: :blcr:cr_filp_mknod+0x30/0x12e
0d [user.notice] [<ffffffff887e629a>]: :blcr:cr_uread+0x40/0x91
0d [user.notice] [<ffffffff887e6e20>]: :blcr:cr_mkunlinked+0x47/0x14d
0d [user.notice] [<ffffffff887eaea1>]: :blcr:cr_restore_open_file+0x195/0x332
0d [user.notice] [<ffffffff887ec9d7>]: :blcr:cr_rstrt_child+0x1354/0x1de2
0d [user.notice] [<ffffffff8008ac96>]: __wake_up_common+0x3e/0x68
0d [user.notice] [<ffffffff8008c86c>]: default_wake_function+0x0/0xe
0d [user.notice] [<ffffffff800646f9>]: __down_failed+0x35/0x3a
0d [user.notice] [<ffffffff800421b6>]: do_ioctl+0x55/0x6b
0d [user.notice] [<ffffffff80030293>]: vfs_ioctl+0x457/0x4b9
0d [user.notice] [<ffffffff8004c843>]: sys_ioctl+0x59/0x78
0d [user.notice] [<ffffffff8005d28d>]: tracesys+0xd5/0xe0
0d [user.notice] :
0d [user.notice] :
Code 0d [user.notice] Code:
0f 0d [user.notice] 0f:
0b 0d [user.notice] 0b:
68 0d [user.notice] 68:
50 0d [user.notice] 50:
2a 0d [user.notice] 2a:
16 0d [user.notice] 16:
88 0d [user.notice] 88:
c2 0d [user.notice] c2:
68 0d [user.notice] 68:
03 0d [user.notice] 03:
c7 0d [user.notice] c7:
03 0d [user.notice] 03:
00 0d [user.notice] 00:
00 0d [user.notice] 00:
00 0d [user.notice] 00:
00 0d [user.notice] 00:
41 0d [user.notice] 41:
5a 0d [user.notice] 5a:
5b 0d [user.notice] 5b:
5d 0d [user.notice] 5d:
0d [user.notice] :
RIP 0d [user.notice] RIP:
0d [user.notice] [<ffffffff881554ff>]: :nfs:encode_share_access+0x6d/0x82
RSP 0d [user.notice] RSP: <ffff81041d0677b8>
0d [user.notice] :
kernel 03 [kern.err] kernel: last message repeated 2 times
kernel 04 [kern.warning] kernel: ----------- [cut here ] --------- [please bite here ]
On Thu, 25 Feb 2010 23:35:27 +0100
Anton Starikov <[email protected]> wrote:
> Below my logs obtained on centos 5.4 with kernel 2.6.18-164.11.1.el5 when I ask OpenMPI+BLCR to load checkpoint snapshot from NFS share.
>
> General layout is next: host is diskless with nfsroot over NFSv3, /home/* auto-mounted via NFSv4,
> and checkpoint directory (where BLCR snapshot is) mounted via NFSv3 (because over NFS4 it kills system even faster).
>
> CentOS 5.4 / kernel 2.6.18-164.11.1.el5
> NFS servier is OpenSolaris.
> BLCR-0.8.2+OpenMPI-1.4.1 (if it does matter).
>
>
> Although checkpoint snapshot is on NFSv3 (on NFSv4 at kills system in different way), during restore of processes BLCR try to open some files on /home/user share which is on NFSv4.
>
> Practically, for last couple of years I'm regularly trying to implement config with diskless hosts where /home/* folders will be automounted over NFSv4 (to have proper ACL and attrs), and all what I see:
>
> 1) you can't have root on NFS4 (although you can move idmap to initrd and mount NFS4 as root, you always get after some time hanging system, or system with broken idmapping), so you have to use NFS3 for root. And, obviously, NFS4 root isn't desirable, if you take into account idmapping, which means that on server you really need to create corespondent UIDs for all system/service UIDs you have on the clients and have to keep it synchronized.
>
> 2) root over NFSv3 and mounts over NFSv4 can't coexist together. At least in real combat systems. There always some different bugs in different places which prevents this config from working. I tried at least 15 different versions of kernels in range 2.6.16-2.6.31, from different distros and vanilla kernels, but never managed to get it working stable.
>
> Will it ever work?
>
> Anton.
>
I can't comment on any of the above since it doesn't contain any
specific info other than "my stuff doesn't work".
>
> ----------- 0d [user.notice] -----------: [cut here ] --------- [please bite here ] ---------
> Kernel 0d [user.notice] Kernel: BUG at fs/nfs/nfs4xdr.c:872
> invalid 0d [user.notice] invalid: opcode: 0000 [1]
> SMP 0d [user.notice] SMP:
> 0d [user.notice] :
> last 0d [user.notice] last: sysfs file: /devices/system/cpu/cpu15/topology/physical_package_id
> CPU 0d [user.notice] CPU: 12
> 0d [user.notice] :
> Modules 0d [user.notice] Modules: linked in:
> blcr(U) 0d [user.notice] blcr(U):
> blcr_imports(U) 0d [user.notice] blcr_imports(U):
> netconsole 0d [user.notice] netconsole:
> autofs4 0d [user.notice] autofs4:
> testmgr_cipher 0d [user.notice] testmgr_cipher:
> testmgr 0d [user.notice] testmgr:
> aead 0d [user.notice] aead:
> crypto_blkcipher 0d [user.notice] crypto_blkcipher:
> crypto_algapi 0d [user.notice] crypto_algapi:
> des 0d [user.notice] des:
> ip_conntrack_netbios_ns 0d [user.notice] ip_conntrack_netbios_ns:
> ipt_REJECT 0d [user.notice] ipt_REJECT:
> xt_state 0d [user.notice] xt_state:
> ip_conntrack 0d [user.notice] ip_conntrack:
> nfnetlink 0d [user.notice] nfnetlink:
> iptable_filter 0d [user.notice] iptable_filter:
> ip_tables 0d [user.notice] ip_tables:
> ip6t_REJECT 0d [user.notice] ip6t_REJECT:
> xt_tcpudp 0d [user.notice] xt_tcpudp:
> ip6table_filter 0d [user.notice] ip6table_filter:
> ip6_tables 0d [user.notice] ip6_tables:
> x_tables 0d [user.notice] x_tables:
> rdma_ucm(U) 0d [user.notice] rdma_ucm(U):
> ib_ucm(U) 0d [user.notice] ib_ucm(U):
> ib_sdp(U) 0d [user.notice] ib_sdp(U):
> rdma_cm(U) 0d [user.notice] rdma_cm(U):
> iw_cm(U) 0d [user.notice] iw_cm(U):
> ib_addr(U) 0d [user.notice] ib_addr(U):
> ib_ipoib(U) 0d [user.notice] ib_ipoib(U):
> ipoib_helper(U) 0d [user.notice] ipoib_helper(U):
> ib_cm(U) 0d [user.notice] ib_cm(U):
> ib_sa(U) 0d [user.notice] ib_sa(U):
> ib_uverbs(U) 0d [user.notice] ib_uverbs(U):
> ib_umad(U) 0d [user.notice] ib_umad(U):
> iw_nes(U) 0d [user.notice] iw_nes(U):
> iw_cxgb3(U) 0d [user.notice] iw_cxgb3(U):
> cxgb3(U) 0d [user.notice] cxgb3(U):
> ib_qib(U) 0d [user.notice] ib_qib(U):
> dca 0d [user.notice] dca:
> mlx4_en(U) 0d [user.notice] mlx4_en(U):
> mlx4_ib(U) 0d [user.notice] mlx4_ib(U):
> ib_mthca(U) 0d [user.notice] ib_mthca(U):
> ib_mad(U) 0d [user.notice] ib_mad(U):
> ib_core(U) 0d [user.notice] ib_core(U):
> dm_mirror 0d [user.notice] dm_mirror:
> dm_log 0d [user.notice] dm_log:
> dm_multipath 0d [user.notice] dm_multipath:
> scsi_dh 0d [user.notice] scsi_dh:
> dm_mod 0d [user.notice] dm_mod:
> video 0d [user.notice] video:
> hwmon 0d [user.notice] hwmon:
> backlight 0d [user.notice] backlight:
> sbs 0d [user.notice] sbs:
> i2c_ec 0d [user.notice] i2c_ec:
> button 0d [user.notice] button:
> battery 0d [user.notice] battery:
> asus_acpi 0d [user.notice] asus_acpi:
> acpi_memhotplug 0d [user.notice] acpi_memhotplug:
> ac 0d [user.notice] ac:
> parport_pc 0d [user.notice] parport_pc:
> lp 0d [user.notice] lp:
> parport 0d [user.notice] parport:
> joydev 0d [user.notice] joydev:
> sr_mod 0d [user.notice] sr_mod:
> cdrom 0d [user.notice] cdrom:
> sd_mod 0d [user.notice] sd_mod:
> sg 0d [user.notice] sg:
> mptsas 0d [user.notice] mptsas:
> mlx4_core(U) 0d [user.notice] mlx4_core(U):
> mptscsih 0d [user.notice] mptscsih:
> pcspkr 0d [user.notice] pcspkr:
> mptbase 0d [user.notice] mptbase:
> scsi_transport_sas 0d [user.notice] scsi_transport_sas:
> i2c_nforce2 0d [user.notice] i2c_nforce2:
> i2c_core 0d [user.notice] i2c_core:
> serio_raw 0d [user.notice] serio_raw:
> usb_storage 0d [user.notice] usb_storage:
> scsi_mod 0d [user.notice] scsi_mod:
> shpchp 0d [user.notice] shpchp:
> bnx2 0d [user.notice] bnx2:
> e1000 0d [user.notice] e1000:
> tg3 0d [user.notice] tg3:
> nfs 0d [user.notice] nfs:
> lockd 0d [user.notice] lockd:
> ipv6 0d [user.notice] ipv6:
> fscache 0d [user.notice] fscache:
> nfs_acl 0d [user.notice] nfs_acl:
> rpcsec_gss_krb5 0d [user.notice] rpcsec_gss_krb5:
> auth_rpcgss 0d [user.notice] auth_rpcgss:
> xfrm_nalgo 0d [user.notice] xfrm_nalgo:
> crypto_api 0d [user.notice] crypto_api:
> sunrpc 0d [user.notice] sunrpc:
> uhci_hcd 0d [user.notice] uhci_hcd:
> ohci_hcd 0d [user.notice] ohci_hcd:
> ehci_hcd 0d [user.notice] ehci_hcd:
> 0d [user.notice] :
> Pid 0d [user.notice] Pid: 6821, comm: vasp Tainted: G 2.6.18-164.11.1.el5 #1
> RIP 0d [user.notice] RIP: 0010:[<ffffffff881554ff>]
> 0d [user.notice] [<ffffffff881554ff>]: :nfs:encode_share_access+0x6d/0x82
> RSP 0d [user.notice] RSP: 0018:ffff81041d0677b8 EFLAGS: 00010297
> RAX 0d [user.notice] RAX: 00000000ffffffff RBX: ffff81041c0910a8 RCX: ffff81041c0910a8
> RDX 0d [user.notice] RDX: 0000000000000008 RSI: 0000000000000008 RDI: ffff81041d067808
> RBP 0d [user.notice] RBP: 0000000000000080 R08: ffff81041c09109c R09: 0000000000000009
> R10 0d [user.notice] R10: ffff810415c9ce00 R11: ffffffff88158d4f R12: ffff81041d067808
> R13 0d [user.notice] R13: ffff810417c4ea68 R14: ffff81041d067ab8 R15: ffff810426afa000
> FS 0d [user.notice] FS: 00002b6e05f681c0(0000) GS:ffff81010e957240(0000) knlGS:0000000000000000
> CS 0d [user.notice] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
> CR2 0d [user.notice] CR2: 0000003192a03080 CR3: 0000000417712000 CR4: 00000000000006e0
> Process 0d [user.notice] Process: vasp (pid: 6821, threadinfo ffff81041d066000, task ffff81042689c100)
> Stack 0d [user.notice] Stack:
> ffffffffffffffff 0d [user.notice] ffffffffffffffff:
> ffff81041c0910a0 0d [user.notice] ffff81041c0910a0:
> ffff810426be2408 0d [user.notice] ffff810426be2408:
> ffffffff881589ff 0d [user.notice] ffffffff881589ff:
> 0d [user.notice] :
> 0000000000000000 0d [user.notice] 0000000000000000:
> ffff810417c4ea68 0d [user.notice] ffff810417c4ea68:
> ffff810426be2408 0d [user.notice] ffff810426be2408:
> ffffffff88158d4f 0d [user.notice] ffffffff88158d4f:
> 0d [user.notice] :
> ffff810417c4ea68 0d [user.notice] ffff810417c4ea68:
> ffffffff88158dbc 0d [user.notice] ffffffff88158dbc:
> ffff81041c0910b0 0d [user.notice] ffff81041c0910b0:
> ffff810417c4ea70 0d [user.notice] ffff810417c4ea70:
> 0d [user.notice] :
> Call 0d [user.notice] Call: Trace:
> 0d [user.notice] [<ffffffff881589ff>]: :nfs:encode_open+0x66/0x33e
> 0d [user.notice] [<ffffffff88158d4f>]: :ac+0x0/0xac
> 0d [user.notice] [<ffffffff88158dbc>]: :nfs:nfs4_xdr_enc_open+0x6d/0xac
> 0d [user.notice] [<ffffffff88158d4f>]: :nfs:nfs4_xdr_enc_open+0x0/0xac
> 0d [user.notice] [<ffffffff880313f0>]: :sunrpc:call_transmit+0x1bc/0x222
> 0d [user.notice] [<ffffffff880369c1>]: :sunrpc:__rpc_execute+0x92/0x24e
> 0d [user.notice] [<ffffffff88036bd4>]: :sunrpc:rpc_run_task+0x37/0x3f
> 0d [user.notice] [<ffffffff881501b1>]: :nfs:_nfs4_proc_open+0x50/0x1aa
> 0d [user.notice] [<ffffffff881510c3>]: :nfs:nfs4_do_open+0xc2/0x1dd
> 0d [user.notice] [<ffffffff88152a89>]: :nfs:nfs4_proc_create+0x7f/0x1b2
> 0d [user.notice] [<ffffffff8012827c>]: avc_has_perm+0x46/0x58
> 0d [user.notice] [<ffffffff8813d18a>]: :nfs:nfs_create+0x91/0x103
> 0d [user.notice] [<ffffffff8003a593>]: vfs_create+0xe6/0x158
> 0d [user.notice] [<ffffffff887e5d16>]: :blcr:cr_mknod+0x19f/0x2b8
Hmmm...so this "blcr" module is calling down into vfs_create (I guess
to create a device or pipe or something?). If it's crashing in
encode_share_access then I suspect that the problem is that it's not
filling out the open_intent data in the nameidata that it's passing
down to vfs_create.
IOW, this is likely a bug in the "blcr" module and not in RHEL.
> 0d [user.notice] [<ffffffff887e5ee0>]: :blcr:cr_filp_mknod+0x30/0x12e
> 0d [user.notice] [<ffffffff887e629a>]: :blcr:cr_uread+0x40/0x91
> 0d [user.notice] [<ffffffff887e6e20>]: :blcr:cr_mkunlinked+0x47/0x14d
> 0d [user.notice] [<ffffffff887eaea1>]: :blcr:cr_restore_open_file+0x195/0x332
> 0d [user.notice] [<ffffffff887ec9d7>]: :blcr:cr_rstrt_child+0x1354/0x1de2
> 0d [user.notice] [<ffffffff8008ac96>]: __wake_up_common+0x3e/0x68
> 0d [user.notice] [<ffffffff8008c86c>]: default_wake_function+0x0/0xe
> 0d [user.notice] [<ffffffff800646f9>]: __down_failed+0x35/0x3a
> 0d [user.notice] [<ffffffff800421b6>]: do_ioctl+0x55/0x6b
> 0d [user.notice] [<ffffffff80030293>]: vfs_ioctl+0x457/0x4b9
> 0d [user.notice] [<ffffffff8004c843>]: sys_ioctl+0x59/0x78
> 0d [user.notice] [<ffffffff8005d28d>]: tracesys+0xd5/0xe0
> 0d [user.notice] :
> 0d [user.notice] :
> Code 0d [user.notice] Code:
> 0f 0d [user.notice] 0f:
> 0b 0d [user.notice] 0b:
> 68 0d [user.notice] 68:
> 50 0d [user.notice] 50:
> 2a 0d [user.notice] 2a:
> 16 0d [user.notice] 16:
> 88 0d [user.notice] 88:
> c2 0d [user.notice] c2:
> 68 0d [user.notice] 68:
> 03 0d [user.notice] 03:
> c7 0d [user.notice] c7:
> 03 0d [user.notice] 03:
> 00 0d [user.notice] 00:
> 00 0d [user.notice] 00:
> 00 0d [user.notice] 00:
> 00 0d [user.notice] 00:
> 41 0d [user.notice] 41:
> 5a 0d [user.notice] 5a:
> 5b 0d [user.notice] 5b:
> 5d 0d [user.notice] 5d:
> 0d [user.notice] :
> RIP 0d [user.notice] RIP:
> 0d [user.notice] [<ffffffff881554ff>]: :nfs:encode_share_access+0x6d/0x82
> RSP 0d [user.notice] RSP: <ffff81041d0677b8>
> 0d [user.notice] :
> kernel 03 [kern.err] kernel: last message repeated 2 times
> kernel 04 [kern.warning] kernel: ----------- [cut here ] --------- [please bite here ] -----------
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Jeff Layton <[email protected]>
On Feb 26, 2010, at 12:06 AM, Jeff Layton wrote:
> I can't comment on any of the above since it doesn't contain any
> specific info other than "my stuff doesn't work".
Sorry, if it looks impolite. Just tired and nervous.
> Hmmm...so this "blcr" module is calling down into vfs_create (I guess
> to create a device or pipe or something?). If it's crashing in
> encode_share_access then I suspect that the problem is that it's not
> filling out the open_intent data in the nameidata that it's passing
> down to vfs_create.
>
> IOW, this is likely a bug in the "blcr" module and not in RHEL.
With kernel from OpenSUSE 11.2 (2.6.31) it works. In exactly the same setup.
I'll check, it might be chance that there is an issue with earlier kernel. But it happens only with NFS, it works when everything is on block-storage.
I will also crosspost it to BLCR mail-list.
Anton