2011-05-21 08:08:09

by Boaz Harrosh

[permalink] [raw]
Subject: pnfs-submit for objects: BUGs Please help


Benny, people hi!

After long debugging of the "pnfs-submit for objects" and after some bugs found.
Please see next-mail patch: [RFC] pnfs: pnfs-core BUGs fixing

I'm at a situation that the lseg reference count is unbalanced short, and the
segment gets freed after layout_commit stage, and never survives to see
a layout_return. (So layout_commit is the last operation I see)

What is nice is that there is a BUG_ON that caches this, (I've converted it to WARN_ON
for debbuging but just the same)

Please see below:

------------[ cut here ]------------
WARNING: at /usr0/export/dev/bharrosh/git/pub/linux-pnfs/fs/nfs/pnfs.c:258 put_lseg_common+0x29/0x7f [nfs]()
Modules linked in: md5 objlayoutdriver exofs nfsd nfs lockd auth_rpcgss nfs_acl sunrpc osd libosd cryptomgr aead crc32c crypto_hash crypto_algapi ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi scsi_mod binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod [last unloaded: scsi_wait_scan]
Call Trace:
76a5bb98: [<60030b0a>] warn_slowpath_common+0x5e/0x75
76a5bbd8: [<60030b36>] warn_slowpath_null+0x15/0x17
76a5bbe8: [<7b7c97df>] put_lseg_common+0x29/0x7f [nfs]
76a5bc08: [<7b7ca685>] put_lseg+0x6d/0x9b [nfs]
76a5bc48: [<7b7b8c0a>] nfs4_layoutcommit_release+0x27/0x3f [nfs]
76a5bc68: [<7a8a3e06>] rpc_release_calldata+0x12/0x14 [sunrpc]
76a5bc78: [<7a8a3ec8>] rpc_free_task+0x57/0x5f [sunrpc]
76a5bca8: [<7a8a3f1d>] rpc_final_put_task+0x4d/0x4f [sunrpc]
76a5bcb8: [<7a8a3f49>] rpc_do_put_task+0x2a/0x31 [sunrpc]
76a5bce8: [<7a8a3f6a>] rpc_put_task+0xb/0xd [sunrpc]
76a5bcf8: [<7b7b4d7f>] nfs4_proc_layoutcommit+0x10b/0x119 [nfs]
76a5bd88: [<7b7c8e71>] pnfs_layoutcommit_inode+0x1d5/0x207 [nfs]
76a5bde8: [<7b79f293>] nfs_file_fsync+0xa2/0xab [nfs]
76a5be18: [<600aa80e>] vfs_fsync_range+0x4d/0x75
76a5be48: [<600aa88b>] vfs_fsync+0x17/0x19
76a5be58: [<7b79ee2c>] nfs_file_flush+0x5c/0x61 [nfs]
76a5be78: [<60087638>] filp_close+0x3f/0x76
76a5bea8: [<6008771a>] sys_close+0xab/0xe7
76a5bee8: [<60016fd4>] handle_syscall+0x58/0x70
76a5bf08: [<600264d7>] userspace+0x2dd/0x38a
76a5bfc8: [<600145e8>] fork_handler+0x7d/0x84

---[ end trace 00e42804444a3cac ]---

So you can see that put_lseg() in nfs4_layoutcommit_release decides that this is the
last ref and frees lseg. (See pnfs.c:258)

I'm not sure where is the missing ref? Please help!!

I'll send the fixes needed to get to this stage as reply. Current tree will crash much
earlier.

Benny I've done all the work needed on the pnfs-objects I'll send that next look for
the:
[PATCHSET 00/11] SQUASHME pnfs-obj: Lots of changes addressing comments by Trond and Benny

Thanks
Boaz



2011-05-21 08:43:28

by Boaz Harrosh

[permalink] [raw]
Subject: [RFC] Bugs in new pnfs write path


1. In nfs4_write_done_cb: data->write_done_cb comes with a NULL.
Just as a guess I call nfs4_write_done_cb() just above it
it looked like the right thing todo. With that in, I'm able
to write things to file When converting pnfs.c:258 to a WARN_ON.

Benny we might want to set data->write_done_cb somewhere in the
none-rpc path? where is it best to do that?

2. In pnfs_ld_write_done:
put_lseg(data->lseg);
data->lseg = NULL;
was done before the call to pnfs_set_layoutcommit()
which trys to get_lseg() on that same data->lseg.

3. In pnfs_ld_write_done:
data->mds_ops->rpc_call_done(NULL, data);
crashes with a NULL task. Just pass it with &data->task

Which calls for a cleanup. There is bunch of functions
with [task, write_data] API. And the task is always
write_data->task

Signed-off-by: Boaz Harrosh <[email protected]>
---
fs/nfs/nfs4proc.c | 3 ++-
fs/nfs/pnfs.c | 10 ++++++----
2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 759523a..1a53187 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3250,7 +3250,8 @@ static int nfs4_write_done(struct rpc_task *task, struct nfs_write_data *data)
{
if (!nfs4_sequence_done(task, &data->res.seq_res))
return -EAGAIN;
- return data->write_done_cb(task, data);
+ return data->write_done_cb ? data->write_done_cb(task, data) :
+ nfs4_write_done_cb(task, data);
}

/* Reset the the nfs_write_data to send the write to the MDS. */
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 17d0c4c..b04cdb4 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -255,7 +255,7 @@ put_lseg_common(struct pnfs_layout_segment *lseg)
{
struct inode *inode = lseg->pls_layout->plh_inode;

- BUG_ON(test_bit(NFS_LSEG_VALID, &lseg->pls_flags));
+ WARN_ON(test_bit(NFS_LSEG_VALID, &lseg->pls_flags));
list_del_init(&lseg->pls_list);
if (list_empty(&lseg->pls_layout->plh_segs)) {
set_bit(NFS_LAYOUT_DESTROYED, &lseg->pls_layout->plh_flags);
@@ -1124,15 +1124,17 @@ pnfs_ld_write_done(struct nfs_write_data *data)
{
int status;

- put_lseg(data->lseg);
- data->lseg = NULL;
if (!data->pnfs_error) {
pnfs_set_layoutcommit(data);
- data->mds_ops->rpc_call_done(NULL, data);
+ data->mds_ops->rpc_call_done(&data->task, data);
data->mds_ops->rpc_release(data);
+ put_lseg(data->lseg);
+ data->lseg = NULL;
return 0;
}

+ put_lseg(data->lseg);
+ data->lseg = NULL;
dprintk("%s: pnfs_error=%d, retry via MDS\n", __func__,
data->pnfs_error);
status = nfs_initiate_write(data, NFS_CLIENT(data->inode), data->mds_ops, NFS_FILE_SYNC);
--
1.7.2.3