When an AFS FS.StoreData RPC call is made, amongst other things it is given
the resultant file size to be. On the server, this is processed by
truncating the file to new size and then writing the data.
Now, kafs has a lock (vnode->io_lock) that serves to serialise operations
against a specific vnode (ie. inode), but the parameters for the op are set
before the lock is taken. This allows two writebacks (say sync and kswapd)
to race - and if writes are ongoing the writeback for a later write could
occur before the writeback for an earlier one if the latter gets
interrupted.
Note that afs_writepages() cannot take i_mutex and only takes a shared lock
on vnode->validate_lock.
Also note that the server does the truncation and the write inside a lock,
so there's no problem at that end.
Fix this by moving the calculation for the proposed new i_size inside the
vnode->io_lock. Also reset the iterator (which we might have read from)
and update the mtime setting there.
Fixes: bd80d8a80e12 ("afs: Use ITER_XARRAY for writing")
Reported-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: [email protected]
cc: [email protected]
---
fs/afs/write.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 8750b99c3f56..c1f4391ccd7c 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -413,17 +413,19 @@ static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t
afs_op_set_vnode(op, 0, vnode);
op->file[0].dv_delta = 1;
op->file[0].modification = true;
- op->store.write_iter = iter;
op->store.pos = pos;
op->store.size = size;
- op->store.i_size = max(pos + size, vnode->netfs.remote_i_size);
op->store.laundering = laundering;
- op->mtime = vnode->netfs.inode.i_mtime;
op->flags |= AFS_OPERATION_UNINTR;
op->ops = &afs_store_data_operation;
try_next_key:
afs_begin_vnode_operation(op);
+
+ op->store.write_iter = iter;
+ op->store.i_size = max(pos + size, vnode->netfs.remote_i_size);
+ op->mtime = vnode->netfs.inode.i_mtime;
+
afs_wait_for_operation(op);
switch (op->error) {
On 6/28/2023 9:47 AM, David Howells wrote:
>
> When an AFS FS.StoreData RPC call is made, amongst other things it is given
> the resultant file size to be. On the server, this is processed by
> truncating the file to new size and then writing the data.
>
> Now, kafs has a lock (vnode->io_lock) that serves to serialise operations
> against a specific vnode (ie. inode), but the parameters for the op are set
> before the lock is taken. This allows two writebacks (say sync and kswapd)
> to race - and if writes are ongoing the writeback for a later write could
> occur before the writeback for an earlier one if the latter gets
> interrupted.
>
> Note that afs_writepages() cannot take i_mutex and only takes a shared lock
> on vnode->validate_lock.
>
> Also note that the server does the truncation and the write inside a lock,
> so there's no problem at that end.
>
> Fix this by moving the calculation for the proposed new i_size inside the
> vnode->io_lock. Also reset the iterator (which we might have read from)
> and update the mtime setting there.
>
> Fixes: bd80d8a80e12 ("afs: Use ITER_XARRAY for writing")
> Reported-by: Marc Dionne <[email protected]>
> Signed-off-by: David Howells <[email protected]>
> cc: [email protected]
> cc: [email protected]
> ---
> fs/afs/write.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/fs/afs/write.c b/fs/afs/write.c
> index 8750b99c3f56..c1f4391ccd7c 100644
> --- a/fs/afs/write.c
> +++ b/fs/afs/write.c
> @@ -413,17 +413,19 @@ static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t
> afs_op_set_vnode(op, 0, vnode);
> op->file[0].dv_delta = 1;
> op->file[0].modification = true;
> - op->store.write_iter = iter;
> op->store.pos = pos;
> op->store.size = size;
> - op->store.i_size = max(pos + size, vnode->netfs.remote_i_size);
> op->store.laundering = laundering;
> - op->mtime = vnode->netfs.inode.i_mtime;
> op->flags |= AFS_OPERATION_UNINTR;
> op->ops = &afs_store_data_operation;
>
> try_next_key:
> afs_begin_vnode_operation(op);
> +
> + op->store.write_iter = iter;
> + op->store.i_size = max(pos + size, vnode->netfs.remote_i_size);
> + op->mtime = vnode->netfs.inode.i_mtime;
> +
> afs_wait_for_operation(op);
>
> switch (op->error) {
Reviewed-by: Jeffrey Altman <[email protected]>
On Wed, Jun 28, 2023 at 10:47 AM David Howells <[email protected]> wrote:
>
>
> When an AFS FS.StoreData RPC call is made, amongst other things it is given
> the resultant file size to be. On the server, this is processed by
> truncating the file to new size and then writing the data.
>
> Now, kafs has a lock (vnode->io_lock) that serves to serialise operations
> against a specific vnode (ie. inode), but the parameters for the op are set
> before the lock is taken. This allows two writebacks (say sync and kswapd)
> to race - and if writes are ongoing the writeback for a later write could
> occur before the writeback for an earlier one if the latter gets
> interrupted.
>
> Note that afs_writepages() cannot take i_mutex and only takes a shared lock
> on vnode->validate_lock.
>
> Also note that the server does the truncation and the write inside a lock,
> so there's no problem at that end.
>
> Fix this by moving the calculation for the proposed new i_size inside the
> vnode->io_lock. Also reset the iterator (which we might have read from)
> and update the mtime setting there.
>
> Fixes: bd80d8a80e12 ("afs: Use ITER_XARRAY for writing")
> Reported-by: Marc Dionne <[email protected]>
> Signed-off-by: David Howells <[email protected]>
> cc: [email protected]
> cc: [email protected]
> ---
> fs/afs/write.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/fs/afs/write.c b/fs/afs/write.c
> index 8750b99c3f56..c1f4391ccd7c 100644
> --- a/fs/afs/write.c
> +++ b/fs/afs/write.c
> @@ -413,17 +413,19 @@ static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t
> afs_op_set_vnode(op, 0, vnode);
> op->file[0].dv_delta = 1;
> op->file[0].modification = true;
> - op->store.write_iter = iter;
> op->store.pos = pos;
> op->store.size = size;
> - op->store.i_size = max(pos + size, vnode->netfs.remote_i_size);
> op->store.laundering = laundering;
> - op->mtime = vnode->netfs.inode.i_mtime;
> op->flags |= AFS_OPERATION_UNINTR;
> op->ops = &afs_store_data_operation;
>
> try_next_key:
> afs_begin_vnode_operation(op);
> +
> + op->store.write_iter = iter;
> + op->store.i_size = max(pos + size, vnode->netfs.remote_i_size);
> + op->mtime = vnode->netfs.inode.i_mtime;
> +
> afs_wait_for_operation(op);
>
> switch (op->error) {
Looks good to me; the traces where I got a failure indicate that an
extending store occurred in a different thread while waiting for the
io lock, so this looks like the right fix.
Reviewed-by: Marc Dionne <[email protected]>
Marc