2004-11-15 20:45:29

by Hugh Dickins

[permalink] [raw]
Subject: [PATCH] tmpfs symlink corrupts mempolicy

Andrea discovered that short symlinks on tmpfs, stored within the inode
itself, overwrote the NUMA mempolicy field which shmem_destroy_inode
expected to find there. His fix was good, but Hugh changed it around a
little, to match existing shmem.c practice (now mpol_init in cases which
might allocate a page, mpol_free in shmem_truncate_inode), and allow for
possibility that mpol_init for a long symlink might one day do something
which really needs mpol_free.

Signed-off-by: Hugh Dickins <[email protected]>
---

Thanks a lot for working that out, Andrea: is this version okay with you?
I've not studied the mempolicy.c part of the patch you posted, which
seemed to be an entirely separate (and less urgent) patch,
better reviewed by Andi.

--- 2.6.10-rc2/mm/shmem.c 2004-11-15 16:21:24.000000000 +0000
+++ linux/mm/shmem.c 2004-11-15 19:08:58.366829456 +0000
@@ -672,6 +672,7 @@ static void shmem_delete_inode(struct in
shmem_unacct_size(info->flags, inode->i_size);
inode->i_size = 0;
shmem_truncate(inode);
+ mpol_free_shared_policy(&info->policy);
if (!list_empty(&info->swaplist)) {
spin_lock(&shmem_swaplist_lock);
list_del_init(&info->swaplist);
@@ -1292,7 +1293,6 @@ shmem_get_inode(struct super_block *sb,
info = SHMEM_I(inode);
memset(info, 0, (char *)inode - (char *)info);
spin_lock_init(&info->lock);
- mpol_shared_policy_init(&info->policy);
INIT_LIST_HEAD(&info->swaplist);

switch (mode & S_IFMT) {
@@ -1303,6 +1303,7 @@ shmem_get_inode(struct super_block *sb,
case S_IFREG:
inode->i_op = &shmem_inode_operations;
inode->i_fop = &shmem_file_operations;
+ mpol_shared_policy_init(&info->policy);
break;
case S_IFDIR:
inode->i_nlink++;
@@ -1766,12 +1767,13 @@ static int shmem_symlink(struct inode *d
memcpy(info, symname, len);
inode->i_op = &shmem_symlink_inline_operations;
} else {
+ inode->i_op = &shmem_symlink_inode_operations;
+ mpol_shared_policy_init(&info->policy);
error = shmem_getpage(inode, 0, &page, SGP_WRITE, NULL);
if (error) {
iput(inode);
return error;
}
- inode->i_op = &shmem_symlink_inode_operations;
kaddr = kmap_atomic(page, KM_USER0);
memcpy(kaddr, symname, len);
kunmap_atomic(kaddr, KM_USER0);
@@ -2026,7 +2028,6 @@ static struct inode *shmem_alloc_inode(s

static void shmem_destroy_inode(struct inode *inode)
{
- mpol_free_shared_policy(&SHMEM_I(inode)->policy);
kmem_cache_free(shmem_inode_cachep, SHMEM_I(inode));
}



2004-11-15 22:34:47

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] tmpfs symlink corrupts mempolicy

On Mon, Nov 15, 2004 at 08:41:12PM +0000, Hugh Dickins wrote:
> Andrea discovered that short symlinks on tmpfs, stored within the inode
> itself, overwrote the NUMA mempolicy field which shmem_destroy_inode
> expected to find there. His fix was good, but Hugh changed it around a
> little, to match existing shmem.c practice (now mpol_init in cases which
> might allocate a page, mpol_free in shmem_truncate_inode), and allow for
> possibility that mpol_init for a long symlink might one day do something
> which really needs mpol_free.
>
> Signed-off-by: Hugh Dickins <[email protected]>
> ---
>
> Thanks a lot for working that out, Andrea: is this version okay with you?
> I've not studied the mempolicy.c part of the patch you posted, which
> seemed to be an entirely separate (and less urgent) patch,
> better reviewed by Andi.
>
> --- 2.6.10-rc2/mm/shmem.c 2004-11-15 16:21:24.000000000 +0000
> +++ linux/mm/shmem.c 2004-11-15 19:08:58.366829456 +0000
> @@ -672,6 +672,7 @@ static void shmem_delete_inode(struct in
> shmem_unacct_size(info->flags, inode->i_size);
> inode->i_size = 0;
> shmem_truncate(inode);
> + mpol_free_shared_policy(&info->policy);
> if (!list_empty(&info->swaplist)) {
> spin_lock(&shmem_swaplist_lock);
> list_del_init(&info->swaplist);

this patch is completely broken, delete_inode isn't going to be called
when the inode is being shrunk. delete_inode is only good for truncate,
mpol_free_shared_policy has nothing to do with the nlink value.

this patch will tend to work until the vm shrink the dcache, then it'll
crash, sorry.

2004-11-16 00:27:18

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] tmpfs symlink corrupts mempolicy

On Mon, 15 Nov 2004, Andrea Arcangeli wrote:
>
> this patch is completely broken, delete_inode isn't going to be called
> when the inode is being shrunk. delete_inode is only good for truncate,

Do you mean "when the inode cache is being shrunk"?
By "truncate" there you mean "unlink"?

> mpol_free_shared_policy has nothing to do with the nlink value.

I didn't think that it would.

> this patch will tend to work until the vm shrink the dcache, then it'll
> crash, sorry.

Sorry, you can see I'm having some trouble understanding your reply.

I think you're forgetting that tmpfs inodes live nowhere but in memory:
if shrinking the inode cache were to remove tmpfs inodes, it would be a
considerably more temporary fs than could ever be useful. There's an
extra dget that keeps dentry and inode safe from pruning.

mpol_shared_policy_init is appropriate when we might want to allocate
pages to the inode (regular file or long symlink); mpol_free_shared_policy
is appropriate when we've given up the possibility of allocating pages to
such an inode - that's after shmem_delete_inode's shmem_truncate.
Whereas shmem_destroy_inode is the complement of shmem_alloc_inode,
neither of which should get into this mpol stuff.

If you're unconvinced, please suggest a test case which will do the
wrong thing, and I'll check it out tomorrow - thanks.

Hugh

2004-11-16 02:14:13

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] tmpfs symlink corrupts mempolicy

On Tue, Nov 16, 2004 at 12:26:05AM +0000, Hugh Dickins wrote:
> On Mon, 15 Nov 2004, Andrea Arcangeli wrote:
> >
> > this patch is completely broken, delete_inode isn't going to be called
> > when the inode is being shrunk. delete_inode is only good for truncate,
>
> Do you mean "when the inode cache is being shrunk"?
> By "truncate" there you mean "unlink"?

yes.

> > this patch will tend to work until the vm shrink the dcache, then it'll
> > crash, sorry.
>
> Sorry, you can see I'm having some trouble understanding your reply.
>
> I think you're forgetting that tmpfs inodes live nowhere but in memory:
> if shrinking the inode cache were to remove tmpfs inodes, it would be a
> considerably more temporary fs than could ever be useful. There's an
> extra dget that keeps dentry and inode safe from pruning.

I was wrong about it crashing, but it will not crash only for shmfs,
if you want to export that to MAP_SHARED on other fs, you won't be
allowed anymore to depending on this shmfs speciaility of delete_inode
being recalled only during unlink. I don't see how depending on this
subtle detail to free the mpol could ever be an improvement instead of
doing in destroy_inode where it belongs to.

2004-11-16 05:50:48

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] tmpfs symlink corrupts mempolicy

On Tue, 16 Nov 2004, Andrea Arcangeli wrote:
>
> I was wrong about it crashing, but it will not crash only for shmfs,
> if you want to export that to MAP_SHARED on other fs, you won't be

I've not yet studied the patches which apply NUMA memory policy to
general page cache, but that's something different. (I'd like to
think they could extend and replace the shmem special case mpol,
but API commitment may turn out to prevent that.)

Of course they won't be able to rely on tmpfs peculiarities, but
they will surely integrate the mpol with generic inode/mapping,
not the tmpfs-specfic inode. I don't know whether they would free
up the mpol in destroy_inode, or near the truncate_inode_pages.

> allowed anymore to depending on this shmfs speciaility of delete_inode
> being recalled only during unlink. I don't see how depending on this
> subtle detail to free the mpol could ever be an improvement instead of
> doing in destroy_inode where it belongs to.

But we're talking about tmpfs (shmfs) for now: which has this space-
saving but nasty oddity that it keeps the short symlinks overwriting
its filesystem-specific inode structure, but the long symlinks in a
page allocated according to NUMA memory policy. By the time you get
to shmem_destroy_inode, you don't know which was which. No, I'm
wrong on that, you can test inode->i_op->truncate to decide
(I'd have to check whether inode->i_op might be NULL).

Your code is relying on the fact that actually the mpol for a long
symlink is always the default, and the default doesn't need any memory
to be freed . That may well remain the case forever, but Brent has a
patch to set tmpfs default memory policy by mount option. That patch
may go nowhere, or we might override the default for symlinks anyway,
but potentially it could need mpol freed even in long symlink case.

The place where tmpfs currently deals with regular files and long
symlinks (those which may allocate pages) versus the rest is in
shmem_delete_inode, hence I moved the mpol freeing there. Perhaps
that whole bank of code could be moved into shmem_destroy_inode,
but I rather doubt it - shmem_destroy_inode is there to free what
shmem_alloc_inode allocated, and that doesn't involve mpol at all.

Hugh