2000-11-24 21:47:37

by buhr

[permalink] [raw]
Subject: 2.4.0-test5 bug: invalid "shmid_kernel" passed to "shm_nopage_core"

I've been chasing after a bug in 2.4.0-test5 that I can't quite nail
down. I don't see anything obvious between test5 and test11 that
leads me to believe it's been fixed.

I encountered a lockup on my SMP box. One CPU got stuck in a spinlock
via the following call trace. There were enough args and saved
registers on the stack for me to reconstruct a few of the calls:

valid_swaphandles(entry=c218b268, offset=c68e7e78)
swapin_readahead(entry=c218b268)
shm_nopage_core(shp=c218b240, idx=0, address=40014000)
shm_nopage
do_no_page
handle_mm_fault
do_page_fault
schedule
sys_ipc (at call to sys_shmat)

"valid_swaphandles" locked on the:

swap_device_lock(swapdev)

and it's not surprising it did. The SWP_TYPE(entry) was swapfile
index 52 on my 2-swapfile system, so it was spinning on some random
piece of memory.

In "shm_nopage", the code

if(!(shp = shm_lock(inode->i_ino)))
BUG();

got a "shp" of 0xc218b240. For some reason, this wasn't a valid
"shp", because in "shm_nopage_core", the

pte = SHM_ENTRY(shp,idx); // in our case, shp->shm_dir[0][0]

returned 0xc218b268 (i.e., the value of &shp->shm_dir, so maybe
shp->shm_dir was a pointer to itself---not possible if "shp" pointed
to a valid "struct shmid_kernel").

The SHM locking has thwarted my attempts at understanding. Maybe
someone else can see the bug or reassure me that it's already been
fixed in test11?

Kevin <[email protected]>


2000-11-25 10:37:05

by Christoph Rohland

[permalink] [raw]
Subject: Re: 2.4.0-test5 bug: invalid "shmid_kernel" passed to "shm_nopage_core"

Hi Kevin,

[email protected] (Kevin Buhr) writes:
> The SHM locking has thwarted my attempts at understanding. Maybe
> someone else can see the bug or reassure me that it's already been
> fixed in test11?

This is the first report of such corruption. If it's real it is _not_
fixed between test5 and test11. There is probably no way to reproduce
it since you ask if it's fixed in test11, right?

Christoph

2000-11-26 07:36:26

by buhr

[permalink] [raw]
Subject: Re: 2.4.0-test5 bug: invalid "shmid_kernel" passed to "shm_nopage_core"

Christoph Rohland <[email protected]> writes:
>
> This is the first report of such corruption. If it's real it is _not_
> fixed between test5 and test11. There is probably no way to reproduce
> it since you ask if it's fixed in test11, right?

I know no way to reproduce it. I've been using "test5" reliably since
just after its release, and I've triggered this bug only the one time.

I was running Mozilla, one of the few programs I run that uses shared
memory to communicate with the X server. If I recall correctly, the
machine had been idle for a few minutes when my ISP suddenly hung up
on me. Then, I discovered the machine had locked: CPU1 running "pppd"
got stuck waiting for the kernel lock in "sock_ioctl". I believe it
was the innocent victim. CPU0 (running "XF86_SVGA") had grabbed the
kernel lock and gotten stuck spinning on the invalid swap device
spinlock, as mentioned in my previous message.

I use a SysReq patch to do an oops-style dump instead of the usual
"showPc" function, so I was able to copy a stack dump down.

>From the stack dump, I can be 100% positive that, in shm_nopage_core,
"shp" was 0xc218b240 on entry and "idx" was 0, but the line

pte = SHM_ENTRY(shp,idx);

calculated a value of 0xc218b268, the memory location of
"shp->shm_dir". That is, I had shp->shm_dir == **shp->shm_dir, so I
*suspect* that that shp->shm_dir == *shp->shm_dir.

In any event, the "shp" was corrupt (hadn't been initialized or had
been freed and reused).

I'll fiddle around a bit more and see if I can find a way to reproduce
it reliably.

Thanks.

Kevin <[email protected]>

2000-11-26 17:00:12

by Christoph Rohland

[permalink] [raw]
Subject: Re: 2.4.0-test5 bug: invalid "shmid_kernel" passed to "shm_nopage_core"

Hi Kevin,

[email protected] (Kevin Buhr) writes:

> I know no way to reproduce it. I've been using "test5" reliably since
> just after its release, and I've triggered this bug only the one time.

That's what I feared :-(

> I use a SysReq patch to do an oops-style dump instead of the usual
> "showPc" function, so I was able to copy a stack dump down.

Could you send me the patch? Does it do the dump on all cpus?

> I'll fiddle around a bit more and see if I can find a way to reproduce
> it reliably.

I would be happy to help debug the shm code if you find a way to
reproduce it.

Christoph

2000-11-26 21:06:11

by buhr

[permalink] [raw]
Subject: Re: 2.4.0-test5 bug: invalid "shmid_kernel" passed to "shm_nopage_core"

Christoph Rohland <[email protected]> writes:
>
> > I use a SysReq patch to do an oops-style dump instead of the usual
> > "showPc" function, so I was able to copy a stack dump down.
>
> Could you send me the patch? Does it do the dump on all cpus?

You can grab it at:

ftp://mozart.stat.wisc.edu/pub/misc/patch-2.4.0-test5-sysreq

It doesn't dump all CPUs; it just dumps whichever one handles the
SysReq request, so I just keep doing it until I get them both.

> I would be happy to help debug the shm code if you find a way to
> reproduce it.

Okay. I've actually determined that my window manager (Enlightenment)
creates a shared memory segment for the X server to map and unmap
anywhere from 25 to 100 times a second; the segment appears in the X
server's memory map at the address 0x40014000 that "shm_nopage" was
trying to fault in when my lockup occurred.

I didn't notice it before because the time the page is mapped is
small. To catch it, I had to do

while true; do egrep 40014000 /proc/xxx/maps; done

and wait a bit. A "printk" for shm_mmapping at address 0x40014004
gave me the 25-100 times per second figure.

The fact that this has crashed once in all the time I've been using
this setup would seem to imply a very subtle race condition. Ugh.

Can you offer me a tutorial on the SHM locking? What's supposed to
protect against what?

In the meantime, I'll keep plugging away at it.

Kevin <[email protected]>

2000-12-19 09:30:17

by Christoph Rohland

[permalink] [raw]
Subject: Re: 2.4.0-test5 bug: invalid "shmid_kernel" passed to "shm_nopage_core"

Hi Kevin,

On 26 Nov 2000, Kevin Buhr wrote:
> The fact that this has crashed once in all the time I've been using
> this setup would seem to imply a very subtle race condition. Ugh.

I am just running a stress test on 2.4.0-test13-pre3 + appended patch
without problems. Is the shm segment deleted sometimes or is it always
the same segment?

> Can you offer me a tutorial on the SHM locking? What's supposed to
> protect against what?

The locking should be much easier to understand in 2.4.0-test13-pre3:

SYSV has two locks:
- shmids.sem protects addition/removal of shm ids
- the per id lock protects data changes in the id

shmem uses also to locks:
- the inode semaphore is used to protect nopage to race with itself
or truncate
- shmem_inode_info.u.shmem_i.lock is used to protect swapout against
the others

Greetings
Christoph

diff -uNr 4-13-3/ipc/shm.c c/ipc/shm.c
--- 4-13-3/ipc/shm.c Mon Dec 18 15:08:32 2000
+++ c/ipc/shm.c Mon Dec 18 20:07:21 2000
@@ -15,23 +15,13 @@
*
*/

-#include <linux/config.h>
-#include <linux/module.h>
#include <linux/malloc.h>
#include <linux/shm.h>
-#include <linux/swap.h>
-#include <linux/smp_lock.h>
#include <linux/init.h>
-#include <linux/locks.h>
#include <linux/file.h>
#include <linux/mman.h>
-#include <linux/vmalloc.h>
-#include <linux/pagemap.h>
#include <linux/proc_fs.h>
-#include <linux/highmem.h>
-
#include <asm/uaccess.h>
-#include <asm/pgtable.h>

#include "util.h"

@@ -109,6 +99,7 @@
BUG();
shp->shm_atim = CURRENT_TIME;
shp->shm_lprid = current->pid;
+ shp->shm_nattch++;
shm_unlock(id);
}

@@ -123,21 +114,14 @@
*
* @shp: struct to free
*
- * It has to be called with shp and shm_ids.sem locked and will
- * release them
+ * It has to be called with shp and shm_ids.sem locked
*/
static void shm_destroy (struct shmid_kernel *shp)
{
- struct file * file = shp->shm_file;
-
- shp->shm_file = NULL;
shm_tot -= (shp->shm_segsz + PAGE_SIZE - 1) >> PAGE_SHIFT;
- shm_unlock (shp->id);
shm_rmid (shp->id);
+ fput (shp->shm_file);
kfree (shp);
- up (&shm_ids.sem);
- /* put the file outside the critical path to prevent recursion */
- fput (file);
}

/*
@@ -158,10 +142,10 @@
BUG();
shp->shm_lprid = current->pid;
shp->shm_dtim = CURRENT_TIME;
- if(shp->shm_flags & SHM_DEST &&
- file_count (file) == 2) /* shp and the vma have the last
- references*/
- return shm_destroy (shp);
+ shp->shm_nattch--;
+ if(shp->shm_nattch == 0 &&
+ shp->shm_flags & SHM_DEST)
+ shm_destroy (shp);

shm_unlock(id);
up (&shm_ids.sem);
@@ -176,7 +160,7 @@
}

static struct file_operations shm_file_operations = {
- mmap: shm_mmap
+ mmap: shm_mmap
};

static struct vm_operations_struct shm_vm_ops = {
@@ -218,9 +202,10 @@
shp->shm_atim = shp->shm_dtim = 0;
shp->shm_ctim = CURRENT_TIME;
shp->shm_segsz = size;
+ shp->shm_nattch = 0;
shp->id = shm_buildid(id,shp->shm_perm.seq);
shp->shm_file = file;
- file->f_dentry->d_inode->i_ino = id;
+ file->f_dentry->d_inode->i_ino = shp->id;
file->f_op = &shm_file_operations;
shm_tot += numpages;
shm_unlock (id);
@@ -370,15 +355,13 @@
struct inode * inode;

shp = shm_get(i);
- if(shp == NULL || shp->shm_file == NULL)
+ if(shp == NULL)
continue;
inode = shp->shm_file->f_dentry->d_inode;
- down (&inode->i_sem);
- *rss += inode->i_mapping->nrpages;
spin_lock (&inode->u.shmem_i.lock);
+ *rss += inode->i_mapping->nrpages;
*swp += inode->u.shmem_i.swapped;
spin_unlock (&inode->u.shmem_i.lock);
- up (&inode->i_sem);
}
}

@@ -462,7 +445,7 @@
tbuf.shm_ctime = shp->shm_ctim;
tbuf.shm_cpid = shp->shm_cprid;
tbuf.shm_lpid = shp->shm_lprid;
- tbuf.shm_nattch = file_count (shp->shm_file) - 1;
+ tbuf.shm_nattch = shp->shm_nattch;
shm_unlock(shmid);
if(copy_shmid_to_user (buf, &tbuf, version))
return -EFAULT;
@@ -512,13 +495,12 @@
goto out_up;
err = shm_checkid(shp, shmid);
if (err == 0) {
- if (file_count (shp->shm_file) == 1) {
+ if (shp->shm_nattch){
+ shp->shm_flags |= SHM_DEST;
+ /* Do not find it any more */
+ shp->shm_perm.key = IPC_PRIVATE;
+ } else
shm_destroy (shp);
- return 0;
- }
- shp->shm_flags |= SHM_DEST;
- /* Do not find it any more */
- shp->shm_perm.key = IPC_PRIVATE;
}
/* Unlock */
shm_unlock(shmid);
@@ -619,13 +601,23 @@
return -EACCES;
}
file = shp->shm_file;
- get_file (file);
+ shp->shm_nattch++;
shm_unlock(shmid);

down(&current->mm->mmap_sem);
user_addr = (void *) do_mmap (file, addr, file->f_dentry->d_inode->i_size, prot, flags, 0);
up(&current->mm->mmap_sem);
- fput (file);
+
+ down (&shm_ids.sem);
+ if(!(shp = shm_lock(shmid)))
+ BUG();
+ shp->shm_nattch--;
+ if(shp->shm_nattch == 0 &&
+ shp->shm_flags & SHM_DEST)
+ shm_destroy (shp);
+ shm_unlock(shmid);
+ up (&shm_ids.sem);
+
*raddr = (unsigned long) user_addr;
err = 0;
if (IS_ERR(user_addr))
@@ -684,7 +676,7 @@
shp->shm_segsz,
shp->shm_cprid,
shp->shm_lprid,
- file_count (shp->shm_file) - 1,
+ shp->shm_nattch,
shp->shm_perm.uid,
shp->shm_perm.gid,
shp->shm_perm.cuid,
diff -uNr 4-13-3/mm/shmem.c c/mm/shmem.c
--- 4-13-3/mm/shmem.c Mon Dec 18 15:08:32 2000
+++ c/mm/shmem.c Mon Dec 18 15:13:10 2000
@@ -210,37 +210,39 @@
{
int error;
struct shmem_inode_info *info;
- swp_entry_t *entry;
+ swp_entry_t *entry, swap;

info = &((struct inode *)page->mapping->host)->u.shmem_i;
if (info->locked)
return 1;
- spin_lock(&info->lock);
- entry = shmem_swp_entry (info, page->index);
- if (!entry) /* this had been allocted on page allocation */
- BUG();
- error = -EAGAIN;
- if (entry->val)
- goto out;
-
/*
* 1 means "cannot write out".
* We can't drop dirty pages
* just because we ran out of
* swap.
*/
- error = 1;
- *entry = __get_swap_page(2);
- if (!entry->val)
+ swap = __get_swap_page(2);
+ if (!swap.val)
+ return 1;
+
+ spin_lock(&info->lock);
+ entry = shmem_swp_entry (info, page->index);
+ if (!entry) /* this had been allocted on page allocation */
+ BUG();
+ error = -EAGAIN;
+ if (entry->val) {
+ __swap_free(swap, 2);
goto out;
+ }

+ *entry = swap;
error = 0;
/* Remove the from the page cache */
lru_cache_del(page);
remove_inode_page(page);

/* Add it to the swap cache */
- add_to_swap_cache(page,*entry);
+ add_to_swap_cache(page, swap);
page_cache_release(page);
SetPageDirty(page);
info->swapped++;

2000-12-19 18:42:52

by buhr

[permalink] [raw]
Subject: Re: 2.4.0-test5 bug: invalid "shmid_kernel" passed to "shm_nopage_core"

Christoph Rohland <[email protected]> writes:
>
> I am just running a stress test on 2.4.0-test13-pre3 + appended patch
> without problems. Is the shm segment deleted sometimes or is it always
> the same segment?

IIRC, in my particular crash case, the Enlightenment window manager
was using the X shared memory extension to take snapshots of the
screen for its little desktop pager window around 30 times a second;
Mozilla was also sharing some memory with the X server for something.

The code in Enlightenment did a complete shmget/shmat/shmctl(RMID)/shmdt
cycle, so that segment *was* being constantly deleted. The Mozilla
ones stuck around. The particular address that was being reference in
the shm_nopage_core call corresponded to the segments being created
and deleted by Enlightenment, however.

Thanks for the locking tutorial, too.

Kevin <[email protected]>

2000-12-20 08:01:26

by Christoph Rohland

[permalink] [raw]
Subject: Re: 2.4.0-test5 bug: invalid "shmid_kernel" passed to "shm_nopage_core"

Hi Kevin,

On 19 Dec 2000, Kevin Buhr wrote:
> The code in Enlightenment did a complete
> shmget/shmat/shmctl(RMID)/shmdt cycle, so that segment *was* being
> constantly deleted. The Mozilla ones stuck around. The particular
> address that was being reference in the shm_nopage_core call
> corresponded to the segments being created and deleted by
> Enlightenment, however.

OK. The test with the complete cycle did run now for the whole night
on 13-pre3 + my patch. Seems to be stable for me.

Greetings
Christoph