2001-02-24 21:14:46

by Nate Eldredge

[permalink] [raw]
Subject: 2.4.2-ac3: loop threads in D state

Kernel 2.4.2-ac3.

FLAGS UID PID PPID PRI NI SIZE RSS WCHAN STA TTY TIME COMMAND
40 0 425 1 -1 -20 0 0 down DW< ? 0:00 (loop0)

>From a look at the source it seems that this may be normal behavior
(though I'm not sure). However, it's still cosmetically annoying,
because it throws off the load average (a D state process is counted
as "running" for the loadavg calculation).

My loopback-mounted fs seems to be working fine, nevertheless, which
is a nice change from previous kernels.

--

Nate Eldredge
[email protected]


2001-02-25 22:33:44

by Nate Eldredge

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state

Nate Eldredge writes:
> Kernel 2.4.2-ac3.
>
> FLAGS UID PID PPID PRI NI SIZE RSS WCHAN STA TTY TIME COMMAND
> 40 0 425 1 -1 -20 0 0 down DW< ? 0:00 (loop0)

It looks like this has been addressed in the thread "242-ac3 loop
bug". Jens Axboe posted a patch, but the list archive I'm reading
mangled it. Jens, could you make this patch available somewhere, or
at least email me a copy? (If it's going in an upcoming -ac patch,
then don't bother; I can wait until then.)

Thanks.

> From a look at the source it seems that this may be normal behavior
> (though I'm not sure). However, it's still cosmetically annoying,
> because it throws off the load average (a D state process is counted
> as "running" for the loadavg calculation).
>
> My loopback-mounted fs seems to be working fine, nevertheless, which
> is a nice change from previous kernels.

--

Nate Eldredge
[email protected]

2001-02-25 22:40:44

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state

--- drivers/block/loop.c~ Sat Feb 24 23:08:38 2001
+++ drivers/block/loop.c Sat Feb 24 23:11:13 2001
@@ -507,7 +507,7 @@
sprintf(current->comm, "loop%d", lo->lo_number);

spin_lock_irq(&current->sigmask_lock);
- siginitsetinv(&current->blocked, sigmask(SIGKILL));
+ sigfillset(&current->blocked);
flush_signals(current);
spin_unlock_irq(&current->sigmask_lock);

@@ -525,7 +525,7 @@
up(&lo->lo_sem);

for (;;) {
- down(&lo->lo_bh_mutex);
+ down_interruptible(&lo->lo_bh_mutex);
if (!atomic_read(&lo->lo_pending))
break;


Attachments:
(No filename) (716.00 B)
loop_sig-1 (546.00 B)
Download all attachments

2001-02-25 22:48:44

by Alexander Viro

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state



On Sun, 25 Feb 2001, Jens Axboe wrote:

> On Sun, Feb 25 2001, Nate Eldredge wrote:
> > Nate Eldredge writes:
> > > Kernel 2.4.2-ac3.
> > >
> > > FLAGS UID PID PPID PRI NI SIZE RSS WCHAN STA TTY TIME COMMAND
> > > 40 0 425 1 -1 -20 0 0 down DW< ? 0:00 (loop0)
> >
> > It looks like this has been addressed in the thread "242-ac3 loop
> > bug". Jens Axboe posted a patch, but the list archive I'm reading
> > mangled it. Jens, could you make this patch available somewhere, or
> > at least email me a copy? (If it's going in an upcoming -ac patch,
> > then don't bother; I can wait until then.)
>
> Patch is here, I haven't checked whether Alan put it in ac4 yet (I
> did cc him, but noone knows for sure :-).

Jens, you have a race in lo_clr_fd() (loop-6). I've put the fixed
variant on ftp.math.psu.edu/pub/viro/loop-S2.gz. Diff and you'll
see - it's in the very beginning of the lo_clr_fd().
Cheers,
Al

2001-02-26 00:25:25

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state

diff -ur --exclude-from /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.2-ac4/drivers/block/loop.c linux/drivers/block/loop.c
--- /opt/kernel/linux-2.4.2-ac4/drivers/block/loop.c Mon Feb 26 01:19:38 2001
+++ linux/drivers/block/loop.c Mon Feb 26 01:22:28 2001
@@ -79,7 +79,6 @@
static int *loop_sizes;
static int *loop_blksizes;
static devfs_handle_t devfs_handle; /* For the directory */
-static kmem_cache_t *loop_bhp;

/*
* Transfer functions
@@ -289,7 +288,7 @@
if (bh) {
kunmap(bh->b_page);
__free_page(bh->b_page);
- kmem_cache_free(loop_bhp, bh);
+ kmem_cache_free(bh_cachep, bh);
}
}

@@ -358,7 +357,7 @@
struct buffer_head *bh;

do {
- bh = kmem_cache_alloc(loop_bhp, SLAB_BUFFER);
+ bh = kmem_cache_alloc(bh_cachep, SLAB_BUFFER);
if (bh)
break;

@@ -508,7 +507,7 @@
sprintf(current->comm, "loop%d", lo->lo_number);

spin_lock_irq(&current->sigmask_lock);
- siginitsetinv(&current->blocked, sigmask(SIGKILL));
+ sigfillset(&current->blocked);
flush_signals(current);
spin_unlock_irq(&current->sigmask_lock);

@@ -526,7 +525,7 @@
up(&lo->lo_sem);

for (;;) {
- down(&lo->lo_bh_mutex);
+ down_interruptible(&lo->lo_bh_mutex);
if (!atomic_read(&lo->lo_pending))
break;

@@ -671,9 +670,12 @@
if (lo->lo_refcnt > 1) /* we needed one fd for the ioctl */
return -EBUSY;

+ spin_lock_irq(&lo->lo_lock);
lo->lo_state = Lo_rundown;
atomic_dec(&lo->lo_pending);
up(&lo->lo_bh_mutex);
+ spin_unlock_irq(&lo->lo_lock);
+
down(&lo->lo_sem);

lo->lo_backing_file = NULL;
@@ -927,13 +929,6 @@
return -EIO;
}

- loop_bhp = kmem_cache_create("loop_buffers", sizeof(struct buffer_head),
- 0, SLAB_HWCACHE_ALIGN, NULL, NULL);
- if (!loop_bhp) {
- printk(KERN_WARNING "loop: unable to create slab cache\n");
- return -ENOMEM;
- }
-
devfs_handle = devfs_mk_dir(NULL, "loop", NULL);
devfs_register_series(devfs_handle, "%u", max_loop, DEVFS_FL_DEFAULT,
MAJOR_NR, 0,
@@ -942,7 +937,7 @@

loop_dev = kmalloc(max_loop * sizeof(struct loop_device), GFP_KERNEL);
if (!loop_dev)
- goto out_dev;
+ return -ENOMEM;

loop_sizes = kmalloc(max_loop * sizeof(int), GFP_KERNEL);
if (!loop_sizes)
@@ -974,8 +969,6 @@
printk(KERN_INFO "loop: loaded (max %d devices)\n", max_loop);
return 0;

-out_dev:
- kmem_cache_destroy(loop_bhp);
out_sizes:
kfree(loop_dev);
out_blksizes:
@@ -990,7 +983,6 @@
if (devfs_unregister_blkdev(MAJOR_NR, "loop"))
printk(KERN_WARNING "loop: cannot unregister blkdev\n");

- kmem_cache_destroy(loop_bhp);
kfree(loop_dev);
kfree(loop_sizes);
kfree(loop_blksizes);
diff -ur --exclude-from /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.2-ac4/fs/Makefile linux/fs/Makefile
--- /opt/kernel/linux-2.4.2-ac4/fs/Makefile Mon Feb 26 01:19:40 2001
+++ linux/fs/Makefile Mon Feb 26 01:14:44 2001
@@ -7,7 +7,7 @@

O_TARGET := fs.o

-export-objs := filesystems.o open.o
+export-objs := filesystems.o open.o dcache.o
mod-subdirs := nls

obj-y := open.o read_write.o devices.o file_table.o buffer.o \
diff -ur --exclude-from /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.2-ac4/fs/dcache.c linux/fs/dcache.c
--- /opt/kernel/linux-2.4.2-ac4/fs/dcache.c Sat Feb 17 01:06:17 2001
+++ linux/fs/dcache.c Mon Feb 26 01:14:54 2001
@@ -22,6 +22,7 @@
#include <linux/init.h>
#include <linux/smp_lock.h>
#include <linux/cache.h>
+#include <linux/module.h>

#include <asm/uaccess.h>

@@ -1250,6 +1251,7 @@

/* SLAB cache for buffer_head structures */
kmem_cache_t *bh_cachep;
+EXPORT_SYMBOL(bh_cachep);

void __init vfs_caches_init(unsigned long mempages)
{


Attachments:
(No filename) (494.00 B)
loop-ac4-1 (3.52 kB)
Download all attachments

2001-02-26 00:29:45

by Alexander Viro

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state



On Mon, 26 Feb 2001, Jens Axboe wrote:

> On Sun, Feb 25 2001, Alexander Viro wrote:
> > Jens, you have a race in lo_clr_fd() (loop-6). I've put the fixed
> > variant on ftp.math.psu.edu/pub/viro/loop-S2.gz. Diff and you'll
> > see - it's in the very beginning of the lo_clr_fd().
>
> Oops yeah you are right. Here's a diff of my current loop stuff
> against -ac4, Alan could you apply? Andrea suggested removing
> the loop private slab cache for buffer heads and just using the
> bh_cachep pool, and it seems like a good idea to me.

Erm... Jens, it really should be
if (atomic_dec_and_test(...))
up(...);
not just
atomic_dec(...);
up(...);

Otherwise you can end up with too early exit of loop_thread. Normally
it would not matter, but in pathological cases...

2001-02-26 00:34:05

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state

On Sun, Feb 25 2001, Alexander Viro wrote:
> Erm... Jens, it really should be
> if (atomic_dec_and_test(...))
> up(...);
> not just
> atomic_dec(...);
> up(...);
>
> Otherwise you can end up with too early exit of loop_thread. Normally
> it would not matter, but in pathological cases...

How so? We dec it and up the semaphore, loop_thread runs until it's
done and ups lo_sem.

--
Jens Axboe

2001-02-26 00:36:35

by Alexander Viro

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state



On Mon, 26 Feb 2001, Jens Axboe wrote:

> On Sun, Feb 25 2001, Alexander Viro wrote:
> > Erm... Jens, it really should be
> > if (atomic_dec_and_test(...))
> > up(...);
> > not just
> > atomic_dec(...);
> > up(...);
> >
> > Otherwise you can end up with too early exit of loop_thread. Normally
> > it would not matter, but in pathological cases...
>
> How so? We dec it and up the semaphore, loop_thread runs until it's
> done and ups lo_sem.

You are risking an extra up() here. Think what happens if you already had a
pending request.


2001-02-26 00:40:05

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state

diff -ur --exclude-from /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.2-ac4/drivers/block/loop.c linux/drivers/block/loop.c
--- /opt/kernel/linux-2.4.2-ac4/drivers/block/loop.c Mon Feb 26 01:19:38 2001
+++ linux/drivers/block/loop.c Mon Feb 26 01:38:36 2001
@@ -79,7 +79,6 @@
static int *loop_sizes;
static int *loop_blksizes;
static devfs_handle_t devfs_handle; /* For the directory */
-static kmem_cache_t *loop_bhp;

/*
* Transfer functions
@@ -289,7 +288,7 @@
if (bh) {
kunmap(bh->b_page);
__free_page(bh->b_page);
- kmem_cache_free(loop_bhp, bh);
+ kmem_cache_free(bh_cachep, bh);
}
}

@@ -358,7 +357,7 @@
struct buffer_head *bh;

do {
- bh = kmem_cache_alloc(loop_bhp, SLAB_BUFFER);
+ bh = kmem_cache_alloc(bh_cachep, SLAB_BUFFER);
if (bh)
break;

@@ -508,7 +507,7 @@
sprintf(current->comm, "loop%d", lo->lo_number);

spin_lock_irq(&current->sigmask_lock);
- siginitsetinv(&current->blocked, sigmask(SIGKILL));
+ sigfillset(&current->blocked);
flush_signals(current);
spin_unlock_irq(&current->sigmask_lock);

@@ -526,7 +525,7 @@
up(&lo->lo_sem);

for (;;) {
- down(&lo->lo_bh_mutex);
+ down_interruptible(&lo->lo_bh_mutex);
if (!atomic_read(&lo->lo_pending))
break;

@@ -671,9 +670,12 @@
if (lo->lo_refcnt > 1) /* we needed one fd for the ioctl */
return -EBUSY;

+ spin_lock_irq(&lo->lo_lock);
lo->lo_state = Lo_rundown;
- atomic_dec(&lo->lo_pending);
- up(&lo->lo_bh_mutex);
+ if (atomic_dec_and_test(&lo->lo_pending))
+ up(&lo->lo_bh_mutex);
+ spin_unlock_irq(&lo->lo_lock);
+
down(&lo->lo_sem);

lo->lo_backing_file = NULL;
@@ -927,13 +929,6 @@
return -EIO;
}

- loop_bhp = kmem_cache_create("loop_buffers", sizeof(struct buffer_head),
- 0, SLAB_HWCACHE_ALIGN, NULL, NULL);
- if (!loop_bhp) {
- printk(KERN_WARNING "loop: unable to create slab cache\n");
- return -ENOMEM;
- }
-
devfs_handle = devfs_mk_dir(NULL, "loop", NULL);
devfs_register_series(devfs_handle, "%u", max_loop, DEVFS_FL_DEFAULT,
MAJOR_NR, 0,
@@ -942,7 +937,7 @@

loop_dev = kmalloc(max_loop * sizeof(struct loop_device), GFP_KERNEL);
if (!loop_dev)
- goto out_dev;
+ return -ENOMEM;

loop_sizes = kmalloc(max_loop * sizeof(int), GFP_KERNEL);
if (!loop_sizes)
@@ -974,8 +969,6 @@
printk(KERN_INFO "loop: loaded (max %d devices)\n", max_loop);
return 0;

-out_dev:
- kmem_cache_destroy(loop_bhp);
out_sizes:
kfree(loop_dev);
out_blksizes:
@@ -990,7 +983,6 @@
if (devfs_unregister_blkdev(MAJOR_NR, "loop"))
printk(KERN_WARNING "loop: cannot unregister blkdev\n");

- kmem_cache_destroy(loop_bhp);
kfree(loop_dev);
kfree(loop_sizes);
kfree(loop_blksizes);
diff -ur --exclude-from /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.2-ac4/fs/Makefile linux/fs/Makefile
--- /opt/kernel/linux-2.4.2-ac4/fs/Makefile Mon Feb 26 01:19:40 2001
+++ linux/fs/Makefile Mon Feb 26 01:14:44 2001
@@ -7,7 +7,7 @@

O_TARGET := fs.o

-export-objs := filesystems.o open.o
+export-objs := filesystems.o open.o dcache.o
mod-subdirs := nls

obj-y := open.o read_write.o devices.o file_table.o buffer.o \
diff -ur --exclude-from /home/axboe/cdrom/exclude /opt/kernel/linux-2.4.2-ac4/fs/dcache.c linux/fs/dcache.c
--- /opt/kernel/linux-2.4.2-ac4/fs/dcache.c Sat Feb 17 01:06:17 2001
+++ linux/fs/dcache.c Mon Feb 26 01:14:54 2001
@@ -22,6 +22,7 @@
#include <linux/init.h>
#include <linux/smp_lock.h>
#include <linux/cache.h>
+#include <linux/module.h>

#include <asm/uaccess.h>

@@ -1250,6 +1251,7 @@

/* SLAB cache for buffer_head structures */
kmem_cache_t *bh_cachep;
+EXPORT_SYMBOL(bh_cachep);

void __init vfs_caches_init(unsigned long mempages)
{


Attachments:
(No filename) (563.00 B)
loop-ac4-2 (3.59 kB)
Download all attachments

2001-02-26 00:41:05

by Alexander Viro

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state



On Sun, 25 Feb 2001, Alexander Viro wrote:

>
>
> On Mon, 26 Feb 2001, Jens Axboe wrote:
>
> > On Sun, Feb 25 2001, Alexander Viro wrote:
> > > Erm... Jens, it really should be
> > > if (atomic_dec_and_test(...))
> > > up(...);
> > > not just
> > > atomic_dec(...);
> > > up(...);
> > >
> > > Otherwise you can end up with too early exit of loop_thread. Normally
> > > it would not matter, but in pathological cases...
> >
> > How so? We dec it and up the semaphore, loop_thread runs until it's
> > done and ups lo_sem.
>
> You are risking an extra up() here. Think what happens if you already had a
> pending request.

Let me elaborate: the race is very narrow and takes deliberate efforts to
hit. It _can_ be triggered, unfortunately. This extra up() will mess your
life later on.

2001-02-26 00:49:05

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state

On Sun, Feb 25 2001, Alexander Viro wrote:
> Let me elaborate: the race is very narrow and takes deliberate efforts to
> hit. It _can_ be triggered, unfortunately. This extra up() will mess your
> life later on.

What's the worst that can happen? We do an extra up, but loop_thread
will still quit once we hit zero lo_pending. And loop_clr_fd
is still protected by lo_ctl_mutex.

--
Jens Axboe

2001-02-26 00:53:35

by Alexander Viro

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state



On Mon, 26 Feb 2001, Jens Axboe wrote:

> On Sun, Feb 25 2001, Alexander Viro wrote:
> > Let me elaborate: the race is very narrow and takes deliberate efforts to
> > hit. It _can_ be triggered, unfortunately. This extra up() will mess your
> > life later on.
>
> What's the worst that can happen? We do an extra up, but loop_thread
> will still quit once we hit zero lo_pending. And loop_clr_fd
> is still protected by lo_ctl_mutex.

Well, for one thing you'll get some surprises next time you losetup
the same device ;-) There are more subtle scenarios, but that one
is pretty unpleasant in itself.
Cheers,
Al

2001-02-26 00:55:35

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state

On Sun, Feb 25 2001, Alexander Viro wrote:
> > What's the worst that can happen? We do an extra up, but loop_thread
> > will still quit once we hit zero lo_pending. And loop_clr_fd
> > is still protected by lo_ctl_mutex.
>
> Well, for one thing you'll get some surprises next time you losetup
> the same device ;-) There are more subtle scenarios, but that one
> is pretty unpleasant in itself.

Ah ok, but that could be solved by just reiniting the sems on
each losetup (which probably would be a good idea anyway). But
ok, I'll shut up now :-)

--
Jens Axboe

2001-02-26 01:42:49

by Alexander Viro

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state



On Mon, 26 Feb 2001, Jens Axboe wrote:

> Ah ok, I see what you mean. Updated patch attached.

Corresponding patch against 2.4.2 is on ftp.math.psu.edu/pub/viro/loop-S2.gz

Cheers,
Al

2001-02-27 19:11:55

by Nate Eldredge

[permalink] [raw]
Subject: Re: 2.4.2-ac3: loop threads in D state

Jens Axboe writes:
> On Sun, Feb 25 2001, Nate Eldredge wrote:
> > Nate Eldredge writes:
> > > Kernel 2.4.2-ac3.
> > >
> > > FLAGS UID PID PPID PRI NI SIZE RSS WCHAN STA TTY TIME COMMAND
> > > 40 0 425 1 -1 -20 0 0 down DW< ? 0:00 (loop0)
> >
> > It looks like this has been addressed in the thread "242-ac3 loop
> > bug". Jens Axboe posted a patch, but the list archive I'm reading
> > mangled it. Jens, could you make this patch available somewhere, or
> > at least email me a copy? (If it's going in an upcoming -ac patch,
> > then don't bother; I can wait until then.)
>
> Patch is here, I haven't checked whether Alan put it in ac4 yet (I
> did cc him, but noone knows for sure :-).

It's in 2.4.2-ac5, which does indeed fix the problem. Thanks to all.

--

Nate Eldredge
[email protected]