2012-02-28 14:32:54

by Stefan Richter

[permalink] [raw]
Subject: [PATCH] [SCSI] sr: fix multi-drive performance, remove BKL replacement

Commit 2a48fc0ab242 "block: autoconvert trivial BKL users to private
mutex" and other commits at the time mechanically swapped BKL for
per-driver global mutexes. If the sr driver is any indication, these
replacements have still not been checked by anybody for their
necessessity, removed where possible, or the sections they serialize
reduced to a necessary minimum.

The sr_mutex in particular very noticably degraded performance of
CD-DA ripping with multiple drives in parallel. When several
instances of "grip" are used with two or more drives, their GUIs
became laggier, as did the KDE file manager GUI, and drive utilization
was reduced. (During ripping, drive lights flicker instead of staying
on most of the time.) IOW time to rip a stack of CDs was increased.
I didn't measure this but it is highly noticeable.

On the other hand, I don't see what state sr_mutex would protect.
So I removed it entirely and that works fine for me.

Tested with up to six CD-ROM drives connected at the same time (1x
IDE, 5x FireWire), 12 grip instances running (2 per drive, one for
ripping and the other just polling the TOC as a test), and of course
udisks-daemon sitting in the background and polling the CD-ROM drives
as usual. GUI lags are reduced and ripping throughput increased to
a level how I remember it from BKL era.

Also tested: Erasing and writing a CD-RW with K3B/cdrecord while grip
and udisks-daemon poll the CD-RW drive and other grip instances rip
CD-DA from three other CD-ROM drives.

Signed-off-by: Stefan Richter <[email protected]>
---
drivers/scsi/sr.c | 23 ++++++-----------------
1 file changed, 6 insertions(+), 17 deletions(-)

--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -75,7 +75,6 @@ MODULE_ALIAS_SCSI_DEVICE(TYPE_WORM);
CDC_CD_R|CDC_CD_RW|CDC_DVD|CDC_DVD_R|CDC_DVD_RAM|CDC_GENERIC_PACKET| \
CDC_MRW|CDC_MRW_W|CDC_RAM)

-static DEFINE_MUTEX(sr_mutex);
static int sr_probe(struct device *);
static int sr_remove(struct device *);
static int sr_done(struct scsi_cmnd *);
@@ -508,27 +507,23 @@ static int sr_prep_fn(struct request_que

static int sr_block_open(struct block_device *bdev, fmode_t mode)
{
- struct scsi_cd *cd;
+ struct scsi_cd *cd = scsi_cd_get(bdev->bd_disk);
int ret = -ENXIO;

- mutex_lock(&sr_mutex);
- cd = scsi_cd_get(bdev->bd_disk);
if (cd) {
ret = cdrom_open(&cd->cdi, bdev, mode);
if (ret)
scsi_cd_put(cd);
}
- mutex_unlock(&sr_mutex);
return ret;
}

static int sr_block_release(struct gendisk *disk, fmode_t mode)
{
struct scsi_cd *cd = scsi_cd(disk);
- mutex_lock(&sr_mutex);
+
cdrom_release(&cd->cdi, mode);
scsi_cd_put(cd);
- mutex_unlock(&sr_mutex);
return 0;
}

@@ -540,8 +535,6 @@ static int sr_block_ioctl(struct block_d
void __user *argp = (void __user *)arg;
int ret;

- mutex_lock(&sr_mutex);
-
/*
* Send SCSI addressing ioctls directly to mid level, send other
* ioctls to cdrom/block level.
@@ -549,13 +542,12 @@ static int sr_block_ioctl(struct block_d
switch (cmd) {
case SCSI_IOCTL_GET_IDLUN:
case SCSI_IOCTL_GET_BUS_NUMBER:
- ret = scsi_ioctl(sdev, cmd, argp);
- goto out;
+ return scsi_ioctl(sdev, cmd, argp);
}

ret = cdrom_ioctl(&cd->cdi, bdev, mode, cmd, arg);
if (ret != -ENOSYS)
- goto out;
+ return ret;

/*
* ENODEV means that we didn't recognise the ioctl, or that we
@@ -566,12 +558,9 @@ static int sr_block_ioctl(struct block_d
ret = scsi_nonblockable_ioctl(sdev, cmd, argp,
(mode & FMODE_NDELAY) != 0);
if (ret != -ENODEV)
- goto out;
- ret = scsi_ioctl(sdev, cmd, argp);
+ return ret;

-out:
- mutex_unlock(&sr_mutex);
- return ret;
+ return scsi_ioctl(sdev, cmd, argp);
}

static unsigned int sr_block_check_events(struct gendisk *disk,


--
Stefan Richter
-=====-===-- --=- ===--
http://arcgraph.de/sr/


2012-02-28 15:00:06

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] sr: fix multi-drive performance, remove BKL replacement

On Tue, 2012-02-28 at 15:32 +0100, Stefan Richter wrote:
> Commit 2a48fc0ab242 "block: autoconvert trivial BKL users to private
> mutex" and other commits at the time mechanically swapped BKL for
> per-driver global mutexes. If the sr driver is any indication, these
> replacements have still not been checked by anybody for their
> necessessity, removed where possible, or the sections they serialize
> reduced to a necessary minimum.
>
> The sr_mutex in particular very noticably degraded performance of
> CD-DA ripping with multiple drives in parallel. When several
> instances of "grip" are used with two or more drives, their GUIs
> became laggier, as did the KDE file manager GUI, and drive utilization
> was reduced. (During ripping, drive lights flicker instead of staying
> on most of the time.) IOW time to rip a stack of CDs was increased.
> I didn't measure this but it is highly noticeable.
>
> On the other hand, I don't see what state sr_mutex would protect.
> So I removed it entirely and that works fine for me.
>
I'm afraid you can't do that: The problem is that we have an entangled
set of reference counts that need to be taken and released atomically.
If we don't surround them with a mutex you get undefined results from
racing last release with new acquire. You can see this usage in sd.c.

The sr.c use case looks like bd_mutex would mediate ... but that's
because it doesn't use driver shutdown and has no power management
functions ... I think I have vague memories that someone is working on
pm for cdroms?

I don't think the mutex needs to be on the ioctls, though, which is
what's causing your performance problems, right?

James

2012-02-28 16:09:40

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] sr: fix multi-drive performance, remove BKL replacement

On Feb 28 James Bottomley wrote:
> On Tue, 2012-02-28 at 15:32 +0100, Stefan Richter wrote:
> > Commit 2a48fc0ab242 "block: autoconvert trivial BKL users to private
> > mutex" and other commits at the time mechanically swapped BKL for
> > per-driver global mutexes. If the sr driver is any indication, these
> > replacements have still not been checked by anybody for their
> > necessessity, removed where possible, or the sections they serialize
> > reduced to a necessary minimum.
> >
> > The sr_mutex in particular very noticably degraded performance of
> > CD-DA ripping with multiple drives in parallel. When several
> > instances of "grip" are used with two or more drives, their GUIs
> > became laggier, as did the KDE file manager GUI, and drive utilization
> > was reduced. (During ripping, drive lights flicker instead of staying
> > on most of the time.) IOW time to rip a stack of CDs was increased.
> > I didn't measure this but it is highly noticeable.
> >
> > On the other hand, I don't see what state sr_mutex would protect.
> > So I removed it entirely and that works fine for me.
> >
> I'm afraid you can't do that: The problem is that we have an entangled
> set of reference counts that need to be taken and released atomically.
> If we don't surround them with a mutex you get undefined results from
> racing last release with new acquire. You can see this usage in sd.c.

While I do remove sr_mutex aroud scsi_cd_get/put() calls, these ones
internally use another lock: sr_ref_mutex. Always did, still do, since
neither Arnd's mechanical BKL pushdown and BKL-to-mutex conversions
patches nor my patch changed that. This sr_ref_mutex also protects sr's
reference counting outside of the three block_device_operations methods
which I changed.

I suppose I could have mentioned right away in the changelog that the
sr driver's own reference counting serialization remains in place, via that
other mutex.

> The sr.c use case looks like bd_mutex would mediate ... but that's
> because it doesn't use driver shutdown and has no power management
> functions ... I think I have vague memories that someone is working on
> pm for cdroms?
>
> I don't think the mutex needs to be on the ioctls, though, which is
> what's causing your performance problems, right?

I guess sr_block_open/release are less of an issue; after all they are
still partly serialized across all sr devices (the sections which are
under the mentioned sr_ref_mutex protection).
--
Stefan Richter
-=====-===-- --=- ===--
http://arcgraph.de/sr/

2012-02-28 16:16:39

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] sr: fix multi-drive performance, remove BKL replacement

On Tue, 2012-02-28 at 17:09 +0100, Stefan Richter wrote:
> On Feb 28 James Bottomley wrote:
> > On Tue, 2012-02-28 at 15:32 +0100, Stefan Richter wrote:
> > > Commit 2a48fc0ab242 "block: autoconvert trivial BKL users to private
> > > mutex" and other commits at the time mechanically swapped BKL for
> > > per-driver global mutexes. If the sr driver is any indication, these
> > > replacements have still not been checked by anybody for their
> > > necessessity, removed where possible, or the sections they serialize
> > > reduced to a necessary minimum.
> > >
> > > The sr_mutex in particular very noticably degraded performance of
> > > CD-DA ripping with multiple drives in parallel. When several
> > > instances of "grip" are used with two or more drives, their GUIs
> > > became laggier, as did the KDE file manager GUI, and drive utilization
> > > was reduced. (During ripping, drive lights flicker instead of staying
> > > on most of the time.) IOW time to rip a stack of CDs was increased.
> > > I didn't measure this but it is highly noticeable.
> > >
> > > On the other hand, I don't see what state sr_mutex would protect.
> > > So I removed it entirely and that works fine for me.
> > >
> > I'm afraid you can't do that: The problem is that we have an entangled
> > set of reference counts that need to be taken and released atomically.
> > If we don't surround them with a mutex you get undefined results from
> > racing last release with new acquire. You can see this usage in sd.c.
>
> While I do remove sr_mutex aroud scsi_cd_get/put() calls, these ones
> internally use another lock: sr_ref_mutex. Always did, still do, since
> neither Arnd's mechanical BKL pushdown and BKL-to-mutex conversions
> patches nor my patch changed that. This sr_ref_mutex also protects sr's
> reference counting outside of the three block_device_operations methods
> which I changed.
>
> I suppose I could have mentioned right away in the changelog that the
> sr driver's own reference counting serialization remains in place, via that
> other mutex.

OK, agreed ... the thing that caught my eye was the get/open and the
release/put, but I think that's completely safe.

> > The sr.c use case looks like bd_mutex would mediate ... but that's
> > because it doesn't use driver shutdown and has no power management
> > functions ... I think I have vague memories that someone is working on
> > pm for cdroms?
> >
> > I don't think the mutex needs to be on the ioctls, though, which is
> > what's causing your performance problems, right?
>
> I guess sr_block_open/release are less of an issue; after all they are
> still partly serialized across all sr devices (the sections which are
> under the mentioned sr_ref_mutex protection).

They're also per bdev serialised by bd_mutex, so yes.

James

2012-02-28 16:22:38

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] sr: fix multi-drive performance, remove BKL replacement

On Feb 28 Stefan Richter wrote:
> On Feb 28 James Bottomley wrote:
> > On Tue, 2012-02-28 at 15:32 +0100, Stefan Richter wrote:
> > > Commit 2a48fc0ab242 "block: autoconvert trivial BKL users to private
> > > mutex" and other commits at the time mechanically swapped BKL for
> > > per-driver global mutexes. If the sr driver is any indication, these
> > > replacements have still not been checked by anybody for their
> > > necessessity, removed where possible, or the sections they serialize
> > > reduced to a necessary minimum.
> > >
> > > The sr_mutex in particular very noticably degraded performance of
> > > CD-DA ripping with multiple drives in parallel. When several
> > > instances of "grip" are used with two or more drives, their GUIs
> > > became laggier, as did the KDE file manager GUI, and drive utilization
> > > was reduced. (During ripping, drive lights flicker instead of staying
> > > on most of the time.) IOW time to rip a stack of CDs was increased.
> > > I didn't measure this but it is highly noticeable.
> > >
> > > On the other hand, I don't see what state sr_mutex would protect.
> > > So I removed it entirely and that works fine for me.
> > >
> > I'm afraid you can't do that: The problem is that we have an entangled
> > set of reference counts that need to be taken and released atomically.
> > If we don't surround them with a mutex you get undefined results from
> > racing last release with new acquire. You can see this usage in sd.c.
>
> While I do remove sr_mutex aroud scsi_cd_get/put() calls, these ones
> internally use another lock: sr_ref_mutex. Always did, still do, since
> neither Arnd's mechanical BKL pushdown and BKL-to-mutex conversions
> patches nor my patch changed that. This sr_ref_mutex also protects sr's
> reference counting outside of the three block_device_operations methods
> which I changed.
>
> I suppose I could have mentioned right away in the changelog that the
> sr driver's own reference counting serialization remains in place, via that
> other mutex.

PS:
Entirely independently of this, it is important to remember that sr has
been and still is broken WRT hot removal/ hot unbinding of devices.
http://marc.info/?t=132484687700001
I haven't tried the last proposed patches from Jun'ichi Nomura or from Bart
Van Assche yet. While I tested my BKL-replacement-removal patch, I made
sure to quit all processes that accessed /dev/sr*, including udisks-daemon,
every time before I unplugged a CD-ROM drive. Or actually /almost/ every
time; sometimes I missed to kill udisks-daemon but was lucky enough not to
hit that bug in these sessions.
--
Stefan Richter
-=====-===-- --=- ===--
http://arcgraph.de/sr/

2012-02-28 16:42:24

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] sr: fix multi-drive performance, remove BKL replacement

On Tuesday 28 February 2012, James Bottomley wrote:
> On Tue, 2012-02-28 at 17:09 +0100, Stefan Richter wrote:
> > On Feb 28 James Bottomley wrote:
> >
> > While I do remove sr_mutex aroud scsi_cd_get/put() calls, these ones
> > internally use another lock: sr_ref_mutex. Always did, still do, since
> > neither Arnd's mechanical BKL pushdown and BKL-to-mutex conversions
> > patches nor my patch changed that. This sr_ref_mutex also protects sr's
> > reference counting outside of the three block_device_operations methods
> > which I changed.
> >
> > I suppose I could have mentioned right away in the changelog that the
> > sr driver's own reference counting serialization remains in place, via that
> > other mutex.
>
> OK, agreed ... the thing that caught my eye was the get/open and the
> release/put, but I think that's completely safe.

I took another look and I believe the cdi->use_count in
cdrom_open/cdrom_release still requires some protection that is
currently provided by sr_mutex. Some parts of cdrom_ioctl also
access this variable and things like cdi->options or cdi->keeplocked.

I could imagine that you can get rid of the mutex if you turn those
into atomics and bitops, but there may be other parts of cdrom_device_info
that need locking. A safer option to solve the performance problems
could be to replace sr_mutex with a per-device mutex inside of
cdrom_device_info.

Arnd

2012-02-28 16:57:42

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] sr: fix multi-drive performance, remove BKL replacement

On Tue, 2012-02-28 at 16:42 +0000, Arnd Bergmann wrote:
> On Tuesday 28 February 2012, James Bottomley wrote:
> > On Tue, 2012-02-28 at 17:09 +0100, Stefan Richter wrote:
> > > On Feb 28 James Bottomley wrote:
> > >
> > > While I do remove sr_mutex aroud scsi_cd_get/put() calls, these ones
> > > internally use another lock: sr_ref_mutex. Always did, still do, since
> > > neither Arnd's mechanical BKL pushdown and BKL-to-mutex conversions
> > > patches nor my patch changed that. This sr_ref_mutex also protects sr's
> > > reference counting outside of the three block_device_operations methods
> > > which I changed.
> > >
> > > I suppose I could have mentioned right away in the changelog that the
> > > sr driver's own reference counting serialization remains in place, via that
> > > other mutex.
> >
> > OK, agreed ... the thing that caught my eye was the get/open and the
> > release/put, but I think that's completely safe.
>
> I took another look and I believe the cdi->use_count in
> cdrom_open/cdrom_release still requires some protection that is
> currently provided by sr_mutex.

So I think this is fine ... it's protected by the bdev->bd_mutex.

> Some parts of cdrom_ioctl also
> access this variable and things like cdi->options or cdi->keeplocked.

This would be problematic because we no longer lock the ioctl.

> I could imagine that you can get rid of the mutex if you turn those
> into atomics and bitops, but there may be other parts of cdrom_device_info
> that need locking. A safer option to solve the performance problems
> could be to replace sr_mutex with a per-device mutex inside of
> cdrom_device_info.

I'd say the latter.

James

2012-02-28 17:11:41

by Stefan Richter

[permalink] [raw]
Subject: Re: [PATCH] [SCSI] sr: fix multi-drive performance, remove BKL replacement

On Feb 28 James Bottomley wrote:
> On Tue, 2012-02-28 at 16:42 +0000, Arnd Bergmann wrote:
> > I took another look and I believe the cdi->use_count in
> > cdrom_open/cdrom_release still requires some protection that is
> > currently provided by sr_mutex.
>
> So I think this is fine ... it's protected by the bdev->bd_mutex.
>
> > Some parts of cdrom_ioctl also
> > access this variable and things like cdi->options or cdi->keeplocked.
>
> This would be problematic because we no longer lock the ioctl.
>
> > I could imagine that you can get rid of the mutex if you turn those
> > into atomics and bitops, but there may be other parts of cdrom_device_info
> > that need locking. A safer option to solve the performance problems
> > could be to replace sr_mutex with a per-device mutex inside of
> > cdrom_device_info.
>
> I'd say the latter.

Thanks Arnd and James, I will pursue this when I get the time.
--
Stefan Richter
-=====-===-- --=- ===--
http://arcgraph.de/sr/