2009-06-18 13:56:36

by Rainer Weikusat

[permalink] [raw]
Subject: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

From: Rainer Weikusat <[email protected]>

With 2.6.30, the error handling code in cdrom_newpc_intr was changed
to deal with partial request failures by normally completing the 'good'
parts of a request and only 'error' the last (and presumably,
incompletely transferred) bio associated with a particular
request. This doesn't work for requests which don't have bios
associated with them ('GPCMD_READ_DISC_INFO'), because the first call
to ide_end_rq, done via ide_complete_rq in order to do the
partial completion part, returns with a code of zero for all non-bio
requests, causing the drive->hwif->rq pointer to be set to NULL. Upon
calling ide_complete_rq a second time, it is attempted to de-reference
this null pointer, resulting in a kernel crash.

Signed-Off-By: Rainer Weikusat <[email protected]>

---

This is fixed in the linux-ide tree since at about 2009/06/10 [Bug
13399, also happens w/ TSSTcorpDVD-ROM SH-D162C], but a patch
against 2.6.30 AFAIK doesn't exist (and I didn't find the
corresponding thread before digging through all of this ...).

--- drivers/ide/ide-cd.c.orig 2009-06-18 15:10:24.000000000 +0200
+++ drivers/ide/ide-cd.c 2009-06-18 14:10:16.000000000 +0200
@@ -758,7 +758,7 @@ out_end:
rq->errors = -EIO;
}

- if (uptodate == 0)
+ if (uptodate == 0 && rq->bio)
ide_cd_error_cmd(drive, cmd);

/* make sure it's fully ended */


2009-06-18 14:39:48

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

Hi,

we we're just debugging this and I was about to send the patches
tomorrow but you beat me to it :).

On Thu, Jun 18, 2009 at 3:48 PM, Rainer Weikusat<[email protected]> wrote:
> From: Rainer Weikusat <[email protected]>
>
> With 2.6.30, the error handling code in cdrom_newpc_intr was changed
> to deal with partial request failures by normally completing the 'good'
> parts of a request and only 'error' the last (and presumably,
> incompletely transferred) bio associated with a particular
> request. This doesn't work for requests which don't have bios
> associated with them ('GPCMD_READ_DISC_INFO'), because the first call
> to ide_end_rq, done via ide_complete_rq in order to do the
> partial completion part, returns with a code of zero for all non-bio
> requests, causing the drive->hwif->rq pointer to be set to NULL.

This is a bit misleading, it should be more like: "ide_complete_rq is
called over ide_cd_error_cmd() to partially complete the rq but the rq
is without a bio and the block layer does partial completion only for
requests with bio's so this request is completed as a whole and the rq
freed."

please fix.

> Upon
> calling ide_complete_rq a second time, it is attempted to de-reference
> this null pointer, resulting in a kernel crash.
>
> Signed-Off-By: Rainer Weikusat <[email protected]>
>
> ---
>
> This is fixed in the linux-ide tree since at about 2009/06/10 [Bug
> 13399, also happens w/ TSSTcorpDVD-ROM SH-D162C],

really, because I can't find it in Bart's trees. Do you have a commit
id?

> but a patch
> against 2.6.30 AFAIK doesn't exist (and I didn't find the
> corresponding thread before digging through all of this ...).
> --- drivers/ide/ide-cd.c.orig ? 2009-06-18 15:10:24.000000000 +0200
> +++ drivers/ide/ide-cd.c ? ? ? ?2009-06-18 14:10:16.000000000 +0200
> @@ -758,7 +758,7 @@ out_end:
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?rq->errors = -EIO;
> ? ? ? ? ? ? ? ?}
>
> - ? ? ? ? ? ? ? if (uptodate == 0)
> + ? ? ? ? ? ? ? if (uptodate == 0 && rq->bio)
> ? ? ? ? ? ? ? ? ? ? ? ?ide_cd_error_cmd(drive, cmd);
>
> ? ? ? ? ? ? ? ?/* make sure it's fully ended */
>

--
Regards/Gruss,
Boris

2009-06-18 14:53:18

by Rainer Weikusat

[permalink] [raw]
Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

Borislav Petkov <[email protected]> writes:
> On Thu, Jun 18, 2009 at 3:48 PM, Rainer Weikusat<[email protected]> wrote:
>> From: Rainer Weikusat <[email protected]>
>>
>> With 2.6.30, the error handling code in cdrom_newpc_intr was changed
>> to deal with partial request failures by normally completing the 'good'
>> parts of a request and only 'error' the last (and presumably,
>> incompletely transferred) bio associated with a particular
>> request. This doesn't work for requests which don't have bios
>> associated with them ('GPCMD_READ_DISC_INFO'), because the first call
>> to ide_end_rq, done via ide_complete_rq in order to do the
>> partial completion part, returns with a code of zero for all non-bio
>> requests, causing the drive->hwif->rq pointer to be set to NULL.
>
> This is a bit misleading, it should be more like: "ide_complete_rq is
> called over ide_cd_error_cmd() to partially complete the rq but the rq
> is without a bio and the block layer does partial completion only for
> requests with bio's so this request is completed as a whole and the rq
> freed."

Technically, this is not quite correct (assuming I haven't overlooked
something), because ide_cd_queue_pc still has a reference to the rq.

> please fix.

I will send a modified 'patch e-mail' soon.

Something I would like to add: The DVD-ROM mentioned below has exactly
the same 32/30 issue w/ READ DISC INFO. This used to just be an
unnoticed failure in older kernels.

[...]

>> This is fixed in the linux-ide tree since at about 2009/06/10 [Bug
>> 13399, also happens w/ TSSTcorpDVD-ROM SH-D162C],
>
> really, because I can't find it in Bart's trees. Do you have a commit
> id?

No, I just assumed that, since I found the bio-check among beginnings
of code intended to deal with the 32/30 issue.

2009-06-18 15:04:19

by Rainer Weikusat

[permalink] [raw]
Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

From: Rainer Weikusat <[email protected]>

With 2.6.30, the error handling code in cdrom_newpc_intr was changed
to deal with partial request failures by normally completing the 'good'
parts of a request and only 'error' the last (and presumably,
incompletely transferred) bio associated with a particular
request. In order to do this, ide_complete_rq is called over
ide_cd_error_cmd() to partially complete the rq. The block layer
does partial completion only for requests with bio's and if the
rq doesn't have one (eg 'GPCMD_READ_DISC_INFO') the request is
completed as a whole and the drive->hwif->rq pointer set to NULL
afterwards. When calling ide_complete_rq again to report
the error, this null pointer is derefenced, resulting in a kernel
crash.

Signed-Off-By: Rainer Weikusat <[email protected]>

---

--- drivers/ide/ide-cd.c.orig 2009-06-18 15:10:24.000000000 +0200
+++ drivers/ide/ide-cd.c 2009-06-18 14:10:16.000000000 +0200
@@ -758,7 +758,7 @@ out_end:
rq->errors = -EIO;
}

- if (uptodate == 0)
+ if (uptodate == 0 && rq->bio)
ide_cd_error_cmd(drive, cmd);

/* make sure it's fully ended */

2009-06-18 15:43:40

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

Hi,

On Thu, Jun 18, 2009 at 4:52 PM, Rainer Weikusat<[email protected]> wrote:
> Borislav Petkov <[email protected]> writes:
>> On Thu, Jun 18, 2009 at 3:48 PM, Rainer Weikusat<[email protected]> wrote:
>>> From: Rainer Weikusat <[email protected]>
>>>
>>> With 2.6.30, the error handling code in cdrom_newpc_intr was changed
>>> to deal with partial request failures by normally completing the 'good'
>>> parts of a request and only 'error' the last (and presumably,
>>> incompletely transferred) bio associated with a particular
>>> request. This doesn't work for requests which don't have bios
>>> associated with them ('GPCMD_READ_DISC_INFO'), because the first call
>>> to ide_end_rq, done via ide_complete_rq in order to do the
>>> partial completion part, returns with a code of zero for all non-bio
>>> requests, causing the drive->hwif->rq pointer to be set to NULL.
>>
>> This is a bit misleading, it should be more like: "ide_complete_rq is
>> called over ide_cd_error_cmd() to partially complete the rq but the rq
>> is without a bio and the block layer does partial completion only for
>> requests with bio's so this request is completed as a whole and the rq
>> freed."
>
> Technically, this is not quite correct (assuming I haven't overlooked
> something), because ide_cd_queue_pc still has a reference to the rq.

That doesn't matter because the OOPS happens after the command has been
issued and _before_ ide_cd_queue_pc() gets to access the rq ptr again.

ide_cd_queue_pc() does blk_execute_rq() which does wait for completion
of the request so it doesn't access the rq pointer before the IRQ
handler has completed. Even if it would reach the blk_put_request part,
the OOPS would have already happened.

ide_complete_rq simply does

blk_end_request
|->blk_end_bidi_request
|->blk_finish_request

after checking the rq->bio pointer through blk_update_bidi_request() and
if the prior is NULL it does __blk_put_request in blk_finish_request and
this is where the thing vanishes.

The second time ide_complete_rq() comes to run at the end of the IRQ
handler the rq is already 0xdeadbeef :).

(this is all latest git but the old code (without the bidi stuff) did
the same thing wrt to rq->bio).


--
Regards/Gruss,
Boris

2009-06-18 16:06:42

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

Hi,

On Thu, Jun 18, 2009 at 5:04 PM, Rainer Weikusat<[email protected]> wrote:
> From: Rainer Weikusat <[email protected]>
>
> With 2.6.30, the error handling code in cdrom_newpc_intr was changed
> to deal with partial request failures by normally completing the 'good'
> parts of a request and only 'error' the last (and presumably,
> incompletely transferred) bio associated with a particular
> request. In order to do this, ide_complete_rq is called over
> ide_cd_error_cmd() to partially complete the rq. The block layer
> does partial completion only for requests with bio's and if the
> rq doesn't have one (eg 'GPCMD_READ_DISC_INFO') the request is
> completed as a whole and the drive->hwif->rq pointer set to NULL
> afterwards. When calling ide_complete_rq again to report
> the error, this null pointer is derefenced, resulting in a kernel
> crash.

Sorry, but still not good enough. Instead, I rediffed the change against
current linux-ide branch and rewrote the commit message properly keeping
your S-O-B.

@Bart: please apply.

--
From: Rainer Weikusat <[email protected]>
Date: Thu, 18 Jun 2009 17:48:24 +0200
Subject: [PATCH] ide-cd: don't do partial completions on bio-less rqs

The block layer completes bio-less requests totally instead of partially
and this breaks cdrom drives which fragment packet command data in
several DRQ turns (eg GPCMD_READ_DISC_INFO).

More specifically, ide_complete_rq() is called over ide_cd_error_cmd()
to partially complete the rq on error. This bio-less request is
completed as a whole and when calling ide_complete_rq again to complete
the request wholly, the rq is already vanished resulting in the
following OOPS:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: [<ffffffff804deb50>] ide_complete_rq+0x19/0x4d
PGD 0
Thread overran stack, or stack corrupted
Oops: 0000 [#1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.30-rc8 #22 Precision WorkStation 380
RIP: 0010:[<ffffffff804deb50>] [<ffffffff804deb50>] ide_complete_rq+0x19/0x4d
RSP: 0018:ffff880028022e18 EFLAGS: 00010096
RAX: 0000000000000002 RBX: ffff88011a84e000 RCX: 0000000000000200
RDX: 0000000000000200 RSI: 0000000000000000 RDI: ffff88011afc9000
RBP: ffff880028022e28 R08: 00000000fffffffb R09: 0000000000000000
R10: ffff8800280302e8 R11: ffff880028030310 R12: 0000000000000000
R13: 0000000000000000 R14: ffff88011afc9000 R15: ffff88011aff7140
FS: 0000000000000000(0000) GS:ffff88002801f000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000048 CR3: 0000000000201000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff809c0000, task ffffffff808f1360)
Stack:
ffff88011aecf840 ffff88011aff7140 ffff880028022eb8 ffffffff804ed0ef
ffffffff8025a811 00ff880028022e70 ffffffff00000050 ffff88011a84e000
ffff88011a84e108 000000008025ef72 0000000000000000 ffff88011a84e000
Call Trace:
<IRQ> <0> [<ffffffff804ed0ef>] cdrom_newpc_intr+0x20e/0xac5
[<ffffffff8025a811>] ? irq_exit+0x47/0x7d
[<ffffffff804ecee1>] ? cdrom_newpc_intr+0x0/0xac5
[<ffffffff804de920>] ide_intr+0x1d2/0x220
[<ffffffff80284ee1>] handle_IRQ_event+0x3a/0xba
[<ffffffff80286c38>] handle_edge_irq+0xce/0x133
[<ffffffff8022c0cd>] handle_irq+0x1d/0x29
[<ffffffff8022b8c7>] do_IRQ+0x5a/0xd5
[<ffffffff8022a013>] ret_from_intr+0x0/0xa
<EOI> <0> [<ffffffff80230eee>] ? mwait_idle+0x60/0x6e
[<ffffffff802282c2>] ? enter_idle+0x20/0x22
[<ffffffff802286ef>] ? cpu_idle+0x4a/0x8d
[<ffffffff806ebb05>] ? rest_init+0x65/0x70
[<ffffffff809cdcef>] ? start_kernel+0x2da/0x3bb
[<ffffffff809cd271>] ? x86_64_start_reservations+0x81/0xbc
[<ffffffff809cd37b>] ? x86_64_start_kernel+0xcf/0xf1
Code: c3 48 25 ff ff ff fe 48 89 47 50 e8 a7 86 00 00 eb d7 55 48 89
e5 53 48 83 ec 08 41 89 f0 89 d1 48 8b 5f 40 48 8b b3 28 03 00 00 <f6>
46 48 0e 74 05 45 85 c0 7e 1e 44 89 c2 e8 85 ff ff ff 85 c0
RIP [<ffffffff804deb50>] ide_complete_rq+0x19/0x4d
RSP <ffff880028022e18>
CR2: 0000000000000048
---[ end trace 6662ae44d700bf58 ]---

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=13399.

Tested-by: Hans de Bruin <[email protected]>
Signed-Off-By: Rainer Weikusat <[email protected]>
Signed-Off-By: Borislav Petkov <[email protected]>
---
drivers/ide/ide-cd.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
index 0b7645b..4a19686 100644
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -667,7 +667,7 @@ out_end:
rq->errors = -EIO;
}

- if (uptodate == 0)
+ if (uptodate == 0 && rq->bio)
ide_cd_error_cmd(drive, cmd);

/* make sure it's fully ended */
--
1.6.3.1



--
Regards/Gruss,
Boris

2009-06-18 16:18:58

by Rainer Weikusat

[permalink] [raw]
Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

Borislav Petkov <[email protected]> writes:
> On Thu, Jun 18, 2009 at 4:52 PM, Rainer Weikusat<[email protected]> wrote:
>> Borislav Petkov <[email protected]> writes:

[...]

>>> so this request is completed as a whole and the rq
>>> freed."
>>
>> Technically, this is not quite correct (assuming I haven't overlooked
>> something), because ide_cd_queue_pc still has a reference to the rq.
>
> That doesn't matter because the OOPS happens after the command has been
> issued and _before_ ide_cd_queue_pc() gets to access the rq ptr
> again.

Yes. Because the pointer I already mentioned has been reset. But the
request itself is still alive.

[...]

> ide_complete_rq simply does
>
> blk_end_request
> |->blk_end_bidi_request
> |->blk_finish_request
>
> after checking the rq->bio pointer through blk_update_bidi_request() and
> if the prior is NULL it does __blk_put_request in blk_finish_request and
> this is where the thing vanishes.
>
> The second time ide_complete_rq() comes to run at the end of the IRQ
> handler the rq is already 0xdeadbeef :).

Not quite. Below is the blk_execute_rq-function:

int blk_execute_rq(struct request_queue *q, struct gendisk *bd_disk,
struct request *rq, int at_head)
{
DECLARE_COMPLETION_ONSTACK(wait);
char sense[SCSI_SENSE_BUFFERSIZE];
int err = 0;

/*
* we need an extra reference to the request, so we can look at
* it after io completion
*/
rq->ref_count++;

if (!rq->sense) {
memset(sense, 0, sizeof(sense));
rq->sense = sense;
rq->sense_len = 0;
}

rq->end_io_data = &wait;
blk_execute_rq_nowait(q, bd_disk, rq, at_head, blk_end_sync_rq);
wait_for_completion(&wait);

if (rq->errors)
err = -EIO;

return err;
}

and the refcount is incremented at the start of that 'so we can look
at it after io-completion', which means 'after the code below has
executed':

static void blk_end_sync_rq(struct request *rq, int error)
{
struct completion *waiting = rq->end_io_data;

rq->end_io_data = NULL;
__blk_put_request(rq->q, rq);

/*
* complete last, if this is a stack request the process (and thus
* the rq pointer) could be invalid right after this complete()
*/
complete(waiting);
}

which puts the rq once, decrementing the refcount by
one. Both blk_execute_rq and ide_cd_queue_pc inspect the rq after it
has been completed and the latter puts it again. While inspecting a
'freed' data structure would probably be harmless, freeing it twice
would be quite of a problem :-). I have spent something like 18 hours
of my life to determine what was going on here, starting from 'I have
never seen any of this again' and came across a bit of it in the
process.

But I am actually busy doing 'something completely different' with 2.4
for my job ...

2009-06-18 17:07:50

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

On Thu, Jun 18, 2009 at 6:18 PM, Rainer Weikusat<[email protected]> wrote:
> Borislav Petkov <[email protected]> writes:
>> On Thu, Jun 18, 2009 at 4:52 PM, Rainer Weikusat<[email protected]> wrote:
>>> Borislav Petkov <[email protected]> writes:
>
> [...]
>
>>>> so this request is completed as a whole and the rq
>>>> freed."
>>>
>>> Technically, this is not quite correct (assuming I haven't overlooked
>>> something), because ide_cd_queue_pc still has a reference to the rq.
>>
>> That doesn't matter because the OOPS happens after the command has been
>> issued and _before_ ide_cd_queue_pc() gets to access the rq ptr
>> again.
>
> Yes. Because the pointer I already mentioned has been reset. But the
> request itself is still alive.
>
> [...]
>
>> ide_complete_rq simply does
>>
>> blk_end_request
>> |->blk_end_bidi_request
>> ? ?|->blk_finish_request
>>
>> after checking the rq->bio pointer through blk_update_bidi_request() and
>> if the prior is NULL it does __blk_put_request in blk_finish_request and
>> this is where the thing vanishes.
>>
>> The second time ide_complete_rq() comes to run at the end of the IRQ
>> handler the rq is already 0xdeadbeef :).
>
> Not quite. Below is the blk_execute_rq-function:
>
> int blk_execute_rq(struct request_queue *q, struct gendisk *bd_disk,
> ? ? ? ? ? ? ? ? ? struct request *rq, int at_head)
> {
> ? ? ? ?DECLARE_COMPLETION_ONSTACK(wait);
> ? ? ? ?char sense[SCSI_SENSE_BUFFERSIZE];
> ? ? ? ?int err = 0;
>
> ? ? ? ?/*
> ? ? ? ? * we need an extra reference to the request, so we can look at
> ? ? ? ? * it after io completion
> ? ? ? ? */
> ? ? ? ?rq->ref_count++;
>
> ? ? ? ?if (!rq->sense) {
> ? ? ? ? ? ? ? ?memset(sense, 0, sizeof(sense));
> ? ? ? ? ? ? ? ?rq->sense = sense;
> ? ? ? ? ? ? ? ?rq->sense_len = 0;
> ? ? ? ?}
>
> ? ? ? ?rq->end_io_data = &wait;
> ? ? ? ?blk_execute_rq_nowait(q, bd_disk, rq, at_head, blk_end_sync_rq);
> ? ? ? ?wait_for_completion(&wait);
>
> ? ? ? ?if (rq->errors)
> ? ? ? ? ? ? ? ?err = -EIO;
>
> ? ? ? ?return err;
> }
>
> and the refcount is incremented at the start of that 'so we can look
> at it after io-completion', which means 'after the code below has
> executed':
>
> static void blk_end_sync_rq(struct request *rq, int error)
> {
> ? ? ? ?struct completion *waiting = rq->end_io_data;
>
> ? ? ? ?rq->end_io_data = NULL;
> ? ? ? ?__blk_put_request(rq->q, rq);
>
> ? ? ? ?/*
> ? ? ? ? * complete last, if this is a stack request the process (and thus
> ? ? ? ? * the rq pointer) could be invalid right after this complete()
> ? ? ? ? */
> ? ? ? ?complete(waiting);
> }
>
> which puts the rq once, decrementing the refcount by
> one.

Please, look at the following trace:

ide-cd: ide_cd_queue_pc: cmd[0]: 0x51, write: 0x0, timeout: 1750,
cmd_flags: 0x8000
ide-cd: ide_cd_do_request: cmd: 0x51, block: 18446744073709551615
ide_cd_do_request: dev hda: type=a, flags=108a640
sector 18446744073709551615, nr/cnr 0/0
bio (null), biotail (null), buffer (null), data ffff88011b849ba0, len 32
ide-cd: cdrom_do_block_pc: rq->cmd[0]: 0x51, rq->cmd_type: 0xa
ide-cd: cdrom_newpc_intr: cmd: 0x51, write: 0x0
ide-cd: cdrom_newpc_intr: DRQ: stat: 0x58, thislen: 30
ide-cd: ide_cd_check_ireason: ireason: 0x2, rw: 0x0
ide-cd: cdrom_newpc_intr: data transfer, rq->cmd_type: 0xa, ireason: 0x2
ide-cd: cdrom_newpc_intr: cmd: 0x51, write: 0x0
ide-cd: cdrom_newpc_intr: DRQ: stat: 0x50, thislen: 2
ide-cd: ide_cd_request_sense_fixup: rq->cmd[0]: 0x51
#3

[ this is a printk I added to show from where we hit the out_end label.
There we call the first ide_complete_rq() over ide_cd_error_cmd() where
we put the request. __blk_put_request returns without freeing the rq if
the refcount is > 1, which, in this case, is.]

and now here we do the second direct ide_complete_rq and here the rq is freed:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000048

However, despite the refcounting, the rq->bio check kills the rq -
otherwise the block layer would've waited, see the beginning of
blk_update_request():

if (!req->bio)
return false;

but let me do more tracing to see exactly what happens and when it
happens, now that we're waist-deep into the block layer :).

--
Regards/Gruss,
Boris

2009-06-18 18:26:01

by Rainer Weikusat

[permalink] [raw]
Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

Borislav Petkov <[email protected]> writes:
> On Thu, Jun 18, 2009 at 6:18 PM, Rainer Weikusat<[email protected]> wrote:
>> Borislav Petkov <[email protected]> writes:
>>> On Thu, Jun 18, 2009 at 4:52 PM, Rainer Weikusat<[email protected]> wrote:
>>>> Borislav Petkov <[email protected]> writes:
>>
>> [...]
>>
>>>>> so this request is completed as a whole and the rq
>>>>> freed."
>>>>
>>>> Technically, this is not quite correct (assuming I haven't overlooked
>>>> something), because ide_cd_queue_pc still has a reference to the rq.
>>>
>>> That doesn't matter because the OOPS happens after the command has been
>>> issued and _before_ ide_cd_queue_pc() gets to access the rq ptr
>>> again.
>>
>> Yes. Because the pointer I already mentioned has been reset.

And this happens here (!!!, code from 2.6.30 vanilla):

int ide_complete_rq(ide_drive_t *drive, int error, unsigned int nr_bytes)
{
ide_hwif_t *hwif = drive->hwif;
struct request *rq = hwif->rq;
int rc;

/*
* if failfast is set on a request, override number of sectors
* and complete the whole request right now
*/
if (blk_noretry_request(rq) && error <= 0)
nr_bytes = rq->hard_nr_sectors << 9;

rc = ide_end_rq(drive, rq, error, nr_bytes);
if (rc == 0)
hwif->rq = NULL; /* !!! */

return rc;
}
[explanation below]

My first attempt at getting this to work again actually looked like
this (as addition to cdrom_newpc_intr)

if (uptodate == 0) {
ide_cd_error_cmd(drive, cmd);
rq = drive->hwif->rq;
}

if (rq) {
/* code up to the 2nd complete call */
}

if (sense && rc == 2)
ide_error(drive, "request sense failure", stat);

[...]

That was before I had any idea why complete was being called twice and
that what this is supposed to do won't be done for bio-less requests,
anyway, and it worked fine.

> and now here we do the second direct ide_complete_rq and here the rq is freed:
>
> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000048

I have just restored the original file to generate another crash
dump. [the relevant part of] What I get is EIP == c0251311, edx == 0.
This corresponds with the following machine code:

c02512fc <ide_complete_rq>:
c02512fc: 55 push %ebp
c02512fd: 89 e5 mov %esp,%ebp
c02512ff: 56 push %esi
c0251300: 53 push %ebx
c0251301: 83 ec 04 sub $0x4,%esp
c0251304: 89 c3 mov %eax,%ebx
c0251306: 89 d0 mov %edx,%eax

/* ide_hwif_t *hwif = drive->hwif; */
c0251308: 8b 73 28 mov 0x28(%ebx),%esi

/* struct request *rq = hwif->rq; */
c025130b: 8b 96 c8 01 00 00 mov 0x1c8(%esi),%edx

/* if (blk_noretry_request(rq) .... */
c0251311: f6 42 24 0e testb $0xe,0x24(%edx) /* !!! */
c0251315: 74 04 je c025131b <ide_complete_rq+0x1f>

blk_notretry_request is

#define blk_noretry_request(rq) (blk_failfast_dev(rq) || \
blk_failfast_transport(rq) || \
blk_failfast_driver(rq))

and

#define blk_failfast_dev(rq) ((rq)->cmd_flags & REQ_FAILFAST_DEV)
#define blk_failfast_transport(rq) ((rq)->cmd_flags & REQ_FAILFAST_TRANSPORT)
#define blk_failfast_driver(rq) ((rq)->cmd_flags & REQ_FAILFAST_DRIVER)

and

#define REQ_FAILFAST_DEV (1 << __REQ_FAILFAST_DEV)
#define REQ_FAILFAST_TRANSPORT (1 << __REQ_FAILFAST_TRANSPORT)
#define REQ_FAILFAST_DRIVER (1 << __REQ_FAILFAST_DRIVER)

and

enum rq_flag_bits {
__REQ_RW, /* not set, read. set, write */
__REQ_FAILFAST_DEV, /* no driver retries of device errors */
__REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
__REQ_FAILFAST_DRIVER, /* no driver retries of driver errors */
__REQ_DISCARD, /* request to discard sectors */

[...]

This means the values of the relevant __REQ_-constants are 1, 2, and
3 and (1 << 1) | (1 << 2) | (1 << 3) == 2 + 4 + 8 == 14 (== 0xe),
hence testb $0xe, 0x24(%edx) (optimized by compiler). edx is 0.
0x24(%edx) is the field at offset 36 in a struct request, which is
cmd_flags (on my computer).

blk_end_io always returns zero for bio-less requests. More precisely,
it calls end_that_request_data, which is

static int end_that_request_data(struct request *rq, int error,
unsigned int nr_bytes, unsigned int bidi_bytes)
{
if (rq->bio) {
if (__end_that_request_first(rq, error, nr_bytes))
return 1;

/* Bidi request must be completed as a whole */
if (blk_bidi_rq(rq) &&
__end_that_request_first(rq->next_rq, error, bidi_bytes))
return 1;
}

return 0;
}

ie it returns 0 for a request w/o a bio, and looks itself like this:

static int blk_end_io(struct request *rq, int error, unsigned int nr_bytes,
unsigned int bidi_bytes,
int (drv_callback)(struct request *))
{
struct request_queue *q = rq->q;
unsigned long flags = 0UL;

if (end_that_request_data(rq, error, nr_bytes, bidi_bytes))
return 1;

/* Special feature for tricky drivers */
if (drv_callback && drv_callback(rq))
return 1;

add_disk_randomness(rq->rq_disk);

spin_lock_irqsave(q->queue_lock, flags);
end_that_request_last(rq, error);
spin_unlock_irqrestore(q->queue_lock, flags);

return 0;
}

drv_callback is NULL and the 'final return value' is ultimatively
returned from the ide_end_rq-call mentioned at the beginning of this
overly long e-mail.

An easy way to verify that would be a

BUG_ON(!rq)

inserted before the blkdev_noretry_request in ide_complete_rq (which I
also did -- I have been doing this for long enough to never trust my
own assumptions blindly ...).

He who doesn't understand assembly will have a more difficult life
because of that :-). While this is interesting, my boss [righfully]
hates it and I will now have to do at least an hour of additional
overtime.

2009-06-19 08:54:22

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

Hi,

On Thu, Jun 18, 2009 at 08:25:44PM +0200, Rainer Weikusat wrote:
> Borislav Petkov <[email protected]> writes:
> > On Thu, Jun 18, 2009 at 6:18 PM, Rainer Weikusat<[email protected]> wrote:
> >> Borislav Petkov <[email protected]> writes:
> >>> On Thu, Jun 18, 2009 at 4:52 PM, Rainer Weikusat<[email protected]> wrote:
> >>>> Borislav Petkov <[email protected]> writes:
> >>
> >> [...]
> >>
> >>>>> so this request is completed as a whole and the rq
> >>>>> freed."
> >>>>
> >>>> Technically, this is not quite correct (assuming I haven't overlooked
> >>>> something), because ide_cd_queue_pc still has a reference to the rq.
> >>>
> >>> That doesn't matter because the OOPS happens after the command has been
> >>> issued and _before_ ide_cd_queue_pc() gets to access the rq ptr
> >>> again.
> >>
> >> Yes. Because the pointer I already mentioned has been reset.
>
> And this happens here (!!!, code from 2.6.30 vanilla):
>
> int ide_complete_rq(ide_drive_t *drive, int error, unsigned int nr_bytes)
> {
> ide_hwif_t *hwif = drive->hwif;
> struct request *rq = hwif->rq;
> int rc;
>
> /*
> * if failfast is set on a request, override number of sectors
> * and complete the whole request right now
> */
> if (blk_noretry_request(rq) && error <= 0)
> nr_bytes = rq->hard_nr_sectors << 9;
>
> rc = ide_end_rq(drive, rq, error, nr_bytes);
> if (rc == 0)
> hwif->rq = NULL; /* !!! */
>
> return rc;
> }
> [explanation below]
>
> My first attempt at getting this to work again actually looked like
> this (as addition to cdrom_newpc_intr)
>
> if (uptodate == 0) {
> ide_cd_error_cmd(drive, cmd);
> rq = drive->hwif->rq;
> }
>
> if (rq) {
> /* code up to the 2nd complete call */
> }
>
> if (sense && rc == 2)
> ide_error(drive, "request sense failure", stat);
>
> [...]
>
> That was before I had any idea why complete was being called twice and
> that what this is supposed to do won't be done for bio-less requests,
> anyway, and it worked fine.
>
> > and now here we do the second direct ide_complete_rq and here the rq is freed:
> >
> > BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000048
>
> I have just restored the original file to generate another crash
> dump. [the relevant part of] What I get is EIP == c0251311, edx == 0.
> This corresponds with the following machine code:
>
> c02512fc <ide_complete_rq>:
> c02512fc: 55 push %ebp
> c02512fd: 89 e5 mov %esp,%ebp
> c02512ff: 56 push %esi
> c0251300: 53 push %ebx
> c0251301: 83 ec 04 sub $0x4,%esp
> c0251304: 89 c3 mov %eax,%ebx
> c0251306: 89 d0 mov %edx,%eax
>
> /* ide_hwif_t *hwif = drive->hwif; */
> c0251308: 8b 73 28 mov 0x28(%ebx),%esi
>
> /* struct request *rq = hwif->rq; */
> c025130b: 8b 96 c8 01 00 00 mov 0x1c8(%esi),%edx
>
> /* if (blk_noretry_request(rq) .... */
> c0251311: f6 42 24 0e testb $0xe,0x24(%edx) /* !!! */
> c0251315: 74 04 je c025131b <ide_complete_rq+0x1f>
>
> blk_notretry_request is
>
> #define blk_noretry_request(rq) (blk_failfast_dev(rq) || \
> blk_failfast_transport(rq) || \
> blk_failfast_driver(rq))
>
> and
>
> #define blk_failfast_dev(rq) ((rq)->cmd_flags & REQ_FAILFAST_DEV)
> #define blk_failfast_transport(rq) ((rq)->cmd_flags & REQ_FAILFAST_TRANSPORT)
> #define blk_failfast_driver(rq) ((rq)->cmd_flags & REQ_FAILFAST_DRIVER)
>
> and
>
> #define REQ_FAILFAST_DEV (1 << __REQ_FAILFAST_DEV)
> #define REQ_FAILFAST_TRANSPORT (1 << __REQ_FAILFAST_TRANSPORT)
> #define REQ_FAILFAST_DRIVER (1 << __REQ_FAILFAST_DRIVER)
>
> and
>
> enum rq_flag_bits {
> __REQ_RW, /* not set, read. set, write */
> __REQ_FAILFAST_DEV, /* no driver retries of device errors */
> __REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
> __REQ_FAILFAST_DRIVER, /* no driver retries of driver errors */
> __REQ_DISCARD, /* request to discard sectors */
>
> [...]
>
> This means the values of the relevant __REQ_-constants are 1, 2, and
> 3 and (1 << 1) | (1 << 2) | (1 << 3) == 2 + 4 + 8 == 14 (== 0xe),
> hence testb $0xe, 0x24(%edx) (optimized by compiler). edx is 0.
> 0x24(%edx) is the field at offset 36 in a struct request, which is
> cmd_flags (on my computer).
>
> blk_end_io always returns zero for bio-less requests. More precisely,
> it calls end_that_request_data, which is
>
> static int end_that_request_data(struct request *rq, int error,
> unsigned int nr_bytes, unsigned int bidi_bytes)
> {
> if (rq->bio) {
> if (__end_that_request_first(rq, error, nr_bytes))
> return 1;
>
> /* Bidi request must be completed as a whole */
> if (blk_bidi_rq(rq) &&
> __end_that_request_first(rq->next_rq, error, bidi_bytes))
> return 1;
> }
>
> return 0;
> }
>
> ie it returns 0 for a request w/o a bio, and looks itself like this:
>
> static int blk_end_io(struct request *rq, int error, unsigned int nr_bytes,
> unsigned int bidi_bytes,
> int (drv_callback)(struct request *))
> {
> struct request_queue *q = rq->q;
> unsigned long flags = 0UL;
>
> if (end_that_request_data(rq, error, nr_bytes, bidi_bytes))
> return 1;
>
> /* Special feature for tricky drivers */
> if (drv_callback && drv_callback(rq))
> return 1;
>
> add_disk_randomness(rq->rq_disk);
>
> spin_lock_irqsave(q->queue_lock, flags);
> end_that_request_last(rq, error);
> spin_unlock_irqrestore(q->queue_lock, flags);
>
> return 0;
> }
>
> drv_callback is NULL and the 'final return value' is ultimatively
> returned from the ide_end_rq-call mentioned at the beginning of this
> overly long e-mail.
>
> An easy way to verify that would be a
>
> BUG_ON(!rq)
>
> inserted before the blkdev_noretry_request in ide_complete_rq (which I
> also did -- I have been doing this for long enough to never trust my
> own assumptions blindly ...).
>
> He who doesn't understand assembly will have a more difficult life
> because of that :-).

Well, this is a very good code analysis, I'd say, one for the books :)

> While this is interesting, my boss [righfully] hates it and I will now
> have to do at least an hour of additional overtime.

Sorry about that, same here :)

Here's another trace I did:

[ 266.037646] ide-cd: ide_cd_queue_pc: cmd[0]: 0x51, write: 0x0, timeout: 7000, cmd_flags: 0x8000
[ 266.037662] ide-cd: ide_cd_do_request: cmd: 0x51, block: 4294967295, marker: 114
[ 266.037667] ide_cd_do_request: dev hdc: type=a, flags=128a440
[ 266.037670] sector 4294967295, nr/cnr 0/0
[ 266.037675] bio dde1e840, biotail dde1e840, buffer (null), len 2
[ 266.037678] ide-cd: cdrom_do_block_pc: rq->cmd[0]: 0x51, rq->cmd_type: 0xa
[ 266.066559] ide-cd: cdrom_newpc_intr: cmd: 0x51, write: 0x0
[ 266.066567] ide-cd: cdrom_newpc_intr: DRQ: stat: 0x58, thislen: 2
[ 266.066570] ide-cd: ide_cd_check_ireason: ireason: 0x2, rw: 0x0
[ 266.066572] ide-cd: cdrom_newpc_intr: data transfer, rq->cmd_type: 0xa, ireason: 0x2
[ 266.149000] ide-cd: cdrom_newpc_intr: cmd: 0x51, write: 0x0
[ 266.149010] ide-cd: cdrom_newpc_intr: DRQ: stat: 0x50, thislen: 0
[ 266.149014] ide-cd: ide_cd_request_sense_fixup: rq->cmd[0]: 0x51
[ 266.149017] ide_complete_rq: completing rq, marker: 114

this is the end of the IRQ handler

[ 266.149023] __blk_put_request: marker: 114, ref_count: 2
[ 266.149033] ide_cd_queue_pc: putting rq, marker: 114

and here ide_cd_queue_pc comes _after_ that and frees the rq due to refcount ==
1.

[ 266.149039] __blk_put_request: marker: 114, ref_count: 1
[ 266.149045] blk_free_request: marker: 114, ref_count: 0

Now in the fragmented packet command case - that what you call 32/30 -
you'll have the first ide_complete_rq() call from ide_cd_error_cmd() and
the first decrement of the refcount. Had the rq had a bio, it'd never be
come to do a __blk_put_request and decrement the refcount.

But even if it did decrement it, as it does in the real case, that
wouldn't be a problem since the rq is still alive (refcount is now 1)
and it will be only freed in the second ide_complete_rq() at the end of
the IRQ handler.

But, it seems that ide_cd_queue_pc() goes after that first
ide_complete_rq() and decrements the refcount, as you suggest, right?

What bothers me here is that we're in IRQ context and running with
interrupts disabled so _actually_ the blk_put_request() part of
ide_cd_queue_pc() should be getting to run only _after_ the IRQ
handler is done and we should be getting the NULL ptr deref in
ide_cd_queue_pc(), but we don't. I guess I'm missing something here.

--
Regards/Gruss,
Boris.

Subject: Re: [PATCH] ide-cd: prevent null pointer deref via cdrom_newpc_intr

On Thursday 18 June 2009 18:06:34 Borislav Petkov wrote:
> Hi,
>
> On Thu, Jun 18, 2009 at 5:04 PM, Rainer Weikusat<[email protected]> wrote:
> > From: Rainer Weikusat <[email protected]>
> >
> > With 2.6.30, the error handling code in cdrom_newpc_intr was changed
> > to deal with partial request failures by normally completing the 'good'
> > parts of a request and only 'error' the last (and presumably,
> > incompletely transferred) bio associated with a particular
> > request. In order to do this, ide_complete_rq is called over
> > ide_cd_error_cmd() to partially complete the rq. The block layer
> > does partial completion only for requests with bio's and if the
> > rq doesn't have one (eg 'GPCMD_READ_DISC_INFO') the request is
> > completed as a whole and the drive->hwif->rq pointer set to NULL
> > afterwards. When calling ide_complete_rq again to report
> > the error, this null pointer is derefenced, resulting in a kernel
> > crash.

Rainer, thanks for fixing this bug (with a lot of extra points for
the detailed explanation).

> @Bart: please apply.

applied [I kept the above patch description]