KASAN has found a slab-out-of-bounds error in ses_enclosure_data_process().
[ 27.298092] BUG: KASAN: slab-out-of-bounds in ses_enclosure_data_process+0x919/0xe80 [ses]
[ 27.306407] Read of size 1 at addr ffff8807c99048b1 by task systemd-udevd/1563
[ 27.315173] CPU: 18 PID: 1563 Comm: systemd-udevd Not tainted 4.18.0-80.23.el8.x86_64+debug #1
[ 27.323835] Hardware name: HPE ProLiant XL450 Gen10/ProLiant XL450 Gen10, BIOS U40 10/02/2018
[ 27.332410] Call Trace:
:
[ 27.348557] kasan_report.cold.6+0x92/0x1a6
[ 27.352771] ses_enclosure_data_process+0x919/0xe80 [ses]
[ 27.358211] ? kfree+0xd6/0x2e0
[ 27.361376] ses_intf_add+0xa23/0xef1 [ses]
[ 27.365590] ? class_dev_iter_next+0x6c/0xc0
[ 27.370034] class_interface_register+0x298/0x400
[ 27.374769] ? tsc_cs_mark_unstable+0x60/0x60
[ 27.379163] ? class_dev_iter_exit+0x10/0x10
[ 27.384857] ? 0xffffffffc0838000
[ 27.389580] ses_init+0x12/0x1000 [ses]
[ 27.394832] do_one_initcall+0xe9/0x5fd
:
[ 27.562322] Allocated by task 1563:
[ 27.562330] kasan_kmalloc+0xbf/0xe0
[ 27.569433] __kmalloc+0x150/0x360
[ 27.572858] ses_intf_add+0x76a/0xef1 [ses]
[ 27.577068] class_interface_register+0x298/0x400
[ 27.581803] ses_init+0x12/0x1000 [ses]
[ 27.587053] do_one_initcall+0xe9/0x5fd
[ 27.592309] do_init_module+0x1f2/0x710
[ 27.597564] load_module+0x3e19/0x5910
[ 27.601336] __do_sys_init_module+0x1dd/0x260
[ 27.606673] do_syscall_64+0xa5/0x4a0
[ 27.610359] entry_SYSCALL_64_after_hwframe+0x6a/0xdf
:
[ 27.624932] The buggy address belongs to the object at ffff8807c9904400
which belongs to the cache kmalloc-2048 of size 2048
[ 27.637701] The buggy address is located 1201 bytes inside of
2048-byte region [ffff8807c9904400, ffff8807c9904c00)
[ 27.649683] The buggy address belongs to the page:
[ 27.654503] page:ffffea001f264000 count:1 mapcount:0 mapping:ffff880107c15d80 index:0x0 compound_mapcount: 0
[ 27.664393] flags: 0x17ffffc0008100(slab|head)
[ 27.668865] raw: 0017ffffc0008100 dead000000000100 dead000000000200 ffff880107c15d80
[ 27.676656] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
[ 27.684444] page dumped because: kasan: bad access detected
The out-of-bounds memory access happens to ses_dev->page10 which has a
size of 1200 bytes in this case. The invalid memory access happens in:
600 if (addl_desc_ptr &&
601 /* only find additional descriptions for specific devices */
602 (type_ptr[0] == ENCLOSURE_COMPONENT_DEVICE ||
603 type_ptr[0] == ENCLOSURE_COMPONENT_ARRAY_DEVICE ||
604 type_ptr[0] == ENCLOSURE_COMPONENT_SAS_EXPANDER ||
605 /* these elements are optional */
606 type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
607 type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
608 type_ptr[0] == ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
609 addl_desc_ptr += addl_desc_ptr[1] + 2; <-- here
To fix the out-of-bounds memory access, code is now added to make sure
that addl_desc_ptr will never point to an address outside of its bounds.
With this patch, the KASAN warning is gone.
Signed-off-by: Waiman Long <[email protected]>
---
drivers/scsi/ses.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
index 0fc39224ce1e..dbc9acc2df2f 100644
--- a/drivers/scsi/ses.c
+++ b/drivers/scsi/ses.c
@@ -605,9 +605,14 @@ static void ses_enclosure_data_process(struct enclosure_device *edev,
/* these elements are optional */
type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
- type_ptr[0] == ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
+ type_ptr[0] == ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS)) {
addl_desc_ptr += addl_desc_ptr[1] + 2;
+ /* Ensure no out-of-bounds memory access */
+ if (addl_desc_ptr >= ses_dev->page10 +
+ ses_dev->page10_len)
+ addl_desc_ptr = NULL;
+ }
}
}
kfree(buf);
--
2.18.1
On 5/1/19 2:05 PM, Waiman Long wrote:
> KASAN has found a slab-out-of-bounds error in ses_enclosure_data_process().
>
> [ 27.298092] BUG: KASAN: slab-out-of-bounds in ses_enclosure_data_process+0x919/0xe80 [ses]
> [ 27.306407] Read of size 1 at addr ffff8807c99048b1 by task systemd-udevd/1563
> [ 27.315173] CPU: 18 PID: 1563 Comm: systemd-udevd Not tainted 4.18.0-80.23.el8.x86_64+debug #1
> [ 27.323835] Hardware name: HPE ProLiant XL450 Gen10/ProLiant XL450 Gen10, BIOS U40 10/02/2018
> [ 27.332410] Call Trace:
> :
> [ 27.348557] kasan_report.cold.6+0x92/0x1a6
> [ 27.352771] ses_enclosure_data_process+0x919/0xe80 [ses]
> [ 27.358211] ? kfree+0xd6/0x2e0
> [ 27.361376] ses_intf_add+0xa23/0xef1 [ses]
> [ 27.365590] ? class_dev_iter_next+0x6c/0xc0
> [ 27.370034] class_interface_register+0x298/0x400
> [ 27.374769] ? tsc_cs_mark_unstable+0x60/0x60
> [ 27.379163] ? class_dev_iter_exit+0x10/0x10
> [ 27.384857] ? 0xffffffffc0838000
> [ 27.389580] ses_init+0x12/0x1000 [ses]
> [ 27.394832] do_one_initcall+0xe9/0x5fd
> :
> [ 27.562322] Allocated by task 1563:
> [ 27.562330] kasan_kmalloc+0xbf/0xe0
> [ 27.569433] __kmalloc+0x150/0x360
> [ 27.572858] ses_intf_add+0x76a/0xef1 [ses]
> [ 27.577068] class_interface_register+0x298/0x400
> [ 27.581803] ses_init+0x12/0x1000 [ses]
> [ 27.587053] do_one_initcall+0xe9/0x5fd
> [ 27.592309] do_init_module+0x1f2/0x710
> [ 27.597564] load_module+0x3e19/0x5910
> [ 27.601336] __do_sys_init_module+0x1dd/0x260
> [ 27.606673] do_syscall_64+0xa5/0x4a0
> [ 27.610359] entry_SYSCALL_64_after_hwframe+0x6a/0xdf
> :
> [ 27.624932] The buggy address belongs to the object at ffff8807c9904400
> which belongs to the cache kmalloc-2048 of size 2048
> [ 27.637701] The buggy address is located 1201 bytes inside of
> 2048-byte region [ffff8807c9904400, ffff8807c9904c00)
> [ 27.649683] The buggy address belongs to the page:
> [ 27.654503] page:ffffea001f264000 count:1 mapcount:0 mapping:ffff880107c15d80 index:0x0 compound_mapcount: 0
> [ 27.664393] flags: 0x17ffffc0008100(slab|head)
> [ 27.668865] raw: 0017ffffc0008100 dead000000000100 dead000000000200 ffff880107c15d80
> [ 27.676656] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
> [ 27.684444] page dumped because: kasan: bad access detected
>
> The out-of-bounds memory access happens to ses_dev->page10 which has a
> size of 1200 bytes in this case. The invalid memory access happens in:
>
> 600 if (addl_desc_ptr &&
> 601 /* only find additional descriptions for specific devices */
> 602 (type_ptr[0] == ENCLOSURE_COMPONENT_DEVICE ||
> 603 type_ptr[0] == ENCLOSURE_COMPONENT_ARRAY_DEVICE ||
> 604 type_ptr[0] == ENCLOSURE_COMPONENT_SAS_EXPANDER ||
> 605 /* these elements are optional */
> 606 type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
> 607 type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
> 608 type_ptr[0] == ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
> 609 addl_desc_ptr += addl_desc_ptr[1] + 2; <-- here
>
> To fix the out-of-bounds memory access, code is now added to make sure
> that addl_desc_ptr will never point to an address outside of its bounds.
>
> With this patch, the KASAN warning is gone.
>
> Signed-off-by: Waiman Long <[email protected]>
> ---
> drivers/scsi/ses.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
> index 0fc39224ce1e..dbc9acc2df2f 100644
> --- a/drivers/scsi/ses.c
> +++ b/drivers/scsi/ses.c
> @@ -605,9 +605,14 @@ static void ses_enclosure_data_process(struct enclosure_device *edev,
> /* these elements are optional */
> type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
> type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
> - type_ptr[0] == ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
> + type_ptr[0] == ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS)) {
> addl_desc_ptr += addl_desc_ptr[1] + 2;
>
> + /* Ensure no out-of-bounds memory access */
> + if (addl_desc_ptr >= ses_dev->page10 +
> + ses_dev->page10_len)
> + addl_desc_ptr = NULL;
> + }
> }
> }
> kfree(buf);
Ping! Any comment on this patch.
Cheers,
Longman
On Mon, 2019-05-20 at 10:41 -0400, Waiman Long wrote:
[...]
> > --- a/drivers/scsi/ses.c
> > +++ b/drivers/scsi/ses.c
> > @@ -605,9 +605,14 @@ static void ses_enclosure_data_process(struct
> > enclosure_device *edev,
> > /* these elements are optional */
> > type_ptr[0] ==
> > ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
> > type_ptr[0] ==
> > ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
> > - type_ptr[0] ==
> > ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
> > + type_ptr[0] ==
> > ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS)) {
> > addl_desc_ptr += addl_desc_ptr[1]
> > + 2;
> >
> > + /* Ensure no out-of-bounds memory
> > access */
> > + if (addl_desc_ptr >= ses_dev-
> > >page10 +
> > + ses_dev-
> > >page10_len)
> > + addl_desc_ptr = NULL;
> > + }
> > }
> > }
> > kfree(buf);
>
> Ping! Any comment on this patch.
The update looks fine to me:
Reviewed-by: James E.J. Bottomley <[email protected]>
It might also be interesting to find out how the proliant is
structuring this descriptor array to precipitate the out of bounds: Is
it just an off by one or something more serious?
James
On 5/20/19 10:52 AM, James Bottomley wrote:
> On Mon, 2019-05-20 at 10:41 -0400, Waiman Long wrote:
> [...]
>>> --- a/drivers/scsi/ses.c
>>> +++ b/drivers/scsi/ses.c
>>> @@ -605,9 +605,14 @@ static void ses_enclosure_data_process(struct
>>> enclosure_device *edev,
>>> /* these elements are optional */
>>> type_ptr[0] ==
>>> ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
>>> type_ptr[0] ==
>>> ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
>>> - type_ptr[0] ==
>>> ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
>>> + type_ptr[0] ==
>>> ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS)) {
>>> addl_desc_ptr += addl_desc_ptr[1]
>>> + 2;
>>>
>>> + /* Ensure no out-of-bounds memory
>>> access */
>>> + if (addl_desc_ptr >= ses_dev-
>>>> page10 +
>>> + ses_dev-
>>>> page10_len)
>>> + addl_desc_ptr = NULL;
>>> + }
>>> }
>>> }
>>> kfree(buf);
>> Ping! Any comment on this patch.
> The update looks fine to me:
>
> Reviewed-by: James E.J. Bottomley <[email protected]>
>
> It might also be interesting to find out how the proliant is
> structuring this descriptor array to precipitate the out of bounds: Is
> it just an off by one or something more serious?
I didn't look into the detail the enclosure message returned by the
hardware, but I believe it may have more description entries (page7)
than extended description entries (page10).
I can try to reserve the system and find out what exactly is wrong with
that system if you really want to find that out.
Cheers,
Longman
On Mon, 2019-05-20 at 11:24 -0400, Waiman Long wrote:
> On 5/20/19 10:52 AM, James Bottomley wrote:
> > On Mon, 2019-05-20 at 10:41 -0400, Waiman Long wrote:
> > [...]
> > > > --- a/drivers/scsi/ses.c
> > > > +++ b/drivers/scsi/ses.c
> > > > @@ -605,9 +605,14 @@ static void
> > > > ses_enclosure_data_process(struct
> > > > enclosure_device *edev,
> > > > /* these elements are optional */
> > > > type_ptr[0] ==
> > > > ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
> > > > type_ptr[0] ==
> > > > ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
> > > > - type_ptr[0] ==
> > > > ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
> > > > + type_ptr[0] ==
> > > > ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS)) {
> > > > addl_desc_ptr +=
> > > > addl_desc_ptr[1]
> > > > + 2;
> > > >
> > > > + /* Ensure no out-of-bounds
> > > > memory
> > > > access */
> > > > + if (addl_desc_ptr >= ses_dev-
> > > > > page10 +
> > > >
> > > > + ses_dev-
> > > > > page10_len)
> > > >
> > > > + addl_desc_ptr = NULL;
> > > > + }
> > > > }
> > > > }
> > > > kfree(buf);
> > >
> > > Ping! Any comment on this patch.
> >
> > The update looks fine to me:
> >
> > Reviewed-by: James E.J. Bottomley <[email protected]>
> >
> > It might also be interesting to find out how the proliant is
> > structuring this descriptor array to precipitate the out of bounds:
> > Is it just an off by one or something more serious?
>
> I didn't look into the detail the enclosure message returned by the
> hardware, but I believe it may have more description entries (page7)
> than extended description entries (page10).
>
> I can try to reserve the system and find out what exactly is wrong
> with that system if you really want to find that out.
Please. What I'm interested in is whether this is simply a bug in the
array firmware, in which case the fix is sufficient, or whether there's
some problem with the parser, like mismatched expectations over added
trailing nulls or something.
James
On 5/20/19 11:46 AM, James Bottomley wrote:
> On Mon, 2019-05-20 at 11:24 -0400, Waiman Long wrote:
>> On 5/20/19 10:52 AM, James Bottomley wrote:
>>> On Mon, 2019-05-20 at 10:41 -0400, Waiman Long wrote:
>>> [...]
>>>>> --- a/drivers/scsi/ses.c
>>>>> +++ b/drivers/scsi/ses.c
>>>>> @@ -605,9 +605,14 @@ static void
>>>>> ses_enclosure_data_process(struct
>>>>> enclosure_device *edev,
>>>>> /* these elements are optional */
>>>>> type_ptr[0] ==
>>>>> ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
>>>>> type_ptr[0] ==
>>>>> ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
>>>>> - type_ptr[0] ==
>>>>> ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
>>>>> + type_ptr[0] ==
>>>>> ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS)) {
>>>>> addl_desc_ptr +=
>>>>> addl_desc_ptr[1]
>>>>> + 2;
>>>>>
>>>>> + /* Ensure no out-of-bounds
>>>>> memory
>>>>> access */
>>>>> + if (addl_desc_ptr >= ses_dev-
>>>>>> page10 +
>>>>> + ses_dev-
>>>>>> page10_len)
>>>>> + addl_desc_ptr = NULL;
>>>>> + }
>>>>> }
>>>>> }
>>>>> kfree(buf);
>>>> Ping! Any comment on this patch.
>>> The update looks fine to me:
>>>
>>> Reviewed-by: James E.J. Bottomley <[email protected]>
>>>
>>> It might also be interesting to find out how the proliant is
>>> structuring this descriptor array to precipitate the out of bounds:
>>> Is it just an off by one or something more serious?
>> I didn't look into the detail the enclosure message returned by the
>> hardware, but I believe it may have more description entries (page7)
>> than extended description entries (page10).
>>
>> I can try to reserve the system and find out what exactly is wrong
>> with that system if you really want to find that out.
> Please. What I'm interested in is whether this is simply a bug in the
> array firmware, in which case the fix is sufficient, or whether there's
> some problem with the parser, like mismatched expectations over added
> trailing nulls or something.
>
> James
>
OK, will let you know once I get hold of the system.
Cheers,
Longman
James,
> Please. What I'm interested in is whether this is simply a bug in the
> array firmware, in which case the fix is sufficient, or whether
> there's some problem with the parser, like mismatched expectations
> over added trailing nulls or something.
Our support folks have been looking at this for a while. We have seen
problems with devices from several vendors. To the extent that I gave up
the idea of blacklisting all of them.
I am collecting "bad" SES pages from these devices. I have added support
for RECEIVE DIAGNOSTICS to scsi_debug and added a bunch of deliberately
broken SES pages so we could debug this.
It appears to be very common for devices to return inconsistent or
invalid data. So pretty much all of the ses.c parsing needs to have
sanity checking heuristics added to prevent KASAN hiccups.
--
Martin K. Petersen Oracle Linux Engineering
On 2019-05-20 12:05 p.m., Martin K. Petersen wrote:
>
> James,
>
>> Please. What I'm interested in is whether this is simply a bug in the
>> array firmware, in which case the fix is sufficient, or whether
>> there's some problem with the parser, like mismatched expectations
>> over added trailing nulls or something.
>
> Our support folks have been looking at this for a while. We have seen
> problems with devices from several vendors. To the extent that I gave up
> the idea of blacklisting all of them.
>
> I am collecting "bad" SES pages from these devices. I have added support
> for RECEIVE DIAGNOSTICS to scsi_debug and added a bunch of deliberately
> broken SES pages so we could debug this
Patches ??
> It appears to be very common for devices to return inconsistent or
> invalid data. So pretty much all of the ses.c parsing needs to have
> sanity checking heuristics added to prevent KASAN hiccups.
And it is not just SES device implementations that were broken. The
relationship between Additional Element Status diagnostic page (dpage)
and the Enclosure Status dpage was under-specified in SES-2 and that
led to the EIIOE field being introduced during the SES-3 revisions.
And the meaning of EIIOE was tweaked several times *** before SES-3 was
standardized. Anyone interested in the adventures of EIIOE can see
the code of sg_ses.c in sg3_utils. The sg_ses utility is many times
more complex than anything else in the sg3_utils package.
And that complexity led me to suspect that the Linux SES driver was
broken. It should be 3 or 4 times larger than it is! It simply doesn't
do enough checking.
So yes Martin, you are on the right track.
Doug Gilbert
BTW the NVME Management Interface folks have decided to use SES-3 for
NVME enclosure management rather than invent their own can of worms :-)
*** For example EIIOE started life as a 1 bit field, but two cases
wasn't enough, so it became a 2 bit field and now uses all
four possibilities.
Doug,
>> I am collecting "bad" SES pages from these devices. I have added
>> support for RECEIVE DIAGNOSTICS to scsi_debug and added a bunch of
>> deliberately broken SES pages so we could debug this
>
> Patches ??
I have included the plumbing below. However, I need to synthesize the
contents of the pages with problems. I can't share the ones I have
received from customers so I removed the arrays from the patch.
--
Martin K. Petersen Oracle Linux Engineering
From 968dfc5cd498d2ea6e77801cc9b9183a1a28b35d Mon Sep 17 00:00:00 2001
From: "Martin K. Petersen" <[email protected]>
Date: Thu, 28 Mar 2019 22:29:13 -0400
Subject: [PATCH] scsi: scsi_debug: Implement support for Receive Diagnostics
command
Signed-off-by: Martin K. Petersen <[email protected]>
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 2740a90501a0..db8745a7000e 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -356,7 +356,8 @@ enum sdeb_opcode_index {
SDEB_I_WRITE_SAME = 26, /* 10, 16 */
SDEB_I_SYNC_CACHE = 27, /* 10, 16 */
SDEB_I_COMP_WRITE = 28,
- SDEB_I_LAST_ELEMENT = 29, /* keep this last (previous + 1) */
+ SDEB_I_RECV_DIAG = 29,
+ SDEB_I_LAST_ELEMENT = 30, /* keep this last (previous + 1) */
};
@@ -367,8 +368,8 @@ static const unsigned char opcode_ind_arr[256] = {
SDEB_I_READ, 0, SDEB_I_WRITE, 0, 0, 0, 0, 0,
0, 0, SDEB_I_INQUIRY, 0, 0, SDEB_I_MODE_SELECT, SDEB_I_RESERVE,
SDEB_I_RELEASE,
- 0, 0, SDEB_I_MODE_SENSE, SDEB_I_START_STOP, 0, SDEB_I_SEND_DIAG,
- SDEB_I_ALLOW_REMOVAL, 0,
+ 0, 0, SDEB_I_MODE_SENSE, SDEB_I_START_STOP, SDEB_I_RECV_DIAG,
+ SDEB_I_SEND_DIAG, SDEB_I_ALLOW_REMOVAL, 0,
/* 0x20; 0x20->0x3f: 10 byte cdbs */
0, 0, 0, 0, 0, SDEB_I_READ_CAPACITY, 0, 0,
SDEB_I_READ, 0, SDEB_I_WRITE, 0, 0, 0, 0, SDEB_I_VERIFY,
@@ -433,6 +434,7 @@ static int resp_write_same_16(struct scsi_cmnd *, struct sdebug_dev_info *);
static int resp_comp_write(struct scsi_cmnd *, struct sdebug_dev_info *);
static int resp_write_buffer(struct scsi_cmnd *, struct sdebug_dev_info *);
static int resp_sync_cache(struct scsi_cmnd *, struct sdebug_dev_info *);
+static int resp_recv_diag(struct scsi_cmnd *, struct sdebug_dev_info *);
/*
* The following are overflow arrays for cdbs that "hit" the same index in
@@ -613,8 +615,9 @@ static const struct opcode_info_t opcode_info_arr[SDEB_I_LAST_ELEMENT + 1] = {
{0, 0x89, 0, F_D_OUT | FF_MEDIA_IO, resp_comp_write, NULL,
{16, 0xf8, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0, 0,
0, 0xff, 0x3f, 0xc7} }, /* COMPARE AND WRITE */
-
-/* 29 */
+ {0, 0x1c, 0, FF_RESPOND | F_D_IN, resp_recv_diag, NULL, /* RECV DIAG */
+ {6, 0x1, 0xff, 0xff, 0xff, 0xc7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
+/* 30 */
{0xff, 0, 0, 0, NULL, NULL, /* terminating element */
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} },
};
@@ -1516,7 +1519,7 @@ static int resp_inquiry(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
arr[5] = (int)have_dif_prot; /* PROTECT bit */
if (sdebug_vpd_use_hostno == 0)
arr[5] |= 0x10; /* claim: implicit TPGS */
- arr[6] = 0x10; /* claim: MultiP */
+ arr[6] = 0x10 | 0x40; /* claim: MultiP */
/* arr[6] |= 0x40; ... claim: EncServ (enclosure services) */
arr[7] = 0xa; /* claim: LINKED + CMDQUE */
memcpy(&arr[8], sdebug_inq_vendor_id, 8);
@@ -3597,6 +3600,36 @@ static int resp_sync_cache(struct scsi_cmnd *scp,
return res;
}
+static unsigned char diag0[] = {
+ 0x00, 0x00, 0x00, 0x07, 0x00, 0x01, 0x02, 0x06, 0x07, 0x0a, 0xa0, 0x00,
+};
+#define DIAG0_LEN 12
+
+
+static int resp_recv_diag(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
+{
+ unsigned char *cmd = scp->cmnd;
+
+ switch(cmd[2]) {
+ case 0:
+ return fill_from_dev_buffer(scp, diag0, DIAG0_LEN);
+ case 1:
+ return fill_from_dev_buffer(scp, diag1, DIAG1_LEN);
+ case 2:
+ return fill_from_dev_buffer(scp, diag2, DIAG2_LEN);
+ case 6:
+ return fill_from_dev_buffer(scp, diag6, DIAG6_LEN);
+ case 7:
+ return fill_from_dev_buffer(scp, diag7, DIAG7_LEN);
+ case 0xa:
+ return fill_from_dev_buffer(scp, diaga, DIAGA_LEN);
+ case 0xa0:
+ return fill_from_dev_buffer(scp, diaga0, DIAGA0_LEN);
+ }
+
+ return DID_ERROR << 16;
+}
+
#define RL_BUCKET_ELEMS 8
/* Even though each pseudo target has a REPORT LUNS "well known logical unit"
On 5/20/19 11:56 AM, Waiman Long wrote:
> On 5/20/19 11:46 AM, James Bottomley wrote:
>> On Mon, 2019-05-20 at 11:24 -0400, Waiman Long wrote:
>>> On 5/20/19 10:52 AM, James Bottomley wrote:
>>>> On Mon, 2019-05-20 at 10:41 -0400, Waiman Long wrote:
>>>> [...]
>>>>>> --- a/drivers/scsi/ses.c
>>>>>> +++ b/drivers/scsi/ses.c
>>>>>> @@ -605,9 +605,14 @@ static void
>>>>>> ses_enclosure_data_process(struct
>>>>>> enclosure_device *edev,
>>>>>> /* these elements are optional */
>>>>>> type_ptr[0] ==
>>>>>> ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
>>>>>> type_ptr[0] ==
>>>>>> ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
>>>>>> - type_ptr[0] ==
>>>>>> ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
>>>>>> + type_ptr[0] ==
>>>>>> ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS)) {
>>>>>> addl_desc_ptr +=
>>>>>> addl_desc_ptr[1]
>>>>>> + 2;
>>>>>>
>>>>>> + /* Ensure no out-of-bounds
>>>>>> memory
>>>>>> access */
>>>>>> + if (addl_desc_ptr >= ses_dev-
>>>>>>> page10 +
>>>>>> + ses_dev-
>>>>>>> page10_len)
>>>>>> + addl_desc_ptr = NULL;
>>>>>> + }
>>>>>> }
>>>>>> }
>>>>>> kfree(buf);
>>>>> Ping! Any comment on this patch.
>>>> The update looks fine to me:
>>>>
>>>> Reviewed-by: James E.J. Bottomley <[email protected]>
>>>>
>>>> It might also be interesting to find out how the proliant is
>>>> structuring this descriptor array to precipitate the out of bounds:
>>>> Is it just an off by one or something more serious?
>>> I didn't look into the detail the enclosure message returned by the
>>> hardware, but I believe it may have more description entries (page7)
>>> than extended description entries (page10).
>>>
>>> I can try to reserve the system and find out what exactly is wrong
>>> with that system if you really want to find that out.
>> Please. What I'm interested in is whether this is simply a bug in the
>> array firmware, in which case the fix is sufficient, or whether there's
>> some problem with the parser, like mismatched expectations over added
>> trailing nulls or something.
>>
>> James
>>
> OK, will let you know once I get hold of the system.
I now have access to the xl450 system that can reproduce the error. 12
enclosure messages are processed during bootup. Of the one that causes
the KASAN warning, the byte dump of page 1 and 10 are:
[ 28.185749] ses page 10 (len = 1200):
[ 28.185762] a-0000: 0a 00 04 ac 00 00 00 00 16 22 00 00 01 01 00 01
[ 28.185768] a-0010: 00 00 00 01 50 01 43 80 33 d4 45 bd 50 01 43 80
[ 28.185774] a-0020: 33 d4 45 80 00 00 00 00 00 00 00 00 16 22 00 01
[ 28.185779] a-0030: 01 01 00 02 00 00 00 01 50 01 43 80 33 d4 45 bd
[ 28.185785] a-0040: 50 01 43 80 33 d4 45 81 00 00 00 00 00 00 00 00
[ 28.185790] a-0050: 96 22 00 02 01 01 00 03 00 00 00 00 00 00 00 00
[ 28.185794] a-0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185798] a-0070: 00 00 00 00 96 22 00 03 01 01 00 04 00 00 00 00
[ 28.185801] a-0080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185805] a-0090: 00 00 00 00 00 00 00 00 96 22 00 04 01 01 00 05
[ 28.185808] a-00a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185812] a-00b0: 00 00 00 00 00 00 00 00 00 00 00 00 96 22 00 05
[ 28.185815] a-00c0: 01 01 00 06 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185818] a-00d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185822] a-00e0: 96 22 00 06 01 01 00 07 00 00 00 00 00 00 00 00
[ 28.185825] a-00f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185829] a-0100: 00 00 00 00 96 22 00 07 01 01 00 08 00 00 00 00
[ 28.185832] a-0110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185836] a-0120: 00 00 00 00 00 00 00 00 96 22 00 08 01 01 00 09
[ 28.185839] a-0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185843] a-0140: 00 00 00 00 00 00 00 00 00 00 00 00 96 22 00 09
[ 28.185846] a-0150: 01 01 00 0a 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185850] a-0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185853] a-0170: 96 22 00 0a 01 01 00 0b 00 00 00 00 00 00 00 00
[ 28.185857] a-0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185860] a-0190: 00 00 00 00 96 22 00 0b 01 01 00 0c 00 00 00 00
[ 28.185864] a-01a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185867] a-01b0: 00 00 00 00 00 00 00 00 96 22 00 0c 01 01 00 0d
[ 28.185871] a-01c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185874] a-01d0: 00 00 00 00 00 00 00 00 00 00 00 00 96 22 00 0d
[ 28.185877] a-01e0: 01 01 00 0e 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185881] a-01f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185884] a-0200: 96 22 00 0e 01 01 00 0f 00 00 00 00 00 00 00 00
[ 28.185888] a-0210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185891] a-0220: 00 00 00 00 96 22 00 0f 01 01 00 10 00 00 00 00
[ 28.185895] a-0230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185898] a-0240: 00 00 00 00 00 00 00 00 96 22 00 10 01 01 00 11
[ 28.185902] a-0250: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185905] a-0260: 00 00 00 00 00 00 00 00 00 00 00 00 96 22 00 11
[ 28.185909] a-0270: 01 01 00 12 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185912] a-0280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185916] a-0290: 96 22 00 12 01 01 00 13 00 00 00 00 00 00 00 00
[ 28.185919] a-02a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185923] a-02b0: 00 00 00 00 96 22 00 13 01 01 00 14 00 00 00 00
[ 28.185926] a-02c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185930] a-02d0: 00 00 00 00 00 00 00 00 96 22 00 14 01 01 00 15
[ 28.185933] a-02e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185937] a-02f0: 00 00 00 00 00 00 00 00 00 00 00 00 96 22 00 15
[ 28.185940] a-0300: 01 01 00 16 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185944] a-0310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185947] a-0320: 96 22 00 16 01 01 00 17 00 00 00 00 00 00 00 00
[ 28.185951] a-0330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185954] a-0340: 00 00 00 00 96 22 00 17 01 01 00 18 00 00 00 00
[ 28.185958] a-0350: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185961] a-0360: 00 00 00 00 00 00 00 00 96 22 00 18 01 01 00 19
[ 28.185965] a-0370: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185968] a-0380: 00 00 00 00 00 00 00 00 00 00 00 00 96 22 00 19
[ 28.185972] a-0390: 01 01 00 1a 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185975] a-03a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185979] a-03b0: 96 22 00 1a 01 01 00 1b 00 00 00 00 00 00 00 00
[ 28.185982] a-03c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185986] a-03d0: 00 00 00 00 96 22 00 1b 01 01 00 1c 00 00 00 00
[ 28.185989] a-03e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185993] a-03f0: 00 00 00 00 00 00 00 00 96 22 00 1c 01 01 00 1d
[ 28.185996] a-0400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.185999] a-0410: 00 00 00 00 00 00 00 00 00 00 00 00 96 22 00 1d
[ 28.186023] a-0420: 01 01 00 1e 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.186026] a-0430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 28.186030] a-0440: 16 6e 00 47 30 40 00 00 50 01 43 80 33 d4 45 bd
[ 28.186034] a-0450: 49 01 4a 02 4b 03 4c 04 4d 05 4e 06 4f 07 50 08
[ 28.186038] a-0460: 51 09 52 0a 53 0b 54 0c 55 0d 56 0e 57 0f 58 10
[ 28.186041] a-0470: 59 11 5a 12 5b 13 5c 14 5d 15 5e 16 5f 17 60 18
[ 28.186045] a-0480: 61 19 62 1a 63 1b 64 1c 65 1d 66 1e ff ff ff ff
[ 28.186048] a-0490: 68 ff 69 ff 6a ff 6b ff 6c ff 6d ff 6e ff 6f ff
[ 28.186052] a-04a0: 71 ff 72 ff 73 ff 74 ff 75 ff 76 ff 77 ff 78 ff
[ 28.188282] ses page 1 (len = 36):
[ 28.188290] 1-0000: 17 1e 00 0c 04 03 00 14 89 1e 00 18 07 01 00 0c
[ 28.188298] 1-0010: 0e 01 00 10 18 01 00 0c 19 1e 00 14 19 08 00 20
[ 28.188302] 1-0020: 19 08 00 20
Page 7 can't be retrieved for this message.
Please let me know what else do you want to have.
Cheers,
Longman
On 5/20/19 10:52 AM, James Bottomley wrote:
> On Mon, 2019-05-20 at 10:41 -0400, Waiman Long wrote:
> [...]
>>> --- a/drivers/scsi/ses.c
>>> +++ b/drivers/scsi/ses.c
>>> @@ -605,9 +605,14 @@ static void ses_enclosure_data_process(struct
>>> enclosure_device *edev,
>>> /* these elements are optional */
>>> type_ptr[0] ==
>>> ENCLOSURE_COMPONENT_SCSI_TARGET_PORT ||
>>> type_ptr[0] ==
>>> ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT ||
>>> - type_ptr[0] ==
>>> ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS))
>>> + type_ptr[0] ==
>>> ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS)) {
>>> addl_desc_ptr += addl_desc_ptr[1]
>>> + 2;
>>>
>>> + /* Ensure no out-of-bounds memory
>>> access */
>>> + if (addl_desc_ptr >= ses_dev-
>>>> page10 +
>>> + ses_dev-
>>>> page10_len)
>>> + addl_desc_ptr = NULL;
>>> + }
>>> }
>>> }
>>> kfree(buf);
>> Ping! Any comment on this patch.
> The update looks fine to me:
>
> Reviewed-by: James E.J. Bottomley <[email protected]>
>
> It might also be interesting to find out how the proliant is
> structuring this descriptor array to precipitate the out of bounds: Is
> it just an off by one or something more serious?
>
> James
>
Is someone going to merge this patch in the current cycle?
Thanks,
Longman
Waiman,
> Is someone going to merge this patch in the current cycle?
I was hoping somebody would step up and patch all the bad accesses and
not just page 10.
--
Martin K. Petersen Oracle Linux Engineering
On 7/18/19 2:26 PM, Martin K. Petersen wrote:
> Waiman,
>
>> Is someone going to merge this patch in the current cycle?
> I was hoping somebody would step up and patch all the bad accesses and
> not just page 10.
>
I see.
Thanks,
Longman