2005-10-17 19:24:06

by Dale Blount

[permalink] [raw]
Subject: scsi disk size reporting in dmesg

Hello,

I just added 2 external 1TB+ scsi devices to my i686 linux server
running 2.6.13.4 connected to external LSI MPT card. fdisk and df both
show the sizes correctly (see below), but I'm worried that dmesg reports
them incorrectly.

SCSI device sda: 2460934144 512-byte hdwr sectors (160487 MB)
SCSI device sdb: 3790438400 512-byte hdwr sectors (841193 MB)

I don't think it's as simple as a variable overflow because both
sdkp->capacity and mb look to be cast as unsigned long longs. I know a
workaround is to present less data per LUN, but I'd like to use it as
it's setup currently if possible. Is this just printing incorrectly or
will I run into trouble when the device gets more full?


Thanks,

Dale




# df -h
/dev/sda1 1.2T 129M 1.1T 1% /mnt/sda1
/dev/sdb1 1.8T 129M 1.7T 1% /mnt/sdb1
# df
/dev/sda1 1211159084 131228 1149504532 1% /mnt/sda1
/dev/sdb1 1865473692 131228 1770581860 1% /mnt/sdb1


# fdisk -l /dev/sda
Disk /dev/sda: 1259.9 GB, 1259998281728 bytes
255 heads, 63 sectors/track, 153186 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 1 153186 1230466513+ 83 Linux

# fdisk -l /dev/sdb

Disk /dev/sdb: 1940.7 GB, 1940704460800 bytes
255 heads, 63 sectors/track, 235943 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 235943 1895212116 83 Linux





2005-10-20 05:09:23

by Randy Dunlap

[permalink] [raw]
Subject: Re: scsi disk size reporting in dmesg

On Mon, 17 Oct 2005 15:24:04 -0400 Dale Blount wrote:

> Hello,
>
> I just added 2 external 1TB+ scsi devices to my i686 linux server
> running 2.6.13.4 connected to external LSI MPT card. fdisk and df both
> show the sizes correctly (see below), but I'm worried that dmesg reports
> them incorrectly.
>
> SCSI device sda: 2460934144 512-byte hdwr sectors (160487 MB)
> SCSI device sdb: 3790438400 512-byte hdwr sectors (841193 MB)
>
> I don't think it's as simple as a variable overflow because both
> sdkp->capacity and mb look to be cast as unsigned long longs. I know a
> workaround is to present less data per LUN, but I'd like to use it as
> it's setup currently if possible. Is this just printing incorrectly or
> will I run into trouble when the device gets more full?

The casts to (unsigned long long) just fix the printk() args to match
the format strings (and eliminate warnings).

Looks to me like sdkp->capacity is correct. The <mb> value looks
way off. Since it's just printed here for user info, I don't see
how it can be a problem later on.

I don't see the error just yet. Are there any other SCSI device-
related messages near these? And just to confirm, but you must
have CONFIG_LBD (Large Block Device) enabled, right?



> Thanks,
>
> Dale
>
>
>
>
> # df -h
> /dev/sda1 1.2T 129M 1.1T 1% /mnt/sda1
> /dev/sdb1 1.8T 129M 1.7T 1% /mnt/sdb1
> # df
> /dev/sda1 1211159084 131228 1149504532 1% /mnt/sda1
> /dev/sdb1 1865473692 131228 1770581860 1% /mnt/sdb1
>
>
> # fdisk -l /dev/sda
> Disk /dev/sda: 1259.9 GB, 1259998281728 bytes
> 255 heads, 63 sectors/track, 153186 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
>
> Device Boot Start End Blocks Id System
> /dev/sda1 1 153186 1230466513+ 83 Linux
>
> # fdisk -l /dev/sdb
>
> Disk /dev/sdb: 1940.7 GB, 1940704460800 bytes
> 255 heads, 63 sectors/track, 235943 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
>
> Device Boot Start End Blocks Id System
> /dev/sdb1 1 235943 1895212116 83 Linux


---
~Randy

2005-10-20 12:17:17

by Dale Blount

[permalink] [raw]
Subject: Re: scsi disk size reporting in dmesg

On Wed, 2005-10-19 at 22:09 -0700, Randy.Dunlap wrote:
> On Mon, 17 Oct 2005 15:24:04 -0400 Dale Blount wrote:
>
> > Hello,
> >
> > I just added 2 external 1TB+ scsi devices to my i686 linux server
> > running 2.6.13.4 connected to external LSI MPT card. fdisk and df both
> > show the sizes correctly (see below), but I'm worried that dmesg reports
> > them incorrectly.
> >
> > SCSI device sda: 2460934144 512-byte hdwr sectors (160487 MB)
> > SCSI device sdb: 3790438400 512-byte hdwr sectors (841193 MB)
> >
> > I don't think it's as simple as a variable overflow because both
> > sdkp->capacity and mb look to be cast as unsigned long longs. I know a
> > workaround is to present less data per LUN, but I'd like to use it as
> > it's setup currently if possible. Is this just printing incorrectly or
> > will I run into trouble when the device gets more full?
>
> The casts to (unsigned long long) just fix the printk() args to match
> the format strings (and eliminate warnings).
>
> Looks to me like sdkp->capacity is correct. The <mb> value looks
> way off. Since it's just printed here for user info, I don't see
> how it can be a problem later on.
>

That's what I was hoping, but I didn't know for sure. I figured I'd
better ask to make sure my data wouldn't get truncated when the disk got
fuller. Between my first post and now, I've tested it by filling the
drive with data and testing the md5sums for each file and it seems to
work. Other than the odd MB size reported, it seems to work just fine.

> I don't see the error just yet. Are there any other SCSI device-
> related messages near these? And just to confirm, but you must
> have CONFIG_LBD (Large Block Device) enabled, right?
>

No, there are no other related messages near these other than the
standard vendor/versions. I did not enable CONFIG_LBD since the help
says "bigger than 2TB", and I partitioned the storage system to present
disks smaller than that.

Thanks for the reply,

Dale

2005-10-20 19:26:29

by Randy Dunlap

[permalink] [raw]
Subject: Re: scsi disk size reporting in dmesg

On Thu, 20 Oct 2005, Dale Blount wrote:

> On Wed, 2005-10-19 at 22:09 -0700, Randy.Dunlap wrote:
> > On Mon, 17 Oct 2005 15:24:04 -0400 Dale Blount wrote:
> >
> > > Hello,
> > >
> > > I just added 2 external 1TB+ scsi devices to my i686 linux server
> > > running 2.6.13.4 connected to external LSI MPT card. fdisk and df both
> > > show the sizes correctly (see below), but I'm worried that dmesg reports
> > > them incorrectly.
> > >
> > > SCSI device sda: 2460934144 512-byte hdwr sectors (160487 MB)
> > > SCSI device sdb: 3790438400 512-byte hdwr sectors (841193 MB)
> > >
> > > I don't think it's as simple as a variable overflow because both
> > > sdkp->capacity and mb look to be cast as unsigned long longs. I know a
> > > workaround is to present less data per LUN, but I'd like to use it as
> > > it's setup currently if possible. Is this just printing incorrectly or
> > > will I run into trouble when the device gets more full?
> >
> > The casts to (unsigned long long) just fix the printk() args to match
> > the format strings (and eliminate warnings).
> >
> > Looks to me like sdkp->capacity is correct. The <mb> value looks
> > way off. Since it's just printed here for user info, I don't see
> > how it can be a problem later on.
> >
>
> That's what I was hoping, but I didn't know for sure. I figured I'd
> better ask to make sure my data wouldn't get truncated when the disk got
> fuller. Between my first post and now, I've tested it by filling the
> drive with data and testing the md5sums for each file and it seems to
> work. Other than the odd MB size reported, it seems to work just fine.
>
> > I don't see the error just yet. Are there any other SCSI device-
> > related messages near these? And just to confirm, but you must
> > have CONFIG_LBD (Large Block Device) enabled, right?
> >
>
> No, there are no other related messages near these other than the
> standard vendor/versions. I did not enable CONFIG_LBD since the help
> says "bigger than 2TB", and I partitioned the storage system to present
> disks smaller than that.

I guessed at CONFIG_LBD=y last night. Today I did the arithmetic
with CONFIG_LBD=n... and found the overflow.

In drivers/scsi/sd.c, approx. line #1260:

sector_t sz = sdkp->capacity * (hard_sector/256);

sz is 32 bits.
capacity is 32 bits with a value of 2,460,934,144 == 0x92ae_e000.
Then we multiply that by 2 (512 / 256)... and it overflows.
Should be hex 1_255d_c000, but we drop the '1'.
The rest of the calculation (with this new value) does result
in the 160487 MB, matching your log message.

I think that you can fix it (patch it) by changing that line
to make 'sz' be type 'u64' for now.

u64 sz = sdkp->capacity * (hard_sector/256);

But those divides by 1250 and 1950 are still voodoo to me.

--
~Randy

2005-10-20 23:04:24

by Bill Davidsen

[permalink] [raw]
Subject: Re: scsi disk size reporting in dmesg

Randy.Dunlap wrote:
> On Thu, 20 Oct 2005, Dale Blount wrote:
>
>
>>On Wed, 2005-10-19 at 22:09 -0700, Randy.Dunlap wrote:
>>
>>>On Mon, 17 Oct 2005 15:24:04 -0400 Dale Blount wrote:
>>>
>>>
>>>>Hello,
>>>>
>>>>I just added 2 external 1TB+ scsi devices to my i686 linux server
>>>>running 2.6.13.4 connected to external LSI MPT card. fdisk and df both
>>>>show the sizes correctly (see below), but I'm worried that dmesg reports
>>>>them incorrectly.
>>>>
>>>>SCSI device sda: 2460934144 512-byte hdwr sectors (160487 MB)
>>>>SCSI device sdb: 3790438400 512-byte hdwr sectors (841193 MB)
>>>>
>>>>I don't think it's as simple as a variable overflow because both
>>>>sdkp->capacity and mb look to be cast as unsigned long longs. I know a
>>>>workaround is to present less data per LUN, but I'd like to use it as
>>>>it's setup currently if possible. Is this just printing incorrectly or
>>>>will I run into trouble when the device gets more full?
>>>
>>>The casts to (unsigned long long) just fix the printk() args to match
>>>the format strings (and eliminate warnings).
>>>
>>>Looks to me like sdkp->capacity is correct. The <mb> value looks
>>>way off. Since it's just printed here for user info, I don't see
>>>how it can be a problem later on.
>>>
>>
>>That's what I was hoping, but I didn't know for sure. I figured I'd
>>better ask to make sure my data wouldn't get truncated when the disk got
>>fuller. Between my first post and now, I've tested it by filling the
>>drive with data and testing the md5sums for each file and it seems to
>>work. Other than the odd MB size reported, it seems to work just fine.
>>
>>
>>>I don't see the error just yet. Are there any other SCSI device-
>>>related messages near these? And just to confirm, but you must
>>>have CONFIG_LBD (Large Block Device) enabled, right?
>>>
>>
>>No, there are no other related messages near these other than the
>>standard vendor/versions. I did not enable CONFIG_LBD since the help
>>says "bigger than 2TB", and I partitioned the storage system to present
>>disks smaller than that.
>
>
> I guessed at CONFIG_LBD=y last night. Today I did the arithmetic
> with CONFIG_LBD=n... and found the overflow.
>
> In drivers/scsi/sd.c, approx. line #1260:
>
> sector_t sz = sdkp->capacity * (hard_sector/256);
>
> sz is 32 bits.
> capacity is 32 bits with a value of 2,460,934,144 == 0x92ae_e000.
> Then we multiply that by 2 (512 / 256)... and it overflows.
> Should be hex 1_255d_c000, but we drop the '1'.
> The rest of the calculation (with this new value) does result
> in the 160487 MB, matching your log message.
>
> I think that you can fix it (patch it) by changing that line
> to make 'sz' be type 'u64' for now.
>
> u64 sz = sdkp->capacity * (hard_sector/256);
>
> But those divides by 1250 and 1950 are still voodoo to me.
>
I think that just hides the real problem, and while that's useful to
stop the annoying numbers, the real problem is either (a) the value
calculated is wrong, (b) the size of sector_t is no longer large enough,
or (c) the type has been wrong all along and shouldn't be sector_t.

If I understand what sector_t really is, (b) is my candidate.

--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me

2005-10-21 02:05:01

by Randy Dunlap

[permalink] [raw]
Subject: Re: scsi disk size reporting in dmesg

On Thu, 20 Oct 2005 19:04:47 -0400 Bill Davidsen wrote:

> Randy.Dunlap wrote:
> > On Thu, 20 Oct 2005, Dale Blount wrote:
> >
> >
> >>On Wed, 2005-10-19 at 22:09 -0700, Randy.Dunlap wrote:
> >>
> >>>On Mon, 17 Oct 2005 15:24:04 -0400 Dale Blount wrote:
> >>>
> >>>
> >>>>Hello,
> >>>>
> >>>>I just added 2 external 1TB+ scsi devices to my i686 linux server
> >>>>running 2.6.13.4 connected to external LSI MPT card. fdisk and df both
> >>>>show the sizes correctly (see below), but I'm worried that dmesg reports
> >>>>them incorrectly.
> >>>>
> >>>>SCSI device sda: 2460934144 512-byte hdwr sectors (160487 MB)
> >>>>SCSI device sdb: 3790438400 512-byte hdwr sectors (841193 MB)
> >>>>
> >>>>I don't think it's as simple as a variable overflow because both
> >>>>sdkp->capacity and mb look to be cast as unsigned long longs. I know a
> >>>>workaround is to present less data per LUN, but I'd like to use it as
> >>>>it's setup currently if possible. Is this just printing incorrectly or
> >>>>will I run into trouble when the device gets more full?
> >>>
> >>>The casts to (unsigned long long) just fix the printk() args to match
> >>>the format strings (and eliminate warnings).
> >>>
> >>>Looks to me like sdkp->capacity is correct. The <mb> value looks
> >>>way off. Since it's just printed here for user info, I don't see
> >>>how it can be a problem later on.
> >>>
> >>
> >>That's what I was hoping, but I didn't know for sure. I figured I'd
> >>better ask to make sure my data wouldn't get truncated when the disk got
> >>fuller. Between my first post and now, I've tested it by filling the
> >>drive with data and testing the md5sums for each file and it seems to
> >>work. Other than the odd MB size reported, it seems to work just fine.
> >>
> >>
> >>>I don't see the error just yet. Are there any other SCSI device-
> >>>related messages near these? And just to confirm, but you must
> >>>have CONFIG_LBD (Large Block Device) enabled, right?
> >>>
> >>
> >>No, there are no other related messages near these other than the
> >>standard vendor/versions. I did not enable CONFIG_LBD since the help
> >>says "bigger than 2TB", and I partitioned the storage system to present
> >>disks smaller than that.
> >
> >
> > I guessed at CONFIG_LBD=y last night. Today I did the arithmetic
> > with CONFIG_LBD=n... and found the overflow.
> >
> > In drivers/scsi/sd.c, approx. line #1260:
> >
> > sector_t sz = sdkp->capacity * (hard_sector/256);
> >
> > sz is 32 bits.
> > capacity is 32 bits with a value of 2,460,934,144 == 0x92ae_e000.
> > Then we multiply that by 2 (512 / 256)... and it overflows.
> > Should be hex 1_255d_c000, but we drop the '1'.
> > The rest of the calculation (with this new value) does result
> > in the 160487 MB, matching your log message.
> >
> > I think that you can fix it (patch it) by changing that line
> > to make 'sz' be type 'u64' for now.
> >
> > u64 sz = sdkp->capacity * (hard_sector/256);
> >
> > But those divides by 1250 and 1950 are still voodoo to me.
> >
> I think that just hides the real problem, and while that's useful to
> stop the annoying numbers, the real problem is either (a) the value
> calculated is wrong, (b) the size of sector_t is no longer large enough,
> or (c) the type has been wrong all along and shouldn't be sector_t.
>
> If I understand what sector_t really is, (b) is my candidate.

(b) is definitely the case with Dale's disk drive size.

---
~Randy