by Elliott, Robert (Server Storage)

[permalink] [raw]

Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout

In sd_sync_cache:
rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER;

Regardless of the baseline for the multiplication, a magic
number of 2 is too arbitrary. That might work for an
individual drive, but could be far too short for a RAID
controller that runs into worst case error handling for
the drives to which it is flushing (e.g., if its cache
is volatile and the drives all have recoverable errors
during writes). That time goes up with a bigger cache and
with more fragmented write data in the cache requiring
more individual WRITE commands.

A better value would be the Recommended Command Timeout field
value reported in the REPORT SUPPORTED OPERATION CODES command,
if reported by the device server. That is supposed to account
for the worst case.

For cases where that is not reported, exposing the multiplier
in sysfs would let the timeout for simple devices be set to
smaller values than complex devices.

Also, in both sd_setup_flush_cmnd and sd_sync_cache:
cmd->cmnd[0] = SYNCHRONIZE_CACHE;
cmd->cmd_len = 10;

SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE
CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported.

---
Rob Elliott HP Server Storage

2014-07-18 15:10:51

On 14-07-18 07:41 AM, James Bottomley wrote:
> On Fri, 2014-07-18 at 17:17 +0000, Elliott, Robert (Server Storage)
> wrote:
>>
>>
>>> From: James Bottomley [mailto:[email protected]]
>>>
>>> On Fri, 2014-07-18 at 00:51 +0000, Elliott, Robert (Server Storage)
>>> wrote:
>> ...
>>>>
>>>> Also, in both sd_setup_flush_cmnd and sd_sync_cache:
>>>> cmd->cmnd[0] = SYNCHRONIZE_CACHE;
>>>> cmd->cmd_len = 10;
>>>>
>>>> SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE
>>>> CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported.
>>
>> (sorry - meant "unless ... 16 is not supported")
>
> Yes, I guessed that?
>
>>> For what reason. We usually go for the safe alternatives, which is 10
>>> byte commands because they have the widest testing and greatest level of
>>> support. We don't do range flushes currently, so there doesn't seem to
>>> be a practical different. If we did support range flushes, we'd likely
>>> only use SC(16) on >2TB devices.
>>>
>>> James
>>
>> A goal of the simplified SCSI feature set idea is to drop all the
>> short CDBs that have larger, more capable equivalents - don't carry
>> READ 6/10/12/16 and SYNCHRONIZE CACHE 10/16, just keep the 16-byte
>> versions. With modern serial IU-based protocols, short CDBs don't
>> save any transfer time. This will simplify design and testing on
>> both initiator and target sides. Competing command sets like NVMe
>> rightly point out that SCSI has too much legacy baggage - all you
>> need for IO is one READ, one WRITE, and one FLUSH command.
>
> But that's not relevant to us. This is the problem of practical vs
> standards approaches. We have to support older and buggy devices. Most
> small USB storage sticks die if they see 16 byte CDB commands because
> their interpreters. The more "legacy baggage" the standards committee
> dumps, the greater the number of heuristics OSs have to have to cope
> with the plethora of odd devices.
>
>> That's why SBC-3 ended up with these warning notes for all the
>> non-16 byte CDBs:
>>
>> NOTE 15 - Migration from the SYNCHRONIZE CACHE (10) command to
>> the SYNCHRONIZE CACHE (16) command is recommended for all
>> implementations.
>>
>> If the LBA field in SYNCHRONIZE CACHE went obsolete, then maybe
>> SYNCHRONIZE CACHE (10) would be kept instead of (16), but that
>> field is still present. So, (16) is the likely survivor.
>
> OK, but look at it from our point of view: The reply above justifies
> why we prefer 10 byte CDBs over 16. If the standards body ever did
> remove SC(10) completely, then we'd have to have yet another heuristic
> to recognise devices that don't support SC(10), but until that day,
> using SC(10) alone works in all cases, so is the better path for the OS.
>
> If you could, please get the standards body to recognise that the more
> command churn they introduce (in the name of rationalisation or
> whatever), the more problems they introduce for Operating Systems and
> the more likelihood (because of different people reading different
> revisions of standards) that we end up with compliance bugs in devices.

From the term: "feature sets" I'm guessing T10 will follow what T13
does and have something like a VPD page with descriptors of feature
sets supported. Each set has mandatory and optional commands,
perhaps a similar categorization of mode and VPD pages as well. Such
a "clean slate" for SCSI would make it simpler in the future, at
least for what to put on the fast path. Perhaps some legacy
support could be pushed to the user space.

For many technical areas "legacy" is a derogatory term, but
not necessarily for storage!

Doug Gilbert