On Thu, Feb 28, 2013 at 01:28:48PM +0000, Jan Beulich wrote:
> >>> On 28.02.13 at 13:00, Roger Pau Monn?<[email protected]> wrote:
> > On 28/02/13 12:19, Jan Beulich wrote:
> >>>>> On 28.02.13 at 11:28, Roger Pau Monne <[email protected]> wrote:
> >>> @@ -109,6 +111,16 @@ typedef uint64_t blkif_sector_t;
> >>> */
> >>> #define BLKIF_MAX_SEGMENTS_PER_REQUEST 11
> >>>
> >>> +#define BLKIF_MAX_INDIRECT_GREFS_PER_REQUEST 8
> >>> +
> >>> +struct blkif_request_segment_aligned {
> >>> + grant_ref_t gref; /* reference to I/O buffer frame */
> >>> + /* @first_sect: first sector in frame to transfer (inclusive). */
> >>> + /* @last_sect: last sector in frame to transfer (inclusive). */
> >>> + uint8_t first_sect, last_sect;
> >>> + uint16_t _pad; /* padding to make it 8 bytes, so it's cache-aligned */
> >>> +} __attribute__((__packed__));
> >>
> >> What's the __packed__ for here?
> >
> > Yes, that's not needed.
> >
> >>
> >>> +
> >>> struct blkif_request_rw {
> >>> uint8_t nr_segments; /* number of segments */
> >>> blkif_vdev_t handle; /* only for read/write requests */
> >>> @@ -138,11 +150,24 @@ struct blkif_request_discard {
> >>> uint8_t _pad3;
> >>> } __attribute__((__packed__));
> >>>
> >>> +struct blkif_request_indirect {
> >>> + uint8_t indirect_op;
> >>> + uint16_t nr_segments;
> >>> +#ifdef CONFIG_X86_64
> >>> + uint32_t _pad1; /* offsetof(blkif_...,u.indirect.id) == 8 */
> >>> +#endif
> >>
> >> Either you want the structure be packed tightly (and you don't care
> >> about misaligned fields), in which case you shouldn't need a padding
> >> field. That's even more so as there's no padding between indirect_op
> >> and nr_segments, so everything is misaligned anyway, and the
> >> comment above is wrong too (offsetof() really ought to yield 7 in
> >> that case).
> >
> > This padding is because we want to have the "id" field at the same
> > position as blkif_request_rw, so we need to add the padding for it to
> > match 32 & 64 bit blkif_request_rw structures, this prevents adding some
> > "if (req.op == BLKIF_OP_INDIRECT)..." if we only need to get the id of
> > the request.
>
> Oh, right, that's desirable of course.
>
> > The comment is indeed wrong, I've copied it from blkif_request_discard
> > and forgot to change the offset
>
> But the offset stated there then is right after all - I forgot that
> there is a 1-byte field outside the union (the way this is being done
> in the upstream Linux header is really ugly imo, but I guess Jeremy
> and/or Konrad liked it that way). That's also why the packed
> attribute is needed here.
I am not particularly found as I keep on forgetting about the 1-byte field
as well. If you have a patch to clean it up would love to see it.
>
> But you will probably want to switch sector_number and handle, so
> that sector_number becomes aligned, and add another 16-bit
> padding field between handle and indirect_grefs[].
>
> I also wonder whether "indirect_op" wouldn't better be named
> "actual_op" or just "op".
<nods> 'op' sounds good. With a comment saying it can do all of the BLKIF_OPS_..
except the BLKIF_OP_INDIRECT one. Thought one could in theory chain
it that way for fun.
>
> Jan
>>> On 04.03.13 at 21:44, Konrad Rzeszutek Wilk <[email protected]> wrote:
> <nods> 'op' sounds good. With a comment saying it can do all of the
> BLKIF_OPS_..
> except the BLKIF_OP_INDIRECT one. Thought one could in theory chain
> it that way for fun.
In fact I'd like to exclude chaining as well as BLKIF_OP_DISCARD here.
The former should - if useful for anything - be controlled by a
separate feature flag, and the latter is plain pointless to indirect.
And I reckon the same would apply to BLKIF_OP_FLUSH_DISKCACHE
and BLKIF_OP_RESERVED_1 - i.e. it might be better to state that
indirection is only permitted for normal I/O (read/write) ops.
Jan
On Tue, Mar 05, 2013 at 08:11:19AM +0000, Jan Beulich wrote:
> >>> On 04.03.13 at 21:44, Konrad Rzeszutek Wilk <[email protected]> wrote:
> > <nods> 'op' sounds good. With a comment saying it can do all of the
> > BLKIF_OPS_..
> > except the BLKIF_OP_INDIRECT one. Thought one could in theory chain
> > it that way for fun.
>
> In fact I'd like to exclude chaining as well as BLKIF_OP_DISCARD here.
> The former should - if useful for anything - be controlled by a
> separate feature flag, and the latter is plain pointless to indirect.
> And I reckon the same would apply to BLKIF_OP_FLUSH_DISKCACHE
> and BLKIF_OP_RESERVED_1 - i.e. it might be better to state that
> indirection is only permitted for normal I/O (read/write) ops.
<nods> That makes sense. And also of course the new BLKIF_OP should
be documented in the Xen tree as well.
>
> Jan
>
On 05/03/13 15:16, Konrad Rzeszutek Wilk wrote:
> On Tue, Mar 05, 2013 at 08:11:19AM +0000, Jan Beulich wrote:
>>>>> On 04.03.13 at 21:44, Konrad Rzeszutek Wilk <[email protected]> wrote:
>>> <nods> 'op' sounds good. With a comment saying it can do all of the
>>> BLKIF_OPS_..
>>> except the BLKIF_OP_INDIRECT one. Thought one could in theory chain
>>> it that way for fun.
>>
>> In fact I'd like to exclude chaining as well as BLKIF_OP_DISCARD here.
>> The former should - if useful for anything - be controlled by a
>> separate feature flag, and the latter is plain pointless to indirect.
>> And I reckon the same would apply to BLKIF_OP_FLUSH_DISKCACHE
>> and BLKIF_OP_RESERVED_1 - i.e. it might be better to state that
>> indirection is only permitted for normal I/O (read/write) ops.
>
> <nods> That makes sense. And also of course the new BLKIF_OP should
> be documented in the Xen tree as well.
The only ops that can be done indirectly are _READ, _WRITE and
_BARRIER/_FLUSH. From the implementation in blkfront it seems like
_FLUSH/_BARRIER requests can indeed contain segments, but I haven't been
able to spot any _FLUSH op with segments on it. Can you confirm FLUSH
requests never contain bios?
On Tue, Mar 05, 2013 at 06:00:51PM +0100, Roger Pau Monn? wrote:
> On 05/03/13 15:16, Konrad Rzeszutek Wilk wrote:
> > On Tue, Mar 05, 2013 at 08:11:19AM +0000, Jan Beulich wrote:
> >>>>> On 04.03.13 at 21:44, Konrad Rzeszutek Wilk <[email protected]> wrote:
> >>> <nods> 'op' sounds good. With a comment saying it can do all of the
> >>> BLKIF_OPS_..
> >>> except the BLKIF_OP_INDIRECT one. Thought one could in theory chain
> >>> it that way for fun.
> >>
> >> In fact I'd like to exclude chaining as well as BLKIF_OP_DISCARD here.
> >> The former should - if useful for anything - be controlled by a
> >> separate feature flag, and the latter is plain pointless to indirect.
> >> And I reckon the same would apply to BLKIF_OP_FLUSH_DISKCACHE
> >> and BLKIF_OP_RESERVED_1 - i.e. it might be better to state that
> >> indirection is only permitted for normal I/O (read/write) ops.
> >
> > <nods> That makes sense. And also of course the new BLKIF_OP should
> > be documented in the Xen tree as well.
>
> The only ops that can be done indirectly are _READ, _WRITE and
> _BARRIER/_FLUSH. From the implementation in blkfront it seems like
> _FLUSH/_BARRIER requests can indeed contain segments, but I haven't been
> able to spot any _FLUSH op with segments on it. Can you confirm FLUSH
> requests never contain bios?
Not FLUSH per say. But the FUA should be able to provide data and the
flush semantics with it. Except we don't support FUA so this is irrelevant
- unless in the future we want to intrduce FUA or WRITE with some extra
flags.