2002-08-02 11:33:43

by Steve Lord

[permalink] [raw]
Subject: A new ide warning message


In 2.5.30 I started getting these warning messages out ide during
the mount of an XFS filesystem:

ide-dma: received 1 phys segments, build 2

Can anyone translate that into English please.

Thanks,

Steve



2002-08-02 11:40:16

by Marcin Dalecki

[permalink] [raw]
Subject: Re: A new ide warning message

U?ytkownik Stephen Lord napisa?:
> In 2.5.30 I started getting these warning messages out ide during
> the mount of an XFS filesystem:
>
> ide-dma: received 1 phys segments, build 2
>
> Can anyone translate that into English please.

It can be found in pcidma.c.
It is repoting that we have one physical segment needed by
the request in question but the sctter gather list allocation
needed to break it up for mapping in two.


2002-08-02 11:43:51

by Jens Axboe

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, Aug 02 2002, Stephen Lord wrote:
>
> In 2.5.30 I started getting these warning messages out ide during
> the mount of an XFS filesystem:
>
> ide-dma: received 1 phys segments, build 2
>
> Can anyone translate that into English please.

Well I added that message when switching to the 2.5 style request
mapping functions, and I think the message is perfectly clear :-). Never
the less, it means that a segment that came into the ide layer with an
advertised size of 1 segment was returned from blk_rq_map_sg() as having
_two_. This can be a problem with dynamically allocated sg table (not
that ide uses those, but still).

It's a bug and usually a critical one when this happens. I'd be inclined
to think that Adam's changes in this path are to blame for this error.

Oh, and I'd be _really_ careful if you have trusted data on that drive
(surely not when running 2.5 ide on it :-)

--
Jens Axboe

2002-08-02 11:49:42

by Marcin Dalecki

[permalink] [raw]
Subject: Re: A new ide warning message

Uz.ytkownik Jens Axboe napisa?:
> On Fri, Aug 02 2002, Stephen Lord wrote:
>
>>In 2.5.30 I started getting these warning messages out ide during
>>the mount of an XFS filesystem:
>>
>>ide-dma: received 1 phys segments, build 2
>>
>>Can anyone translate that into English please.
>
>
> Well I added that message when switching to the 2.5 style request
> mapping functions, and I think the message is perfectly clear :-). Never
> the less, it means that a segment that came into the ide layer with an
> advertised size of 1 segment was returned from blk_rq_map_sg() as having
> _two_. This can be a problem with dynamically allocated sg table (not
> that ide uses those, but still).
>
> It's a bug and usually a critical one when this happens. I'd be inclined
> to think that Adam's changes in this path are to blame for this error.

Carefull carefull. it can be that the generic BIO code doesn't honour
the limits Adam was setting properly. And it can be of course
as well the XFS doesn't cooperate properly with those limits as well,
since ther kernel appears to be patched to support them.

It would be helpfull as well to know on which brand of host controller
chip this was found. In esp. trm290 maybe?
>
> Oh, and I'd be _really_ careful if you have trusted data on that drive
> (surely not when running 2.5 ide on it :-)
>


2002-08-02 11:49:58

by Jens Axboe

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, Aug 02 2002, Marcin Dalecki wrote:
> U?ytkownik Stephen Lord napisa?:
> >In 2.5.30 I started getting these warning messages out ide during
> >the mount of an XFS filesystem:
> >
> >ide-dma: received 1 phys segments, build 2
> >
> >Can anyone translate that into English please.
>
> It can be found in pcidma.c.
> It is repoting that we have one physical segment needed by
> the request in question but the sctter gather list allocation
> needed to break it up for mapping in two.

You don't seem to realise that this is a BUG (somewhere, could even be
in the generic mapping functions)! blk_rq_map_sg() must never map a
request to more entries that rq->nr_segments, that's just very wrong.

That's why I'm suspecting the recent pcidma changes. Just a feeling, I
have not looked at them.

--
Jens Axboe

2002-08-02 11:52:26

by Jens Axboe

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, Aug 02 2002, Marcin Dalecki wrote:
> Uz.ytkownik Jens Axboe napisa?:
> >On Fri, Aug 02 2002, Stephen Lord wrote:
> >
> >>In 2.5.30 I started getting these warning messages out ide during
> >>the mount of an XFS filesystem:
> >>
> >>ide-dma: received 1 phys segments, build 2
> >>
> >>Can anyone translate that into English please.
> >
> >
> >Well I added that message when switching to the 2.5 style request
> >mapping functions, and I think the message is perfectly clear :-). Never
> >the less, it means that a segment that came into the ide layer with an
> >advertised size of 1 segment was returned from blk_rq_map_sg() as having
> >_two_. This can be a problem with dynamically allocated sg table (not
> >that ide uses those, but still).
> >
> >It's a bug and usually a critical one when this happens. I'd be inclined
> >to think that Adam's changes in this path are to blame for this error.
>
> Carefull carefull. it can be that the generic BIO code doesn't honour
> the limits Adam was setting properly. And it can be of course
> as well the XFS doesn't cooperate properly with those limits as well,
> since ther kernel appears to be patched to support them.

Sure, I'm not claiming it's an IDE bug (please re-read the paragraph)
and my recent even said out loud that this could be a bug anywhere. It
also stresses that this bug can be quite serious and cause data
corruption, since this sort of thing is not expected to happen.

I'm suggesting that this is _most likely_ a bug introduced in the recent
pcidma changes. Or it could be a bug with a request getting screwed over
somewhere. Nasty corruption likely in that case as well.

--
Jens Axboe

2002-08-02 11:56:17

by Jens Axboe

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, Aug 02 2002, Jens Axboe wrote:
> On Fri, Aug 02 2002, Marcin Dalecki wrote:
> > U?ytkownik Stephen Lord napisa?:
> > >In 2.5.30 I started getting these warning messages out ide during
> > >the mount of an XFS filesystem:
> > >
> > >ide-dma: received 1 phys segments, build 2
> > >
> > >Can anyone translate that into English please.
> >
> > It can be found in pcidma.c.
> > It is repoting that we have one physical segment needed by
> > the request in question but the sctter gather list allocation
> > needed to break it up for mapping in two.
>
> You don't seem to realise that this is a BUG (somewhere, could even be
> in the generic mapping functions)! blk_rq_map_sg() must never map a
> request to more entries that rq->nr_segments, that's just very wrong.
>
> That's why I'm suspecting the recent pcidma changes. Just a feeling, I
> have not looked at them.

I'll take that back. Having looked at Adam's changes there are perfectly
fine. I'm now putting my money on IDE breakage somewhere instead. It
would be interesting to dump request state when this happens. Stephen,
can you reproduce this at will?

--
Jens Axboe

2002-08-02 12:04:43

by Steve Lord

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, 2002-08-02 at 06:48, Marcin Dalecki wrote:
> Uz.ytkownik Jens Axboe napisa?:
> > On Fri, Aug 02 2002, Stephen Lord wrote:
> >
> >>In 2.5.30 I started getting these warning messages out ide during
> >>the mount of an XFS filesystem:
> >>
> >>ide-dma: received 1 phys segments, build 2
> >>
> >>Can anyone translate that into English please.
> >
> >
> > Well I added that message when switching to the 2.5 style request
> > mapping functions, and I think the message is perfectly clear :-). Never
> > the less, it means that a segment that came into the ide layer with an
> > advertised size of 1 segment was returned from blk_rq_map_sg() as having
> > _two_. This can be a problem with dynamically allocated sg table (not
> > that ide uses those, but still).
> >
> > It's a bug and usually a critical one when this happens. I'd be inclined
> > to think that Adam's changes in this path are to blame for this error.
>
> Carefull carefull. it can be that the generic BIO code doesn't honour
> the limits Adam was setting properly. And it can be of course
> as well the XFS doesn't cooperate properly with those limits as well,
> since ther kernel appears to be patched to support them.
>

Well, this is happening when reading the log up from disk during
mount, we will be asking for somewhere around 32K of data at a
time, but it might not be well aligned. I will instrument it and
report back - will be a few hours, the box is at work and I just
tripped it up in some other code, I cannot reset it from here.

> It would be helpfull as well to know on which brand of host controller
> chip this was found. In esp. trm290 maybe?

Since it is down I cannot give you the ide boot messages right now,
but it is a Tyan Tiger BX motherboard using the built in IDE chipset,
so pretty generic stuff.

> >
> > Oh, and I'd be _really_ careful if you have trusted data on that drive
> > (surely not when running 2.5 ide on it :-)
> >
>

It's a scratch box, all it does is run development kernels and tests,
I am keeping 2.5 in a cage right now.

To answer your other question Jens, yes I can reproduce at will,
once I go in to the office.

Steve


2002-08-02 12:14:23

by Marcin Dalecki

[permalink] [raw]
Subject: Re: A new ide warning message

U?ytkownik Stephen Lord napisa?:
> On Fri, 2002-08-02 at 06:48, Marcin Dalecki wrote:
>
>>Uz.ytkownik Jens Axboe napisa?:
>>
>>>On Fri, Aug 02 2002, Stephen Lord wrote:
>>>
>>>
>>>>In 2.5.30 I started getting these warning messages out ide during
>>>>the mount of an XFS filesystem:
>>>>
>>>>ide-dma: received 1 phys segments, build 2
>>>>
>>>>Can anyone translate that into English please.
>>>
>>>
>>>Well I added that message when switching to the 2.5 style request
>>>mapping functions, and I think the message is perfectly clear :-). Never
>>>the less, it means that a segment that came into the ide layer with an
>>>advertised size of 1 segment was returned from blk_rq_map_sg() as having
>>>_two_. This can be a problem with dynamically allocated sg table (not
>>>that ide uses those, but still).
>>>
>>>It's a bug and usually a critical one when this happens. I'd be inclined
>>>to think that Adam's changes in this path are to blame for this error.
>>
>>Carefull carefull. it can be that the generic BIO code doesn't honour
>>the limits Adam was setting properly. And it can be of course
>>as well the XFS doesn't cooperate properly with those limits as well,
>>since ther kernel appears to be patched to support them.
>>
>
>
> Well, this is happening when reading the log up from disk during
> mount, we will be asking for somewhere around 32K of data at a
> time, but it might not be well aligned. I will instrument it and
> report back - will be a few hours, the box is at work and I just
> tripped it up in some other code, I cannot reset it from here.
>
>
>>It would be helpfull as well to know on which brand of host controller
>>chip this was found. In esp. trm290 maybe?
>
>
> Since it is down I cannot give you the ide boot messages right now,
> but it is a Tyan Tiger BX motherboard using the built in IDE chipset,
> so pretty generic stuff.
OK. Could you then deliberately change the following in ide/main.c

+ /* Most controllers cannot do transfers across 64kB boundaries.
+ trm290 can do transfers within a 4GB boundary, so it changes
+ this mask accordingly. */
+ ch->seg_boundary_mask = 0xffff;
+
+ /* Some chipsets (cs5530, any others?) think a 64kB transfer
+ is 0 byte transfer, so set the limit one sector smaller.
+ In the future, we may default to 64kB transfers and let
+ invidual chipsets with this problem change ch->max_segment_size. */
+ ch->max_segment_size = (1<<16) - 512;


I would in esp. like to see the result of setting ch->max_segment_size
= (1 << 15).



2002-08-02 12:27:33

by Jens Axboe

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, Aug 02 2002, Marcin Dalecki wrote:
> U?ytkownik Stephen Lord napisa?:
> >On Fri, 2002-08-02 at 06:48, Marcin Dalecki wrote:
> >
> >>Uz.ytkownik Jens Axboe napisa?:
> >>
> >>>On Fri, Aug 02 2002, Stephen Lord wrote:
> >>>
> >>>
> >>>>In 2.5.30 I started getting these warning messages out ide during
> >>>>the mount of an XFS filesystem:
> >>>>
> >>>>ide-dma: received 1 phys segments, build 2
> >>>>
> >>>>Can anyone translate that into English please.
> >>>
> >>>
> >>>Well I added that message when switching to the 2.5 style request
> >>>mapping functions, and I think the message is perfectly clear :-). Never
> >>>the less, it means that a segment that came into the ide layer with an
> >>>advertised size of 1 segment was returned from blk_rq_map_sg() as having
> >>>_two_. This can be a problem with dynamically allocated sg table (not
> >>>that ide uses those, but still).
> >>>
> >>>It's a bug and usually a critical one when this happens. I'd be inclined
> >>>to think that Adam's changes in this path are to blame for this error.
> >>
> >>Carefull carefull. it can be that the generic BIO code doesn't honour
> >>the limits Adam was setting properly. And it can be of course
> >>as well the XFS doesn't cooperate properly with those limits as well,
> >>since ther kernel appears to be patched to support them.
> >>
> >
> >
> >Well, this is happening when reading the log up from disk during
> >mount, we will be asking for somewhere around 32K of data at a
> >time, but it might not be well aligned. I will instrument it and
> >report back - will be a few hours, the box is at work and I just
> >tripped it up in some other code, I cannot reset it from here.
> >
> >
> >>It would be helpfull as well to know on which brand of host controller
> >>chip this was found. In esp. trm290 maybe?
> >
> >
> >Since it is down I cannot give you the ide boot messages right now,
> >but it is a Tyan Tiger BX motherboard using the built in IDE chipset,
> >so pretty generic stuff.
> OK. Could you then deliberately change the following in ide/main.c
>
> + /* Most controllers cannot do transfers across 64kB boundaries.
> + trm290 can do transfers within a 4GB boundary, so it changes
> + this mask accordingly. */
> + ch->seg_boundary_mask = 0xffff;
> +
> + /* Some chipsets (cs5530, any others?) think a 64kB transfer
> + is 0 byte transfer, so set the limit one sector smaller.
> + In the future, we may default to 64kB transfers and let
> + invidual chipsets with this problem change ch->max_segment_size.
> */
> + ch->max_segment_size = (1<<16) - 512;
>
>
> I would in esp. like to see the result of setting ch->max_segment_size
> = (1 << 15).

This might not be such a good idea, since the limit-bio-size etc stuff
isn't in yet, depending on _exactly_ how big the bio's xfs are building
are. If they are max 8 pages (I seem to recall so), then yeah the above
test would be nice to see. If they are bigger than 8 pages, then the
above would be a meaningless test.

I'll hack up a rq_dump() function to slap in pcidma.c as well.

--
Jens Axboe

2002-08-02 12:33:32

by Marcin Dalecki

[permalink] [raw]
Subject: Re: A new ide warning message

Uz.ytkownik Jens Axboe napisa?:

>>*/
>>+ ch->max_segment_size = (1<<16) - 512;
>>
>>
>>I would in esp. like to see the result of setting ch->max_segment_size
>>= (1 << 15).
>
>
> This might not be such a good idea, since the limit-bio-size etc stuff
> isn't in yet, depending on _exactly_ how big the bio's xfs are building
> are. If they are max 8 pages (I seem to recall so), then yeah the above
> test would be nice to see. If they are bigger than 8 pages, then the
> above would be a meaningless test.

Sure. I'm aware of this. And I haven't looked at the XFS code
yet, so I can only guess about it.

What I can do myself is just pushing this limit even lower just to
see at which point my own system (ext3) starts to turn tits up ...

> I'll hack up a rq_dump() function to slap in pcidma.c as well.

Yes that would be a "nice to have too".
But it is a request for actual data as far as I can see.

2002-08-02 12:38:17

by Jens Axboe

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, Aug 02 2002, Marcin Dalecki wrote:
> Uz.ytkownik Jens Axboe napisa?:
>
> >>*/
> >>+ ch->max_segment_size = (1<<16) - 512;
> >>
> >>
> >>I would in esp. like to see the result of setting ch->max_segment_size
> >>= (1 << 15).
> >
> >
> >This might not be such a good idea, since the limit-bio-size etc stuff
> >isn't in yet, depending on _exactly_ how big the bio's xfs are building
> >are. If they are max 8 pages (I seem to recall so), then yeah the above
> >test would be nice to see. If they are bigger than 8 pages, then the
> >above would be a meaningless test.
>
> Sure. I'm aware of this. And I haven't looked at the XFS code
> yet, so I can only guess about it.
>
> What I can do myself is just pushing this limit even lower just to
> see at which point my own system (ext3) starts to turn tits up ...
>
> >I'll hack up a rq_dump() function to slap in pcidma.c as well.
>
> Yes that would be a "nice to have too".
> But it is a request for actual data as far as I can see.

Yeah it's a request for data, what else could it be? That's the only
place where we call blk_rq_map_sg().

Stephen, please provoke with this patch applied. I hope it works, it's
untested :-)

--- drivers/ide/pcidma.c~ 2002-08-02 14:32:14.000000000 +0200
+++ drivers/ide/pcidma.c 2002-08-02 14:39:21.000000000 +0200
@@ -58,6 +58,24 @@
return ata_error(drive, rq, __FUNCTION__);
}

+static void rq_dump(struct request *rq, int build_segments)
+{
+ struct bio_vec *bvec;
+ struct bio *bio;
+ int i = 0, ibio = 0;
+
+ printk("pcidma: build %d segments, supplied %d/%d, sectors %ld/%d\n", build_segments, rq->nr_phys_segments, rq->nr_hw_segments, rq->nr_sectors, rq->current_nr_sectors);
+
+ rq_for_each_bio(bio, rq) {
+ bio->bi_flags &= ~(1 << BIO_SEG_VALID);
+ printk("bio %d: phys %d, hw %d\n", ibio, bio_phys_segments(rq->q, bio), bio_hw_segments(rq->q, bio));
+ bio_for_each_segment(bvec, bio, i) {
+ printk("segment %d: phys %lu, size %u\n", i, bvec_to_phys(bvec), bvec->bv_len);
+ }
+ ibio++;
+ }
+}
+
/*
* FIXME: taskfiles should be a map of pages, not a long virt address... /jens
* FIXME: I agree with Jens --mdcki!
@@ -107,7 +125,7 @@
nents = blk_rq_map_sg(&drive->queue, rq, ch->sg_table);

if (rq->q && nents > rq->nr_phys_segments)
- printk("ide-dma: received %d phys segments, build %d\n", rq->nr_phys_segments, nents);
+ rq_dump(rq, nents);

if (rq_data_dir(rq) == READ)
ch->sg_dma_direction = PCI_DMA_FROMDEVICE;

--
Jens Axboe

Subject: Re: A new ide warning message


On Fri, 2 Aug 2002, Jens Axboe wrote:

> On Fri, Aug 02 2002, Jens Axboe wrote:
> > On Fri, Aug 02 2002, Marcin Dalecki wrote:
> > > U?ytkownik Stephen Lord napisa?:
> > > >In 2.5.30 I started getting these warning messages out ide during
> > > >the mount of an XFS filesystem:
> > > >
> > > >ide-dma: received 1 phys segments, build 2
> > > >
> > > >Can anyone translate that into English please.
> > >
> > > It can be found in pcidma.c.
> > > It is repoting that we have one physical segment needed by
> > > the request in question but the sctter gather list allocation
> > > needed to break it up for mapping in two.
> >
> > You don't seem to realise that this is a BUG (somewhere, could even be
> > in the generic mapping functions)! blk_rq_map_sg() must never map a
> > request to more entries that rq->nr_segments, that's just very wrong.
> >
> > That's why I'm suspecting the recent pcidma changes. Just a feeling, I
> > have not looked at them.
>
> I'll take that back. Having looked at Adam's changes there are perfectly
> fine. I'm now putting my money on IDE breakage somewhere instead. It

Look again Jens. Adam's changes made IDE queue handling inconsistent.
hint: 2 * 127 != 255

But noticed warning deals with design of ll_rw_blk.c. ;-)
(right now max_segment_size have to be max bv->bv_len aligned)

Jens, please look at segment checking/counting code, it does it on
bv->bv_len (4kb most likely) not sector granuality...

So for not 4kb aligned max_segment_size we will get new segment...

Best fix will be to make block layer count sectors not bv->bv_len...


btw. I like Adam's patch but it was draft not to include in mainline (?).
--
Bartlomiej

> would be interesting to dump request state when this happens. Stephen,
> can you reproduce this at will?
>
> --
> Jens Axboe


2002-08-02 13:55:00

by Steve Lord

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, 2002-08-02 at 07:41, Jens Axboe wrote:
>
> Yeah it's a request for data, what else could it be? That's the only
> place where we call blk_rq_map_sg().
>
> Stephen, please provoke with this patch applied. I hope it works, it's
> untested :-)
>

Consider it provoked, I added a trace to the submit_bio call in XFS
as well, dumping sector number, bio length in bytes and number of
vector elements submitted:

submit_bio(READ, sector 0x414870, len 65536 vec_len 16
submit_bio(READ, sector 0x4148f0, len 65536 vec_len 16
pcidma: build 2 segments, supplied 1/16, sectors 128/8
bio 0: phys 1, hw 16
segment 0: phys 69599232, size 4096
segment 1: phys 69603328, size 4096
segment 2: phys 69607424, size 4096
segment 3: phys 69611520, size 4096
segment 4: phys 69615616, size 4096
segment 5: phys 69619712, size 4096
segment 6: phys 69623808, size 4096
segment 7: phys 69627904, size 4096
segment 8: phys 69632000, size 4096
segment 9: phys 69636096, size 4096
segment 10: phys 69640192, size 4096
segment 11: phys 69644288, size 4096
segment 12: phys 69648384, size 4096
segment 13: phys 69652480, size 4096
segment 14: phys 69656576, size 4096
segment 15: phys 69660672, size 4096
pcidma: build 2 segments, supplied 1/16, sectors 128/8
bio 0: phys 1, hw 16
segment 0: phys 69664768, size 4096
segment 1: phys 69668864, size 4096
segment 2: phys 69672960, size 4096
segment 3: phys 69677056, size 4096
segment 4: phys 69681152, size 4096
segment 5: phys 69685248, size 4096
segment 6: phys 69689344, size 4096
segment 7: phys 69693440, size 4096
segment 8: phys 69697536, size 4096
segment 9: phys 69701632, size 4096
segment 10: phys 69705728, size 4096
segment 11: phys 69709824, size 4096
segment 12: phys 69713920, size 4096
segment 13: phys 69718016, size 4096
segment 14: phys 69722112, size 4096
segment 15: phys 69726208, size 4096

And so on.....

The bio size being used is based purely on the BIO_MAX_SECTORS
constant, same code as ll_rw_kio. Looks like the direct I/O
path uses similar math.

Steve

--

Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: [email protected]

2002-08-02 14:17:22

by Jens Axboe

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, Aug 02 2002, Bartlomiej Zolnierkiewicz wrote:
>
> On Fri, 2 Aug 2002, Jens Axboe wrote:
>
> > On Fri, Aug 02 2002, Jens Axboe wrote:
> > > On Fri, Aug 02 2002, Marcin Dalecki wrote:
> > > > U?ytkownik Stephen Lord napisa?:
> > > > >In 2.5.30 I started getting these warning messages out ide during
> > > > >the mount of an XFS filesystem:
> > > > >
> > > > >ide-dma: received 1 phys segments, build 2
> > > > >
> > > > >Can anyone translate that into English please.
> > > >
> > > > It can be found in pcidma.c.
> > > > It is repoting that we have one physical segment needed by
> > > > the request in question but the sctter gather list allocation
> > > > needed to break it up for mapping in two.
> > >
> > > You don't seem to realise that this is a BUG (somewhere, could even be
> > > in the generic mapping functions)! blk_rq_map_sg() must never map a
> > > request to more entries that rq->nr_segments, that's just very wrong.
> > >
> > > That's why I'm suspecting the recent pcidma changes. Just a feeling, I
> > > have not looked at them.
> >
> > I'll take that back. Having looked at Adam's changes there are perfectly
> > fine. I'm now putting my money on IDE breakage somewhere instead. It
>
> Look again Jens. Adam's changes made IDE queue handling inconsistent.
> hint: 2 * 127 != 255
>
> But noticed warning deals with design of ll_rw_blk.c. ;-)
> (right now max_segment_size have to be max bv->bv_len aligned)

Yeah that's true, actually was just saying that on linux-scsi
yesterday/today.

> Jens, please look at segment checking/counting code, it does it on
> bv->bv_len (4kb most likely) not sector granuality...
>
> So for not 4kb aligned max_segment_size we will get new segment...
>
> Best fix will be to make block layer count sectors not bv->bv_len...

Well I'm inclined to just make that page size granularity. It's like
that in 2.4 as well (no guarentees that we will honor anything less than
that granularity).

> btw. I like Adam's patch but it was draft not to include in mainline (?).

The concept is sound, so it has a bug... I can say the same for other
stuff in the kernel as well :-)

I probably just wanted more review (my 1 minute review surely wasn't
enough).

--
Jens Axboe

Subject: Re: A new ide warning message


On Fri, 2 Aug 2002, Jens Axboe wrote:

> On Fri, Aug 02 2002, Bartlomiej Zolnierkiewicz wrote:
> >
> > On Fri, 2 Aug 2002, Jens Axboe wrote:
> >
> > > On Fri, Aug 02 2002, Jens Axboe wrote:
> > > > On Fri, Aug 02 2002, Marcin Dalecki wrote:
> > > > > U?ytkownik Stephen Lord napisa?:
> > > > > >In 2.5.30 I started getting these warning messages out ide during
> > > > > >the mount of an XFS filesystem:
> > > > > >
> > > > > >ide-dma: received 1 phys segments, build 2
> > > > > >
> > > > > >Can anyone translate that into English please.
> > > > >
> > > > > It can be found in pcidma.c.
> > > > > It is repoting that we have one physical segment needed by
> > > > > the request in question but the sctter gather list allocation
> > > > > needed to break it up for mapping in two.
> > > >
> > > > You don't seem to realise that this is a BUG (somewhere, could even be
> > > > in the generic mapping functions)! blk_rq_map_sg() must never map a
> > > > request to more entries that rq->nr_segments, that's just very wrong.
> > > >
> > > > That's why I'm suspecting the recent pcidma changes. Just a feeling, I
> > > > have not looked at them.
> > >
> > > I'll take that back. Having looked at Adam's changes there are perfectly
> > > fine. I'm now putting my money on IDE breakage somewhere instead. It
> >
> > Look again Jens. Adam's changes made IDE queue handling inconsistent.
> > hint: 2 * 127 != 255
> >
> > But noticed warning deals with design of ll_rw_blk.c. ;-)
> > (right now max_segment_size have to be max bv->bv_len aligned)
>
> Yeah that's true, actually was just saying that on linux-scsi
> yesterday/today.

:-)

> > Jens, please look at segment checking/counting code, it does it on
> > bv->bv_len (4kb most likely) not sector granuality...
> >
> > So for not 4kb aligned max_segment_size we will get new segment...
> >
> > Best fix will be to make block layer count sectors not bv->bv_len...
>
> Well I'm inclined to just make that page size granularity. It's like
> that in 2.4 as well (no guarentees that we will honor anything less than
> that granularity).

Anyway it must be also something diffirent - __make_request() should have
noticed that rq has 2 segments not 1... this puzzles me a bit.

This case also shows limits of BIO_MAX_SECTORS again (Adam worked on
generic solution, but I don't know current state). There some devices
which set q->max_sectors to 64, i.e. broken ide-floppy driver ;-)

> > btw. I like Adam's patch but it was draft not to include in mainline (?).
>
> The concept is sound, so it has a bug... I can say the same for other
> stuff in the kernel as well :-)

Yes. :-)

> I probably just wanted more review (my 1 minute review surely wasn't
> enough).
>
> --
> Jens Axboe

Greets
--
Bartlomiej

2002-08-02 16:08:25

by Jens Axboe

[permalink] [raw]
Subject: Re: A new ide warning message

On Fri, Aug 02 2002, Bartlomiej Zolnierkiewicz wrote:
>
> On Fri, 2 Aug 2002, Jens Axboe wrote:
>
> > On Fri, Aug 02 2002, Bartlomiej Zolnierkiewicz wrote:
> > >
> > > On Fri, 2 Aug 2002, Jens Axboe wrote:
> > >
> > > > On Fri, Aug 02 2002, Jens Axboe wrote:
> > > > > On Fri, Aug 02 2002, Marcin Dalecki wrote:
> > > > > > U?ytkownik Stephen Lord napisa?:
> > > > > > >In 2.5.30 I started getting these warning messages out ide during
> > > > > > >the mount of an XFS filesystem:
> > > > > > >
> > > > > > >ide-dma: received 1 phys segments, build 2
> > > > > > >
> > > > > > >Can anyone translate that into English please.
> > > > > >
> > > > > > It can be found in pcidma.c.
> > > > > > It is repoting that we have one physical segment needed by
> > > > > > the request in question but the sctter gather list allocation
> > > > > > needed to break it up for mapping in two.
> > > > >
> > > > > You don't seem to realise that this is a BUG (somewhere, could even be
> > > > > in the generic mapping functions)! blk_rq_map_sg() must never map a
> > > > > request to more entries that rq->nr_segments, that's just very wrong.
> > > > >
> > > > > That's why I'm suspecting the recent pcidma changes. Just a feeling, I
> > > > > have not looked at them.
> > > >
> > > > I'll take that back. Having looked at Adam's changes there are perfectly
> > > > fine. I'm now putting my money on IDE breakage somewhere instead. It
> > >
> > > Look again Jens. Adam's changes made IDE queue handling inconsistent.
> > > hint: 2 * 127 != 255
> > >
> > > But noticed warning deals with design of ll_rw_blk.c. ;-)
> > > (right now max_segment_size have to be max bv->bv_len aligned)
> >
> > Yeah that's true, actually was just saying that on linux-scsi
> > yesterday/today.
>
> :-)
>
> > > Jens, please look at segment checking/counting code, it does it on
> > > bv->bv_len (4kb most likely) not sector granuality...
> > >
> > > So for not 4kb aligned max_segment_size we will get new segment...
> > >
> > > Best fix will be to make block layer count sectors not bv->bv_len...
> >
> > Well I'm inclined to just make that page size granularity. It's like
> > that in 2.4 as well (no guarentees that we will honor anything less than
> > that granularity).
>
> Anyway it must be also something diffirent - __make_request() should have
> noticed that rq has 2 segments not 1... this puzzles me a bit.

Well yes, that's the whole bug this thread is about, and the puzzling
bit as you say :-)

I haven't even had time to look at Stephen's dump yet, one thing that
did stand out immediately was that phys_segments was 1 even for the big
16 segment requests, which should not be with < 64k max segment size. It
should indeed have been one. So it looks like a one-off or something in
the generic dma mappers.

Steve, make BIO_MAX_SECTORS 4k smaller for now and it should work. I
think.

> This case also shows limits of BIO_MAX_SECTORS again (Adam worked on
> generic solution, but I don't know current state). There some devices
> which set q->max_sectors to 64, i.e. broken ide-floppy driver ;-)

I have code ready for this, I'll share starting next week. Need to
finish lots of other stuff too :/

--
Jens Axboe

Subject: Re: A new ide warning message


Jens, I think bio segment boundary code is dead wrong :-).

include/linux/bio.h:
#define BIOVEC_SEG_BOUNDARY(q, b1, b2) \
(((addr1) | (mask)) == (((addr2) - 1) | (mask)))
#define BIOVEC_SEG_BOUNDARY(q, b1, b2) \
__BIO_SEG_BOUNDARY(bvec_to_phys((b1)), bvec_to_phys((b2)) +
(b2)->bv_len, (q)->seg_boundary_mask)

With this code we are getting new hw segment for each bio vector... :\

On 2 Aug 2002, Steve Lord wrote:

> On Fri, 2002-08-02 at 07:41, Jens Axboe wrote:
> >
> > Yeah it's a request for data, what else could it be? That's the only
> > place where we call blk_rq_map_sg().
> >
> > Stephen, please provoke with this patch applied. I hope it works, it's
> > untested :-)
> >
>
> Consider it provoked, I added a trace to the submit_bio call in XFS
> as well, dumping sector number, bio length in bytes and number of
> vector elements submitted:
>
> submit_bio(READ, sector 0x414870, len 65536 vec_len 16
> submit_bio(READ, sector 0x4148f0, len 65536 vec_len 16
> pcidma: build 2 segments, supplied 1/16, sectors 128/8
> bio 0: phys 1, hw 16
> segment 0: phys 69599232, size 4096
> segment 1: phys 69603328, size 4096
> segment 2: phys 69607424, size 4096
> segment 3: phys 69611520, size 4096
> segment 4: phys 69615616, size 4096
> segment 5: phys 69619712, size 4096
> segment 6: phys 69623808, size 4096
> segment 7: phys 69627904, size 4096
> segment 8: phys 69632000, size 4096
> segment 9: phys 69636096, size 4096
> segment 10: phys 69640192, size 4096
> segment 11: phys 69644288, size 4096
> segment 12: phys 69648384, size 4096
> segment 13: phys 69652480, size 4096
> segment 14: phys 69656576, size 4096
> segment 15: phys 69660672, size 4096
> pcidma: build 2 segments, supplied 1/16, sectors 128/8
> bio 0: phys 1, hw 16
> segment 0: phys 69664768, size 4096
> segment 1: phys 69668864, size 4096
> segment 2: phys 69672960, size 4096
> segment 3: phys 69677056, size 4096
> segment 4: phys 69681152, size 4096
> segment 5: phys 69685248, size 4096
> segment 6: phys 69689344, size 4096
> segment 7: phys 69693440, size 4096
> segment 8: phys 69697536, size 4096
> segment 9: phys 69701632, size 4096
> segment 10: phys 69705728, size 4096
> segment 11: phys 69709824, size 4096
> segment 12: phys 69713920, size 4096
> segment 13: phys 69718016, size 4096
> segment 14: phys 69722112, size 4096
> segment 15: phys 69726208, size 4096
>
> And so on.....
>
> The bio size being used is based purely on the BIO_MAX_SECTORS
> constant, same code as ll_rw_kio. Looks like the direct I/O
> path uses similar math.
>
> Steve
>
> --
>
> Steve Lord voice: +1-651-683-3511
> Principal Engineer, Filesystem Software email: [email protected]

2002-08-03 03:25:06

by Adam J. Richter

[permalink] [raw]
Subject: Re: A new ide warning message

>On Fri, Aug 02 2002, Bartlomiej Zolnierkiewicz wrote:
>This case also shows limits of BIO_MAX_SECTORS again (Adam worked on
>generic solution, but I don't know current state).

In case this information is helpful to anyone, here is
the current status of three different variants of it.

1. bio_max_iovecs() version - This just replaced BIO_MAX_SECTORS.
It was simple. It worked, but everyone got more ambitious about making
more sweeping changes. Jens wanted a boolean one_more_bvec(bio, bvec)
interface. I made a void bio_append(bio, bvec) interface that would submit
bio that could not hold another bvec and always succeed. I beleve that
the bio_max_iovecs should have gone in while we worked on other changes
and I still believe that they should go in in the meantime. People's
systems are crashing while we argue about more ambitious solutions.
I do not have a patch against 2.5.30 handy, but I would be happy to
generate a patch for this again if there is interest in getting it
into Linus's tree while we await future changes that might replace
this code.

2. bio_append + mpage.c changes - This changed mpage.c to be
able to handle all IO requests that are thrown at it (i.e., it never
punts to fs/buffer.c) . However, since 2.5.27(?) it corrupts the
disk on an ext2 file system with a 1024-byte block size (I have only
tried it on ext2 file systems and with block sizes of 1024 or 4096 bytes).

3. bio_append + mpage.c changes + fs/buffer.c shrink - Same as
above, but had fs/mpage.c routines replace the fs/buffer.c routines
entirely, deleting a bunch of buffer_head oriented code from fs/buffer.c.
These changes corrupt the disk when the device is opened with a block
size of 1024 (as opposed ot 4096). This is probably related to the
bug described in the previous paragraph.

#2 and #3 are closely related. Perhaps I will stare at
the code for #2 some more and try to find that @#$Q@#$ bug.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."
`e

2002-08-03 03:29:40

by Adam J. Richter

[permalink] [raw]
Subject: Re: A new ide warning message

Bartlomiej Zolnierkiewicz wrote:
>Look again Jens. Adam's changes made IDE queue handling inconsistent.
>hint: 2 * 127 != 255

I've discussed this with Bartlomiej by email now and we've
determined that we he was referring to was not a bug after all.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

2002-08-06 12:49:40

by Marcin Dalecki

[permalink] [raw]
Subject: Re: A new ide warning message

Uz.ytkownik Bartlomiej Zolnierkiewicz napisa?:

> Look again Jens. Adam's changes made IDE queue handling inconsistent.
> hint: 2 * 127 != 255
>
> But noticed warning deals with design of ll_rw_blk.c. ;-)
> (right now max_segment_size have to be max bv->bv_len aligned)
>
> Jens, please look at segment checking/counting code, it does it on
> bv->bv_len (4kb most likely) not sector granuality...
>
> So for not 4kb aligned max_segment_size we will get new segment...
>
> Best fix will be to make block layer count sectors not bv->bv_len...
>
>
> btw. I like Adam's patch but it was draft not to include in mainline (?).

One never ever get's anything then drafts from Adam ;-) And since
I can't reproducde the breakage myself on any system I test
and since the patch looked really smooth...
Ej ej...