I am not on on the kernel mailing list please CC to me
I upgrade my computer's ram from 512MB to 2GB of ram and complied in
high mem support 4GB.
My system uses the XFS filesystem as my root filesystem. my distro is
debian sid. my system is an athlon 700 mhz slota on an abit KA7 (KX133
based chipset) my system is all scsi. my scsi card is an adaptec 2940UW(
I also have an 2940U2W I can switch it with). I have two hard drives
both IBM an 18 gig(DNES-318350W) and a 4 gig(DDRS-34560W). I have a zip
100 scsi and a plextor 12/10/32S.
When I upgraded to 2gigs of ram I was using 2.4.10 then used 2.4.14 and
2.4.16 each did a kernel panic. however none do it with highmem off.
just like others it only happens when copying from dev mounted as /.
however me with I have a second hard drive and a zipdrive that both have
partitions that can be mounted as vfat. the first hard drive also has a
partition that I mount as vfat. it doesn't matter where I copy to the
first hard drive the second hard or the zipdrive I still get the same
error.
I've also had the error locking at making tag count 64( somebody in the
archives said that it doing that is normal. I've had that problem for
years it doesn't hurt anything so I ignore it. so why does it do that?)
I've read through the archives somebody posted the if they change the
#define NSEG from 128 to 512 it goes away. it went away for me too when
I did that and recomplied the kernel it went away for me too.
I found this in aic7xxx_osm.h suggesting the 128 setting was only for
highmem off
/*
* Number of SG segments we require. So long as the S/G segments for
* a particular transaction are allocated in a physically contiguous
* manner and are allocated below 4GB, the number of S/G segments is
* unrestricted.
*/
applied the debug patch with NSEG set to 128 (default) on kernel 2.4.16
here are my results:
too few segs for dma mapping. Increase AHC_NSEG
invalid operand: 0000
CPU: 0
EIP: 0010 : [<c0235db5>] not tainted
EFLAGS: 00010046
eax: ffffffff ebx: f7bc82a0 ecx: c0344000 edx: 00000000
esi: 00001000 edi: f6ca1000 ebp: f7bc4fc0 esp: c0345f18
ds: 0018 es: 0018 ss: 0018
Process swapper (pid:0, stackpage=c0345000)
Stack: c301ac00 c0345f5c 0000000f c0345fa8 c0345fa8 f7b4a000 c0239198
c301ac00
41bc82a0 c3010002 f7b44200 c0236349 c301ac00 c301f780
f7bd0340 04000001
c0117eba 00000292 c010818f 0000000f c301ac00 c0345fa8
000003c0 c0378cc0
Call trace: [<c0239198>] [<c0236349>] [<c0117eba>] [<c010818f>]
[<c010830e>] [c0105390>] [<c01053b3>]
[<c0105422>] [<c0105000>] [<c0105027>]
code: 0f 0b 80 4f 07 80 8b 03 8b 53 2c 83 ca 02 89 50 14 8b 13 8b
<0> Kernel panic: Aiee, killing interrupt handler!
in interrupt handler_not syncing.
if anyone would like me to do any extra troubleshooting. I'm game just
make sure you CC me the email.
thanks
LBJM
>When I upgraded to 2gigs of ram I was using 2.4.10 then used 2.4.14 and
>2.4.16 each did a kernel panic. however none do it with highmem off.
I am still investigating the cause of this particular problem. In
fact we are building up a new system today in the hope of being able
to reproduce and solve this problem.
>I've also had the error locking at making tag count 64( somebody in the
>archives said that it doing that is normal. I've had that problem for
>years it doesn't hurt anything so I ignore it. so why does it do that?)
Its an informational message, not an error. The system is telling
you that it has dynamically determined the maximum queue depth of
the device.
>I've read through the archives somebody posted the if they change the
>#define NSEG from 128 to 512 it goes away. it went away for me too when
>I did that and recomplied the kernel it went away for me too.
>I found this in aic7xxx_osm.h suggesting the 128 setting was only for
>highmem off
>/*
> * Number of SG segments we require. So long as the S/G segments for
> * a particular transaction are allocated in a physically contiguous
> * manner and are allocated below 4GB, the number of S/G segments is
> * unrestricted.
> */
This is merely saying that the controller requires that the S/G list is
allocated below 4GB in the PCI bus address space (note that on some platforms
this may not mean the same thing as allocated within the first 4GB of
physical memory).
--
Justin
On Mon, Dec 10 2001, Justin T. Gibbs wrote:
> >When I upgraded to 2gigs of ram I was using 2.4.10 then used 2.4.14 and
> >2.4.16 each did a kernel panic. however none do it with highmem off.
>
> I am still investigating the cause of this particular problem. In
> fact we are building up a new system today in the hope of being able
> to reproduce and solve this problem.
Ok I decided to try and trace this. You se sg_tablesize go AHC_NSEG, so
the SCSI mid layer merging functions will at best create a AHC_NSEG
sized request for you. So far, so good. Now, aic7xxx queuer is entered,
and eventually we end up in ahc_linux_run_device_queue to process the
queued scsi_cmnd. You call pci_map_sg on the supplied scatter gather
list, which returns nseg amount of segments. On x86 nseg will equal the
scatter gather members initially created (no iommu to do funky tricks),
so for our test case here nseg still equals AHC_NSEG. Now you build and
map the segments, the jist of that loop is something ala
/*
* The sg_count may be larger than nseg if
* a transfer crosses a 32bit page.
*/
hmm, here it already starts to smell fishy...
scb->sg_count = 0;
while(cur_seg < end_seg) {
bus_addr_t addr;
bus_size_t len;
int consumed;
addr = sg_dma_address(cur_seg);
len = sg_dma_len(cur_seg);
consumed = ahc_linux_map_seg(ahc, scb, sg, addr, len);
ahc_linux_map_seg checks if scb->sg_count gets bigger than AHC_NSEG, in
fact the test is
if (scb->sg_count + 1 > AHC_NSEC)
panic()
What am I missing here?? I see nothing preventing hitting this panic in
some circumstances.
if (scb->sg_count + 2 > AHC_NSEG)
panic()
weee, we crossed a 4gb boundary and suddenly we have bigger problems
yet. Ok, so what I think the deal is here is that AHC_NSEG are two
different things to your driver and the mid layer.
Am I missing something? It can't be this obvious.
--
Jens Axboe
>Ok I decided to try and trace this.
...
> /*
> * The sg_count may be larger than nseg if
> * a transfer crosses a 32bit page.
> */
>
>hmm, here it already starts to smell fishy...
>
> scb->sg_count = 0;
> while(cur_seg < end_seg) {
> bus_addr_t addr;
> bus_size_t len;
> int consumed;
>
> addr = sg_dma_address(cur_seg);
> len = sg_dma_len(cur_seg);
> consumed = ahc_linux_map_seg(ahc, scb, sg, addr, len);
>
>ahc_linux_map_seg checks if scb->sg_count gets bigger than AHC_NSEG, in
>fact the test is
>
> if (scb->sg_count + 1 > AHC_NSEC)
> panic()
>
>What am I missing here?? I see nothing preventing hitting this panic in
>some circumstances.
If you don't cross a 4GB boundary, this is the same as a static test
that you never have more than AHC_NSEG segments.
> if (scb->sg_count + 2 > AHC_NSEG)
> panic()
>
>weee, we crossed a 4gb boundary and suddenly we have bigger problems
>yet. Ok, so what I think the deal is here is that AHC_NSEG are two
>different things to your driver and the mid layer.
>
>Am I missing something? It can't be this obvious.
You will never cross a 4GB boundary on a machine with only 2GB of
physical memory. This report and another I have received are for
configurations with 2GB or less memory. This is not the cause of the
problem. Further, after this code was written, David Miller made the
comment that an I/O that crosses a 4GB boundary will never be generated
for the exact same reason that this check is included in the aic7xxx
driver - you can't cross a 4GB page in a single PCI DAC transaction.
I should go verify that this is really the case in recent 2.4.X kernels.
Saying that AHC_NSEG and the segment count exported to the mid-layer are
too differnt things is true to some extent, but if the 4GB rule is not
honored by the mid-layer implicitly, I would have to tell the mid-layer
I can only handle half the number of segments I really can. This isn't
good for the memory footprint of the driver. The test was added to
protect against a situation that I don't believe can now happen in Linux.
In truth, the solution to these kinds of problems is to export alignment,
boundary, and range restrictions on memory mappings from the device
driver to the layer creating the mappings. This is the only way to
generically allow a device driver to export a true segment limit.
--
Justin
On Mon, Dec 10 2001, Justin T. Gibbs wrote:
> >ahc_linux_map_seg checks if scb->sg_count gets bigger than AHC_NSEG, in
> >fact the test is
> >
> > if (scb->sg_count + 1 > AHC_NSEC)
> > panic()
> >
> >What am I missing here?? I see nothing preventing hitting this panic in
> >some circumstances.
>
> If you don't cross a 4GB boundary, this is the same as a static test
> that you never have more than AHC_NSEG segments.
Yes sorry, my one-off.
> > if (scb->sg_count + 2 > AHC_NSEG)
> > panic()
> >
> >weee, we crossed a 4gb boundary and suddenly we have bigger problems
> >yet. Ok, so what I think the deal is here is that AHC_NSEG are two
> >different things to your driver and the mid layer.
> >
> >Am I missing something? It can't be this obvious.
>
> You will never cross a 4GB boundary on a machine with only 2GB of
> physical memory. This report and another I have received are for
Of course not.
> configurations with 2GB or less memory. This is not the cause of the
> problem. Further, after this code was written, David Miller made the
> comment that an I/O that crosses a 4GB boundary will never be generated
> for the exact same reason that this check is included in the aic7xxx
> driver - you can't cross a 4GB page in a single PCI DAC transaction.
> I should go verify that this is really the case in recent 2.4.X kernels.
Right, we decided against ever doing that. In fact I added the very code
to do this in the block-highmem series -- however, this assumption
breaks down in the current 2.4 afair on 64-bit archs.
> Saying that AHC_NSEG and the segment count exported to the mid-layer are
> too differnt things is true to some extent, but if the 4GB rule is not
> honored by the mid-layer implicitly, I would have to tell the mid-layer
> I can only handle half the number of segments I really can. This isn't
> good for the memory footprint of the driver. The test was added to
> protect against a situation that I don't believe can now happen in Linux.
> In truth, the solution to these kinds of problems is to export alignment,
> boundary, and range restrictions on memory mappings from the device
> driver to the layer creating the mappings. This is the only way to
> generically allow a device driver to export a true segment limit.
I agree, and that is why I've already included code to do just that in
2.5.
--
Jens Axboe
On Mon, 10 Dec 2001, Jens Axboe wrote:
> On Mon, Dec 10 2001, Justin T. Gibbs wrote:
> > You will never cross a 4GB boundary on a machine with only 2GB of
> > physical memory. This report and another I have received are for
>
> Of course not.
>
> > configurations with 2GB or less memory. This is not the cause of the
> > problem. Further, after this code was written, David Miller made the
> > comment that an I/O that crosses a 4GB boundary will never be generated
> > for the exact same reason that this check is included in the aic7xxx
> > driver - you can't cross a 4GB page in a single PCI DAC transaction.
> > I should go verify that this is really the case in recent 2.4.X kernels.
>
> Right, we decided against ever doing that. In fact I added the very code
> to do this in the block-highmem series -- however, this assumption
> breaks down in the current 2.4 afair on 64-bit archs.
By the way, the DAC capable Symbios chips donnot support scatter entry
that crosses a 4GB boundary by *design*. In fact they seem to be
internally 32 bit addressing based with some additionnal logic for DAC to
be possible, but they don't look like true 64 bit engine.
And since these devices have had some ISA ancestors, they also have some
16 MB limit in some place. This applies to the max length that can be
DAMed by scatter entry. And since everything has bugs, some errata
applying to some early chip versions have some relationship with scatter
entry crossing a 16 MB or 32 MB boundary...
As a result, for Symbios devices, I want to ask upper layers not to
provide low-level drivers with scatter entries that cross a 16 MB boundary
and obviously not larger than 16 MB but it is a consequence. If Linux does
not allow this, they I can do nothing clean to fix the issue.
Btw, a 16 MB boundary limitation would have no significant impact on
performance and would have the goodness of avoiding some hardware bugs not
only on a few Symbios devices in my opinion. As we know, numerous modern
cores still have rests of the ISA epoch in their guts. So, in my opinion,
the 16 MB boundary limitation should be the default on systems where
reliability is the primary goal.
> > Saying that AHC_NSEG and the segment count exported to the mid-layer are
> > too differnt things is true to some extent, but if the 4GB rule is not
> > honored by the mid-layer implicitly, I would have to tell the mid-layer
> > I can only handle half the number of segments I really can. This isn't
> > good for the memory footprint of the driver. The test was added to
> > protect against a situation that I don't believe can now happen in Linux.
>
> > In truth, the solution to these kinds of problems is to export alignment,
> > boundary, and range restrictions on memory mappings from the device
> > driver to the layer creating the mappings. This is the only way to
> > generically allow a device driver to export a true segment limit.
>
> I agree, and that is why I've already included code to do just that in
> 2.5.
This shouldn't have been missed, in my opinion. :)
G?rard.
From: G?rard Roudier <[email protected]>
Date: Mon, 10 Dec 2001 20:21:21 +0100 (CET)
Btw, a 16 MB boundary limitation would have no significant impact on
performance and would have the goodness of avoiding some hardware bugs not
only on a few Symbios devices in my opinion. As we know, numerous modern
cores still have rests of the ISA epoch in their guts. So, in my opinion,
the 16 MB boundary limitation should be the default on systems where
reliability is the primary goal.
Complications arrive when IOMMU starts to remap things into a virtual
32-bit bus space as happens on several platforms now.
Jen's block layer knows nothing about what we will do here, since
he only really has access to physical addresses.
Only after the pci_map_sg() call can you inspect DMA addresses and
apply such workarounds.
On Mon, 10 Dec 2001, David S. Miller wrote:
> From: G?rard Roudier <[email protected]>
> Date: Mon, 10 Dec 2001 20:21:21 +0100 (CET)
>
> Btw, a 16 MB boundary limitation would have no significant impact on
> performance and would have the goodness of avoiding some hardware bugs not
> only on a few Symbios devices in my opinion. As we know, numerous modern
> cores still have rests of the ISA epoch in their guts. So, in my opinion,
> the 16 MB boundary limitation should be the default on systems where
> reliability is the primary goal.
>
> Complications arrive when IOMMU starts to remap things into a virtual
> 32-bit bus space as happens on several platforms now.
>
> Jen's block layer knows nothing about what we will do here, since
> he only really has access to physical addresses.
>
> Only after the pci_map_sg() call can you inspect DMA addresses and
> apply such workarounds.
Such workaround will add bloat on the already bloated for no relevant
reason 'generic scatter to physical' thing.
As you know, low-level drivers on Linux announce some maximum length for
the sg array. As you guess, in the worst case, each sg entry may have to
be cut into several real entry (hoped 2 maximum) due to boundary
limitations. At a first glance, low-level drivers should announce no more
than half their real sg length capability and also would have to rewalk
the entire sg list.
I used and was happy to do so when the scatter process was not generic.
If we want it to be generic, then we want it to do the needed work. If
generic means 'just bloated and clueless' then generic is a extreme bad
thing.
'virt_to_bus' + 'flat addressing model' was the 'just as complex as
needed' for DMA model and most (may-be > 99%) of existing physical
machines are just happy with such model. The DMA/BUS complexity all O/Ses
have invented nowadays is a useless misfeature when based on the reality,
in my opinion. So, I may just be dreaming, at the moment. :-)
If one really wants for some marketing reason to support these ugly and
stinky '32 bit machines that want to provide more than 4GB of memory by
shoe-horning complexity all over the place', one should use his brain,
when so-featured, prior to writing clueless code.
Speaking for the sym drivers, the sg list will NEVER be rewalked under any
O/S that want to provide a generic method to scatter the IO buffers. When
the normal semantic is supplied, as in the just complex as needed DMA
models, the drivers did and do things *right* regarding scatter/gather
without bloat.
G?rard.
PS: If I want the sym driver to register as a PCI driver under Linux, then
some generic probing scheme seems to be unconditionnaly applied by the PCI
code. This just disallow the USER DEFINED boot order in the controller
NVRAMs to be applied by the driver. Did I miss something, or is it still
some clueless new generic method that bites me once again, here?
On Tue, Dec 11 2001, G?rard Roudier wrote:
> On Mon, 10 Dec 2001, David S. Miller wrote:
>
> > From: G?rard Roudier <[email protected]>
> > Date: Mon, 10 Dec 2001 20:21:21 +0100 (CET)
> >
> > Btw, a 16 MB boundary limitation would have no significant impact on
> > performance and would have the goodness of avoiding some hardware bugs not
> > only on a few Symbios devices in my opinion. As we know, numerous modern
> > cores still have rests of the ISA epoch in their guts. So, in my opinion,
> > the 16 MB boundary limitation should be the default on systems where
> > reliability is the primary goal.
> >
> > Complications arrive when IOMMU starts to remap things into a virtual
> > 32-bit bus space as happens on several platforms now.
> >
> > Jen's block layer knows nothing about what we will do here, since
> > he only really has access to physical addresses.
> >
> > Only after the pci_map_sg() call can you inspect DMA addresses and
> > apply such workarounds.
>
> Such workaround will add bloat on the already bloated for no relevant
> reason 'generic scatter to physical' thing.
>
> As you know, low-level drivers on Linux announce some maximum length for
> the sg array. As you guess, in the worst case, each sg entry may have to
> be cut into several real entry (hoped 2 maximum) due to boundary
> limitations. At a first glance, low-level drivers should announce no more
> than half their real sg length capability and also would have to rewalk
> the entire sg list.
That's why these boundary limitations need to be known by the layer
build the requests for you.
> I used and was happy to do so when the scatter process was not generic.
> If we want it to be generic, then we want it to do the needed work. If
> generic means 'just bloated and clueless' then generic is a extreme bad
> thing.
>
> 'virt_to_bus' + 'flat addressing model' was the 'just as complex as
> needed' for DMA model and most (may-be > 99%) of existing physical
> machines are just happy with such model. The DMA/BUS complexity all O/Ses
> have invented nowadays is a useless misfeature when based on the reality,
> in my opinion. So, I may just be dreaming, at the moment. :-)
>
> If one really wants for some marketing reason to support these ugly and
> stinky '32 bit machines that want to provide more than 4GB of memory by
> shoe-horning complexity all over the place', one should use his brain,
> when so-featured, prior to writing clueless code.
First of all, virt_to_bus just cannot work on some archetectures that
are just slightly more advanced than x86. I'm quite sure Davem is ready
to lecture you on this.
Second, you are misunderstanding the need of a page/offset instead of
virtua_address model. It's _not_ for > 4GB machines, it's for machines
with highmem. You'll need this on the standard kernel to I/O above
860MB, that that is definitely a much bigger part of the market. Heck,
lots of home users have 1GB or more with the RAM prices these days.
--
Jens Axboe
On Wed, Dec 12, 2001 at 10:36:54AM +0100, Jens Axboe wrote:
> On Tue, Dec 11 2001, G?rard Roudier wrote:
> > On Mon, 10 Dec 2001, David S. Miller wrote:
> >
> > > From: G?rard Roudier <[email protected]>
> > > Date: Mon, 10 Dec 2001 20:21:21 +0100 (CET)
> > >
> > > Btw, a 16 MB boundary limitation would have no significant impact on
> > > performance and would have the goodness of avoiding some hardware bugs not
> > > only on a few Symbios devices in my opinion. As we know, numerous modern
> > > cores still have rests of the ISA epoch in their guts. So, in my opinion,
> > > the 16 MB boundary limitation should be the default on systems where
> > > reliability is the primary goal.
> > >
> > > Complications arrive when IOMMU starts to remap things into a virtual
> > > 32-bit bus space as happens on several platforms now.
> > >
> > > Jen's block layer knows nothing about what we will do here, since
> > > he only really has access to physical addresses.
> > >
> > > Only after the pci_map_sg() call can you inspect DMA addresses and
> > > apply such workarounds.
> >
> > Such workaround will add bloat on the already bloated for no relevant
> > reason 'generic scatter to physical' thing.
> >
> > As you know, low-level drivers on Linux announce some maximum length for
> > the sg array. As you guess, in the worst case, each sg entry may have to
> > be cut into several real entry (hoped 2 maximum) due to boundary
> > limitations. At a first glance, low-level drivers should announce no more
> > than half their real sg length capability and also would have to rewalk
> > the entire sg list.
>
> That's why these boundary limitations need to be known by the layer
> build the requests for you.
>
> > I used and was happy to do so when the scatter process was not generic.
> > If we want it to be generic, then we want it to do the needed work. If
> > generic means 'just bloated and clueless' then generic is a extreme bad
> > thing.
> >
> > 'virt_to_bus' + 'flat addressing model' was the 'just as complex as
> > needed' for DMA model and most (may-be > 99%) of existing physical
> > machines are just happy with such model. The DMA/BUS complexity all O/Ses
> > have invented nowadays is a useless misfeature when based on the reality,
> > in my opinion. So, I may just be dreaming, at the moment. :-)
> >
> > If one really wants for some marketing reason to support these ugly and
> > stinky '32 bit machines that want to provide more than 4GB of memory by
> > shoe-horning complexity all over the place', one should use his brain,
> > when so-featured, prior to writing clueless code.
>
> First of all, virt_to_bus just cannot work on some archetectures that
> are just slightly more advanced than x86. I'm quite sure Davem is ready
> to lecture you on this.
yes, the whole point of the iommu work (replacement for virt_to_bus) is
for the 64bit machines, not for the 32bit machines. It's to allow the
64bit machines to do zerocopy dma (no bounce buffers) on memory above 4G
with pci32 devices that doesn't support DAC.
>
> Second, you are misunderstanding the need of a page/offset instead of
> virtua_address model. It's _not_ for > 4GB machines, it's for machines
> with highmem. You'll need this on the standard kernel to I/O above
> 860MB, that that is definitely a much bigger part of the market. Heck,
> lots of home users have 1GB or more with the RAM prices these days.
>
> --
> Jens Axboe
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Andrea
On Wed, 12 Dec 2001, Jens Axboe wrote:
> On Tue, Dec 11 2001, G?rard Roudier wrote:
[...]
> > As you know, low-level drivers on Linux announce some maximum length for
> > the sg array. As you guess, in the worst case, each sg entry may have to
> > be cut into several real entry (hoped 2 maximum) due to boundary
> > limitations. At a first glance, low-level drivers should announce no more
> > than half their real sg length capability and also would have to rewalk
> > the entire sg list.
>
> That's why these boundary limitations need to be known by the layer
> build the requests for you.
How can I tell the layer about boundaries ?
Will check if I missed some important change.
> > I used and was happy to do so when the scatter process was not generic.
> > If we want it to be generic, then we want it to do the needed work. If
> > generic means 'just bloated and clueless' then generic is a extreme bad
> > thing.
> >
> > 'virt_to_bus' + 'flat addressing model' was the 'just as complex as
> > needed' for DMA model and most (may-be > 99%) of existing physical
> > machines are just happy with such model. The DMA/BUS complexity all O/Ses
> > have invented nowadays is a useless misfeature when based on the reality,
> > in my opinion. So, I may just be dreaming, at the moment. :-)
> >
> > If one really wants for some marketing reason to support these ugly and
> > stinky '32 bit machines that want to provide more than 4GB of memory by
> > shoe-horning complexity all over the place', one should use his brain,
> > when so-featured, prior to writing clueless code.
>
> First of all, virt_to_bus just cannot work on some archetectures that
> are just slightly more advanced than x86. I'm quite sure Davem is ready
> to lecture you on this.
>
> Second, you are misunderstanding the need of a page/offset instead of
> virtua_address model. It's _not_ for > 4GB machines, it's for machines
> with highmem. You'll need this on the standard kernel to I/O above
> 860MB, that that is definitely a much bigger part of the market. Heck,
> lots of home users have 1GB or more with the RAM prices these days.
I didn't misunderstand anything here, but have probably been unclear. The
3GB user + 1 GB kernel - some room for vremap/vmalloc looks a Linuxish
issue to me and I wanted to be more general here. By the way, speaking for
meyself, I donnot use bloaty applications and hence, at least in theory,
it will be possible for me to use at least 2GB of physical memory without
need of any kind of higmem crap. My guess is that 2GB of physical memory
still encompasses 99% of physically machines in use.
About what I call '32 bit machine', the sparc64 with its s****d IOMMU does
falls in this category. The CPU can do 64 bit operations and addressing,
but as seen from IO, the silicium is some 32 bit out-of-age thing hacked
for 64 bit memory addressing capability and some proprietary BUS streaming
protocol.
FYI, my personnal machine uses a ServerWorks LE chipset. The thing is 32
bit, but it is clean design regarding buses. It is possible for example
for a device on one PCI BUS to master another device on the other PCI BUS.
And the PCI BUses bandwidth seems quite good.
For your memory refresh, virtual memory has been invented for programs to
be allowed to be larger than the physical memory. OTOH, all archs based on
memory segmentations have been replaced by a flat model since this led to
unbearable complexity. The current 32 bit to 64 bit transition resembles
the 8/16 bit and 16 bit/32 bit transition, adding same kind of useless
complexity in software. This stinks a lot. Let me not encourage this a
single second.
G?rard.
On Wed, 12 Dec 2001, Andrea Arcangeli wrote:
> On Wed, Dec 12, 2001 at 10:36:54AM +0100, Jens Axboe wrote:
> > > If one really wants for some marketing reason to support these ugly and
> > > stinky '32 bit machines that want to provide more than 4GB of memory by
> > > shoe-horning complexity all over the place', one should use his brain,
> > > when so-featured, prior to writing clueless code.
> >
> > First of all, virt_to_bus just cannot work on some archetectures that
> > are just slightly more advanced than x86. I'm quite sure Davem is ready
> > to lecture you on this.
>
> yes, the whole point of the iommu work (replacement for virt_to_bus) is
> for the 64bit machines, not for the 32bit machines. It's to allow the
> 64bit machines to do zerocopy dma (no bounce buffers) on memory above 4G
> with pci32 devices that doesn't support DAC.
So, the PCI group should just have specified a 16 bit BUS and have told
that systems should implement some IOMMU in order to address the whole
memory. :-)
PCI was intended to be implemented as a LOCAL BUS with all agents on the
LOCAL BUS being able to talk with any other agent using a flat addressing
scheme. Your PCI thing does not look like true PCI to me, but rather like
some bad mutant that has every chance not to survive a long time.
G?rard.
On Wed, Dec 12, 2001 at 06:22:30PM +0100, G?rard Roudier wrote:
>
>
> On Wed, 12 Dec 2001, Andrea Arcangeli wrote:
>
> > On Wed, Dec 12, 2001 at 10:36:54AM +0100, Jens Axboe wrote:
>
> > > > If one really wants for some marketing reason to support these ugly and
> > > > stinky '32 bit machines that want to provide more than 4GB of memory by
> > > > shoe-horning complexity all over the place', one should use his brain,
> > > > when so-featured, prior to writing clueless code.
> > >
> > > First of all, virt_to_bus just cannot work on some archetectures that
> > > are just slightly more advanced than x86. I'm quite sure Davem is ready
> > > to lecture you on this.
> >
> > yes, the whole point of the iommu work (replacement for virt_to_bus) is
> > for the 64bit machines, not for the 32bit machines. It's to allow the
> > 64bit machines to do zerocopy dma (no bounce buffers) on memory above 4G
> > with pci32 devices that doesn't support DAC.
>
> So, the PCI group should just have specified a 16 bit BUS and have told
> that systems should implement some IOMMU in order to address the whole
> memory. :-)
8)8)
> PCI was intended to be implemented as a LOCAL BUS with all agents on the
> LOCAL BUS being able to talk with any other agent using a flat addressing
> scheme. Your PCI thing does not look like true PCI to me, but rather like
> some bad mutant that has every chance not to survive a long time.
It's not only a matter of the MSB of the "max bus address" supported,
it's also a matter of "how much ram" can be under DMA at the same time.
With pci32 we have a window of 4G of physical ram that can be
simultaneously under I/O at the same time. The linux API (or better all
the drivers) are unfortunately broken so that if the 4G is overflowed
the kernel crashes (but this is a minor detail :). 16 bit would limit
way too much the amount of simultaneous I/O, 4G is rasonable instead
(incidentally this is why linux usually doesn't crash on the 64bit
boxes). For the few apps (like quadrics or myrinet) that needs SG
windows larger than 4G we just require DAC support and we don't use the
iommu 32bit (also to avoid triggering the device driver bugs :).
But this is true PCI, you know the device has no clue it is writing over
4G, that's the iommu work to map a certain bus address into a certain
physical address.
You are completly right: what you mean is that if the PCI group
specified 64bit/DAC (instead of 32... or 16 :) as only possible way to
do DMA with PCI, we wouldn't need the iommu on the 64bit boxes. But
those pci32 card exists and the iommu is better than the bounce
buffers... and after all the iommu/pci_map API is clean enough, except
when all the drivers do pci_map_single and think it cannot fail, because
it _can_ fail! (but of course the conversion was painful because a fail
path wasn't designed into the drivers and so now we can crash instead :)
I guess passing through pci32 is been cheaper while on the x86. Not sure
how much cheaper on the long run though, given we're paying for the
iommu chips now.
Andrea
On Wed, 12 Dec 2001, Andrea Arcangeli wrote:
> On Wed, Dec 12, 2001 at 06:22:30PM +0100, G?rard Roudier wrote:
> >
> >
> > On Wed, 12 Dec 2001, Andrea Arcangeli wrote:
> >
> > > On Wed, Dec 12, 2001 at 10:36:54AM +0100, Jens Axboe wrote:
> >
> > > > > If one really wants for some marketing reason to support these ugly and
> > > > > stinky '32 bit machines that want to provide more than 4GB of memory by
> > > > > shoe-horning complexity all over the place', one should use his brain,
> > > > > when so-featured, prior to writing clueless code.
> > > >
> > > > First of all, virt_to_bus just cannot work on some archetectures that
> > > > are just slightly more advanced than x86. I'm quite sure Davem is ready
> > > > to lecture you on this.
> > >
> > > yes, the whole point of the iommu work (replacement for virt_to_bus) is
> > > for the 64bit machines, not for the 32bit machines. It's to allow the
> > > 64bit machines to do zerocopy dma (no bounce buffers) on memory above 4G
> > > with pci32 devices that doesn't support DAC.
> >
> > So, the PCI group should just have specified a 16 bit BUS and have told
> > that systems should implement some IOMMU in order to address the whole
> > memory. :-)
>
> 8)8)
>
> > PCI was intended to be implemented as a LOCAL BUS with all agents on the
> > LOCAL BUS being able to talk with any other agent using a flat addressing
> > scheme. Your PCI thing does not look like true PCI to me, but rather like
> > some bad mutant that has every chance not to survive a long time.
>
> It's not only a matter of the MSB of the "max bus address" supported,
> it's also a matter of "how much ram" can be under DMA at the same time.
> With pci32 we have a window of 4G of physical ram that can be
> simultaneously under I/O at the same time. The linux API (or better all
> the drivers) are unfortunately broken so that if the 4G is overflowed
> the kernel crashes (but this is a minor detail :). 16 bit would limit
> way too much the amount of simultaneous I/O, 4G is rasonable instead
> (incidentally this is why linux usually doesn't crash on the 64bit
> boxes). For the few apps (like quadrics or myrinet) that needs SG
> windows larger than 4G we just require DAC support and we don't use the
> iommu 32bit (also to avoid triggering the device driver bugs :).
>
> But this is true PCI, you know the device has no clue it is writing over
> 4G, that's the iommu work to map a certain bus address into a certain
> physical address.
>
> You are completly right: what you mean is that if the PCI group
> specified 64bit/DAC (instead of 32... or 16 :) as only possible way to
> do DMA with PCI, we wouldn't need the iommu on the 64bit boxes. But
> those pci32 card exists and the iommu is better than the bounce
> buffers... and after all the iommu/pci_map API is clean enough, except
> when all the drivers do pci_map_single and think it cannot fail, because
> it _can_ fail! (but of course the conversion was painful because a fail
> path wasn't designed into the drivers and so now we can crash instead :)
>
> I guess passing through pci32 is been cheaper while on the x86. Not sure
> how much cheaper on the long run though, given we're paying for the
> iommu chips now.
Thanks for your explanations, Andrea.
My point is not that, given current hardware, supporting IOMMUs and other
weirdnesses is not the right thing to do. It is, very probably, ..., in
fact, I don't care. But, we should not be happy to have to support this.
After all, there are numerous situations in life where we have painful
things to do. Being happy with pain is called masochism. :)
A N% loss for the 99% case in order to support the 1% is close to N%
loss. So, each time we bloat or complexify the code with no relevance for
the average case, the overall difference cannot be a win.
G?rard.
From: G?rard Roudier <[email protected]>
Date: Wed, 12 Dec 2001 18:22:30 +0100 (CET)
PCI was intended to be implemented as a LOCAL BUS with all agents on the
LOCAL BUS being able to talk with any other agent using a flat addressing
scheme. Your PCI thing does not look like true PCI to me, but rather like
some bad mutant that has every chance not to survive a long time.
Intentions are neither here nor there. PCI is MORE USEFUL, because
you CAN do things like IOMMU's and treat PCI like a complete seperate
I/O bus world.
From: G?rard Roudier <[email protected]>
Date: Wed, 12 Dec 2001 21:24:59 +0100 (CET)
A N% loss for the 99% case in order to support the 1% is close to N%
loss. So, each time we bloat or complexify the code with no relevance for
the average case, the overall difference cannot be a win.
Do you know, you can use this N% loss to implement handling of the
very problem you have wrt. sym53c8xx hw bugs? :-)
To be honest all the machinery to handle the problems you have
described are there today, even with IOMMU's present. The generic
block layer today knows when IOMMU is being used, it knows what kind
of coalescing can and will be done by the IOMMU support code (via
DMA_CHUNK_SIZE), and therefore it is capable of adhering to any
restrictions you care to describe to the block layer.
It's only a matter of coding on Jens's part :-)
On Wed, 12 Dec 2001, David S. Miller wrote:
> From: G?rard Roudier <[email protected]>
> Date: Wed, 12 Dec 2001 21:24:59 +0100 (CET)
>
> A N% loss for the 99% case in order to support the 1% is close to N%
> loss. So, each time we bloat or complexify the code with no relevance for
> the average case, the overall difference cannot be a win.
>
> Do you know, you can use this N% loss to implement handling of the
> very problem you have wrt. sym53c8xx hw bugs? :-)
What hw bugs ? :-)
The driver is very fortunate. Based on DELs, only a few known hw bugs have
had to be worked-around by sym drivers. In fact, you can avoid most bugs,
by taking into account usual bugs in PCI hw. When I have made sym53c8xx
from ncr53c8xx, I had some clues in mind.
The 16MB/32MB boundary I pointed out can only affect sym53c896 rev. 1, and
only in not very likely situations. In plan to resurrect the work-around
in Linux sym_glue.c if pci_map_sg() will not want to handle the issue.
So, on paper, you were talking about a single unlikely to happen hw bug.
:-)
> To be honest all the machinery to handle the problems you have
> described are there today, even with IOMMU's present. The generic
> block layer today knows when IOMMU is being used, it knows what kind
> of coalescing can and will be done by the IOMMU support code (via
> DMA_CHUNK_SIZE), and therefore it is capable of adhering to any
> restrictions you care to describe to the block layer.
Will look into all these great things over the week-end and let you all
know my remarks if any.
> It's only a matter of coding on Jens's part :-)
Jen's seems to be a great and very courageous coder, so let me trust him
not to miss anything important. :)
Thanks for your reply,
G?rard.
PS: Don't take the wrong way my statements against Sun stuff. In fact, I
dislike almost everything that comes and came from them. :)
On Wed, 12 Dec 2001, David S. Miller wrote:
> From: G?rard Roudier <[email protected]>
> Date: Wed, 12 Dec 2001 18:22:30 +0100 (CET)
>
> PCI was intended to be implemented as a LOCAL BUS with all agents on the
> LOCAL BUS being able to talk with any other agent using a flat addressing
> scheme. Your PCI thing does not look like true PCI to me, but rather like
> some bad mutant that has every chance not to survive a long time.
>
> Intentions are neither here nor there. PCI is MORE USEFUL, because
> you CAN do things like IOMMU's and treat PCI like a complete seperate
> I/O bus world.
But there are a couple of things you cannot do with PCI. For example, you
cannot mow the lawn with PCI. But since there is always room for
improvement... :-) :-)
G?rard.
On Mon, 2001-12-10 at 13:50, Justin T. Gibbs wrote:
> >Ok I decided to try and trace this.
>
> ...
>
> > /*
> > * The sg_count may be larger than nseg if
> > * a transfer crosses a 32bit page.
> > */
> >
> >hmm, here it already starts to smell fishy...
> >
> > scb->sg_count = 0;
> > while(cur_seg < end_seg) {
> > bus_addr_t addr;
> > bus_size_t len;
> > int consumed;
> >
> > addr = sg_dma_address(cur_seg);
> > len = sg_dma_len(cur_seg);
> > consumed = ahc_linux_map_seg(ahc, scb, sg, addr, len);
> >
> >ahc_linux_map_seg checks if scb->sg_count gets bigger than AHC_NSEG, in
> >fact the test is
> >
> > if (scb->sg_count + 1 > AHC_NSEC)
> > panic()
> >
> >What am I missing here?? I see nothing preventing hitting this panic in
> >some circumstances.
>
> If you don't cross a 4GB boundary, this is the same as a static test
> that you never have more than AHC_NSEG segments.
>
> > if (scb->sg_count + 2 > AHC_NSEG)
> > panic()
> >
> >weee, we crossed a 4gb boundary and suddenly we have bigger problems
> >yet. Ok, so what I think the deal is here is that AHC_NSEG are two
> >different things to your driver and the mid layer.
> >
> >Am I missing something? It can't be this obvious.
>
> You will never cross a 4GB boundary on a machine with only 2GB of
> physical memory. This report and another I have received are for
> configurations with 2GB or less memory. This is not the cause of the
> problem. Further, after this code was written, David Miller made the
> comment that an I/O that crosses a 4GB boundary will never be generated
> for the exact same reason that this check is included in the aic7xxx
> driver - you can't cross a 4GB page in a single PCI DAC transaction.
> I should go verify that this is really the case in recent 2.4.X kernels.
>
> Saying that AHC_NSEG and the segment count exported to the mid-layer are
> too differnt things is true to some extent, but if the 4GB rule is not
> honored by the mid-layer implicitly, I would have to tell the mid-layer
> I can only handle half the number of segments I really can. This isn't
> good for the memory footprint of the driver. The test was added to
> protect against a situation that I don't believe can now happen in Linux.
>
> In truth, the solution to these kinds of problems is to export alignment,
> boundary, and range restrictions on memory mappings from the device
> driver to the layer creating the mappings. This is the only way to
> generically allow a device driver to export a true segment limit.
Another data point on this problem, In 2.5.1-pre11 running XFS I can now
hit this panic 100% of the time running bonnie++. And I do not have
HIGHMEM, I have 128M of memory.
It looks to me like request merging has got too efficient for its own
good!
This is the scsi info printed at startup:
SCSI subsystem driver Revision: 1.00
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.4
<Adaptec aic7896/97 Ultra2 SCSI adapter>
aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.4
<Adaptec aic7896/97 Ultra2 SCSI adapter>
aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/253 SCBs
Vendor: SEAGATE Model: ST39175LW Rev: 0001
Type: Direct-Access ANSI SCSI revision: 02
scsi0:A:1:0: Tagged Queuing enabled. Depth 253
Attached scsi disk sda at scsi0, channel 0, id 1, lun 0
(scsi0:A:1): 80.000MB/s transfers (40.000MHz, offset 15, 16bit)
SCSI device sda: 17783240 512-byte hdwr sectors (9105 MB)
sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
And this is the scb:
0xc7f945b0 c7f90040 c7f943dc 00000000 00000000 @.yG\CyG........
0xc7f945c0 c7f945f0 c7f5e000 c7fb0800 00000000 pEyG.`uG..{G....
0xc7f945d0 c7bd38c0 c7bd3900 c530c000 0530c008 @8=G.9=G.@0E.@0.
0xc7f945e0 00000080 c7f90140 c7f94478 00000000 [email protected]....
0xc7f945f0 00000000 c7f94554 c7f58000 c7fb0800 ....TEyG..uG..{G
0xc7f94600 00004000 c7bd38a0 c7bd3900 c530c400 .@.. 8=G.9=G.D0E
0xc7f94610 0530c408 00000080 c7f90080 c7f94478 .D0.......yGxDyG
0xc7f94620 00000000 c7f9464c c7f94a00 c7f59c00 ....LFyG.JyG..uG
I have the system in a debugger and can look at memory for you
if you want.
Steve
--
Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: [email protected]
>And this is the scb:
>0xc7f945b0 c7f90040 c7f943dc 00000000 00000000 @.yG\CyG........
>0xc7f945c0 c7f945f0 c7f5e000 c7fb0800 00000000 pEyG.`uG..{G....
>0xc7f945d0 c7bd38c0 c7bd3900 c530c000 0530c008 @8=G.9=G.@0E.@0.
>0xc7f945e0 00000080 c7f90140 c7f94478 00000000 [email protected]....
>0xc7f945f0 00000000 c7f94554 c7f58000 c7fb0800 ....TEyG..uG..{G
>0xc7f94600 00004000 c7bd38a0 c7bd3900 c530c400 .@.. 8=G.9=G.D0E
>0xc7f94610 0530c408 00000080 c7f90080 c7f94478 .D0.......yGxDyG
>0xc7f94620 00000000 c7f9464c c7f94a00 c7f59c00 ....LFyG.JyG..uG
>
>I have the system in a debugger and can look at memory for you
>if you want.
I'd like to know the value of scb->io_ctx->use_sg.
--
Justin
On Thu, 2001-12-13 at 14:15, Justin T. Gibbs wrote:
> >And this is the scb:
> >0xc7f945b0 c7f90040 c7f943dc 00000000 00000000 @.yG\CyG........
> >0xc7f945c0 c7f945f0 c7f5e000 c7fb0800 00000000 pEyG.`uG..{G....
> >0xc7f945d0 c7bd38c0 c7bd3900 c530c000 0530c008 @8=G.9=G.@0E.@0.
> >0xc7f945e0 00000080 c7f90140 c7f94478 00000000 [email protected]....
> >0xc7f945f0 00000000 c7f94554 c7f58000 c7fb0800 ....TEyG..uG..{G
> >0xc7f94600 00004000 c7bd38a0 c7bd3900 c530c400 .@.. 8=G.9=G.D0E
> >0xc7f94610 0530c408 00000080 c7f90080 c7f94478 .D0.......yGxDyG
> >0xc7f94620 00000000 c7f9464c c7f94a00 c7f59c00 ....LFyG.JyG..uG
> >
> >I have the system in a debugger and can look at memory for you
> >if you want.
>
> I'd like to know the value of scb->io_ctx->use_sg.
Here is the scsi_cmd in hex:
0xc7f5e000 e25c23a5 c13ad980 01021003 c7f84e00 %#\b.Y:A.....NxG
0xc7f5e010 00000000 c7f5e200 00000000 00000000 .....buG........
0xc7f5e020 c0270a70 00002aa4 00000000 00000000 p.'@$*..........
0xc7f5e030 00000005 00000bb8 00000000 00000000 ....8...........
0xc7f5e040 00000000 00000000 00000001 00000000 ................
0xc7f5e050 00000000 01010a0a 9a00002a 00002f17 ........*..../..
0xc7f5e060 00000008 00000000 00001000 c0439104 ..............C@
0xc7f5e070 c7f5e46c 000140ad c7f5e000 c024ff10 lduG-@...`uG.$@
0xc7f5e080 c1b0f000 00000000 9a00002a 00002f17 .p0A....*..../..
0xc7f5e090 00000008 00000000 00000000 00000a00 ................
0xc7f5e0a0 00001000 c1b0f000 00001000 00001000 .....p0A........
0xc7f5e0b0 00000200 00000000 c7f7dd60 c7f84e48 ........`]wGHNxG
0xc7f5e0c0 00003f7f 9a00002a 04002f17 00000000 .?..*..../......
0xc7f5e0d0 00000000 00000039 00000001 00000800 ....9...........
0xc7f5e0e0 00000000 009a172f 00000400 009a172f ..../......./...
0xc7f5e0f0 00000400 00010080 00000008 00000008 ................
0xc7f5e100 00000000 c1b0f000 00000000 c14ca320 .....p0A.... #LA
0xc7f5e110 c14c6fa0 c7f84e18 c7f84e30 00000000 oLA.NxG0NxG....
0xc7f5e120 00000000 00000000 00000000 00000000 ................
0xc7f5e130 00000000 00000000 00000000 00000000 ................
0xc7f5e140 00000000 00000000 00000000 00000000 ................
0xc7f5e150 00000000 00000000 00000000 00000000 ................
0xc7f5e160 00000000 c024cd30 00000000 00000000 ....0M$@........
0xc7f5e170 00000000 00000000 00000000 c7f48a00 ..............tG
And here is the kdb interpreted version (this command is pretty old,
but built against current structure definitions)
scsi_cmnd at 0xc7f5e000
host = 0xc13ad980 state = 4099 owner = 258 device = 0xc7f84e00
bnext = 0xc7f5e200 reset_chain = 0x00000000 eh_state = 0 done =
0xc0270a70
serial_number = 10916 serial_num_at_to = 0 retries = 0 timeout = 0
id/lun/cmnd = [1/0/0] cmd_len = 10 old_cmd_len = 10
cmnd = [2a/00/00/9a/17/2f/00/00/08/00/00/00]
data_cmnd = [2a/00/00/9a/17/2f/00/00/08/00/00/00]
request_buffer = 0xc1b0f000 bh_next = 0x00000000 request_bufflen =
4096
use_sg = 0 old_use_sg = 0 sglist_len = 2560 abore_reason = 0
bufflen = 4096 buffer = 0xc1b0f000 underflow = 4096 transfersize = 512
tag = 0 pid = 10915
request struct
rq_status = RQ_ACTIVE rq_dev = [8/0] errors = 0nsector = 10098479
nr_sectors = 1024 current_nr_sectors = 8
So according to this, zero.
Steve
>
> --
> Justin
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: [email protected]
From: G?rard Roudier <[email protected]>
Date: Thu, 13 Dec 2001 17:17:22 +0100 (CET)
PS: Don't take the wrong way my statements against Sun stuff. In fact, I
dislike almost everything that comes and came from them. :)
Unfortunately the things you complain about are anything but Sun or
Sparc specific. PPC64, MIPS64, Alpha, HPPA, and probably others I
have forgotten (oh yes and IA64 in the future if Intel gets their
heads out of their asses) all have IOMMU mechanisms in their PCI
controllers.
This disease may even some day infect x86 systems. In fact
technically it already has, most AMD chipsets use a slightly modified
Alpha PCI controller which does have an IOMMU hidden deep down inside
of it :-)
Like I said before, the fact that PCI allows this to work is a feature
that is actually better for PCI's relevance and longevity, not worse.
Or do you suggest that it is wiser to use bounce buffering to handle
32-bit cards on systems with more than 4GB of ram? :-) Using all
64-bit capable cards is not an answer, especially when the big
advantage of PCI is how commoditized and flooded the market is with
32-bit cards.
>So according to this, zero.
Thanks for the info - its the first useful report I've gotten todate. 8-)
I believe I've found and fixed the bug. I've changed a few other things
in the driver for the 6.2.5 release, so once I've tested them I'll release
new patches. In the mean time, you should be able to avoid the problem by
moving the initialization of scb->sg_count to 0 in the function:
aic7xxx_linux.c:ahc_linux_run_device_queue()
to before the statement:
if (cmd->use_sg != 0) {
I'd give you diffs, but these other changes in my tree need more testing
before I'll feel comfortable releasing them. I also don't have a 2.5 tree
downloaded yet to verify that the driver functions there.
In order to reproduce the bug, you need to issue a command that uses
all of the segments of a given transaction and then have a command with
use_sg == 0 be the next command to use that same SCB. This explains why
I was not able to reproduce the problem here.
--
Justin
On Thu, 2001-12-13 at 14:48, Justin T. Gibbs wrote:
> >So according to this, zero.
>
> Thanks for the info - its the first useful report I've gotten todate. 8-)
> I believe I've found and fixed the bug. I've changed a few other things
> in the driver for the 6.2.5 release, so once I've tested them I'll release
> new patches. In the mean time, you should be able to avoid the problem by
> moving the initialization of scb->sg_count to 0 in the function:
>
> aic7xxx_linux.c:ahc_linux_run_device_queue()
>
> to before the statement:
>
> if (cmd->use_sg != 0) {
>
> I'd give you diffs, but these other changes in my tree need more testing
> before I'll feel comfortable releasing them. I also don't have a 2.5 tree
> downloaded yet to verify that the driver functions there.
>
> In order to reproduce the bug, you need to issue a command that uses
> all of the segments of a given transaction and then have a command with
> use_sg == 0 be the next command to use that same SCB. This explains why
> I was not able to reproduce the problem here.
Thanks, I will test it out here and let you know, looks like David
Miller is proposing the same assignment in a different place.
Steve
>
> --
> Justin
--
Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: [email protected]
On Thu, 13 Dec 2001, David S. Miller wrote:
> From: G?rard Roudier <[email protected]>
> Date: Thu, 13 Dec 2001 17:17:22 +0100 (CET)
>
> PS: Don't take the wrong way my statements against Sun stuff. In fact, I
> dislike almost everything that comes and came from them. :)
>
> Unfortunately the things you complain about are anything but Sun or
> Sparc specific. PPC64, MIPS64, Alpha, HPPA, and probably others I
> have forgotten (oh yes and IA64 in the future if Intel gets their
> heads out of their asses) all have IOMMU mechanisms in their PCI
> controllers.
Might just be contagious brain disease. :)
> This disease may even some day infect x86 systems. In fact
> technically it already has, most AMD chipsets use a slightly modified
> Alpha PCI controller which does have an IOMMU hidden deep down inside
> of it :-)
>
> Like I said before, the fact that PCI allows this to work is a feature
> that is actually better for PCI's relevance and longevity, not worse.
>
> Or do you suggest that it is wiser to use bounce buffering to handle
> 32-bit cards on systems with more than 4GB of ram? :-) Using all
> 64-bit capable cards is not an answer, especially when the big
> advantage of PCI is how commoditized and flooded the market is with
> 32-bit cards.
If things happened this way for the last 20 years, then the typical CPU
nowaday would probably look like a 20 GHz Z80. :)
When I purchased my PIII 233 MHz 4 years ago, I thought that my next
system will be a full 64 bit system. Indeed I was dreaming. Btw, I donnot
consider an hybrid 32bit path / 64bit path system to be a 64 bit system.
I understand the deal between making money and doing things right. In the
job I earn from, the former obviously applies, but I didn't earn a single
euro-cent from free software. So, allow me, at least as long as this will
not be changed, not to agree with you here and stop the discussion as a
result.
G?rard.
On Thu, 2001-12-13 at 14:58, Steve Lord wrote:
> On Thu, 2001-12-13 at 14:48, Justin T. Gibbs wrote:
> > >So according to this, zero.
> >
> > Thanks for the info - its the first useful report I've gotten todate. 8-)
> > I believe I've found and fixed the bug. I've changed a few other things
> > in the driver for the 6.2.5 release, so once I've tested them I'll release
> > new patches. In the mean time, you should be able to avoid the problem by
> > moving the initialization of scb->sg_count to 0 in the function:
> >
> > aic7xxx_linux.c:ahc_linux_run_device_queue()
> >
> > to before the statement:
> >
> > if (cmd->use_sg != 0) {
> >
> > I'd give you diffs, but these other changes in my tree need more testing
> > before I'll feel comfortable releasing them. I also don't have a 2.5 tree
> > downloaded yet to verify that the driver functions there.
> >
> > In order to reproduce the bug, you need to issue a command that uses
> > all of the segments of a given transaction and then have a command with
> > use_sg == 0 be the next command to use that same SCB. This explains why
> > I was not able to reproduce the problem here.
>
> Thanks, I will test it out here and let you know, looks like David
> Miller is proposing the same assignment in a different place.
>
OK, I can confirm this fixes it for me. A side not for Jens, this still
pushes the scsi layer into those DMA shortage messages:
Warning - running *really* short on DMA buffers
SCSI: depth is 175, # segs 128, # hw segs 1
Warning - running *really* short on DMA buffers
Warning - running *really* short on DMA buffers
SCSI: depth is 181, # segs 128, # hw segs 1
Warning - running *really* short on DMA buffers
SCSI: depth is 182, # segs 128, # hw segs 1
Warning - running *really* short on DMA buffers
SCSI: depth is 183, # segs 128, # hw segs 1
SCSI: depth is 173, # segs 128, # hw segs 1
Steve
--
Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: [email protected]
From: Steve Lord <[email protected]>
Date: 13 Dec 2001 15:17:24 -0600
OK, I can confirm this fixes it for me. A side not for Jens, this still
pushes the scsi layer into those DMA shortage messages:
Yes we know, once Jens finishes up his work on using a mempool for the
scatterlist allocations this problem will dissapate.
On Thu, Dec 13 2001, David S. Miller wrote:
> From: Steve Lord <[email protected]>
> Date: 13 Dec 2001 15:17:24 -0600
>
> OK, I can confirm this fixes it for me. A side not for Jens, this still
> pushes the scsi layer into those DMA shortage messages:
>
> Yes we know, once Jens finishes up his work on using a mempool for the
> scatterlist allocations this problem will dissapate.
Indeed, the below patch should fix it right up and also has all of
David's fixes and merging cleanups.
*.kernel.org/pub/linux/kernel/people/axboe/patches/v2.5/2.5.1-pre11/bio-pre11-5.bz2
--
Jens Axboe
On Fri, Dec 14 2001, Jens Axboe wrote:
> *.kernel.org/pub/linux/kernel/people/axboe/patches/v2.5/2.5.1-pre11/bio-pre11-5.bz2
Steve Lord caught two typos in the patch, here's an incremental diff
attached. There will also be a bio-pre11-6 at the above location in a
few minutes.
--- linux/drivers/scsi/scsi.c~ Fri Dec 14 11:06:25 2001
+++ linux/drivers/scsi/scsi.c Fri Dec 14 11:06:46 2001
@@ -2590,7 +2590,6 @@
/*
* setup sg memory pools
*/
- ts = 0;
for (i = 0; i < SG_MEMPOOL_NR; i++) {
struct scsi_host_sg_pool *sgp = scsi_sg_pools + i;
int size = scsi_host_sg_pool_sizes[i] * sizeof(struct scatterlist);
--- linux/drivers/scsi/sym53c8xx.c~ Fri Dec 14 11:10:38 2001
+++ linux/drivers/scsi/sym53c8xx.c Fri Dec 14 11:10:51 2001
@@ -12174,7 +12174,7 @@
use_sg = map_scsi_sg_data(np, cmd);
if (use_sg > MAX_SCATTER) {
- unmap_scsi_sg_data(np, cmd);
+ unmap_scsi_data(np, cmd);
return -1;
}
data = &cp->phys.data[MAX_SCATTER - use_sg];
--
Jens Axboe
Jens Axboe wrote:
>On Fri, Dec 14 2001, Jens Axboe wrote:
>
>>*.kernel.org/pub/linux/kernel/people/axboe/patches/v2.5/2.5.1-pre11/bio-pre11-5.bz2
>>
>
>--- linux/drivers/scsi/sym53c8xx.c~ Fri Dec 14 11:10:38 2001
>+++ linux/drivers/scsi/sym53c8xx.c Fri Dec 14 11:10:51 2001
>@@ -12174,7 +12174,7 @@
>
> use_sg = map_scsi_sg_data(np, cmd);
> if (use_sg > MAX_SCATTER) {
>- unmap_scsi_sg_data(np, cmd);
>+ unmap_scsi_data(np, cmd);
> return -1;
> }
> data = &cp->phys.data[MAX_SCATTER - use_sg];
>
There is one of these in ncr53c8xx.c as well, line 8135
Steve
Is this going into the 2.4.17 as well?
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Jens Axboe
> Sent: Friday, December 14, 2001 11:15 AM
> To: David S. Miller
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: highmem, aic7xxx, and vfat: too few segs for dma mapping
>
>
> On Fri, Dec 14 2001, Jens Axboe wrote:
> >
> *.kernel.org/pub/linux/kernel/people/axboe/patches/v2.5/2.5.1-pre11/bi
> > o-pre11-5.bz2
>
> Steve Lord caught two typos in the patch, here's an
> incremental diff attached. There will also be a bio-pre11-6
> at the above location in a few minutes.
>
> --- linux/drivers/scsi/scsi.c~ Fri Dec 14 11:06:25 2001
> +++ linux/drivers/scsi/scsi.c Fri Dec 14 11:06:46 2001
> @@ -2590,7 +2590,6 @@
> /*
> * setup sg memory pools
> */
> - ts = 0;
> for (i = 0; i < SG_MEMPOOL_NR; i++) {
> struct scsi_host_sg_pool *sgp = scsi_sg_pools + i;
> int size = scsi_host_sg_pool_sizes[i] *
> sizeof(struct scatterlist);
> --- linux/drivers/scsi/sym53c8xx.c~ Fri Dec 14 11:10:38 2001
> +++ linux/drivers/scsi/sym53c8xx.c Fri Dec 14 11:10:51 2001
> @@ -12174,7 +12174,7 @@
>
> use_sg = map_scsi_sg_data(np, cmd);
> if (use_sg > MAX_SCATTER) {
> - unmap_scsi_sg_data(np, cmd);
> + unmap_scsi_data(np, cmd);
> return -1;
> }
> data = &cp->phys.data[MAX_SCATTER - use_sg];
>
> --
> Jens Axboe
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in the body of a message to
> [email protected] More majordomo info at
http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
On Fri, Dec 14 2001, Stephen Lord wrote:
> >--- linux/drivers/scsi/sym53c8xx.c~ Fri Dec 14 11:10:38 2001
> >+++ linux/drivers/scsi/sym53c8xx.c Fri Dec 14 11:10:51 2001
> >@@ -12174,7 +12174,7 @@
> >
> > use_sg = map_scsi_sg_data(np, cmd);
> > if (use_sg > MAX_SCATTER) {
> >- unmap_scsi_sg_data(np, cmd);
> >+ unmap_scsi_data(np, cmd);
> > return -1;
> > }
> > data = &cp->phys.data[MAX_SCATTER - use_sg];
> >
> There is one of these in ncr53c8xx.c as well, line 8135
Agrh crap, thanks Steve!
--
Jens Axboe
On Fri, Dec 14 2001, Alok K. Dhir wrote:
> Is this going into the 2.4.17 as well?
It's a 2.5.1-pre11 + bio-pre-X bug fix, so no.
--
Jens Axboe