2003-02-04 07:22:24

by Andrew Morton

[permalink] [raw]
Subject: 2.5.59-mm8


http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/

. Various tweaks and fixes, and some hugetlbpage work.

. There is an updated anticipatory scheduler patch from Nick over in
experimental/ which addresses the large-read-starves-everything problem.

. The reworked ia32 balancing patch from Nitin Kamble is stable, and is
consistently showing benefit for heavy networking loads on large SMP
machines. Even though everyone seems to agree that a userspace solution to
this is smarter, that's no reason to hold back on improving the
kernel-based solution so I shall be submitting that patch.

. Ingo's latest scheduler changes are here. I held off on that because it
appeared that there was some interaction with the I/O scheduler. Whatever
that was has gone away without any CPU scheduler changes, so...

. frlocks have been renamed to seqlocks, and that code is now converging
onto something stable.



Changes since 2.5.59-mm7:


+linus.patch

Latest drop from Linus

-sync-fix.patch
-direct-io-ENOSPC-fix.patch
-inode-accounting-race-fix.patch
-vmlinux-fix.patch
-maestro-fix.patch
-setuid-exec-no-lock_kernel.patch
-ext3-scheduling-storm.patch
-quota-lockfix.patch
-quota-offsem.patch
-slab-poisoning-fix.patch
-preempt-locking.patch
-stack-overflow-fix.patch
-ext2-allocation-failure-fix.patch
-ext2_new_block-fixes.patch
-slab-irq-fix.patch
-Richard_Henderson_for_President.patch
-parenthesise-pgd_index.patch
-kernel-commandline-fix.patch
-macro-double-eval-fix.patch
-blkdev-fixes.patch
-modversions.patch
-pcmcia_timer_init.patch
-buffer-io-accounting.patch
-aic79xx-linux-2.5.59-20030122.patch
-discarded-section-fix.patch
-atyfb-compile-fix.patch
-floppy-locking-fix.patch
-sound-firmware-load-fix.patch
-generic_file_readonly_mmap-fix.patch
-exit_mmap-fix-47.patch
-show_task-fix.patch

Merged

+mark_inode_dirty-race.patch

SMP barriers in __mark_inode_dirty()

+pin_page-pmd.patch

Optimisation for follow_page() for some architectures. For futexes in huge
pages.

+seqlock.patch

Rename frlocks, fixes.

+default_idle-speedup.patch

Speed up the idle task!

+hugetlbfs-get_unmapped_area.patch
+hugetlbfs-truncate-fix.patch
+hugetlbfs-i_size-fix.patch
+hugetlbfs-cleanup.patch
+hugetlbfs-nopage-cleanup.patch
+hugetlbfs-fault-fix.patch
+hugetlbpage-cleanup.patch
+hugetlb_vmtruncate-fixes.patch
+hugetlb-mremap-fix.patch

hugetlb fixes/cleanups

+mremap-cleanup.patch

Random edits

+up-spinlock-debugging.patch

spinlock debugging for uniprocessor builds

+scheduler-update.patch

Ingo's latest.

+rml-scheduler-update.patch

scheduler tweaks from Robert




All 80 patches:

linus.patch
cset-1.879.1.145-to-1.950.txt.gz

kgdb.patch

devfs-fix.patch

deadline-np-42.patch
(undescribed patch)

deadline-np-43.patch
(undescribed patch)

batch-tuning.patch
I/O scheduler tuning

starvation-by-read-fix.patch
fix starvation-by-readers in the IO scheduler

buffer-debug.patch
buffer.c debugging

warn-null-wakeup.patch

reiserfs-readpages.patch
reiserfs v3 readpages support

fadvise.patch
implement posix_fadvise64()

auto-unplug.patch
self-unplugging request queues

less-unplugging.patch
Remove most of the blk_run_queues() calls

scheduler-tunables.patch
scheduler tunables

htlb-2.patch
hugetlb: fix MAP_FIXED handling

kirq.patch
ia32 IRQ distribution rework

kirq-up-fix.patch
Subject: Re: 2.5.59-mm1

agp-warning-fix.patch
fix agp compile warning

ext3-truncate-ordered-pages.patch
ext3: explicitly free truncated pages

prune-icache-stats.patch
add stats for page reclaim via inode freeing

vma-file-merge.patch
file-backed vma merging mergnig

mmap-whitespace.patch

read_cache_pages-cleanup.patch
cleanup in read_cache_pages()

remove-GFP_HIGHIO.patch
remove __GFP_HIGHIO

oprofile-p4.patch

oprofile_cpu-as-string.patch
oprofile cpu-as-string

wli-11_pgd_ctor.patch
Use a slab cache for pgd and pmd pages

wli-11_pgd_ctor-update.patch
pgd_ctor update

smaller-slab-batches.patch
Avoid losing timer ticks when slab debug is enabled.

printk-locking.patch
remove unneeded locking in do_syslog()

hangcheck-timer.patch
hangcheck-timer

jbd-documentation.patch
JBD Documentation

sendfile-security-hooks.patch
Subject: [RFC][PATCH] Restore LSM hook calls to sendfile

mmzone-parens.patch
asm-i386/mmzone.h macro paren/eval fixes

no_space_in_slabnames.patch
remove spaces from slab names

remove-will_become_orphaned_pgrp.patch
remove will_become_orphaned_pgrp()

MAX_IO_APICS-ifdef.patch
MAX_IO_APICS #ifdef'd wrongly

dac960-error-retry.patch
Subject: [PATCH] linux2.5.56 patch to DAC960 driver for error retry

epoll-update.patch
epoll timeout and syscall return types ...

topology-remove-underbars.patch
Remove __ from topology macros

mandlock-oops-fix.patch
ftruncate/truncate oopses with mandatory locking

put_user-warning-fix.patch
Subject: Re: Linux 2.5.59

hash-warnings.patch
fix #warning's

mark_inode_dirty-race.patch
Fix SMP race betwen __sync_single_inode and __mark_inode_dirty

reiserfs_file_write.patch
Subject: reiserfs file_write patch

lost-tick.patch
Lost tick compensation

seq_file-page-defn.patch
Include <asm/page.h> in fs/seq_file.c, as it uses PAGE_SIZE

user-process-count-leak.patch
fix current->user->processes leak

scsi-iothread.patch
scsi_eh_* needs to run even during suspend

numaq-ioapic-fix2.patch
NUMAQ io_apic programming fix

misc.patch
misc fixes

writeback-sync-cleanup.patch
Remove unneeded code in fs/fs-writeback.c

dont-wait-on-inode.patch
Fix latencies during writeback

unlink-latency-fix.patch
fix i_sem contention in sys_unlink()

pin_page-fix.patch
Fix futexes in huge pages

pin_page-pmd.patch
Optimise follow_page() for page-table-based hugepages

frlock-xtime.patch
fast reader locks for gettimeofday() and friends

frlock-xtime-i386.patch

frlock-xtime-ia64.patch

frlock-xtime-other.patch

seqlock.patch
Change frlock to seqlock

do_gettimeofday-speedup.patch
do_gettimeofday() optimisations

default_idle-speedup.patch
default_idle micro-optimisation

pte_chain_alloc-fixes.patch

hugetlbfs-set_page_dirty.patch
give hugetlbfs a set_page_dirty a_op

compound-pages.patch
Infrastructure for correct hugepage refcounting

compound-pages-hugetlb.patch
convert hugetlb code to use compound pages

hugetlbfs-get_unmapped_area.patch
get_unmapped_area for hugetlbfs

hugetlbfs-truncate-fix.patch
hugetlbfs: fix truncate

hugetlbfs-i_size-fix.patch
hugetlbfs i_size fixes

hugetlbfs-cleanup.patch
hugetlbfs cleanups

hugetlbfs-nopage-cleanup.patch
Give all architectures a hugetlb_nopage().

hugetlbfs-fault-fix.patch
Fix hugetlbfs faults

hugetlbpage-cleanup.patch
ia32 hugetlb cleanup

hugetlb_vmtruncate-fixes.patch
Fix hugetlb_vmtruncate_list()

hugetlb-mremap-fix.patch
hugetlb mremap fix

mremap-cleanup.patch
mm/mremap.c whitespace cleanup

up-spinlock-debugging.patch
spinlock debugging on uniprocessors

scheduler-update.patch
ingo's scheduler changes for 2.5.59-mm7

rml-scheduler-update.patch
rml scheduler bits, 2.5.59-mm7




2003-02-04 07:53:41

by Joshua Kwan

[permalink] [raw]
Subject: Re: 2.5.59-mm8

I noticed you have at least some of Jaroslav Kysela's ALSA BK push in
-mm8. Is his whole patch integrated into yours?

Regards
Josh

On Mon, Feb 03, 2003 at 11:31:56PM -0800, Andrew Morton wrote:
>
> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/
>
> . Various tweaks and fixes, and some hugetlbpage work.
>
> . There is an updated anticipatory scheduler patch from Nick over in
> experimental/ which addresses the large-read-starves-everything problem.
>
> . The reworked ia32 balancing patch from Nitin Kamble is stable, and is
> consistently showing benefit for heavy networking loads on large SMP
> machines. Even though everyone seems to agree that a userspace solution to
> this is smarter, that's no reason to hold back on improving the
> kernel-based solution so I shall be submitting that patch.
>
> . Ingo's latest scheduler changes are here. I held off on that because it
> appeared that there was some interaction with the I/O scheduler. Whatever
> that was has gone away without any CPU scheduler changes, so...
>
> . frlocks have been renamed to seqlocks, and that code is now converging
> onto something stable.
>
>
>
> Changes since 2.5.59-mm7:
>
>
> +linus.patch
>
> Latest drop from Linus
>
> -sync-fix.patch
> -direct-io-ENOSPC-fix.patch
> -inode-accounting-race-fix.patch
> -vmlinux-fix.patch
> -maestro-fix.patch
> -setuid-exec-no-lock_kernel.patch
> -ext3-scheduling-storm.patch
> -quota-lockfix.patch
> -quota-offsem.patch
> -slab-poisoning-fix.patch
> -preempt-locking.patch
> -stack-overflow-fix.patch
> -ext2-allocation-failure-fix.patch
> -ext2_new_block-fixes.patch
> -slab-irq-fix.patch
> -Richard_Henderson_for_President.patch
> -parenthesise-pgd_index.patch
> -kernel-commandline-fix.patch
> -macro-double-eval-fix.patch
> -blkdev-fixes.patch
> -modversions.patch
> -pcmcia_timer_init.patch
> -buffer-io-accounting.patch
> -aic79xx-linux-2.5.59-20030122.patch
> -discarded-section-fix.patch
> -atyfb-compile-fix.patch
> -floppy-locking-fix.patch
> -sound-firmware-load-fix.patch
> -generic_file_readonly_mmap-fix.patch
> -exit_mmap-fix-47.patch
> -show_task-fix.patch
>
> Merged
>
> +mark_inode_dirty-race.patch
>
> SMP barriers in __mark_inode_dirty()
>
> +pin_page-pmd.patch
>
> Optimisation for follow_page() for some architectures. For futexes in huge
> pages.
>
> +seqlock.patch
>
> Rename frlocks, fixes.
>
> +default_idle-speedup.patch
>
> Speed up the idle task!
>
> +hugetlbfs-get_unmapped_area.patch
> +hugetlbfs-truncate-fix.patch
> +hugetlbfs-i_size-fix.patch
> +hugetlbfs-cleanup.patch
> +hugetlbfs-nopage-cleanup.patch
> +hugetlbfs-fault-fix.patch
> +hugetlbpage-cleanup.patch
> +hugetlb_vmtruncate-fixes.patch
> +hugetlb-mremap-fix.patch
>
> hugetlb fixes/cleanups
>
> +mremap-cleanup.patch
>
> Random edits
>
> +up-spinlock-debugging.patch
>
> spinlock debugging for uniprocessor builds
>
> +scheduler-update.patch
>
> Ingo's latest.
>
> +rml-scheduler-update.patch
>
> scheduler tweaks from Robert
>
>
>
>
> All 80 patches:
>
> linus.patch
> cset-1.879.1.145-to-1.950.txt.gz
>
> kgdb.patch
>
> devfs-fix.patch
>
> deadline-np-42.patch
> (undescribed patch)
>
> deadline-np-43.patch
> (undescribed patch)
>
> batch-tuning.patch
> I/O scheduler tuning
>
> starvation-by-read-fix.patch
> fix starvation-by-readers in the IO scheduler
>
> buffer-debug.patch
> buffer.c debugging
>
> warn-null-wakeup.patch
>
> reiserfs-readpages.patch
> reiserfs v3 readpages support
>
> fadvise.patch
> implement posix_fadvise64()
>
> auto-unplug.patch
> self-unplugging request queues
>
> less-unplugging.patch
> Remove most of the blk_run_queues() calls
>
> scheduler-tunables.patch
> scheduler tunables
>
> htlb-2.patch
> hugetlb: fix MAP_FIXED handling
>
> kirq.patch
> ia32 IRQ distribution rework
>
> kirq-up-fix.patch
> Subject: Re: 2.5.59-mm1
>
> agp-warning-fix.patch
> fix agp compile warning
>
> ext3-truncate-ordered-pages.patch
> ext3: explicitly free truncated pages
>
> prune-icache-stats.patch
> add stats for page reclaim via inode freeing
>
> vma-file-merge.patch
> file-backed vma merging mergnig
>
> mmap-whitespace.patch
>
> read_cache_pages-cleanup.patch
> cleanup in read_cache_pages()
>
> remove-GFP_HIGHIO.patch
> remove __GFP_HIGHIO
>
> oprofile-p4.patch
>
> oprofile_cpu-as-string.patch
> oprofile cpu-as-string
>
> wli-11_pgd_ctor.patch
> Use a slab cache for pgd and pmd pages
>
> wli-11_pgd_ctor-update.patch
> pgd_ctor update
>
> smaller-slab-batches.patch
> Avoid losing timer ticks when slab debug is enabled.
>
> printk-locking.patch
> remove unneeded locking in do_syslog()
>
> hangcheck-timer.patch
> hangcheck-timer
>
> jbd-documentation.patch
> JBD Documentation
>
> sendfile-security-hooks.patch
> Subject: [RFC][PATCH] Restore LSM hook calls to sendfile
>
> mmzone-parens.patch
> asm-i386/mmzone.h macro paren/eval fixes
>
> no_space_in_slabnames.patch
> remove spaces from slab names
>
> remove-will_become_orphaned_pgrp.patch
> remove will_become_orphaned_pgrp()
>
> MAX_IO_APICS-ifdef.patch
> MAX_IO_APICS #ifdef'd wrongly
>
> dac960-error-retry.patch
> Subject: [PATCH] linux2.5.56 patch to DAC960 driver for error retry
>
> epoll-update.patch
> epoll timeout and syscall return types ...
>
> topology-remove-underbars.patch
> Remove __ from topology macros
>
> mandlock-oops-fix.patch
> ftruncate/truncate oopses with mandatory locking
>
> put_user-warning-fix.patch
> Subject: Re: Linux 2.5.59
>
> hash-warnings.patch
> fix #warning's
>
> mark_inode_dirty-race.patch
> Fix SMP race betwen __sync_single_inode and __mark_inode_dirty
>
> reiserfs_file_write.patch
> Subject: reiserfs file_write patch
>
> lost-tick.patch
> Lost tick compensation
>
> seq_file-page-defn.patch
> Include <asm/page.h> in fs/seq_file.c, as it uses PAGE_SIZE
>
> user-process-count-leak.patch
> fix current->user->processes leak
>
> scsi-iothread.patch
> scsi_eh_* needs to run even during suspend
>
> numaq-ioapic-fix2.patch
> NUMAQ io_apic programming fix
>
> misc.patch
> misc fixes
>
> writeback-sync-cleanup.patch
> Remove unneeded code in fs/fs-writeback.c
>
> dont-wait-on-inode.patch
> Fix latencies during writeback
>
> unlink-latency-fix.patch
> fix i_sem contention in sys_unlink()
>
> pin_page-fix.patch
> Fix futexes in huge pages
>
> pin_page-pmd.patch
> Optimise follow_page() for page-table-based hugepages
>
> frlock-xtime.patch
> fast reader locks for gettimeofday() and friends
>
> frlock-xtime-i386.patch
>
> frlock-xtime-ia64.patch
>
> frlock-xtime-other.patch
>
> seqlock.patch
> Change frlock to seqlock
>
> do_gettimeofday-speedup.patch
> do_gettimeofday() optimisations
>
> default_idle-speedup.patch
> default_idle micro-optimisation
>
> pte_chain_alloc-fixes.patch
>
> hugetlbfs-set_page_dirty.patch
> give hugetlbfs a set_page_dirty a_op
>
> compound-pages.patch
> Infrastructure for correct hugepage refcounting
>
> compound-pages-hugetlb.patch
> convert hugetlb code to use compound pages
>
> hugetlbfs-get_unmapped_area.patch
> get_unmapped_area for hugetlbfs
>
> hugetlbfs-truncate-fix.patch
> hugetlbfs: fix truncate
>
> hugetlbfs-i_size-fix.patch
> hugetlbfs i_size fixes
>
> hugetlbfs-cleanup.patch
> hugetlbfs cleanups
>
> hugetlbfs-nopage-cleanup.patch
> Give all architectures a hugetlb_nopage().
>
> hugetlbfs-fault-fix.patch
> Fix hugetlbfs faults
>
> hugetlbpage-cleanup.patch
> ia32 hugetlb cleanup
>
> hugetlb_vmtruncate-fixes.patch
> Fix hugetlb_vmtruncate_list()
>
> hugetlb-mremap-fix.patch
> hugetlb mremap fix
>
> mremap-cleanup.patch
> mm/mremap.c whitespace cleanup
>
> up-spinlock-debugging.patch
> spinlock debugging on uniprocessors
>
> scheduler-update.patch
> ingo's scheduler changes for 2.5.59-mm7
>
> rml-scheduler-update.patch
> rml scheduler bits, 2.5.59-mm7
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


Attachments:
(No filename) (8.05 kB)
(No filename) (189.00 B)
Download all attachments

2003-02-04 07:55:33

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.59-mm8

Joshua Kwan <[email protected]> wrote:
>
> I noticed you have at least some of Jaroslav Kysela's ALSA BK push in
> -mm8. Is his whole patch integrated into yours?
>

I always include the latest diff from Linus, so whatever is in BK right now
is in -mm8.

2003-02-04 07:59:04

by Joshua Kwan

[permalink] [raw]
Subject: Re: 2.5.59-mm8

Yes, my bad. It suddenly occurred to me after i sent it off that that
could have been the case.. And it was. Sorry for the trouble. :)

Regards
Josh

On Tue, Feb 04, 2003 at 12:05:11AM -0800, Andrew Morton wrote:
> Joshua Kwan <[email protected]> wrote:
> >
> > I noticed you have at least some of Jaroslav Kysela's ALSA BK push in
> > -mm8. Is his whole patch integrated into yours?
> >
>
> I always include the latest diff from Linus, so whatever is in BK right now
> is in -mm8.


Attachments:
(No filename) (490.00 B)
(No filename) (189.00 B)
Download all attachments

2003-02-04 08:00:17

by Martin J. Bligh

[permalink] [raw]
Subject: Re: 2.5.59-mm8

> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/

Booted to login prompt, then immediately oopsed
(16-way NUMA-Q, mm6 worked fine). At a wild guess, I'd suspect
irq_balance stuff.

Unable to handle kernel NULL pointer dereference at virtual address 0000013c
printing eip:
c01ed768
*pde = 2ecb7001
*pte = 00000000
Oops: 0002
CPU: 2
EIP: 0060:[<c01ed768>] Not tainted
EFLAGS: 00010046
EIP is at isp1020_intr_handler+0x1f8/0x330
eax: 00000000 ebx: ef67f080 ecx: 00000000 edx: 00000000
esi: 00000000 edi: 00000003 ebp: ef6c589c esp: f0199efc
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=f0198000 task=f019cc40)
Stack: ef67f080 c02c7fe0 0360db40 f0199f40 00000003 ef6c5800 00000086
f0199f7c
00000013 c01ed556 00000013 ef6c5800 f0199f7c f01ef7e0 24000001
c010b7c5
00000013 ef6c5800 f0199f7c c02c3a60 00000260 00000013 f01ef7e0
c010b9bd
Call Trace:
[<c01ed556>] do_isp1020_intr_handler+0x36/0x50
[<c010b7c5>] handle_IRQ_event+0x45/0x70
[<c010b9bd>] do_IRQ+0x8d/0x100
[<c0107080>] default_idle+0x0/0x50
[<c0107080>] default_idle+0x0/0x50
[<c010a070>] common_interrupt+0x18/0x20
[<c0107080>] default_idle+0x0/0x50
[<c0107080>] default_idle+0x0/0x50
[<c01070aa>] default_idle+0x2a/0x50
[<c010714a>] cpu_idle+0x3a/0x50
[<c0120294>] printk+0x164/0x1a0

Code: 89 86 3c 01 00 00 e9 5b ff ff ff c7 44 24 08 40 00 00 00 8d
<0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

2003-02-04 08:07:31

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.59-mm8

"Martin J. Bligh" <[email protected]> wrote:
>
> > http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/
>
> Booted to login prompt, then immediately oopsed
> (16-way NUMA-Q, mm6 worked fine). At a wild guess, I'd suspect
> irq_balance stuff.
>

There are a lot of scsi updates in Linus's tree. Can you please
test just

http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/broken-out/linus.patch

2003-02-04 08:59:24

by Dave Hansen

[permalink] [raw]
Subject: Re: 2.5.59-mm8

Martin J. Bligh wrote:
>>http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/
>

> Booted to login prompt, then immediately oopsed
> (16-way NUMA-Q, mm6 worked fine). At a wild guess, I'd suspect
> irq_balance stuff.
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000013c
> printing eip:
> c01ed768
> *pde = 2ecb7001
> *pte = 00000000
> Oops: 0002
> CPU: 2
> EIP: 0060:[<c01ed768>] Not tainted
> EFLAGS: 00010046
> EIP is at isp1020_intr_handler+0x1f8/0x330
> eax: 00000000 ebx: ef67f080 ecx: 00000000 edx: 00000000
> esi: 00000000 edi: 00000003 ebp: ef6c589c esp: f0199efc
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 0, threadinfo=f0198000 task=f019cc40)
> Stack: ef67f080 c02c7fe0 0360db40 f0199f40 00000003 ef6c5800 00000086
> f0199f7c
> 00000013 c01ed556 00000013 ef6c5800 f0199f7c f01ef7e0 24000001
> c010b7c5
> 00000013 ef6c5800 f0199f7c c02c3a60 00000260 00000013 f01ef7e0
> c010b9bd
> Call Trace:
> [<c01ed556>] do_isp1020_intr_handler+0x36/0x50
> [<c010b7c5>] handle_IRQ_event+0x45/0x70
> [<c010b9bd>] do_IRQ+0x8d/0x100
> [<c0107080>] default_idle+0x0/0x50
> [<c0107080>] default_idle+0x0/0x50
> [<c010a070>] common_interrupt+0x18/0x20
> [<c0107080>] default_idle+0x0/0x50
> [<c0107080>] default_idle+0x0/0x50
> [<c01070aa>] default_idle+0x2a/0x50
> [<c010714a>] cpu_idle+0x3a/0x50
> [<c0120294>] printk+0x164/0x1a0

This didn't include 4k/irqstack stuff did it? That is in the path that
those patches touch.

--
Dave Hansen
[email protected]

2003-02-04 09:23:43

by Arjan van de Ven

[permalink] [raw]
Subject: Re: 2.5.59-mm8

On Tue, 2003-02-04 at 08:31, Andrew Morton wrote:

> . The reworked ia32 balancing patch from Nitin Kamble is stable, and is
> consistently showing benefit for heavy networking loads on large SMP
> machines. Even though everyone seems to agree that a userspace solution to
> this is smarter, that's no reason to hold back on improving the
> kernel-based solution so I shall be submitting that patch.

<shameless plug>
A version of a proposed userspace solution can be found at
http://people.redhat.com/arjanv/irqbalance/irqbalance-0.05.tar.gz
</shameless plug>

It's still relatively simple, but it has the buildingblocks for becoming
more advanced.

Greetings,
Arjan van de Ven


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2003-02-04 22:14:12

by Martin J. Bligh

[permalink] [raw]
Subject: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

> "Martin J. Bligh" <[email protected]> wrote:
>>
>> > http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/
>>
>> Booted to login prompt, then immediately oopsed
>> (16-way NUMA-Q, mm6 worked fine). At a wild guess, I'd suspect
>> irq_balance stuff.
>>
>
> There are a lot of scsi updates in Linus's tree. Can you please
> test just
>
> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/broken-out/linus.patch

Yup, the SCSI code in Linus' tree has broken since 2.5.59.
I reproduced this on my 4-way SMP machine (panic from that below),
so it's not just NUMA-Q wierdness ;-)

M.

Unable to handle kernel NULL pointer dereference at virtual address 0000013c
printing eip:
c01c1986
*pde = 00000000
Oops: 0002
CPU: 3
EIP: 0060:[<c01c1986>] Not tainted
EFLAGS: 00010046
EIP is at isp1020_intr_handler+0x1e6/0x290
eax: 00000000 ebx: f7c42080 ecx: 00000000 edx: 00000054
esi: 00000002 edi: 00000013 ebp: 00000000 esp: f7f97efc
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=f7f96000 task=f7f9d240)
Stack: f7c42080 f7c52800 00000002 00000013 f7f97f80 00000003 00000003 f7c5289c
f7c52800 c01c1791 00000013 f7c52800 f7f97f80 f7ffe1e0 24000001 c010a815
00000013 f7c52800 f7f97f80 c028fa60 00000260 00000013 f7f97f78 c010a9e6
Call Trace:
[<c01c1791>] do_isp1020_intr_handler+0x25/0x34
[<c010a815>] handle_IRQ_event+0x29/0x4c
[<c010a9e6>] do_IRQ+0x96/0x100
[<c0106ca0>] default_idle+0x0/0x34
[<c01094a8>] common_interrupt+0x18/0x20
[<c0106ca0>] default_idle+0x0/0x34
[<c0106cc9>] default_idle+0x29/0x34
[<c0106d53>] cpu_idle+0x37/0x48
[<c0119d21>] printk+0x149/0x160

Code: 89 85 3c 01 00 00 83 c4 04 eb 0a c7 85 3c 01 00 00 00 00 07
<0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

2003-02-05 08:02:28

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.5.59-mm8

At 11:31 PM 2/3/2003 -0800, Andrew Morton wrote:

>. Ingo's latest scheduler changes are here. I held off on that because it
> appeared that there was some interaction with the I/O scheduler. Whatever
> that was has gone away without any CPU scheduler changes, so...

Greetings,

The scheduler changes cause some odd behavior here when running make -j30
bzImage on my 128MB PIII/500 box.

It seems odd that I'm not getting the level of memory pressure I expect to
get. As you can see by the log, at times I _do_ briefly see 'proper'
memory pressure. I'm still trying to figure out if this is good or bad
behavior... all I know for _sure_ is that it's odd=reportable behavior ;-)

-Mike


Attachments:
log.txt (9.75 kB)

2003-02-06 05:04:03

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

>> There are a lot of scsi updates in Linus's tree. Can you please
>> test just
>>
>> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/broken-o
>> ut/linus.patch
>
> Yup, the SCSI code in Linus' tree has broken since 2.5.59.
> I reproduced this on my 4-way SMP machine (panic from that below),
> so it's not just NUMA-Q wierdness ;-)
>
> M.

elm3b13:~/linux/2.5.59-linus# addr2line -e vmlinux c01c1986
/root/linux/2.5.59-linus/drivers/scsi/qlogicisp.c:632

which is the readw of:

static inline u_short isp_inw(struct Scsi_Host *host, long offset)
{
struct isp1020_hostdata *h = (struct isp1020_hostdata
*)host->hostdata;
if (h->memaddr)
return readw(h->memaddr + offset);
else
return inw(host->io_port + offset);
}

> Unable to handle kernel NULL pointer dereference at virtual address
> 0000013c printing eip:
> c01c1986
> *pde = 00000000
> Oops: 0002
> CPU: 3
> EIP: 0060:[<c01c1986>] Not tainted
> EFLAGS: 00010046
> EIP is at isp1020_intr_handler+0x1e6/0x290
> eax: 00000000 ebx: f7c42080 ecx: 00000000 edx: 00000054
> esi: 00000002 edi: 00000013 ebp: 00000000 esp: f7f97efc
> ds: 007b es: 007b ss: 0068
> Process swapper (pid: 0, threadinfo=f7f96000 task=f7f9d240)
> Stack: f7c42080 f7c52800 00000002 00000013 f7f97f80 00000003 00000003
> f7c5289c f7c52800 c01c1791 00000013 f7c52800 f7f97f80 f7ffe1e0
> 24000001 c010a815 00000013 f7c52800 f7f97f80 c028fa60 00000260
> 00000013 f7f97f78 c010a9e6 Call Trace:
> [<c01c1791>] do_isp1020_intr_handler+0x25/0x34
> [<c010a815>] handle_IRQ_event+0x29/0x4c
> [<c010a9e6>] do_IRQ+0x96/0x100
> [<c0106ca0>] default_idle+0x0/0x34
> [<c01094a8>] common_interrupt+0x18/0x20
> [<c0106ca0>] default_idle+0x0/0x34
> [<c0106cc9>] default_idle+0x29/0x34
> [<c0106d53>] cpu_idle+0x37/0x48
> [<c0119d21>] printk+0x149/0x160
>
> Code: 89 85 3c 01 00 00 83 c4 04 eb 0a c7 85 3c 01 00 00 00 00 07
> <0>Kernel panic: Aiee, killing interrupt handler!
> In interrupt handler - not syncing


2003-02-06 20:41:38

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

OK, I threw a little bit of debug in there:
I'd show you the code, except it just ate my root filesystem.
Likelihood of me doing further research is thus small.

At the start of isp1020_intr_handler it basically checked if host, hostdata,
memaddr, or memaddr+MBOX5 was < 0xC0000000UL, and printk'ed if so.
It didn't printk

at first interrupt:

isp1020_intr_handler: host=f7c52800, hostdata=f7c5289c, memaddr=f8802000, MBOX5=0000007a, readaddr = f880207a

then later it panic'ed without hitting the debug (or at least no printk)

Unable to handle kernel NULL pointer dereference at virtual address 0000013c
printing eip:
c01c19f6
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0060:[<c01c19f6>] Not tainted
EFLAGS: 00010046
EIP is at isp1020_intr_handler+0x256/0x300
eax: 00000000 ebx: f7c42100 ecx: 00000000 edx: 00000080
esi: 00000002 edi: 00000013 ebp: 00000000 esp: c02aff20
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c02ae000 task=c0260140)
Stack: f7c42100 f7c52800 00000002 00000013 c02affa4 00000005 00000005 f7c5289c
f7c52800 c01c1791 00000013 f7c52800 c02affa4 f7ffe1e0 24000001 c010a815
00000013 f7c52800 c02affa4 c028fa60 00000260 00000013 c02aff9c c010a9e6
Call Trace:
[<c01c1791>] do_isp1020_intr_handler+0x25/0x34
[<c010a815>] handle_IRQ_event+0x29/0x4c
[<c010a9e6>] do_IRQ+0x96/0x100
[<c0106ca0>] default_idle+0x0/0x34
[<c0105000>] _stext+0x0/0x48
[<c01094a8>] common_interrupt+0x18/0x20
[<c0106ca0>] default_idle+0x0/0x34
[<c0105000>] _stext+0x0/0x48
[<c0106cc9>] default_idle+0x29/0x34
[<c0106d53>] cpu_idle+0x37/0x48
[<c0105045>] _stext+0x45/0x48

Code: 89 85 3c 01 00 00 83 c4 04 eb 0a c7 85 3c 01 00 00 00 00 07
<0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

--On Wednesday, February 05, 2003 21:13:28 -0800 "Martin J. Bligh" <[email protected]> wrote:

>>> There are a lot of scsi updates in Linus's tree. Can you please
>>> test just
>>>
>>> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.59/2.5.59-mm8/broken-o
>>> ut/linus.patch
>>
>> Yup, the SCSI code in Linus' tree has broken since 2.5.59.
>> I reproduced this on my 4-way SMP machine (panic from that below),
>> so it's not just NUMA-Q wierdness ;-)
>>
>> M.
>
> elm3b13:~/linux/2.5.59-linus# addr2line -e vmlinux c01c1986
> /root/linux/2.5.59-linus/drivers/scsi/qlogicisp.c:632
>
> which is the readw of:
>
> static inline u_short isp_inw(struct Scsi_Host *host, long offset)
> {
> struct isp1020_hostdata *h = (struct isp1020_hostdata
> *)host->hostdata;
> if (h->memaddr)
> return readw(h->memaddr + offset);
> else
> return inw(host->io_port + offset);
> }
>
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 0000013c printing eip:
>> c01c1986
>> *pde = 00000000
>> Oops: 0002
>> CPU: 3
>> EIP: 0060:[<c01c1986>] Not tainted
>> EFLAGS: 00010046
>> EIP is at isp1020_intr_handler+0x1e6/0x290
>> eax: 00000000 ebx: f7c42080 ecx: 00000000 edx: 00000054
>> esi: 00000002 edi: 00000013 ebp: 00000000 esp: f7f97efc
>> ds: 007b es: 007b ss: 0068
>> Process swapper (pid: 0, threadinfo=f7f96000 task=f7f9d240)
>> Stack: f7c42080 f7c52800 00000002 00000013 f7f97f80 00000003 00000003
>> f7c5289c f7c52800 c01c1791 00000013 f7c52800 f7f97f80 f7ffe1e0
>> 24000001 c010a815 00000013 f7c52800 f7f97f80 c028fa60 00000260
>> 00000013 f7f97f78 c010a9e6 Call Trace:
>> [<c01c1791>] do_isp1020_intr_handler+0x25/0x34
>> [<c010a815>] handle_IRQ_event+0x29/0x4c
>> [<c010a9e6>] do_IRQ+0x96/0x100
>> [<c0106ca0>] default_idle+0x0/0x34
>> [<c01094a8>] common_interrupt+0x18/0x20
>> [<c0106ca0>] default_idle+0x0/0x34
>> [<c0106cc9>] default_idle+0x29/0x34
>> [<c0106d53>] cpu_idle+0x37/0x48
>> [<c0119d21>] printk+0x149/0x160
>>
>> Code: 89 85 3c 01 00 00 83 c4 04 eb 0a c7 85 3c 01 00 00 00 00 07
>> <0>Kernel panic: Aiee, killing interrupt handler!
>> In interrupt handler - not syncing
>


2003-02-06 22:20:56

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

> OK, I threw a little bit of debug in there:
> I'd show you the code, except it just ate my root filesystem.
> Likelihood of me doing further research is thus small.


Hmmmm .... did a disassemble of this on a similar machine (see end of email)
data seems to contradict what I was looking at previously ....
not sure what happened, but this set makes much more sense,
as it leads to 13c in the offset ;-)

0xc01c1ac6 <isp1020_intr_handler+486>: mov %eax,0x13c(%ebp)
which is drivers/scsi/qlogicisp.c:1051

Cmnd->result = isp1020_return_status(sts);

seemingly Cmnd is null ... this is in
while (out_ptr != in_ptr) {
u_int cmd_slot;

sts = (struct Status_Entry *) &hostdata->res_cpu[out_ptr];
out_ptr = (out_ptr + 1) & RES_QUEUE_LEN;

cmd_slot = sts->handle;
Cmnd = hostdata->cmd_slots[cmd_slot];
hostdata->cmd_slots[cmd_slot] = NULL;

TRACE("done", out_ptr, Cmnd);

if (le16_to_cpu(sts->completion_status) == CS_RESET_OCCURRED
|| le16_to_cpu(sts->completion_status) == CS_ABORTED
|| (le16_to_cpu(sts->status_flags) & STF_BUS_RESET))
hostdata->send_marker = 1;

if (le16_to_cpu(sts->state_flags) & SF_GOT_SENSE)
memcpy(Cmnd->sense_buffer, sts->req_sense_data,
sizeof(Cmnd->sense_buffer));

DEBUG_INTR(isp1020_print_status_entry(sts));

if (sts->hdr.entry_type == ENTRY_STATUS)
Cmnd->result = isp1020_return_status(sts);
else
Cmnd->result = DID_ERROR << 16;

if (Cmnd->use_sg)
pci_unmap_sg(hostdata->pci_dev,
(struct scatterlist *)Cmnd->buffer,
Cmnd->use_sg,
scsi_to_pci_dma_dir(Cmnd->sc_data_direction));
else if (Cmnd->request_bufflen)
pci_unmap_single(hostdata->pci_dev,
#ifdef CONFIG_QL_ISP_A64
(dma_addr_t)((long)Cmnd->SCp.ptr),
#else
(u32)((long)Cmnd->SCp.ptr),
#endif
Cmnd->request_bufflen,
scsi_to_pci_dma_dir(Cmnd->sc_data_direction));

isp_outw(out_ptr, host, MBOX5);
(*Cmnd->scsi_done)(Cmnd);
}

Changes in this patch to qlogicisp.c were as below. Looks suspciciously
close to the problem area to me, but I don't understand it enough to
say for sure (if this wasn't related to some SCSI subsystem change,
can I just revert out this section?)

M.

# drivers/scsi/qlogicisp.c 1.15 -> 1.17
diff -Nru a/drivers/scsi/qlogicisp.c b/drivers/scsi/qlogicisp.c
--- a/drivers/scsi/qlogicisp.c Mon Feb 3 21:31:47 2003
+++ b/drivers/scsi/qlogicisp.c Mon Feb 3 21:31:47 2003
@@ -802,7 +802,7 @@

ENTER("isp1020_queuecommand");

- host = Cmnd->host;
+ host = Cmnd->device->host;
hostdata = (struct isp1020_hostdata *) host->hostdata;
Cmnd->scsi_done = done;

@@ -853,8 +853,8 @@
cmd->hdr.entry_type = ENTRY_COMMAND;
cmd->hdr.entry_cnt = 1;

- cmd->target_lun = Cmnd->lun;
- cmd->target_id = Cmnd->target;
+ cmd->target_lun = Cmnd->device->lun;
+ cmd->target_id = Cmnd->device->id;
cmd->cdb_length = cpu_to_le16(Cmnd->cmd_len);
cmd->control_flags = cpu_to_le16(CFLAG_READ | CFLAG_WRITE);
cmd->time_out = cpu_to_le16(30);
@@ -1175,7 +1175,7 @@

ENTER("isp1020_abort");

- host = Cmnd->host;
+ host = Cmnd->device->host;
hostdata = (struct isp1020_hostdata *) host->hostdata;

for (i = 0; i < QLOGICISP_REQ_QUEUE_LEN + 1; i++)
@@ -1186,7 +1186,7 @@
isp1020_disable_irqs(host);

param[0] = MBOX_ABORT;
- param[1] = (((u_short) Cmnd->target) << 8) | Cmnd->lun;
+ param[1] = (((u_short) Cmnd->device->id) << 8) | Cmnd->device->lun;
param[2] = cmd_cookie >> 16;
param[3] = cmd_cookie & 0xffff;

@@ -1214,7 +1214,7 @@

ENTER("isp1020_reset");

- host = Cmnd->host;
+ host = Cmnd->device->host;
hostdata = (struct isp1020_hostdata *) host->hostdata;

param[0] = MBOX_BUS_RESET;


>> Unable to handle kernel NULL pointer dereference at virtual address
>> 0000013c printing eip:
>> c01c1986
>> *pde = 00000000
>> Oops: 0002
>> CPU: 3
>> EIP: 0060:[<c01c1986>] Not tainted
>> EFLAGS: 00010046
>> EIP is at isp1020_intr_handler+0x1e6/0x290
>> eax: 00000000 ebx: f7c42080 ecx: 00000000 edx: 00000054
>> esi: 00000002 edi: 00000013 ebp: 00000000 esp: f7f97efc
>> ds: 007b es: 007b ss: 0068
>> Process swapper (pid: 0, threadinfo=f7f96000 task=f7f9d240)
>> Stack: f7c42080 f7c52800 00000002 00000013 f7f97f80 00000003 00000003
>> f7c5289c f7c52800 c01c1791 00000013 f7c52800 f7f97f80 f7ffe1e0
>> 24000001 c010a815 00000013 f7c52800 f7f97f80 c028fa60 00000260
>> 00000013 f7f97f78 c010a9e6 Call Trace:
>> [<c01c1791>] do_isp1020_intr_handler+0x25/0x34
>> [<c010a815>] handle_IRQ_event+0x29/0x4c
>> [<c010a9e6>] do_IRQ+0x96/0x100
>> [<c0106ca0>] default_idle+0x0/0x34
>> [<c01094a8>] common_interrupt+0x18/0x20
>> [<c0106ca0>] default_idle+0x0/0x34
>> [<c0106cc9>] default_idle+0x29/0x34
>> [<c0106d53>] cpu_idle+0x37/0x48
>> [<c0119d21>] printk+0x149/0x160
>>
>> Code: 89 85 3c 01 00 00 83 c4 04 eb 0a c7 85 3c 01 00 00 00 00 07
>> <0>Kernel panic: Aiee, killing interrupt handler!
>> In interrupt handler - not syncing

Dump of assembler code for function isp1020_intr_handler:
0xc01c18e0 <isp1020_intr_handler>: sub $0x10,%esp
0xc01c18e3 <isp1020_intr_handler+3>: push %ebp
0xc01c18e4 <isp1020_intr_handler+4>: push %edi
0xc01c18e5 <isp1020_intr_handler+5>: push %esi
0xc01c18e6 <isp1020_intr_handler+6>: push %ebx
0xc01c18e7 <isp1020_intr_handler+7>: mov 0x28(%esp,1),%eax
0xc01c18eb <isp1020_intr_handler+11>: mov %eax,0x1c(%esp,1)
0xc01c18ef <isp1020_intr_handler+15>: mov 0x1c(%esp,1),%edx
0xc01c18f3 <isp1020_intr_handler+19>: add $0x9c,%eax
0xc01c18f8 <isp1020_intr_handler+24>: mov %eax,0x18(%esp,1)
0xc01c18fc <isp1020_intr_handler+28>: mov 0x9c(%edx),%eax
0xc01c1902 <isp1020_intr_handler+34>: test %eax,%eax
0xc01c1904 <isp1020_intr_handler+36>:
je 0xc01c1910 <isp1020_intr_handler+48>
0xc01c1906 <isp1020_intr_handler+38>: movzwl 0xa(%eax),%eax
0xc01c190a <isp1020_intr_handler+42>:
jmp 0xc01c191c <isp1020_intr_handler+60>
0xc01c190c <isp1020_intr_handler+44>: lea 0x0(%esi,1),%esi
0xc01c1910 <isp1020_intr_handler+48>: mov 0x1c(%esp,1),%eax
0xc01c1914 <isp1020_intr_handler+52>: mov 0x6c(%eax),%edx
0xc01c1917 <isp1020_intr_handler+55>: add $0xa,%edx
0xc01c191a <isp1020_intr_handler+58>: in (%dx),%ax
0xc01c191c <isp1020_intr_handler+60>: test $0x4,%al
0xc01c191e <isp1020_intr_handler+62>:
je 0xc01c1b66 <isp1020_intr_handler+646>
0xc01c1924 <isp1020_intr_handler+68>: mov 0x1c(%esp,1),%edx
0xc01c1928 <isp1020_intr_handler+72>: mov 0x9c(%edx),%eax
0xc01c192e <isp1020_intr_handler+78>: test %eax,%eax
0xc01c1930 <isp1020_intr_handler+80>:
je 0xc01c1938 <isp1020_intr_handler+88>
0xc01c1932 <isp1020_intr_handler+82>: movzwl 0x7a(%eax),%eax
0xc01c1936 <isp1020_intr_handler+86>:
jmp 0xc01c1944 <isp1020_intr_handler+100>
0xc01c1938 <isp1020_intr_handler+88>: mov 0x1c(%esp,1),%eax
0xc01c193c <isp1020_intr_handler+92>: mov 0x6c(%eax),%edx
0xc01c193f <isp1020_intr_handler+95>: add $0x7a,%edx
0xc01c1942 <isp1020_intr_handler+98>: in (%dx),%ax
0xc01c1944 <isp1020_intr_handler+100>: mov 0x1c(%esp,1),%edx
0xc01c1948 <isp1020_intr_handler+104>: movzwl %ax,%eax
0xc01c194b <isp1020_intr_handler+107>: mov %eax,0x14(%esp,1)
0xc01c194f <isp1020_intr_handler+111>: mov $0x7000,%ecx
0xc01c1954 <isp1020_intr_handler+116>: mov 0x9c(%edx),%eax
0xc01c195a <isp1020_intr_handler+122>: test %eax,%eax
0xc01c195c <isp1020_intr_handler+124>:
je 0xc01c1967 <isp1020_intr_handler+135>
0xc01c195e <isp1020_intr_handler+126>: mov %cx,0xc0(%eax)
0xc01c1965 <isp1020_intr_handler+133>:
jmp 0xc01c1978 <isp1020_intr_handler+152>
0xc01c1967 <isp1020_intr_handler+135>: mov 0x1c(%esp,1),%eax
0xc01c196b <isp1020_intr_handler+139>: mov 0x6c(%eax),%edx
0xc01c196e <isp1020_intr_handler+142>: add $0xc0,%edx
0xc01c1974 <isp1020_intr_handler+148>: mov %ecx,%eax
0xc01c1976 <isp1020_intr_handler+150>: out %ax,(%dx)
0xc01c1978 <isp1020_intr_handler+152>: mov 0x1c(%esp,1),%edx
0xc01c197c <isp1020_intr_handler+156>: mov 0x9c(%edx),%eax
0xc01c1982 <isp1020_intr_handler+162>: test %eax,%eax
0xc01c1984 <isp1020_intr_handler+164>:
je 0xc01c1990 <isp1020_intr_handler+176>
0xc01c1986 <isp1020_intr_handler+166>: movzwl 0xc(%eax),%eax
0xc01c198a <isp1020_intr_handler+170>:
jmp 0xc01c199c <isp1020_intr_handler+188>
0xc01c198c <isp1020_intr_handler+172>: lea 0x0(%esi,1),%esi
0xc01c1990 <isp1020_intr_handler+176>: mov 0x1c(%esp,1),%eax
0xc01c1994 <isp1020_intr_handler+180>: mov 0x6c(%eax),%edx
0xc01c1997 <isp1020_intr_handler+183>: add $0xc,%edx
0xc01c199a <isp1020_intr_handler+186>: in (%dx),%ax
0xc01c199c <isp1020_intr_handler+188>: test $0x1,%al
0xc01c199e <isp1020_intr_handler+190>:
je 0xc01c1a34 <isp1020_intr_handler+340>
0xc01c19a4 <isp1020_intr_handler+196>: mov 0x1c(%esp,1),%edx
0xc01c19a8 <isp1020_intr_handler+200>: mov 0x9c(%edx),%eax
0xc01c19ae <isp1020_intr_handler+206>: test %eax,%eax
0xc01c19b0 <isp1020_intr_handler+208>:
je 0xc01c19b8 <isp1020_intr_handler+216>
0xc01c19b2 <isp1020_intr_handler+210>: movzwl 0x70(%eax),%eax
0xc01c19b6 <isp1020_intr_handler+214>:
jmp 0xc01c19c4 <isp1020_intr_handler+228>
0xc01c19b8 <isp1020_intr_handler+216>: mov 0x1c(%esp,1),%eax
0xc01c19bc <isp1020_intr_handler+220>: mov 0x6c(%eax),%edx
0xc01c19bf <isp1020_intr_handler+223>: add $0x70,%edx
0xc01c19c2 <isp1020_intr_handler+226>: in (%dx),%ax
0xc01c19c4 <isp1020_intr_handler+228>: movzwl %ax,%eax
0xc01c19c7 <isp1020_intr_handler+231>: cmp $0x4006,%eax
0xc01c19cc <isp1020_intr_handler+236>:
jg 0xc01c19e5 <isp1020_intr_handler+261>
0xc01c19ce <isp1020_intr_handler+238>: cmp $0x4005,%eax
0xc01c19d3 <isp1020_intr_handler+243>:
jge 0xc01c1a03 <isp1020_intr_handler+291>
0xc01c19d5 <isp1020_intr_handler+245>: cmp $0x4002,%eax
0xc01c19da <isp1020_intr_handler+250>:
jg 0xc01c1a10 <isp1020_intr_handler+304>
0xc01c19dc <isp1020_intr_handler+252>: cmp $0x4001,%eax
0xc01c19e1 <isp1020_intr_handler+257>:
jl 0xc01c1a10 <isp1020_intr_handler+304>
0xc01c19e3 <isp1020_intr_handler+259>:
jmp 0xc01c1a03 <isp1020_intr_handler+291>
0xc01c19e5 <isp1020_intr_handler+261>: cmp $0x8001,%eax
0xc01c19ea <isp1020_intr_handler+266>:
je 0xc01c19f3 <isp1020_intr_handler+275>
0xc01c19ec <isp1020_intr_handler+268>: cmp $0x8006,%eax
0xc01c19f1 <isp1020_intr_handler+273>:
jne 0xc01c1a10 <isp1020_intr_handler+304>
0xc01c19f3 <isp1020_intr_handler+275>: mov 0x18(%esp,1),%edx
0xc01c19f7 <isp1020_intr_handler+279>: movl $0x1,0xf8(%edx)
0xc01c1a01 <isp1020_intr_handler+289>:
jmp 0xc01c1a10 <isp1020_intr_handler+304>
0xc01c1a03 <isp1020_intr_handler+291>: push $0xc0246f20
0xc01c1a08 <isp1020_intr_handler+296>: call 0xc0119bd8 <printk>
0xc01c1a0d <isp1020_intr_handler+301>: add $0x4,%esp
0xc01c1a10 <isp1020_intr_handler+304>: mov 0x1c(%esp,1),%edx
0xc01c1a14 <isp1020_intr_handler+308>: mov 0x9c(%edx),%eax
0xc01c1a1a <isp1020_intr_handler+314>: test %eax,%eax
0xc01c1a1c <isp1020_intr_handler+316>:
je 0xc01c1a26 <isp1020_intr_handler+326>
0xc01c1a1e <isp1020_intr_handler+318>: movw $0x0,0xc(%eax)
0xc01c1a24 <isp1020_intr_handler+324>:
jmp 0xc01c1a34 <isp1020_intr_handler+340>
0xc01c1a26 <isp1020_intr_handler+326>: mov 0x1c(%esp,1),%eax
0xc01c1a2a <isp1020_intr_handler+330>: mov 0x6c(%eax),%edx
0xc01c1a2d <isp1020_intr_handler+333>: add $0xc,%edx
0xc01c1a30 <isp1020_intr_handler+336>: xor %eax,%eax
0xc01c1a32 <isp1020_intr_handler+338>: out %ax,(%dx)
0xc01c1a34 <isp1020_intr_handler+340>: mov 0x18(%esp,1),%edx
0xc01c1a38 <isp1020_intr_handler+344>: mov 0x14(%esp,1),%eax
0xc01c1a3c <isp1020_intr_handler+348>: mov 0xf4(%edx),%edx
0xc01c1a42 <isp1020_intr_handler+354>: mov %edx,0x10(%esp,1)
0xc01c1a46 <isp1020_intr_handler+358>: cmp %eax,%edx
0xc01c1a48 <isp1020_intr_handler+360>:
je 0xc01c1b58 <isp1020_intr_handler+632>
0xc01c1a4e <isp1020_intr_handler+366>: mov %esi,%esi
0xc01c1a50 <isp1020_intr_handler+368>: mov 0x10(%esp,1),%ebx
0xc01c1a54 <isp1020_intr_handler+372>: mov 0x18(%esp,1),%edx
0xc01c1a58 <isp1020_intr_handler+376>: mov 0x18(%esp,1),%eax
0xc01c1a5c <isp1020_intr_handler+380>: shl $0x6,%ebx
0xc01c1a5f <isp1020_intr_handler+383>: add $0xfc,%eax
0xc01c1a64 <isp1020_intr_handler+388>: add 0xe8(%edx),%ebx
0xc01c1a6a <isp1020_intr_handler+394>: incl 0x10(%esp,1)
0xc01c1a6e <isp1020_intr_handler+398>: andl $0x7,0x10(%esp,1)
0xc01c1a73 <isp1020_intr_handler+403>: mov 0x4(%ebx),%edx
0xc01c1a76 <isp1020_intr_handler+406>: shl $0x2,%edx
0xc01c1a79 <isp1020_intr_handler+409>: mov (%edx,%eax,1),%ebp
0xc01c1a7c <isp1020_intr_handler+412>: movl $0x0,(%edx,%eax,1)
0xc01c1a83 <isp1020_intr_handler+419>: movzwl 0xa(%ebx),%eax
0xc01c1a87 <isp1020_intr_handler+423>: add $0xfffffffc,%ax
0xc01c1a8b <isp1020_intr_handler+427>: cmp $0x1,%ax
0xc01c1a8f <isp1020_intr_handler+431>:
jbe 0xc01c1a97 <isp1020_intr_handler+439>
0xc01c1a91 <isp1020_intr_handler+433>: testb $0x8,0xe(%ebx)
0xc01c1a95 <isp1020_intr_handler+437>:
je 0xc01c1aa5 <isp1020_intr_handler+453>
0xc01c1a97 <isp1020_intr_handler+439>: mov 0x18(%esp,1),%eax
0xc01c1a9b <isp1020_intr_handler+443>: movl $0x1,0xf8(%eax)
0xc01c1aa5 <isp1020_intr_handler+453>: testb $0x20,0xd(%ebx)
0xc01c1aa9 <isp1020_intr_handler+457>:
je 0xc01c1abb <isp1020_intr_handler+475>
0xc01c1aab <isp1020_intr_handler+459>: lea 0xc0(%ebp),%edi
0xc01c1ab1 <isp1020_intr_handler+465>: lea 0x20(%ebx),%esi
0xc01c1ab4 <isp1020_intr_handler+468>: mov $0x10,%ecx
0xc01c1ab9 <isp1020_intr_handler+473>: repz movsl %ds:(%esi),%es:(%edi)
0xc01c1abb <isp1020_intr_handler+475>: cmpb $0x3,(%ebx)
0xc01c1abe <isp1020_intr_handler+478>:
jne 0xc01c1ad1 <isp1020_intr_handler+497>
0xc01c1ac0 <isp1020_intr_handler+480>: push %ebx
0xc01c1ac1 <isp1020_intr_handler+481>:
call 0xc01c1b70 <isp1020_return_status>
0xc01c1ac6 <isp1020_intr_handler+486>: mov %eax,0x13c(%ebp)
0xc01c1acc <isp1020_intr_handler+492>: add $0x4,%esp
0xc01c1acf <isp1020_intr_handler+495>:
jmp 0xc01c1adb <isp1020_intr_handler+507>
0xc01c1ad1 <isp1020_intr_handler+497>: movl $0x70000,0x13c(%ebp)
0xc01c1adb <isp1020_intr_handler+507>: cmpw $0x0,0x9e(%ebp)
0xc01c1ae3 <isp1020_intr_handler+515>:
je 0xc01c1af5 <isp1020_intr_handler+533>
0xc01c1ae5 <isp1020_intr_handler+517>: cmpb $0x3,0x52(%ebp)
0xc01c1ae9 <isp1020_intr_handler+521>:
jne 0xc01c1b10 <isp1020_intr_handler+560>
0xc01c1aeb <isp1020_intr_handler+523>: ud2a
0xc01c1aed <isp1020_intr_handler+525>: inc %ebp
0xc01c1aee <isp1020_intr_handler+526>: add %ch,%bl
0xc01c1af0 <isp1020_intr_handler+528>: out %al,(%dx)
0xc01c1af1 <isp1020_intr_handler+529>: and %eax,%eax
0xc01c1af3 <isp1020_intr_handler+531>:
jmp 0xc01c1b10 <isp1020_intr_handler+560>
0xc01c1af5 <isp1020_intr_handler+533>: cmpl $0x0,0x64(%ebp)
0xc01c1af9 <isp1020_intr_handler+537>:
je 0xc01c1b10 <isp1020_intr_handler+560>
0xc01c1afb <isp1020_intr_handler+539>: cmpb $0x3,0x52(%ebp)
0xc01c1aff <isp1020_intr_handler+543>:
jne 0xc01c1b10 <isp1020_intr_handler+560>
0xc01c1b01 <isp1020_intr_handler+545>: ud2a
0xc01c1b03 <isp1020_intr_handler+547>: sbb $0x0,%al
0xc01c1b05 <isp1020_intr_handler+549>:
jmp 0xc01c1af5 <isp1020_intr_handler+533>
0xc01c1b07 <isp1020_intr_handler+551>: and %eax,%eax
0xc01c1b09 <isp1020_intr_handler+553>: lea 0x0(%esi,1),%esi
0xc01c1b10 <isp1020_intr_handler+560>: mov 0x1c(%esp,1),%edx
0xc01c1b14 <isp1020_intr_handler+564>: mov 0x10(%esp,1),%ecx
0xc01c1b18 <isp1020_intr_handler+568>: mov 0x9c(%edx),%eax
0xc01c1b1e <isp1020_intr_handler+574>: test %eax,%eax
0xc01c1b20 <isp1020_intr_handler+576>:
je 0xc01c1b30 <isp1020_intr_handler+592>
0xc01c1b22 <isp1020_intr_handler+578>: mov %ecx,%edx
0xc01c1b24 <isp1020_intr_handler+580>: mov %dx,0x7a(%eax)
0xc01c1b28 <isp1020_intr_handler+584>:
jmp 0xc01c1b3e <isp1020_intr_handler+606>
0xc01c1b2a <isp1020_intr_handler+586>: lea 0x0(%esi),%esi
0xc01c1b30 <isp1020_intr_handler+592>: mov 0x1c(%esp,1),%eax
0xc01c1b34 <isp1020_intr_handler+596>: mov 0x6c(%eax),%edx
0xc01c1b37 <isp1020_intr_handler+599>: add $0x7a,%edx
0xc01c1b3a <isp1020_intr_handler+602>: mov %ecx,%eax
0xc01c1b3c <isp1020_intr_handler+604>: out %ax,(%dx)
0xc01c1b3e <isp1020_intr_handler+606>: push %ebp
0xc01c1b3f <isp1020_intr_handler+607>: mov 0x108(%ebp),%eax
0xc01c1b45 <isp1020_intr_handler+613>: call *%eax
0xc01c1b47 <isp1020_intr_handler+615>: add $0x4,%esp
0xc01c1b4a <isp1020_intr_handler+618>: mov 0x14(%esp,1),%edx
0xc01c1b4e <isp1020_intr_handler+622>: cmp %edx,0x10(%esp,1)
0xc01c1b52 <isp1020_intr_handler+626>:
jne 0xc01c1a50 <isp1020_intr_handler+368>
0xc01c1b58 <isp1020_intr_handler+632>: mov 0x10(%esp,1),%edx
0xc01c1b5c <isp1020_intr_handler+636>: mov 0x18(%esp,1),%eax
0xc01c1b60 <isp1020_intr_handler+640>: mov %edx,0xf4(%eax)
0xc01c1b66 <isp1020_intr_handler+646>: pop %ebx
0xc01c1b67 <isp1020_intr_handler+647>: pop %esi
0xc01c1b68 <isp1020_intr_handler+648>: pop %edi
0xc01c1b69 <isp1020_intr_handler+649>: pop %ebp
0xc01c1b6a <isp1020_intr_handler+650>: add $0x10,%esp
0xc01c1b6d <isp1020_intr_handler+653>: ret
End of assembler dump.

2003-02-06 23:15:55

by James Bottomley

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

On Thu, 2003-02-06 at 16:30, Martin J. Bligh wrote:
> > OK, I threw a little bit of debug in there:
> > I'd show you the code, except it just ate my root filesystem.
> > Likelihood of me doing further research is thus small.
>
>
> Hmmmm .... did a disassemble of this on a similar machine (see end of email)
> data seems to contradict what I was looking at previously ....
> not sure what happened, but this set makes much more sense,
> as it leads to 13c in the offset ;-)
>
> 0xc01c1ac6 <isp1020_intr_handler+486>: mov %eax,0x13c(%ebp)
> which is drivers/scsi/qlogicisp.c:1051
>
> Cmnd->result = isp1020_return_status(sts);
>
> seemingly Cmnd is null ... this is in
[...]

That looks more like it.

My guess is that the command slot was emptied previously, but I don't
understand enough about the mailbox specifics of the isp1020 to be sure.

Can you try adding

if(!Cmnd) {
printk(KERN_ERR "isp1020 Cmnd is NULL for slot %d, out_ptr %d\n",
cmd_slot, out_ptr);
continue;
}

Just below the Cmnd = hostdata->cmd_slots[cmd_slot];

> say for sure (if this wasn't related to some SCSI subsystem change,
> can I just revert out this section?)

No, I'm afraid not. That was just the elimination of those fields from
Scsi_Cmnd so now it has to be indirect through cmnd->device. It won't
compile without this.

James


2003-02-07 01:21:48

by Patrick Mansfield

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

On Thu, Feb 06, 2003 at 05:25:25PM -0600, James Bottomley wrote:

> > say for sure (if this wasn't related to some SCSI subsystem change,
> > can I just revert out this section?)
>
> No, I'm afraid not. That was just the elimination of those fields from
> Scsi_Cmnd so now it has to be indirect through cmnd->device. It won't
> compile without this.
>
> James

wli has hit this several times prior to 2.5.59 (months ago), pretty much
with any across disk IO loads. The driver sets queue depth to 1 for all
LUNs.

I modified my fsck to run in parallel (well it wasn't running any fsck's
on non-root disks before that), and am hitting hit it on a NUMAQ box.

-- Patrick Mansfield

2003-02-07 01:51:36

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

>> > say for sure (if this wasn't related to some SCSI subsystem change,
>> > can I just revert out this section?)
>>
>> No, I'm afraid not. That was just the elimination of those fields from
>> Scsi_Cmnd so now it has to be indirect through cmnd->device. It won't
>> compile without this.
>>
>> James
>
> wli has hit this several times prior to 2.5.59 (months ago), pretty much
> with any across disk IO loads. The driver sets queue depth to 1 for all
> LUNs.
>
> I modified my fsck to run in parallel (well it wasn't running any fsck's
> on non-root disks before that), and am hitting hit it on a NUMAQ box.

Curious. I've no idea why the changes brought this out then ... I've done
hundreds and hundreds of reboots on 2.5 on all sorts of different kernels,
and never, ever seen this. Yet in 2.5.59-bk I see it every single time.
Very odd.

M.

2003-02-07 02:22:16

by Patrick Mansfield

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

On Thu, Feb 06, 2003 at 06:01:06PM -0800, Martin J. Bligh wrote:
>
> Curious. I've no idea why the changes brought this out then ... I've done
> hundreds and hundreds of reboots on 2.5 on all sorts of different kernels,
> and never, ever seen this. Yet in 2.5.59-bk I see it every single time.
> Very odd.
>
> M.

Okay:

There were some bk scsi changes that ignored the queue depth (qlogicisp
sets them all to one).

Current bk (I just pulled and checked) has a fix, the cleaner shinier
better scsi_lib.c scsi_request_fn now has this code:

if (sdev->device_busy >= sdev->queue_depth)
break;

So the oops has to do with the isp handling multiple requests in a row or
in quick succession.

Hopefully going to the latest bk will fix your oops.

-- Patrick Mansfield

2003-02-07 03:56:13

by Doug Ledford

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

On Thu, Feb 06, 2003 at 06:25:02PM -0800, Patrick Mansfield wrote:
> On Thu, Feb 06, 2003 at 06:01:06PM -0800, Martin J. Bligh wrote:
> >
> > Curious. I've no idea why the changes brought this out then ... I've done
> > hundreds and hundreds of reboots on 2.5 on all sorts of different kernels,
> > and never, ever seen this. Yet in 2.5.59-bk I see it every single time.
> > Very odd.
> >
> > M.
>
> Okay:
>
> There were some bk scsi changes that ignored the queue depth (qlogicisp
> sets them all to one).
>
> Current bk (I just pulled and checked) has a fix, the cleaner shinier
> better scsi_lib.c scsi_request_fn now has this code:
>
> if (sdev->device_busy >= sdev->queue_depth)
> break;
>
> So the oops has to do with the isp handling multiple requests in a row or
> in quick succession.
>
> Hopefully going to the latest bk will fix your oops.

It might, but please understand this. The qlogicisp driver does things to
the scsi mid layer that the scsi mid layer does not protect itself against
and as a result is the biggest pile of steaming, unsupportable, crap code
in the universe! The scsi mid layer was designed from day one to think
that the host->can_queue, sdev->queue_depth, and host->sg_tablesize items
were *static* on a given host/device unless specifically changed by
calling into the adjustment routines (scsi_adjust_queue_depth). The
qlogicisp driver violates those principles and I make no warranty of any
kind that said driver will continue to operate properly unless someone
takes the time to actually audit the qlogicisp_queuecommand() and
qlogicisp_irq() routine to make sure it is actually doing the right thing
when making those changes!

If I understand correctly, Matthew Jacob's latest isp driver set drives
*all* qlogic hardware (or at least all the older stuff like the qlogicisp
driver drives). I would much prefer that people simply test out Matthew's
driver and use it instead. In fact, if it's ready for 2.5 kernel use, I
would strongly recommend that it be considered as a possible replacement
in the linux kernel for the default driver on all qlogic cards not handled
by the new qla2x00 driver version 6 (DaveM may have objections to that
related to sparc if Matthew's driver isn't sparc friendly, but I don't
know of any other reason not to switch over).

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2003-02-07 04:10:56

by Anton Blanchard

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)


Hi,

> If I understand correctly, Matthew Jacob's latest isp driver set drives
> *all* qlogic hardware (or at least all the older stuff like the qlogicisp
> driver drives). I would much prefer that people simply test out Matthew's
> driver and use it instead. In fact, if it's ready for 2.5 kernel use, I
> would strongly recommend that it be considered as a possible replacement
> in the linux kernel for the default driver on all qlogic cards not handled
> by the new qla2x00 driver version 6 (DaveM may have objections to that
> related to sparc if Matthew's driver isn't sparc friendly, but I don't
> know of any other reason not to switch over).

I had a bunch of problems with the in kernel and vendor qlogic drivers
on my ppc64 box. Matt Jacob's driver worked out of the box. Davem
sounded positive last time I asked him about it.

I did a quick forward port to 2.5 a month or two ago, sounds like we
should work to get it in the kernel. There are some rough edges but
Mike kindly offered to lend a hand here.

Anton

2003-02-07 04:09:26

by Andrew Morton

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

Doug Ledford <[email protected]> wrote:
>
> I would much prefer that people simply test out Matthew's
> driver and use it instead.

Where is it?

http://www.feral.com/isp.html seems to be 2.4.x-only.

2003-02-07 04:15:07

by Doug Ledford

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

On Thu, Feb 06, 2003 at 08:19:39PM -0800, Andrew Morton wrote:
> Doug Ledford <[email protected]> wrote:
> >
> > I would much prefer that people simply test out Matthew's
> > driver and use it instead.
>
> Where is it?
>
> http://www.feral.com/isp.html seems to be 2.4.x-only.

As answered elsewhere, the 2.5 port isn't done yet. That's why I said in
my email "if it's ready for 2.5" because I was afraid Matthew hadn't
gotten around to doing the 2.5 update yet. However, if no one else can do
it, I can make a 2.5 update of this driver happen (I don't suspect it
would be that hard actually, not *that* much has to change).

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2003-02-07 04:19:29

by Martin J. Bligh

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

>> There were some bk scsi changes that ignored the queue depth (qlogicisp
>> sets them all to one).
>>
>> Current bk (I just pulled and checked) has a fix, the cleaner shinier
>> better scsi_lib.c scsi_request_fn now has this code:
>>
>> if (sdev->device_busy >= sdev->queue_depth)
>> break;
>>
>> So the oops has to do with the isp handling multiple requests in a row or
>> in quick succession.
>>
>> Hopefully going to the latest bk will fix your oops.
>
> It might, but please understand this. The qlogicisp driver does things to
> the scsi mid layer that the scsi mid layer does not protect itself against
> and as a result is the biggest pile of steaming, unsupportable, crap code
> in the universe! The scsi mid layer was designed from day one to think
> that the host->can_queue, sdev->queue_depth, and host->sg_tablesize items
> were *static* on a given host/device unless specifically changed by
> calling into the adjustment routines (scsi_adjust_queue_depth). The
> qlogicisp driver violates those principles and I make no warranty of any
> kind that said driver will continue to operate properly unless someone
> takes the time to actually audit the qlogicisp_queuecommand() and
> qlogicisp_irq() routine to make sure it is actually doing the right thing
> when making those changes!
>
> If I understand correctly, Matthew Jacob's latest isp driver set drives
> *all* qlogic hardware (or at least all the older stuff like the qlogicisp
> driver drives). I would much prefer that people simply test out Matthew's
> driver and use it instead. In fact, if it's ready for 2.5 kernel use, I
> would strongly recommend that it be considered as a possible replacement
> in the linux kernel for the default driver on all qlogic cards not handled
> by the new qla2x00 driver version 6 (DaveM may have objections to that
> related to sparc if Matthew's driver isn't sparc friendly, but I don't
> know of any other reason not to switch over).

If you can send me a patch, I'll willingly test it .... I have plenty of
these cards on very racy machines ;-)

M.

2003-02-07 04:26:56

by William Lee Irwin III

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

On Thu, Feb 06, 2003 at 08:19:39PM -0800, Andrew Morton wrote:
>> http://www.feral.com/isp.html seems to be 2.4.x-only.

On Thu, Feb 06, 2003 at 11:24:40PM -0500, Doug Ledford wrote:
> As answered elsewhere, the 2.5 port isn't done yet. That's why I said in
> my email "if it's ready for 2.5" because I was afraid Matthew hadn't
> gotten around to doing the 2.5 update yet. However, if no one else can do
> it, I can make a 2.5 update of this driver happen (I don't suspect it
> would be that hard actually, not *that* much has to change).

This driver's bugginess is a _major_ nuisance to me and I don't have
the SCSI know-how to fix it myself. I'd _love_ to test a driver with a
prayer of working anytime this century.

Thanks.

-- wli

2003-02-07 04:41:34

by Matthew Jacob

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)


I haven't integrated Anton's patches yet. Bad me.


On Thu, 6 Feb 2003, Andrew Morton wrote:

> Doug Ledford <[email protected]> wrote:
> >
> > I would much prefer that people simply test out Matthew's
> > driver and use it instead.
>
> Where is it?
>
> http://www.feral.com/isp.html seems to be 2.4.x-only.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


2003-02-07 04:44:00

by Matthew Jacob

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)


The other thing to note is that I'm not really very happy with my driver
at present. It may be working well for some people, but *I* think it
needs some rework before it's really ready for primetime again. I need
to split out the SCSI && FC dependencies. I need to move the name server
code out of the main body and make it more policy driven.

The trouble is also that it's just a hobby for me right now (no clients
with direct Linux support requirements) , and as a recent parent I've
had a lot less hobby time.


On Thu, 6 Feb 2003, William Lee Irwin III wrote:

> On Thu, Feb 06, 2003 at 08:19:39PM -0800, Andrew Morton wrote:
> >> http://www.feral.com/isp.html seems to be 2.4.x-only.
>
> On Thu, Feb 06, 2003 at 11:24:40PM -0500, Doug Ledford wrote:
> > As answered elsewhere, the 2.5 port isn't done yet. That's why I said in
> > my email "if it's ready for 2.5" because I was afraid Matthew hadn't
> > gotten around to doing the 2.5 update yet. However, if no one else can do
> > it, I can make a 2.5 update of this driver happen (I don't suspect it
> > would be that hard actually, not *that* much has to change).
>
> This driver's bugginess is a _major_ nuisance to me and I don't have
> the SCSI know-how to fix it myself. I'd _love_ to test a driver with a
> prayer of working anytime this century.
>
> Thanks.
>
> -- wli
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2003-02-07 08:39:25

by Mike Anderson

[permalink] [raw]
Subject: Re: Broken SCSI code in the BK tree (was: 2.5.59-mm8)

I removed the other Mike Anderson ([email protected]) from the cc
list, we seem to have many Mike Anderson's around hear :-).

Anton Blanchard [[email protected]] wrote:
>
> Hi,
>
> > If I understand correctly, Matthew Jacob's latest isp driver set drives
> > *all* qlogic hardware (or at least all the older stuff like the qlogicisp
> > driver drives). I would much prefer that people simply test out Matthew's
> > driver and use it instead. In fact, if it's ready for 2.5 kernel use, I
> > would strongly recommend that it be considered as a possible replacement
> > in the linux kernel for the default driver on all qlogic cards not handled
> > by the new qla2x00 driver version 6 (DaveM may have objections to that
> > related to sparc if Matthew's driver isn't sparc friendly, but I don't
> > know of any other reason not to switch over).
>
> I had a bunch of problems with the in kernel and vendor qlogic drivers
> on my ppc64 box. Matt Jacob's driver worked out of the box. Davem
> sounded positive last time I asked him about it.
>
> I did a quick forward port to 2.5 a month or two ago, sounds like we
> should work to get it in the kernel. There are some rough edges but
> Mike kindly offered to lend a hand here.

I have been buried lately so I have only taken the patch you sent me
and updated it so it will compile with the most recent SCSI
changes. I also made a few tweaks to the make files.

Currently it is running on my 2202 card system.

When I ran it my other system that has a Qlogic ISP1020 and two Qlogic
2300's it would hang post initing the driver. The driver seemed to be
responsive to external events like port downs, but appeared to not
complete its init. When I use the driver disable defines to turn off
detection of the ISP1020 the driver loaded ok (I only ran it up to 60MB
on a few spindles so not a very good test).

I will look at this more and see if I can understand the cause.

-andmike
--
Michael Anderson
[email protected]