ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.68/2.5.68-mm4/
. Much reworking of the disk IO scheduler patches due to the updated
dynamic-disk-request-allocation patch. No real functional changes here.
. Included the `kexec' patch - load Linux from Linux. Various people want
this for various reasons. I like the idea of going from a login prompt to
"Calibrating delay loop" in 0.5 seconds.
I tried it on four machines and it worked with small glitches on three of
them, and wedged up the fourth. So if it is to proceed this code needs
help with testing and careful bug reporting please.
There's a femto-HOWTO in the patch itself, reproduced here:
- enable kexec in config, build, install.
- grab kexec-tools from
http://www.osdl.org/archive/andyp/kexec/2.5.68/
- edit ./kexec/kexec-syscall.c and make sure __NR_kexec_load is set
to 269 (-mm kernels have an additional syscall)
- run `make distclean' and `make'
- I use this script:
#!/bin/sh
usage()
{
echo "Usage: do-kexec.sh /boot/bzImage [commandline options]"
exit 1
}
if [ $# -lt 1 ]
then
usage
fi
sync
IMAGE=$1
shift
./objdir/build/sbin/kexec -l $IMAGE --command-line="$(cat /proc/cmdline) $*"
./objdir/build/sbin/kexec -e
invoked as
cd /usr/src/kexec-tools
./do-kexec.sh /boot/bzImage-2.5.68
This is fairly crude - it's an instant reboot, no shutdown or anything.
Only do this if you're using journalled filesystems!
Changes since 2.5.68-mm3:
linus.patch
Latest BK drop
-irqreturn-i2c.patch
-irqreturn-sound-2.patch
-irqreturn-smcc.patch
-SLAB_NO_GROW-fix.patch
-irqreturn-bttv.patch
-apm-locking-fix.patch
-xd-warning-fix.patch
-DAC960-interface-fixes.patch
-alt_instr-__KERNEL__.patch
-modular-jbd.patch
-hdlc-module-update.patch
-proc_file_read-fix.patch
-disk_name-size-check.patch
-cleanups.patch
-mwave-cleanup.patch
-ext3-ro-mount-fix.patch
-nr_threads-docco-fix.patch
-lost-tick-HZ-fix.patch
-nr_inactive-race-fix.patch
-blockdev-aio-support.patch
-percpu-counters-fix.patch
-config-menu-aesthetics.patch
-oom-kill-locking.patch
-restore-modinfo-section.patch
-implement-__module_get.patch
Merged
+compat-ioctl-fix.patch
Fix 32-bit ioctl fallback
+generic-subarch-missing-bit.patch
Some ofthe generic subarch patch got lost
+config-PAGE_OFFSET-025G.patch
Allow really small amounts of lowmem.
+dont-set-kernel-pgd-on-PAE.patch
little ia32 optimisation/cleanup
+shrink_slab-accounting.patch
Teach page reclaim to notice success due to slab shrinkage
-dynamic-request-allocation.patch
-dynamic-request-allocation-fix.patch
+rq-dyn-works.patch
latest dynamic disk request allocation patch
+as-iosched-dyn.patch
Update as-iosched for dynamic request allocation
+cfq-iosched-dyn.patch
Update cfq-iosched for dynamic request allocation
+security_d_instantiate-movement.patch
+ext3-security-xattr.patch
+ext2-security-xattr.patch
Security stuff
+pcmcia-fix.patch
Compile fix for the pcmcia fix.
+kexec.patch
kexec.
All 99 patches
linus.patch
mm.patch
add -mmN to EXTRAVERSION
compat-ioctl-fix.patch
Fix NULL handler for compat_ioctl
generic-subarch.patch
generic subarchitecture for ia32
generic-subarch-fix.patch
generic subarch: SMP only
generic-subarch-missing-bit.patch
generic subarch: missing chunk
ipmi-warning-fixes.patch
irqreturn-uml.patch
UML updates for the new IRQ API
irqreturn-aic79xx.patch
Fix aic79xx for new IRQ API
kgdb-ga.patch
kgdb stub for ia32 (George Anzinger's one)
kgdb-ga-ppc64-fix.patch
irqreturn-kgdb-ga.patch
irqreturn-drivers-net.patch
kgdb-ga-smp_num_cpus.patch
kgdb-ga-discontigmem-fixup.patch
kgdb: discontigmem fixup
slab-magazine-layer.patch
magazine layer for slab
config_spinline.patch
uninline spinlocks for profiling accuracy.
ppc64-reloc_hide.patch
ppc64-pci-patch.patch
Subject: pci patch
ppc64-aio-32bit-emulation.patch
32/64bit emulation for aio
ppc64-scruffiness.patch
Fix some PPC64 compile warnings
ppc64-update.patch
ppc64 update
ppc64-update-fixes.patch
ppc64-irqfixes.patch
ppc64-pci-bogons.patch
sym-do-160.patch
make the SYM driver do 160 MB/sec
misc.patch
misc fixes
config-PAGE_OFFSET.patch
Configurable kenrel/user memory split
config-PAGE_OFFSET-025G.patch
3.75G config option
fat-speedup.patch
fat cluster search speedup
buffer-debug.patch
buffer.c debugging
ext3-truncate-ordered-pages.patch
ext3: explicitly free truncated pages
VM_RESERVED-check.patch
VM_RESERVED check
semop-race-fix.patch
semtimedop(): Fix racy BUG check
reiserfs_file_write-5.patch
rcu-stats.patch
RCU statistics reporting
ext3-journalled-data-assertion-fix.patch
Remove incorrect assertion from ext3
nfs-speedup.patch
nfs-oom-fix.patch
nfs oom fix
sk-allocation.patch
Subject: Re: nfs oom
nfs-more-oom-fix.patch
rpciod-atomic-allocations.patch
Make rcpiod use atomic allocations
linux-isp.patch
isp-update-1.patch
dcache_lock-vs-tasklist_lock-take-2.patch
Fix dcache_lock/tasklist_lock ranking bug
clone-retval-fix.patch
copy_process return value fix
de_thread-fix.patch
de_thread memory corruption fix
list_del-debug.patch
list_del debug check
airo-schedule-fix.patch
airo.c: don't sleep in atomic regions
386-access_ok-race-fix.patch
access_ok() race fix for 80386.
synaptics-mouse-support.patch
Add Synaptics touchpad tweaking to psmouse driver
swapfile-hold-i_sem.patch
hold i_sem on swapfiles
dont-set-kernel-pgd-on-PAE.patch
remove unnecessary PAE pgd set
shrink_slab-accounting.patch
account for slab reclaim in try_to_free_pages()
rq-dyn-works.patch
rq-dyn, dynamic request allocation
kblockd.patch
Create `kblockd' workqueue
cfq-infrastructure.patch
elevator-completion-api.patch
elevator completion API
as-iosched.patch
anticipatory I/O scheduler
as-use-completion.patch
AS use completion notifier
as-remove-debug-checks.patch
AS: remove debug checks
as-iosched-dyn.patch
AS: update to dynamic request allocation API
unplug-use-kblockd.patch
Use kblockd for running request queues
cfq-2.patch
CFQ scheduler, #2
cfq-iosched-dyn.patch
CFQ: update to rq-dyn API
unmap-page-debugging.patch
unmap unused pages for debugging
fremap-all-mappings.patch
Make all executable mappings be nonlinear
sched-2.5.68-B2.patch
HT scheduler, sched-2.5.68-B2
sched_idle-typo-fix.patch
fix sched_idle typo
kgdb-ga-idle-fix.patch
sched-2.5.64-D3.patch
sched-2.5.64-D3, more interactivity changes
show_task-free-stack-fix.patch
show_task() fix and cleanup
htree-nfs-fix.patch
Fix ext3 htree / NFS compatibility problems
i8042-share-irqs.patch
allow i8042 interrupt sharing
select-speedup.patch
Subject: Re: IA64 changes to fs/select.c
select-speedup-fix.patch
select() sleedup fix
slab_store_user-large-objects.patch
slab debug: perform redzoning against larger objects
htree-nfs-fix-2.patch
htree nfs fix
htree-leak-fix.patch
ext3: htree memory leak fix
put_task_struct-debug.patch
ia32-mknod64.patch
mknod64 for ia32
ext2-64-bit-special-inodes.patch
ext2: support for 64-bit device nodes
ext3-64-bit-special-inodes.patch
ext3: support for 64-bit device nodes
64-bit-dev_t-kdev_t.patch
64-bit dev_t and kdev_t
oops-dump-preceding-code.patch
i386 oops output: dump preceding code
lockmeter.patch
security_d_instantiate-movement.patch
Move security_d_instantiate hook calls
ext3-security-xattr.patch
ext3 xattr handler for security modules
ext2-security-xattr.patch
ext2 xattr handler for security modules
ext3-no-bkl.patch
journal_dirty_metadata-speedup.patch
journal_get_write_access-speedup.patch
ext3-concurrent-block-inode-allocation.patch
Subject: [PATCH] concurrent block/inode allocation for EXT3
ext3-orlov-approx-counter-fix.patch
Fix orlov allocator boundary case
ext3-concurrent-block-allocation-fix-1.patch
ext3-concurrent-block-allocation-hashed.patch
Subject: Re: [PATCH] concurrent block/inode allocation for EXT3
pcmcia-deadlock-fix-2.patch
Fix PCMCIA deadlock (rev. 2)
pcmcia-fix.patch
kexec.patch
kexec
On Fri, May 02, 2003 at 02:01:49AM -0700, Andrew Morton wrote:
> +dont-set-kernel-pgd-on-PAE.patch
> little ia32 optimisation/cleanup
It looks like no one listened to my commentary on the set_pgd() patch.
Remove pointless #ifdef, pointless set_pgd(), and a mysterious line
full of nothing but whitespace after the #endif, and update commentary.
-- wli
$ diffstat ../patches/mm4-2.5.68-2
fault.c | 12 ++++--------
1 files changed, 4 insertions(+), 8 deletions(-)
diff -urpN mm4-2.5.68-1/arch/i386/mm/fault.c mm4-2.5.68-2/arch/i386/mm/fault.c
--- mm4-2.5.68-1/arch/i386/mm/fault.c 2003-05-02 05:32:27.000000000 -0700
+++ mm4-2.5.68-2/arch/i386/mm/fault.c 2003-05-02 05:54:14.000000000 -0700
@@ -333,16 +333,12 @@ vmalloc_fault:
if (!pgd_present(*pgd_k))
goto no_context;
+
/*
- * kernel pmd pages are shared among all processes
- * with PAE on. Since vmalloc pages are always
- * in the kernel area, this will always be a
- * waste with PAE on.
+ * set_pgd(pgd, *pgd_k); here would be useless on PAE
+ * and redundant with the set_pmd() on non-PAE.
*/
-#ifndef CONFIG_X86_PAE
- set_pgd(pgd, *pgd_k);
-#endif
-
+
pmd = pmd_offset(pgd, address);
pmd_k = pmd_offset(pgd_k, address);
if (!pmd_present(*pmd_k))
Hi,
> . Included the `kexec' patch - load Linux from Linux. Various people want
> this for various reasons. I like the idea of going from a login prompt to
> "Calibrating delay loop" in 0.5 seconds.
One thing that bothers me about kexec is how we grab low pages in
kimage_alloc_page(). On a partitioned ppc64 box I will need to grab
memory in the low 256MB and the machine might have 500GB of memory
free. Thats going to take some time :)
Id hate to introduce a separate zone just for this sort of stuff (we
currently throw all memory in the DMA zone). Could we add a hint to
the page allocator where it makes a best effort to grab memory below
a threshold?
Anton
On Fri, 2003-05-02 at 14:49, Steven Cole wrote:
> On Fri, 2003-05-02 at 14:34, Andrew Morton wrote:
> > Steven Cole <[email protected]> wrote:
> > >
> > > For what it's worth, kexec has worked for me on the following
> > > two systems.
> > > ...
> > > 00:03.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
> >
> > Are you using eepro100 or e100? I found that e100 failed to bring up the
> > interface on restart ("failed selftest"), but eepro100 was OK.
>
> CONFIG_EEPRO100=y
> # CONFIG_EEPRO100_PIO is not set
> # CONFIG_E100 is not set
>
> I can test E100 again to verify if that would help.
>
# CONFIG_EEPRO100 is not set
CONFIG_E100=y
Well, e100 works for me with kexec. Sure is quick to just do:
cp arch/i386/boot/bzImage /boot/vmlinuz-2.5.68-mm4x
do-kexec.sh /boot/vmlinuz-2.5.68-mm4x
and no running /sbin/lilo. Nice.
( I put do-kexec.sh and kexec in /usr/local/bin )
Steven
On Fri, 2003-05-02 at 08:45, Steven Cole wrote:
> On Fri, 2003-05-02 at 03:01, Andrew Morton wrote:
> > - grab kexec-tools from
> >
> > http://www.osdl.org/archive/andyp/kexec/2.5.68/
> >
> The andyp directory seems to be missing. I found kexec-tools-1.8 here:
> http://www.xmission.com/~ebiederm/files/kexec/
>
> Is that the latest version?
Now kexec-tools-1.8-2.5.68.tgz is there at the original URL. Thanks.
Steven
Fix up NUMA-Q build with new generic apic mode stuff
> > > > I found that e100 failed to bring up the
> > > > interface on restart ("failed selftest"), but eepro100 was OK.
> Here is a snippet from dmesg output for a successful kexec e100 boot:
Any chance we could get lspci output from both of these systems?
William Lee Irwin III wrote:
>> On Fri, May 02, 2003 at 02:01:49AM -0700, Andrew Morton wrote:
>>+dont-set-kernel-pgd-on-PAE.patch
>> little ia32 optimisation/cleanup
>
> It looks like no one listened to my commentary on the set_pgd() patch.
>
> Remove pointless #ifdef, pointless set_pgd(), and a mysterious line
> full of nothing but whitespace after the #endif, and update commentary.
> -#ifndef CONFIG_X86_PAE
> - set_pgd(pgd, *pgd_k);
> -#endif
I wask thinking that the PMD set in 4G mode was a noop. But, it isn't,
so it makes up for the completely removed pgd set.
This comment needs to get updated in include/asm-i386/pgtable-2level.h:
/*
* (pmds are folded into pgds so this doesn't get actually called,
* but the define is needed for a generic inline function.)
*/
#define set_pmd(pmdptr, pmdval) (*(pmdptr) = pmdval)
#define set_pgd(pgdptr, pgdval) (*(pgdptr) = pgdval)
--
Dave Hansen
[email protected]
On Fri, 2003-05-02 at 03:01, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.68/2.5.68-mm4/
>
> . Much reworking of the disk IO scheduler patches due to the updated
> dynamic-disk-request-allocation patch. No real functional changes here.
>
> . Included the `kexec' patch - load Linux from Linux. Various people want
> this for various reasons. I like the idea of going from a login prompt to
> "Calibrating delay loop" in 0.5 seconds.
>
> I tried it on four machines and it worked with small glitches on three of
> them, and wedged up the fourth. So if it is to proceed this code needs
> help with testing and careful bug reporting please.
>
For what it's worth, kexec has worked for me on the following
two systems.
single P-III 933Mhz, 256MB, IDE (system 1)
00:00.0 Host bridge: Intel Corp. 82810E DC-133 GMCH [Graphics Memory Controller Hub] (rev 03)
00:01.0 VGA compatible controller: Intel Corp. 82810E DC-133 CGC [Chipset Graphics Controller] (rev 03)
00:1e.0 PCI bridge: Intel Corp. 82801AA PCI Bridge (rev 02)
00:1f.0 ISA bridge: Intel Corp. 82801AA ISA Bridge (LPC) (rev 02)
00:1f.1 IDE interface: Intel Corp. 82801AA IDE (rev 02)
00:1f.2 USB Controller: Intel Corp. 82801AA USB (rev 02)
00:1f.3 SMBus: Intel Corp. 82801AA SMBus (rev 02)
00:1f.5 Multimedia audio controller: Intel Corp. 82801AA AC'97 Audio (rev 02)
01:0c.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78)
dual P-III 1000Mhz, 1024MB, SCSI (system 2)
00:00.0 Host bridge: ServerWorks CNB20LE Host Bridge (rev 06)
00:00.1 Host bridge: ServerWorks CNB20LE Host Bridge (rev 06)
00:02.0 VGA compatible controller: ATI Technologies Inc 3D Rage IIC 215IIC [Mach64 GT IIC] (rev 7a)
00:03.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 50)
00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller
01:04.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
01:04.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
The times for reboot back to run level 3 are:
normal kexec
system 1 69 seconds 35 seconds
system 2 150 seconds 75 seconds
Steven
On Fri, 2003-05-02 at 14:34, Andrew Morton wrote:
> Steven Cole <[email protected]> wrote:
> >
> > For what it's worth, kexec has worked for me on the following
> > two systems.
> > ...
> > 00:03.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
>
> Are you using eepro100 or e100? I found that e100 failed to bring up the
> interface on restart ("failed selftest"), but eepro100 was OK.
CONFIG_EEPRO100=y
# CONFIG_EEPRO100_PIO is not set
# CONFIG_E100 is not set
I can test E100 again to verify if that would help.
Also, I found that if I mistyped the argument to do-kexec.sh, the
system would stay up, but the interface would get hosed, fixable with
/etc/rc.d/init.d/network restart.
Otherwise, kexec works fine here so far over about a dozen reboots on
both machines.
Steven
On Fri, 2003-05-02 at 15:05, Andrew Morton wrote:
> Steven Cole <[email protected]> wrote:
> >
> > On Fri, 2003-05-02 at 14:34, Andrew Morton wrote:
> > > Steven Cole <[email protected]> wrote:
> > > >
> > > > For what it's worth, kexec has worked for me on the following
> > > > two systems.
> > > > ...
> > > > 00:03.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
> > >
> > > Are you using eepro100 or e100? I found that e100 failed to bring up the
> > > interface on restart ("failed selftest"), but eepro100 was OK.
> >
> > CONFIG_EEPRO100=y
> > # CONFIG_EEPRO100_PIO is not set
> > # CONFIG_E100 is not set
> >
> > I can test E100 again to verify if that would help.
>
> May as well.
>
> There's something in the driver shutdown which is failing to bring the
> device into a state in which the driver startup can start it up. Probably
> just a missing device reset. I'll bug Scott about it if we get that far.
>
Here is a snippet from dmesg output for a successful kexec e100 boot:
Intel(R) PRO/100 Network Driver - version 2.2.21-k1
Copyright (c) 2003 Intel Corporation
e100: selftest OK.
Freeing alive device c1b23000, eth%d
e100: eth0: Intel(R) PRO/100 Network Connection
Hardware receive checksums enabled
cpu cycle saver enabled
I booted the e100 2.5.68-mm4 kernel twice with kexec, initially from the
eepro100 version, and once from the e100 version. Both worked OK.
Steven
Hi,
2.5.68-mm4 went dead when I tried to mount a floppy. It
gave no time to record any logs, or sysrqs.
Thanks,
Johan Hidding
---
Running latest Gentoo on AMD Athlon on Via MoBo.
Steven Cole <[email protected]> wrote:
>
> For what it's worth, kexec has worked for me on the following
> two systems.
> ...
> 00:03.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
Are you using eepro100 or e100? I found that e100 failed to bring up the
interface on restart ("failed selftest"), but eepro100 was OK.
Steven Cole <[email protected]> wrote:
>
> On Fri, 2003-05-02 at 14:34, Andrew Morton wrote:
> > Steven Cole <[email protected]> wrote:
> > >
> > > For what it's worth, kexec has worked for me on the following
> > > two systems.
> > > ...
> > > 00:03.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08)
> >
> > Are you using eepro100 or e100? I found that e100 failed to bring up the
> > interface on restart ("failed selftest"), but eepro100 was OK.
>
> CONFIG_EEPRO100=y
> # CONFIG_EEPRO100_PIO is not set
> # CONFIG_E100 is not set
>
> I can test E100 again to verify if that would help.
May as well.
There's something in the driver shutdown which is failing to bring the
device into a state in which the driver startup can start it up. Probably
just a missing device reset. I'll bug Scott about it if we get that far.
> Also, I found that if I mistyped the argument to do-kexec.sh, the
> system would stay up, but the interface would get hosed, fixable with
> /etc/rc.d/init.d/network restart.
Yes, kexec userspace shuts down the network interfaces then tries to exec
the new kernel. But none was loaded and the syscall returns -EINVAL.
You're left with downed interfaces. The script should be checking the
success of the initial image loading.
On Fri, 2003-05-02 at 03:01, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.68/2.5.68-mm4/
>
> . Much reworking of the disk IO scheduler patches due to the updated
> dynamic-disk-request-allocation patch. No real functional changes here.
>
> . Included the `kexec' patch - load Linux from Linux. Various people want
> this for various reasons. I like the idea of going from a login prompt to
> "Calibrating delay loop" in 0.5 seconds.
>
> I tried it on four machines and it worked with small glitches on three of
> them, and wedged up the fourth. So if it is to proceed this code needs
> help with testing and careful bug reporting please.
>
> There's a femto-HOWTO in the patch itself, reproduced here:
>
>
>
> - enable kexec in config, build, install.
>
> - grab kexec-tools from
>
> http://www.osdl.org/archive/andyp/kexec/2.5.68/
>
The andyp directory seems to be missing. I found kexec-tools-1.8 here:
http://www.xmission.com/~ebiederm/files/kexec/
Is that the latest version?
Steven
On Fri, 2003-05-02 at 15:49, Andy Pfiffer wrote:
> > > > > I found that e100 failed to bring up the
> > > > > interface on restart ("failed selftest"), but eepro100 was OK.
>
> > Here is a snippet from dmesg output for a successful kexec e100 boot:
>
> Any chance we could get lspci output from both of these systems?
Sure. I posted that initially. See this:
http://marc.theaimsgroup.com/?l=linux-kernel&m=105190618322919&w=2
Steven
On May 2 Steven Cole wrote:
>Here is a snippet from dmesg output for a successful kexec e100 boot:
Bizarrely I have a nasty crash on modprobing e100 *without* kexec (having
previously modprobed unix, af_packet and mii) and then trying to modprobe
serio (which then deadlocks the machine).
http://www.dcs.qmul.ac.uk/~mb/oops/
More information available on request
Matt
Matt Bernstein <[email protected]> wrote:
>
> On May 2 Steven Cole wrote:
>
> >Here is a snippet from dmesg output for a successful kexec e100 boot:
>
> Bizarrely I have a nasty crash on modprobing e100 *without* kexec (having
> previously modprobed unix, af_packet and mii) and then trying to modprobe
> serio (which then deadlocks the machine).
>
> http://www.dcs.qmul.ac.uk/~mb/oops/
>
Andi, it died in the middle of modprobe->apply_alternatives()
Andrew Morton <[email protected]> wrote:
>
> Are you using eepro100 or e100? I found that e100 failed to bring up the
> interface on restart ("failed selftest"), but eepro100 was OK.
That's because the e100 driver puts the card into state D3 when shutting
down but can't get it back to D0 afterwards.
Please send info about your chipset to Intel so they can work this out.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, May 03, 2003 at 01:41:59AM +0200, Andrew Morton wrote:
> Matt Bernstein <[email protected]> wrote:
> >
> > On May 2 Steven Cole wrote:
> >
> > >Here is a snippet from dmesg output for a successful kexec e100 boot:
> >
> > Bizarrely I have a nasty crash on modprobing e100 *without* kexec (having
> > previously modprobed unix, af_packet and mii) and then trying to modprobe
> > serio (which then deadlocks the machine).
> >
> > http://www.dcs.qmul.ac.uk/~mb/oops/
> >
>
> Andi, it died in the middle of modprobe->apply_alternatives()
The important part of the oops - the first lines are missing in the .png.
What is the failing address? And can you send me your e100.o ?
-Andi
On Sat, May 03, 2003 at 01:41:59AM +0200, Andrew Morton wrote:
> Andi, it died in the middle of modprobe->apply_alternatives()
BTW I just loaded an e100 module with BK-CVS current and there were no
problems.
-Andi
At 04:53 +0200 Andi Kleen wrote:
>> >
>> > Bizarrely I have a nasty crash on modprobing e100 *without* kexec (having
>> > previously modprobed unix, af_packet and mii) and then trying to modprobe
>> > serio (which then deadlocks the machine).
>> >
>> > http://www.dcs.qmul.ac.uk/~mb/oops/
>>
>> Andi, it died in the middle of modprobe->apply_alternatives()
>
>The important part of the oops - the first lines are missing in the .png.
>
>What is the failing address? And can you send me your e100.o ?
I'm sorry I can't get to the machine now till Tuesday. I'll try to get it
into a smaller font, or failing that a serial console if you like.
I've posted e100.{,k}o, vmlinux and System.map to the above URL. FWIW,
they both give "c010e840 T apply_alternatives". I've also posted ".config"
which Apache elects not to list :)
Does any of the above help?
Hi
2.5.68-mm4 fixes APM suspend on my Vaio (problem reported with 2.5.68-mm2) but
my PCMCIA ethernet is still broken after suspend and requires ifconfig eth0
down; cardctl eject; cardctl insert before it will come to life (it took two
goes at that the first time, only one the second)
As before, I get thousands of "eth0: command 0x5800 did not complete!" after
resume, and I got the following backtrace after resume (possibly triggered by
the cardctl commands).
As before, Sony Vaio, pre-empt, APM, combined ethernet/modem PCMCIA using
3c574_cs.
Any more info required?
Charlie
irq 11: nobody cared!
Call Trace:
[<c010b640>] handle_IRQ_event+0x90/0x100
[<c010b897>] do_IRQ+0x97/0x120
[<c0109c68>] common_interrupt+0x18/0x20
[<c01ab1e3>] pci_bus_write_config_word+0x73/0x90
[<c882c180>] +0x0/0x840 [yenta_socket]
[<c88a7890>] dead_socket+0x0/0xc [pcmcia_core]
[<c882984a>] yenta_set_socket+0xba/0x1b0 [yenta_socket]
[<c882c180>] +0x0/0x840 [yenta_socket]
[<c882a027>] yenta_clear_maps+0x57/0x90 [yenta_socket]
[<c882c180>] +0x0/0x840 [yenta_socket]
[<c88a7890>] dead_socket+0x0/0xc [pcmcia_core]
[<c882c180>] +0x0/0x840 [yenta_socket]
[<c882a20b>] yenta_init+0x1b/0x30 [yenta_socket]
[<c882c180>] +0x0/0x840 [yenta_socket]
[<c882c180>] +0x0/0x840 [yenta_socket]
[<c882aa70>] ricoh_init+0x10/0xe0 [yenta_socket]
[<c882c180>] +0x0/0x840 [yenta_socket]
[<c8829036>] +0x36/0x40 [yenta_socket]
[<c882c180>] +0x0/0x840 [yenta_socket]
[<c889dada>] init_socket+0x2a/0x30 [pcmcia_core]
[<c889df25>] shutdown_socket+0x15/0x100 [pcmcia_core]
[<c889e16a>] socket_shutdown+0x4a/0x60 [pcmcia_core]
[<c889e47a>] socket_insert+0x7a/0x80 [pcmcia_core]
[<c889da1a>] get_socket_status+0x1a/0x20 [pcmcia_core]
[<c889e6ad>] pccardd+0x13d/0x1f0 [pcmcia_core]
[<c0119df0>] default_wake_function+0x0/0x20
[<c01091d2>] ret_from_fork+0x6/0x14
[<c0119df0>] default_wake_function+0x0/0x20
[<c889e570>] pccardd+0x0/0x1f0 [pcmcia_core]
[<c010722d>] kernel_thread_helper+0x5/0x18
handlers:
[<c8862410>] (el3_interrupt+0x0/0x260 [3c574_cs])
On Fri, 2 May 2003 16:41:59 -0700
Andrew Morton <[email protected]> wrote:
> > http://www.dcs.qmul.ac.uk/~mb/oops/
> >
>
> Andi, it died in the middle of modprobe->apply_alternatives()
I just got this oops under -mm4 while connecting with a ppp link:
Probably it was modprobe'ing one of those:
ppp_deflate 5312 0 [unsafe]
zlib_deflate 21912 1 ppp_deflate
zlib_inflate 21408 1 ppp_deflate
bsd_comp 5600 0 [unsafe]
ppp_async 10496 1 [unsafe]
ppp_generic 27080 5 ppp_deflate,bsd_comp,ppp_async
slhc 5952 1 ppp_generic
CSLIP: code copyright 1989 Regents of the University of California
PPP generic driver version 2.4.2
Unable to handle kernel paging request at virtual address c03390df
printing eip:
c0110a39
*pde = 00102027
*pte = 00339000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c0110a39>] Not tainted VLI
EFLAGS: 00013202
EIP is at apply_alternatives+0xd9/0x120
eax: 00000001 ebx: d088fb8c ecx: 00000000 edx: 00000001
esi: c03390df edi: d08898cf ebp: ccfe5eec esp: ccfe5ed8
ds: 007b es: 007b ss: 0068
Process modprobe (pid: 375, threadinfo=ccfe4000 task=ce236690)
Stack: c02f68e0 00000003 d0883880 0000008f d08835b7 ccfe5f0c c011a436 d088fb8c
d088fc1b d0883504 d087c000 d0891b20 00000460 ccfe5f94 c013c8ae d087c000
d0883600 d0891b20 00000016 d0891b20 00000000 00000000 00000000 00000488
Call Trace:
[<c011a436>] module_finalize+0x96/0xa0
[<c013c8ae>] load_module+0x64e/0x870
[<c013cb6c>] sys_init_module+0x9c/0x2b0
[<c0109aef>] syscall_call+0x7/0xb
Code: 00 00 8b 0b 83 fa 09 b8 08 00 00 00 0f 4c c2 8b 7d f0 01 cf 8b 4d ec 8b 34
81 89 c1 c1 e9 02 f3 a5 a8 02 74 02 66 a5 a8 01 74 01 <a4> 01 45 f0 29 c2 85 d2
7f cd 83 c3 0c 3b 5d 0c 0f 82 71 ff ff
Anton Blanchard <[email protected]> writes:
> Hi,
>
> > . Included the `kexec' patch - load Linux from Linux. Various people want
> > this for various reasons. I like the idea of going from a login prompt to
> > "Calibrating delay loop" in 0.5 seconds.
>
> One thing that bothers me about kexec is how we grab low pages in
> kimage_alloc_page(). On a partitioned ppc64 box I will need to grab
> memory in the low 256MB and the machine might have 500GB of memory
> free. Thats going to take some time :)
Could you explain to me the need to allocate memory in the low 256MB.
Generally the design is that you can allocate the memory anywhere
and then relocate_kernel.S will move where it needs to be kept.
I have had people wanting to use 300MB initial ramdisks and the like.
If you have 500GB of memory what is the point of keeping anything on a disk?
When you have 4TB on a cluster or a NUMA machine I can understand
wanting to keep things local to a node. But in those cases you want
to have local node zones so the problem does not come up.
In general I hate restricting the memory you can use, because kexec is
not just about booting linux. But it is about booting anything that
we reasonably can. The only case I have seen so far that makes sense
is when your physical memory is larger than your virtual memory.
> Id hate to introduce a separate zone just for this sort of stuff (we
> currently throw all memory in the DMA zone). Could we add a hint to
> the page allocator where it makes a best effort to grab memory below
> a threshold?
I suspect so. And I can't imagine it would be that hard to implement.
But I think I would like to see why you need that.
Eric
Andrew Morton <[email protected]> writes:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.68/2.5.68-mm4/
>
>
> . Much reworking of the disk IO scheduler patches due to the updated
> dynamic-disk-request-allocation patch. No real functional changes here.
>
> . Included the `kexec' patch - load Linux from Linux. Various people want
> this for various reasons. I like the idea of going from a login prompt to
> "Calibrating delay loop" in 0.5 seconds.
>
> I tried it on four machines and it worked with small glitches on three of
> them, and wedged up the fourth. So if it is to proceed this code needs
> help with testing and careful bug reporting please.
The current state of the code is that APM is not expected to work. The
user space tool needs a fix to pass the address of the APM entry points
to the new kernel.
But beyond that everything should work baring drivers which have
problems shutting themselves down and restarting.
Eric
On Tue, May 06, 2003 at 04:15:55PM +0200, Matt Bernstein wrote:
> Is this helpful?
What I really need is an probably decoded with ksymoops oops, not jpegs.
Also you seem to be the only one with the problem so just to avoid
any weird build problems do a make distclean and rebuild from scratch
and reinstall the modules.
-Andi
At 16:35 +0200 Andi Kleen wrote:
>On Tue, May 06, 2003 at 04:15:55PM +0200, Matt Bernstein wrote:
>> Is this helpful?
>
>What I really need is an probably decoded with ksymoops oops, not jpegs.
OK, I'll do this tomorrow morning (I think I can do it without a serial
console now).
>Also you seem to be the only one with the problem so just to avoid
>any weird build problems do a make distclean and rebuild from scratch
>and reinstall the modules.
The only odd thing I think I'm doing is hacking this into rc.sysinit:
awk '/version 2\.5\./ {exit 1}' /proc/version || egrep -v '^#' /etc/sysconfig/modules | while read i
do
action $"Loading $i module: " /sbin/modprobe $i
done
This might be naughty, but it shouldn't be able to hang the box!
I'd prefer to have a proper set of aliases for 2.5 in /etc/modules.conf,
but I'm too lazy to google for one. Also, I'd prefer yet more to shunt
this stuff into an initramfs but I'll wait for documentation to appear for
that :)
Cheers,
Matt
On May 6 Andi Kleen wrote:
>On Tue, May 06, 2003 at 04:15:55PM +0200, Matt Bernstein wrote:
>> Is this helpful?
>
>What I really need is an probably decoded with ksymoops oops, not jpegs.
ksymoops 2.4.9 on i686 2.4.20-8. Options used
-v /opt/linux-2.5.69-mm1/vmlinux (specified)
-K (specified)
-L (specified)
-o /lib/modules/2.5.69-mm1 (specified)
-m /boot/System.map-2.5.69-mm1 (specified)
No modules in ksyms, skipping objects
ACPI: LAPIC_NMI (acpi_id[0xff] polarity[0x0] trigger[0x0] lint[0x1])
Machine check exception polling timer started.
Unable to handle kernel paging request at virtual address c03b6e83
c010e93f
*pde = 00102027
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c010e93f>] Not tainted VLI
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: e094c580 ebx: 00000001 ecx: 00000000 edx: c0345740
esi: c03b6e83 edi: e0944f1d ebp: 00000001 esp: dcfb5ed8
ds: 007b es: 007b ss: 0068
Stack: 00000000 dcfb5ee8 00000001 c0345740 00000003 e094c580 e093c448 c030321c
e093c19f c030320b c01149fe e094c568 e094c5f7 e093c0f2 e092f000 e093c1f0
00000460 00000460 c012f7e1 e092f000 e093c1f0 e09505c0 00000016 e09505c0
Call Trace:
[<c01149fe>] module_finalize+0x8e/0xa0
[<c012f7e1>] load_module+0x6d1/0x920
[<c012faa8>] sys_init_module+0x78/0x1d0
[<c01091e5>] sysenter_past_esp+0x52/0x71
Code: 8b 54 24 0c 0f 4c dd 8b 7c 24 10 03 38 81 fb ff 01 00 00 8b 34 9a 77 39 89 d9 c1 e9 02 f3 a5 f6 c3 02 74 02 66 a5 f6 c3 01 74 01 <a4> 29 dd 01 5c 24 10 85 ed 7f be 83 44 24 14 0c 8b 5c 24 30 39
>>EIP; c010e93f <apply_alternatives+ff/180> <=====
>>eax; e094c580 <_end+20555520/3fc06fa0>
>>edx; c0345740 <k7_nops+0/24>
>>esi; c03b6e83 <k7nops+0/2d>
>>edi; e0944f1d <_end+2054debd/3fc06fa0>
>>esp; dcfb5ed8 <_end+1cbbee78/3fc06fa0>
Trace; c01149fe <module_finalize+8e/a0>
Trace; c012f7e1 <load_module+6d1/920>
Trace; c012faa8 <sys_init_module+78/1d0>
Trace; c01091e5 <sysenter_past_esp+52/71>
This architecture has variable length instructions, decoding before eip
is unreliable, take these instructions with a pinch of salt.
Code; c010e914 <apply_alternatives+d4/180>
00000000 <_EIP>:
Code; c010e914 <apply_alternatives+d4/180>
0: 8b 54 24 0c mov 0xc(%esp,1),%edx
Code; c010e918 <apply_alternatives+d8/180>
4: 0f 4c dd cmovl %ebp,%ebx
Code; c010e91b <apply_alternatives+db/180>
7: 8b 7c 24 10 mov 0x10(%esp,1),%edi
Code; c010e91f <apply_alternatives+df/180>
b: 03 38 add (%eax),%edi
Code; c010e921 <apply_alternatives+e1/180>
d: 81 fb ff 01 00 00 cmp $0x1ff,%ebx
Code; c010e927 <apply_alternatives+e7/180>
13: 8b 34 9a mov (%edx,%ebx,4),%esi
Code; c010e92a <apply_alternatives+ea/180>
16: 77 39 ja 51 <_EIP+0x51>
Code; c010e92c <apply_alternatives+ec/180>
18: 89 d9 mov %ebx,%ecx
Code; c010e92e <apply_alternatives+ee/180>
1a: c1 e9 02 shr $0x2,%ecx
Code; c010e931 <apply_alternatives+f1/180>
1d: f3 a5 repz movsl %ds:(%esi),%es:(%edi)
Code; c010e933 <apply_alternatives+f3/180>
1f: f6 c3 02 test $0x2,%bl
Code; c010e936 <apply_alternatives+f6/180>
22: 74 02 je 26 <_EIP+0x26>
Code; c010e938 <apply_alternatives+f8/180>
24: 66 a5 movsw %ds:(%esi),%es:(%edi)
Code; c010e93a <apply_alternatives+fa/180>
26: f6 c3 01 test $0x1,%bl
Code; c010e93d <apply_alternatives+fd/180>
29: 74 01 je 2c <_EIP+0x2c>
This decode from eip onwards should be reliable
Code; c010e93f <apply_alternatives+ff/180>
00000000 <_EIP>:
Code; c010e93f <apply_alternatives+ff/180> <=====
0: a4 movsb %ds:(%esi),%es:(%edi) <=====
Code; c010e940 <apply_alternatives+100/180>
1: 29 dd sub %ebx,%ebp
Code; c010e942 <apply_alternatives+102/180>
3: 01 5c 24 10 add %ebx,0x10(%esp,1)
Code; c010e946 <apply_alternatives+106/180>
7: 85 ed test %ebp,%ebp
Code; c010e948 <apply_alternatives+108/180>
9: 7f be jg ffffffc9 <_EIP+0xffffffc9>
Code; c010e94a <apply_alternatives+10a/180>
b: 83 44 24 14 0c addl $0xc,0x14(%esp,1)
Code; c010e94f <apply_alternatives+10f/180>
10: 8b 5c 24 30 mov 0x30(%esp,1),%ebx
Code; c010e953 <apply_alternatives+113/180>
14: 39 .byte 0x39
>Also you seem to be the only one with the problem so just to avoid
>any weird build problems do a make distclean and rebuild from scratch
>and reinstall the modules.
Will do later today if the above isn't helpful. One other thing I did do
was a make -j19 KBUILD_VERBOSE=0 but I've been told this is completely
safe these days.
Cheers,
Matt
On Wed, May 07, 2003 at 12:27:02PM +0200, Matt Bernstein wrote:
>
> Will do later today if the above isn't helpful. One other thing I did do
> was a make -j19 KBUILD_VERBOSE=0 but I've been told this is completely
> safe these days.
It tries to patch an instruction past the kernel text.
It could be in the discarded .exit.text/.text.exit. With new binutils you should
get an link error when this happens, but perhaps yours are too old for that.
When you comment these entries out from the DISCARD statement in
arch/i386/vmlinux.lds.S does it go away ? Alternatively use Andrew's
latest 2.5.69-mm*, that has the patch too.
-Andi
At 14:35 +0200 Andi Kleen wrote:
>It tries to patch an instruction past the kernel text.
>
>It could be in the discarded .exit.text/.text.exit. With new binutils you should
>get an link error when this happens, but perhaps yours are too old for that.
I'm using the RH 9 standard 2.13.90.0.18-9. My environment is exactly RH9
+ modutils 2.4.22-10 from rawhide, on a single Athlon XP.
>When you comment these entries out from the DISCARD statement in
>arch/i386/vmlinux.lds.S does it go away ? Alternatively use Andrew's
>latest 2.5.69-mm*, that has the patch too.
Tried 2.5.69-mm2, it crashed the same way :-/