2003-05-09 18:52:08

by Andy Pfiffer

[permalink] [raw]
Subject: [KEXEC][2.5.69] kexec for 2.5.69 available

Eric,

I have a patch set available for kexec for 2.5.69. I had an unrelated
delay in posting this due to some strange behavior of late with LILO and
my ext3-mounted /boot partition (/sbin/lilo would say that it updated,
but a subsequent reboot would not include my new kernel)

The patches are available for download from OSDL's patch lifecycle
manager ( http://www.osdl.org/cgi-bin/plm/ ).

Patch stack for kexec for 2.5.69:

kexec base for 2.5.69:
http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1828

kexec hwfixes for 2.5.69:
http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1829

kexec usemm change (allowed 2-way to work for me):
http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1830

optional change to defconfig (sets CONFIG_KEXEC=y):
http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1831

The patches are also available (with matching kexec-tools-1.8) from this
link pending a crontab update:
http://www.osdl.org/archive/andyp/kexec/2.5.69/

Andrew Morton's tree now also contains kexec, and you can pick up his
patch here:
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.69/

I'll put together another release area for a matching kexec-tools for
-mm trees (different kexec syscall number between 2.5.* and 2.5.*-mm*)
as soon as I get -mm trees built and booted on my kexec test machines.


Regards,
Andy

To All: if you try kexec, a quick reply of success or failure to
[email protected] would be appreciated. If it doesn't work for you,
please include the output of lspci in your email.

Kexec has worked for me on these systems:

single P3-800MHz, 640MB:
00:00.0 Host bridge: ServerWorks CNB20LE Host Bridge (rev 06)
00:00.1 Host bridge: ServerWorks CNB20LE Host Bridge (rev 06)
00:01.0 VGA compatible controller: S3 Inc. Savage 4 (rev 04)
00:09.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro
100] (rev 08)
00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 50)
00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller
00:0f.2 USB Controller: ServerWorks OSB4/CSB5 OHCI USB
Controller (rev 04)
01:03.0 SCSI storage controller: Adaptec AIC-7892P U160/m (rev
02)

dual P3-866MHz, 256MB:
00:00.0 Host bridge: ServerWorks CNB20LE Host Bridge (rev 05)
00:00.1 Host bridge: ServerWorks CNB20LE Host Bridge (rev 05)
00:02.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro
100] (rev 08)
00:04.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro
100] (rev 08)
00:07.0 VGA compatible controller: ATI Technologies Inc Rage XL
(rev 65)
00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 4f)
00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller
00:0f.2 USB Controller: ServerWorks OSB4/CSB5 USB Controller
(rev 04)
01:05.0 SCSI storage controller: LSI Logic / Symbios Logic
53c896 (rev 07)
01:05.1 SCSI storage controller: LSI Logic / Symbios Logic
53c896 (rev 07)

dual P4-1.7GHz Xeon, 512MB:
00:00.0 Host bridge: Intel Corp. 82850 860 (Wombat) Chipset Host
Bridge (MCH) (rev 04)
00:01.0 PCI bridge: Intel Corp. 82850/82860 850/860
(Tehama/Wombat) Chipset AGP Bridge (rev 04)
00:02.0 PCI bridge: Intel Corp. 82860 860 (Wombat) Chipset PCI
Bridge (rev 04)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA PCI Bridge (rev 04)
00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev
04)
00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 04)
00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub (rev
04)
00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 04)
00:1f.5 Multimedia audio controller: Intel Corp. 82801BA/BAM
AC'97 Audio (rev 04)
01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA
G400 AGP (rev 85)
02:1f.0 PCI bridge: Intel Corp. 82806AA PCI64 Hub PCI Bridge
(rev 03)
03:00.0 PIC: Intel Corp. 82806AA PCI64 Hub Advanced Programmable
Interrupt Controller (rev 01)
04:04.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro
100] (rev 0c)






2003-05-09 19:51:44

by Christophe Saout

[permalink] [raw]
Subject: ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69 available)

Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer:

> [...]
> I had an unrelated
> delay in posting this due to some strange behavior of late with LILO and
> my ext3-mounted /boot partition (/sbin/lilo would say that it updated,
> but a subsequent reboot would not include my new kernel)

So I'm not the only one having this problem... I think I first saw this
with 2.5.68 but I'm not sure.

My boot partition is a small ext3 partition on a lvm2 volume accessed
over device-mapper (I've written a lilo patch for that, but the patch is
working and) but I don't think that has something to do with the
problem.

When syncing, unmounting and waiting some time after running lilo, the
changes sometimes seem correctly written to disk, I don't know when
exactly.

Could it be that the location of /boot/map is not written to the
partition sector of /dev/hda? Or not flushed correctly or something?

After reboot the old kernel came up again (though it was moved to
vmlinuz.old).

--
Christophe Saout <[email protected]>

2003-05-09 20:43:30

by Andy Pfiffer

[permalink] [raw]
Subject: Re: ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69 available)

On Fri, 2003-05-09 at 13:04, Christophe Saout wrote:
> Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer:
>
> > [...]
> > I had an unrelated
> > delay in posting this due to some strange behavior of late with LILO and
> > my ext3-mounted /boot partition (/sbin/lilo would say that it updated,
> > but a subsequent reboot would not include my new kernel)
>
> So I'm not the only one having this problem... I think I first saw this
> with 2.5.68 but I'm not sure.

Well, that makes two of us for sure.

>
> My boot partition is a small ext3 partition on a lvm2 volume accessed
> over device-mapper (I've written a lilo patch for that, but the patch is
> working and) but I don't think that has something to do with the
> problem.
>
> When syncing, unmounting and waiting some time after running lilo, the
> changes sometimes seem correctly written to disk, I don't know when
> exactly.

My /boot is an ext3 partition on an IDE disk. My symptoms and your
symptoms match -- wait awhile, and it works okay. If you don't wait
"long enough" the changes made in /etc/lilo.conf are not reflected in
the after running /sbin/lilo and rebooting normally.

I have been unable to reproduce this on a uniproc system with SCSI
disks.

2.5.67 seems to work in this regard as expected.

> Could it be that the location of /boot/map is not written to the
> partition sector of /dev/hda? Or not flushed correctly or something?
>
> After reboot the old kernel came up again (though it was moved to
> vmlinuz.old).

I don't know -- I haven't isolated it yet.

Anyone else?



2003-05-09 21:34:18

by Riley Williams

[permalink] [raw]
Subject: RE: ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available)

Hi Andy, Christophe.

>>> I had an unrelated delay in posting this due to some strange
>>> behavior of late with LILO and my ext3-mounted /boot partition
>>> (/sbin/lilo would say that it updated, but a subsequent reboot
>>> would not include my new kernel)

>> So I'm not the only one having this problem... I think I first
>> saw this with 2.5.68 but I'm not sure.

> Well, that makes two of us for sure.

>> My boot partition is a small ext3 partition on a lvm2 volume
>> accessed over device-mapper (I've written a lilo patch for
>> that, but the patch is working and) but I don't think that has
>> something to do with the problem.
>>
>> When syncing, unmounting and waiting some time after running
>> lilo, the changes sometimes seem correctly written to disk, I
>> don't know when exactly.
>
> My /boot is an ext3 partition on an IDE disk. My symptoms and
> your symptoms match -- wait awhile, and it works okay. If you
> don't wait "long enough" the changes made in /etc/lilo.conf are
> not reflected in the after running /sbin/lilo and rebooting
> normally.

One suggestion: ext3 is a journalled version of ext2, so if you can
boot with whatever is needed to specify that the boot partition is
to be mounted as ext2 rather than ext3, you can isolate the journal
system: If the problem's still there in ext2 then the journal is
not involved, but if the problem vanishes there, it's something to
do with the journal.

I have to admit that the above sounds very much like the details
are being recorded in the journal, but the journal isn't being
played back to update the actual files.

Best wishes from Riley.
---
* Nothing as pretty as a smile, nothing as ugly as a frown.

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.478 / Virus Database: 275 - Release Date: 6-May-2003

2003-05-09 22:27:21

by Joe Korty

[permalink] [raw]
Subject: Re: ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available)

> One suggestion: ext3 is a journalled version of ext2, so if you can
> boot with whatever is needed to specify that the boot partition is
> to be mounted as ext2 rather than ext3, you can isolate the journal
> system: If the problem's still there in ext2 then the journal is
> not involved, but if the problem vanishes there, it's something to
> do with the journal.
>
> I have to admit that the above sounds very much like the details
> are being recorded in the journal, but the journal isn't being
> played back to update the actual files.

I recall reading on lkml once that an ext3 sync(2) merely pushes volatile
data/metadata out to the journal rather than to to files themselves.

Joe

2003-05-09 23:27:24

by Andy Pfiffer

[permalink] [raw]
Subject: RE: ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available)

On Fri, 2003-05-09 at 13:46, Riley Williams wrote:
> Hi Andy, Christophe.
>
> >>> I had an unrelated delay in posting this due to some strange
> >>> behavior of late with LILO and my ext3-mounted /boot partition
> >>> (/sbin/lilo would say that it updated, but a subsequent reboot
> >>> would not include my new kernel)
>
> >> So I'm not the only one having this problem... I think I first
> >> saw this with 2.5.68 but I'm not sure.
>
> > Well, that makes two of us for sure.
>
> >> My boot partition is a small ext3 partition on a lvm2 volume
> >> accessed over device-mapper (I've written a lilo patch for
> >> that, but the patch is working and) but I don't think that has
> >> something to do with the problem.
> >>
> >> When syncing, unmounting and waiting some time after running
> >> lilo, the changes sometimes seem correctly written to disk, I
> >> don't know when exactly.
> >
> > My /boot is an ext3 partition on an IDE disk. My symptoms and
> > your symptoms match -- wait awhile, and it works okay. If you
> > don't wait "long enough" the changes made in /etc/lilo.conf are
> > not reflected in the after running /sbin/lilo and rebooting
> > normally.
>
> One suggestion: ext3 is a journalled version of ext2, so if you can
> boot with whatever is needed to specify that the boot partition is
> to be mounted as ext2 rather than ext3, you can isolate the journal
> system: If the problem's still there in ext2 then the journal is
> not involved, but if the problem vanishes there, it's something to
> do with the journal.

Changing the "ext3" to "ext2" in /etc/fstab and rebooting did not change
the behavior (ie, edit /etc/lilo.conf, run /sbin/lilo, reboot cleanly,
changes not there). I did see the warning about mounting an ext3
filesystem as ext2, however.

Strange.



2003-06-11 21:56:16

by Andy Pfiffer

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- IDE Problem?

On Fri, 2003-05-09 at 13:55, Andy Pfiffer wrote:
> On Fri, 2003-05-09 at 13:04, Christophe Saout wrote:
> > Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer:
> >
> > > [...]
> > > I had an unrelated
> > > delay in posting this due to some strange behavior of late with LILO and
> > > my ext3-mounted /boot partition (/sbin/lilo would say that it updated,
> > > but a subsequent reboot would not include my new kernel)
> >
> > So I'm not the only one having this problem... I think I first saw this
> > with 2.5.68 but I'm not sure.
>
> Well, that makes two of us for sure.
>
> >
> > My boot partition is a small ext3 partition on a lvm2 volume accessed
> > over device-mapper (I've written a lilo patch for that, but the patch is
> > working and) but I don't think that has something to do with the
> > problem.
> >
> > When syncing, unmounting and waiting some time after running lilo, the
> > changes sometimes seem correctly written to disk, I don't know when
> > exactly.
>
> My /boot is an ext3 partition on an IDE disk. My symptoms and your
> symptoms match -- wait awhile, and it works okay. If you don't wait
> "long enough" the changes made in /etc/lilo.conf are not reflected in
> the after running /sbin/lilo and rebooting normally.
>
> I have been unable to reproduce this on a uniproc system with SCSI
> disks.
>
> 2.5.67 seems to work in this regard as expected.
>
> > Could it be that the location of /boot/map is not written to the
> > partition sector of /dev/hda? Or not flushed correctly or something?
> >
> > After reboot the old kernel came up again (though it was moved to
> > vmlinuz.old).
>
> I don't know -- I haven't isolated it yet.
>
> Anyone else?

I have taken another look at this, and can confirm the following:

1. 2.5.67 works as expected.
2. 2.5.68, 2.5.69, and 2.5.70 do not.
3. ext2 vs. ext3 for /boot: no effect (ie, .68, .69, .70 demonstrate the
problem independent of the filesystem used for /boot).

Relative to a 2.5.68 pure BK tree, the deltas from 2.5.67 to 2.5.68 are:
1.971.76.10 /* 2.5.67 */
1.1124 /* 2.5.68 */

The patch exported by BK between these 2 revs is 297K lines ( a sizeable
haystack ). Any ideas about where I should dig for my needle first
would be welcomed...


Gory details about my hardware & software follow...

% lilo -v
LILO version 22.1, Copyright (C) 1992-1998 Werner Almesberger
Development beyond version 21 Copyright (C) 1999-2001 John Coffman
Released 31-Oct-2001 and compiled at 20:50:13 on Mar 25 2002.
MAX_IMAGES = 27


CPUs:
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 1
model name : Intel(R) Xeon(TM) CPU 1.70GHz
stepping : 2
cpu MHz : 1685.926
cache size : 256 KB
physical id : 0
siblings : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 3317.76

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 0
model name : Intel(R) Xeon(TM) CPU 1700MHz
stepping : 10
cpu MHz : 1685.926
cache size : 256 KB
physical id : 0
siblings : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 3366.91


Two IDE hard drives (I haven't cracked the case to identify the
manufacturer):

/dev/hda:
HDIO_GETGEO_BIG failed: Inappropriate ioctl for device

Model=CI530L04VARE700- , FwRev=REO44AA5,
SerialNo= S PXXTYH2351
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40
BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=-66060037, LBA=yes, LBAsects=78156288
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
AdvancedPM=yes: disabled (255)
Drive Supports : ATA/ATAPI-5 T13 1321D revision 1 : ATA-2 ATA-3 ATA-4
ATA-5

/dev/hdb:
HDIO_GETGEO_BIG failed: Inappropriate ioctl for device

Model=CI530L02VARE700- , FwRev=REO24AA5,
SerialNo= S PVVTFT0B17
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40
BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=-66060037, LBA=yes, LBAsects=39876480
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
AdvancedPM=yes: disabled (255)
Drive Supports : ATA/ATAPI-5 T13 1321D revision 1 : ATA-2 ATA-3 ATA-4
ATA-5

The PCI hardware on this system:
00:00.0 Host bridge: Intel Corp. 82850 860 (Wombat) Chipset Host Bridge
(MCH) (rev 04)
Subsystem: IBM: Unknown device 2531
Flags: bus master, fast devsel, latency 0
Memory at f0000000 (32-bit, prefetchable) [size=64M]
Capabilities: [a0] AGP version 2.0

00:01.0 PCI bridge: Intel Corp. 82850/82860 850/860 (Tehama/Wombat)
Chipset AGP Bridge (rev 04) (prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, fast devsel, latency 64
Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
Memory behind bridge: f6000000-f8ffffff
Prefetchable memory behind bridge: f4000000-f5ffffff

00:02.0 PCI bridge: Intel Corp. 82860 860 (Wombat) Chipset PCI Bridge
(rev 04) (prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, fast devsel, latency 32
Bus: primary=00, secondary=02, subordinate=03, sec-latency=0
Memory behind bridge: fb000000-fb0fffff

00:1e.0 PCI bridge: Intel Corp. 82801BA/CA PCI Bridge (rev 04) (prog-if
00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=04, subordinate=04, sec-latency=32
I/O behind bridge: 0000c000-0000cfff
Memory behind bridge: f9000000-faffffff

00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 04)
Flags: bus master, medium devsel, latency 0

00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 04) (prog-if 80
[Master])
Subsystem: IBM: Unknown device 2442
Flags: bus master, medium devsel, latency 0
I/O ports at f000 [size=16]

00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub (rev 04)
(prog-if 00 [UHCI])
Subsystem: IBM: Unknown device 2442
Flags: bus master, medium devsel, latency 0, IRQ 19
I/O ports at d000 [size=32]

00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 04)
Subsystem: IBM: Unknown device 2442
Flags: medium devsel, IRQ 17
I/O ports at 5000 [size=16]

00:1f.5 Multimedia audio controller: Intel Corp. 82801BA/BAM AC'97 Audio
(rev 04)
Subsystem: IBM: Unknown device 0224
Flags: bus master, medium devsel, latency 0, IRQ 17
I/O ports at d800 [size=256]
I/O ports at dc00 [size=64]

01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP
(rev 85) (prog-if 00 [VGA])
Subsystem: Matrox Graphics, Inc. Millennium G450 Dual Head
Flags: bus master, medium devsel, latency 64, IRQ 22
Memory at f4000000 (32-bit, prefetchable) [size=32M]
Memory at f6000000 (32-bit, non-prefetchable) [size=16K]
Memory at f7000000 (32-bit, non-prefetchable) [size=8M]
Expansion ROM at <unassigned> [disabled] [size=128K]
Capabilities: [dc] Power Management version 2
Capabilities: [f0] AGP version 2.0

02:1f.0 PCI bridge: Intel Corp. 82806AA PCI64 Hub PCI Bridge (rev 03)
(prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, fast devsel, latency 0
Bus: primary=02, secondary=03, subordinate=03, sec-latency=32
Memory behind bridge: fb000000-fb0fffff

03:00.0 PIC: Intel Corp. 82806AA PCI64 Hub Advanced Programmable
Interrupt Controller (rev 01) (prog-if 20 [IO(X)-APIC])
Subsystem: Intel Corp. 82806AA PCI64 Hub Advanced Programmable
Interrupt Controller
Flags: bus master, fast devsel, latency 0
Memory at fb000000 (32-bit, non-prefetchable) [size=4K]

04:04.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100]
(rev 0c)
Subsystem: IBM: Unknown device 0207
Flags: bus master, medium devsel, latency 32, IRQ 16
Memory at fa020000 (32-bit, non-prefetchable) [size=4K]
I/O ports at c000 [size=64]
Memory at fa000000 (32-bit, non-prefetchable) [size=128K]
Expansion ROM at <unassigned> [disabled] [size=64K]
Capabilities: [dc] Power Management version 2





2003-06-11 23:07:58

by Christophe Saout

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- IDE Problem?

Am Don, 2003-06-12 um 00.08 schrieb Andy Pfiffer:
> On Fri, 2003-05-09 at 13:55, Andy Pfiffer wrote:
> > On Fri, 2003-05-09 at 13:04, Christophe Saout wrote:
> > > Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer:
> > >
> > > > [...]
> > > > I had an unrelated
> > > > delay in posting this due to some strange behavior of late with LILO and
> > > > my ext3-mounted /boot partition (/sbin/lilo would say that it updated,
> > > > but a subsequent reboot would not include my new kernel)
> > >
> > > So I'm not the only one having this problem... I think I first saw this
> > > with 2.5.68 but I'm not sure.
> >
> > Well, that makes two of us for sure.
> >
> > >
> > > My boot partition is a small ext3 partition on a lvm2 volume accessed
> > > over device-mapper (I've written a lilo patch for that, but the patch is
> > > working and) but I don't think that has something to do with the
> > > problem.
> > >
> > > When syncing, unmounting and waiting some time after running lilo, the
> > > changes sometimes seem correctly written to disk, I don't know when
> > > exactly.
> >
> > My /boot is an ext3 partition on an IDE disk. My symptoms and your
> > symptoms match -- wait awhile, and it works okay. If you don't wait
> > "long enough" the changes made in /etc/lilo.conf are not reflected in
> > the after running /sbin/lilo and rebooting normally.
> >
> > I have been unable to reproduce this on a uniproc system with SCSI
> > disks.
> >
> > 2.5.67 seems to work in this regard as expected.
> >
> > > Could it be that the location of /boot/map is not written to the
> > > partition sector of /dev/hda? Or not flushed correctly or something?
> > >
> > > After reboot the old kernel came up again (though it was moved to
> > > vmlinuz.old).
> >
> > I don't know -- I haven't isolated it yet.
> >
> > Anyone else?
>
> I have taken another look at this, and can confirm the following:
>
> 1. 2.5.67 works as expected.
> 2. 2.5.68, 2.5.69, and 2.5.70 do not.
> 3. ext2 vs. ext3 for /boot: no effect (ie, .68, .69, .70 demonstrate the
> problem independent of the filesystem used for /boot).

I found out that flushb /dev/<boot_device> helps, syncing doesn't. Not
100% sure if that's right, because right now I'm always doing both, but
I remember having only synced before and that didn't help.

> Relative to a 2.5.68 pure BK tree, the deltas from 2.5.67 to 2.5.68 are:
> 1.971.76.10 /* 2.5.67 */
> 1.1124 /* 2.5.68 */
>
> The patch exported by BK between these 2 revs is 297K lines ( a sizeable
> haystack ). Any ideas about where I should dig for my needle first
> would be welcomed...

There don't seem to be too much changes in /drivers/block or /fs, mostly
cleanups. I personally have no idea where to start, except trying out
each -bk version inbetween. Hmmm. And I'm not going to do that now...
:-/

--
Christophe Saout <[email protected]>

Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- IDE Problem?


mm/msync.c:
<...>
* MS_ASYNC does not start I/O (it used to, up to 2.5.67).
<...>

You can revert changes to mm/msync.c from 2.5.68 patch
and see whether it helps.

Regards,
--
Bartlomiej

On 12 Jun 2003, Christophe Saout wrote:

> Am Don, 2003-06-12 um 00.08 schrieb Andy Pfiffer:
> > On Fri, 2003-05-09 at 13:55, Andy Pfiffer wrote:
> > > On Fri, 2003-05-09 at 13:04, Christophe Saout wrote:
> > > > Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer:
> > > >
> > > > > [...]
> > > > > I had an unrelated
> > > > > delay in posting this due to some strange behavior of late with LILO and
> > > > > my ext3-mounted /boot partition (/sbin/lilo would say that it updated,
> > > > > but a subsequent reboot would not include my new kernel)
> > > >
> > > > So I'm not the only one having this problem... I think I first saw this
> > > > with 2.5.68 but I'm not sure.
> > >
> > > Well, that makes two of us for sure.
> > >
> > > >
> > > > My boot partition is a small ext3 partition on a lvm2 volume accessed
> > > > over device-mapper (I've written a lilo patch for that, but the patch is
> > > > working and) but I don't think that has something to do with the
> > > > problem.
> > > >
> > > > When syncing, unmounting and waiting some time after running lilo, the
> > > > changes sometimes seem correctly written to disk, I don't know when
> > > > exactly.
> > >
> > > My /boot is an ext3 partition on an IDE disk. My symptoms and your
> > > symptoms match -- wait awhile, and it works okay. If you don't wait
> > > "long enough" the changes made in /etc/lilo.conf are not reflected in
> > > the after running /sbin/lilo and rebooting normally.
> > >
> > > I have been unable to reproduce this on a uniproc system with SCSI
> > > disks.
> > >
> > > 2.5.67 seems to work in this regard as expected.
> > >
> > > > Could it be that the location of /boot/map is not written to the
> > > > partition sector of /dev/hda? Or not flushed correctly or something?
> > > >
> > > > After reboot the old kernel came up again (though it was moved to
> > > > vmlinuz.old).
> > >
> > > I don't know -- I haven't isolated it yet.
> > >
> > > Anyone else?
> >
> > I have taken another look at this, and can confirm the following:
> >
> > 1. 2.5.67 works as expected.
> > 2. 2.5.68, 2.5.69, and 2.5.70 do not.
> > 3. ext2 vs. ext3 for /boot: no effect (ie, .68, .69, .70 demonstrate the
> > problem independent of the filesystem used for /boot).
>
> I found out that flushb /dev/<boot_device> helps, syncing doesn't. Not
> 100% sure if that's right, because right now I'm always doing both, but
> I remember having only synced before and that didn't help.
>
> > Relative to a 2.5.68 pure BK tree, the deltas from 2.5.67 to 2.5.68 are:
> > 1.971.76.10 /* 2.5.67 */
> > 1.1124 /* 2.5.68 */
> >
> > The patch exported by BK between these 2 revs is 297K lines ( a sizeable
> > haystack ). Any ideas about where I should dig for my needle first
> > would be welcomed...
>
> There don't seem to be too much changes in /drivers/block or /fs, mostly
> cleanups. I personally have no idea where to start, except trying out
> each -bk version inbetween. Hmmm. And I'm not going to do that now...
> :-/
>
> --
> Christophe Saout <[email protected]>



2003-06-12 00:08:16

by Andy Pfiffer

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

On Wed, 2003-06-11 at 16:21, Christophe Saout wrote:
> Am Don, 2003-06-12 um 00.08 schrieb Andy Pfiffer:
> > On Fri, 2003-05-09 at 13:55, Andy Pfiffer wrote:
> > > On Fri, 2003-05-09 at 13:04, Christophe Saout wrote:
> > > > Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer:
> > > >
> > > > > [...]
> > > > > I had an unrelated
> > > > > delay in posting this due to some strange behavior of late with LILO and
> > > > > my ext3-mounted /boot partition (/sbin/lilo would say that it updated,
> > > > > but a subsequent reboot would not include my new kernel)
> > > >
> > > > So I'm not the only one having this problem... I think I first saw this
> > > > with 2.5.68 but I'm not sure.
<snip>
> > I have taken another look at this, and can confirm the following:
> >
> > 1. 2.5.67 works as expected.
> > 2. 2.5.68, 2.5.69, and 2.5.70 do not.
> > 3. ext2 vs. ext3 for /boot: no effect (ie, .68, .69, .70 demonstrate the
> > problem independent of the filesystem used for /boot).
>
> I found out that flushb /dev/<boot_device> helps, syncing doesn't. Not
> 100% sure if that's right, because right now I'm always doing both, but
> I remember having only synced before and that didn't help.

<snip>

A little more digging reveals this thread from May 14, 2003:
http://marc.theaimsgroup.com/?l=linux-kernel&m=105296774516509&w=2

Applying the kludge in Adam's message:

--- linux-2.5.69/fs/block_dev.c.orig 2003-05-14 17:43:40.000000000 -0700
+++ linux-2.5.69/fs/block_dev.c 2003-05-14 17:44:29.000000000 -0700
@@ -635,14 +635,24 @@
int blkdev_put(struct block_device *bdev, int kind)
{
int ret = 0;
struct inode *bd_inode = bdev->bd_inode;
struct gendisk *disk = bdev->bd_disk;

down(&bdev->bd_sem);
+
+ /* AJR start */
+ switch (kind) {
+ case BDEV_FILE:
+ case BDEV_FS:
+ sync_blockdev(bd_inode->i_bdev);
+ break;
+ }
+ /* AJR end */
+
lock_kernel();
if (!--bdev->bd_openers) {
switch (kind) {
case BDEV_FILE:
case BDEV_FS:
sync_blockdev(bd_inode->i_bdev);
break;

made things work for me in 2.5.68.
I suspect it will make things work for .70 as well.

So now the important question: is it wrong to not sync_blockdev() until
the count drops to 0?

Andy


2003-06-12 00:20:15

by Andrew Morton

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

Andy Pfiffer <[email protected]> wrote:
>
> So now the important question: is it wrong to not sync_blockdev() until
> the count drops to 0?

Should be OK. The close will not sync anything if someone else has the
blockdev open (ie: there's a filesystem mounted there).

But sync() should certainly write everything out, and lilo does perform a
sync.

I'd be interested in seeing the contents of /proc/meminfo immediately after
the lilo run, see if there's any dirty memory left around.

2003-06-12 10:29:06

by Christophe Saout

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

Am Don, 2003-06-12 um 02.29 schrieb Andrew Morton:

> But sync() should certainly write everything out, and lilo does perform a
> sync.

Yep.

> I'd be interested in seeing the contents of /proc/meminfo immediately after
> the lilo run, see if there's any dirty memory left around.

Yes, one page. After running lilo, there are 4k diry, running sync
doesn't get it below 4k. Only flushb /dev/hda does (or waiting several
minutes).

If you're interested, I've put an annotated version of

( cat /proc/meminfo; lilo; cat /proc/meminfo; sync; cat /proc/meminfo;
flushb /dev/hda; cat /proc/meminfo ) | buffer > meminfo.out.txt

on my web space: http://www.saout.de/files/meminfo.out.txt

(the kernel used was 2.5.70-mm7 with some unrelated patches backed out)

BTW: I found out that now strace lilo freezes the machine...

--
Christophe Saout <[email protected]>

2003-06-12 10:39:59

by Andrew Morton

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

Christophe Saout <[email protected]> wrote:
>
> Am Don, 2003-06-12 um 02.29 schrieb Andrew Morton:
>
> > But sync() should certainly write everything out, and lilo does perform a
> > sync.
>
> Yep.
>
> > I'd be interested in seeing the contents of /proc/meminfo immediately after
> > the lilo run, see if there's any dirty memory left around.
>
> Yes, one page. After running lilo, there are 4k diry, running sync
> doesn't get it below 4k.

That would tend to imply that a page got onto the wrong list. But if that
were so, nothing would be able to write it.

> Only flushb /dev/hda does (or waiting several minutes).

What is flushb?

I use `lilo ; reboot -f' about 1000 times a day, no probs. There's
something different.

Adam was doing strange things with an initrd and pivot_root. Are you doing
anything unconventional?

>
> BTW: I found out that now strace lilo freezes the machine...

Works OK here. Try `strace strace lilo' ;)


2003-06-12 10:58:47

by Christophe Saout

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

Am Don, 2003-06-12 um 12.54 schrieb Andrew Morton:

> Christophe Saout <[email protected]> wrote:
> >
> > Am Don, 2003-06-12 um 02.29 schrieb Andrew Morton:
> >
> > > I'd be interested in seeing the contents of /proc/meminfo immediately after
> > > the lilo run, see if there's any dirty memory left around.
> >
> > Yes, one page. After running lilo, there are 4k diry, running sync
> > doesn't get it below 4k.
>
> That would tend to imply that a page got onto the wrong list. But if that
> were so, nothing would be able to write it.
>
> > Only flushb /dev/hda does (or waiting several minutes).
>
> What is flushb?

A program that does a flush ioctl on a block device:

open("/dev/hda", O_RDONLY) = 3
ioctl(3, BLKFLSBUF, 0) = 0

> I use `lilo ; reboot -f' about 1000 times a day, no probs. There's
> something different.
>
> Adam was doing strange things with an initrd and pivot_root. Are you doing
> anything unconventional?

I'm using an initrd (but no pivot_root) that initializes my LVM2 volumes
(using device-mapper).

/boot and / are on device-mapper devices.

> > BTW: I found out that now strace lilo freezes the machine...
>
> Works OK here. Try `strace strace lilo' ;)

I'll try to find out what happens. Not interested in crashing my system
while answering emails now. ;)

--
Christophe Saout <[email protected]>

2003-06-12 11:11:54

by Christophe Saout

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- strace lilo - freeze

Am Don, 2003-06-12 um 12.54 schrieb Andrew Morton:

> > BTW: I found out that now strace lilo freezes the machine...
> Works OK here. Try `strace strace lilo' ;)

Since we are already talking about syncing...

The last thing "strace lilo" shows is:

fsync(5

--
Christophe Saout <[email protected]>

2003-06-12 12:31:18

by Herbert Xu

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

Andrew Morton <[email protected]> wrote:
>
> I use `lilo ; reboot -f' about 1000 times a day, no probs. There's
> something different.
>
> Adam was doing strange things with an initrd and pivot_root. Are you doing
> anything unconventional?

I see exactly the same problem with lilo and I too use initrd + pivot_root.
I think Adam's post referred to elsewhere in this thread already identified
the problem as initrd-only.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2003-06-12 17:15:36

by Andy Pfiffer

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

On Wed, 2003-06-11 at 17:29, Andrew Morton wrote:
> Andy Pfiffer <[email protected]> wrote:
> >
> > So now the important question: is it wrong to not sync_blockdev() until
> > the count drops to 0?
>
> Should be OK. The close will not sync anything if someone else has the
> blockdev open (ie: there's a filesystem mounted there).
>
> But sync() should certainly write everything out, and lilo does perform a
> sync.
>
> I'd be interested in seeing the contents of /proc/meminfo immediately after
> the lilo run, see if there's any dirty memory left around.

How I measured:

# cat measure
#!/bin/sh
sync
cat /proc/meminfo
/sbin/lilo
cat /proc/meminfo

# ./measure | dd bs=1024k > out

2.5.68 pure:

Before: After:
MemTotal: 514304 kB MemTotal: 514304 kB
MemFree: 388720 kB MemFree: 385648 kB
Buffers: 10056 kB Buffers: 12092 kB
Cached: 54000 kB Cached: 54956 kB
SwapCached: 0 kB SwapCached: 0 kB
Active: 62928 kB Active: 63240 kB
Inactive: 34416 kB Inactive: 37120 kB
HighTotal: 0 kB HighTotal: 0 kB
HighFree: 0 kB HighFree: 0 kB
LowTotal: 514304 kB LowTotal: 514304 kB
LowFree: 388720 kB LowFree: 385648 kB
SwapTotal: 787144 kB SwapTotal: 787144 kB
SwapFree: 787144 kB SwapFree: 787144 kB
Dirty: 0 kB Dirty: 8 kB <---
Writeback: 0 kB Writeback: 0 kB
Mapped: 45484 kB Mapped: 45488 kB
Slab: 11880 kB Slab: 12100 kB
Committed_AS: 146184 kB Committed_AS: 146184 kB
PageTables: 656 kB PageTables: 656 kB
VmallocTotal: 516040 kB VmallocTotal: 516040 kB
VmallocUsed: 42608 kB VmallocUsed: 42608 kB
VmallocChunk: 473432 kB VmallocChunk: 473432 kB

2.5.68+kludge:

Before: After:
MemTotal: 514304 kB MemTotal: 514304 kB
MemFree: 390416 kB MemFree: 387216 kB
Buffers: 9844 kB Buffers: 11892 kB
Cached: 52920 kB Cached: 53864 kB
SwapCached: 0 kB SwapCached: 0 kB
Active: 62136 kB Active: 62452 kB
Inactive: 33908 kB Inactive: 36600 kB
HighTotal: 0 kB HighTotal: 0 kB
HighFree: 0 kB HighFree: 0 kB
LowTotal: 514304 kB LowTotal: 514304 kB
LowFree: 390416 kB LowFree: 387216 kB
SwapTotal: 787144 kB SwapTotal: 787144 kB
SwapFree: 787144 kB SwapFree: 787144 kB
Dirty: 0 kB Dirty: 4 kB <---
Writeback: 0 kB Writeback: 0 kB
Mapped: 45448 kB Mapped: 45452 kB
Slab: 11564 kB Slab: 11756 kB
Committed_AS: 146192 kB Committed_AS: 146132 kB
PageTables: 656 kB PageTables: 656 kB
VmallocTotal: 516040 kB VmallocTotal: 516040 kB
VmallocUsed: 42608 kB VmallocUsed: 42608 kB
VmallocChunk: 473432 kB VmallocChunk: 473432 kB




2003-06-12 17:39:21

by Andrew Morton

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

Andy Pfiffer <[email protected]> wrote:
>
> Dirty: 0 kB Dirty: 4 kB <---

OK. And are you using initrd as well?

2003-06-12 17:50:57

by Andy Pfiffer

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

On Thu, 2003-06-12 at 10:53, Andrew Morton wrote:
> Andy Pfiffer <[email protected]> wrote:
> >
> > Dirty: 0 kB Dirty: 4 kB <---
>
> OK. And are you using initrd as well?

It is specified in lilo.conf (SuSE 8.0 distro) but I don't see any
reason to keep it. I'll yank it and see if it makes a difference.

Andy


2003-06-12 17:55:25

by Andrew Morton

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

Andy Pfiffer <[email protected]> wrote:
>
> On Thu, 2003-06-12 at 10:53, Andrew Morton wrote:
> > Andy Pfiffer <[email protected]> wrote:
> > >
> > > Dirty: 0 kB Dirty: 4 kB <---
> >
> > OK. And are you using initrd as well?
>
> It is specified in lilo.conf (SuSE 8.0 distro) but I don't see any
> reason to keep it. I'll yank it and see if it makes a difference.
>

That would be interesting.

Also, what about this shot in the dark?


--- 25/fs/fs-writeback.c~a 2003-06-12 11:08:34.000000000 -0700
+++ 25-akpm/fs/fs-writeback.c 2003-06-12 11:08:39.000000000 -0700
@@ -368,7 +368,7 @@ void sync_inodes_sb(struct super_block *
};

get_page_state(&ps);
- wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable +
+ wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable + 1024 +
(ps.nr_dirty + ps.nr_unstable) / 4;
spin_lock(&inode_lock);
sync_sb_inodes(sb, &wbc);

_

2003-06-12 18:12:33

by Andy Pfiffer

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

On Thu, 2003-06-12 at 11:03, Andy Pfiffer wrote:
> On Thu, 2003-06-12 at 10:53, Andrew Morton wrote:
> > Andy Pfiffer <[email protected]> wrote:
> > >
> > > Dirty: 0 kB Dirty: 4 kB <---
> >
> > OK. And are you using initrd as well?
>
> It is specified in lilo.conf (SuSE 8.0 distro) but I don't see any
> reason to keep it. I'll yank it and see if it makes a difference.

pure == 2.5.68
kludge == 2.5.68+kludge in blkdev_put()

% grep Dirt =noinitrd-*
=noinitrd-kludge=:Dirty: 0 kB # before
=noinitrd-kludge=:Dirty: 4 kB # after
=noinitrd-pure=:Dirty: 0 kB # before
=noinitrd-pure=:Dirty: 4 kB # after

So it would appear to me that initrd is the common denominator among
those of us reporting similar symptoms.

Andy




2003-06-12 18:39:44

by Christophe Saout

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

Am Don, 2003-06-12 um 20.10 schrieb Andrew Morton:

> Also, what about this shot in the dark?

> - wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable +
> + wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable + 1024 +

Nope, still 4k dirty left after lilo.

--
Christophe Saout <[email protected]>

2003-06-13 07:46:56

by Andrew Morton

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?


This should fix it.



Once the blockdev inode for /dev/ram0 is dirtied we have a memory-backed
inode on the blockdev superblock's s_dirty list.

sync_sb_inodes() sees the memory-backed inode on the superblock and assumes
that all the other inodes on the superblock are also memory-backed. This is
not true for the blockdev superblock! We forget to write out dirty pages
against the following blockdevs.

Fix this by just leaving the inode dirty and moving on to inspect the other
blockdev inodes on sb->s_io.

(This is a little inefficient: an alternative is to leave dirtied
memory-backed inodes on inode_in_use, so nobody ever even considers them for
writeout. But that introduces an inconsistency and is a bit kludgey).



fs/fs-writeback.c | 15 ++++++++++++++-
1 files changed, 14 insertions(+), 1 deletion(-)

diff -puN fs/fs-writeback.c~writeback-memory-backed-fix fs/fs-writeback.c
--- 25/fs/fs-writeback.c~writeback-memory-backed-fix 2003-06-12 23:12:28.000000000 -0700
+++ 25-akpm/fs/fs-writeback.c 2003-06-12 23:14:07.000000000 -0700
@@ -260,8 +260,21 @@ sync_sb_inodes(struct super_block *sb, s
struct address_space *mapping = inode->i_mapping;
struct backing_dev_info *bdi = mapping->backing_dev_info;

- if (bdi->memory_backed)
+ if (bdi->memory_backed) {
+ if (sb == blockdev_superblock) {
+ /*
+ * Dirty memory-backed blockdev: the ramdisk
+ * driver does this.
+ */
+ list_move(&inode->i_list, &sb->s_dirty);
+ continue;
+ }
+ /*
+ * Assume that all inodes on this superblock are memory
+ * backed. Skip the superblock.
+ */
break;
+ }

if (wbc->nonblocking && bdi_write_congested(bdi)) {
wbc->encountered_congestion = 1;

_

2003-06-13 09:44:30

by Herbert Xu

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

On Fri, Jun 13, 2003 at 01:01:49AM -0700, Andrew Morton wrote:
>
> Fix this by just leaving the inode dirty and moving on to inspect the other
> blockdev inodes on sb->s_io.

This fixes it for me. Thanks Andrew.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2003-06-13 14:29:30

by Eduardo Pereira Habkost

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

On Fri, Jun 13, 2003 at 01:01:49AM -0700, Andrew Morton wrote:
>
> This should fix it.
>

It worked here. Thanks!

--
Eduardo


Attachments:
(No filename) (129.00 B)
(No filename) (189.00 B)
Download all attachments

2003-06-13 17:04:15

by Andy Pfiffer

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

On Fri, 2003-06-13 at 01:01, Andrew Morton wrote:
> Fix this by just leaving the inode dirty and moving on to inspect the other
> blockdev inodes on sb->s_io.

Yup, this fixed it for me, too. Thanks for your help. --Andy



2003-06-13 22:01:00

by Unai Garro Arrazola

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I just got the time to checked. It works great, thanks! Where can I send this
box of chocolates as gratitude? ;-)

On Friday 13 June 2003 09:01, Andrew Morton wrote:
> This should fix it.
>
>
>
> Once the blockdev inode for /dev/ram0 is dirtied we have a memory-backed
> inode on the blockdev superblock's s_dirty list.
>
> sync_sb_inodes() sees the memory-backed inode on the superblock and assumes
> that all the other inodes on the superblock are also memory-backed. This
> is not true for the blockdev superblock! We forget to write out dirty
> pages against the following blockdevs.
>
> Fix this by just leaving the inode dirty and moving on to inspect the other
> blockdev inodes on sb->s_io.
>
> (This is a little inefficient: an alternative is to leave dirtied
> memory-backed inodes on inode_in_use, so nobody ever even considers them
> for writeout. But that introduces an inconsistency and is a bit kludgey).
>
>
>
> fs/fs-writeback.c | 15 ++++++++++++++-
> 1 files changed, 14 insertions(+), 1 deletion(-)
>
> diff -puN fs/fs-writeback.c~writeback-memory-backed-fix fs/fs-writeback.c
> --- 25/fs/fs-writeback.c~writeback-memory-backed-fix 2003-06-12
> 23:12:28.000000000 -0700 +++ 25-akpm/fs/fs-writeback.c 2003-06-12
> 23:14:07.000000000 -0700
> @@ -260,8 +260,21 @@ sync_sb_inodes(struct super_block *sb, s
> struct address_space *mapping = inode->i_mapping;
> struct backing_dev_info *bdi = mapping->backing_dev_info;
>
> - if (bdi->memory_backed)
> + if (bdi->memory_backed) {
> + if (sb == blockdev_superblock) {
> + /*
> + * Dirty memory-backed blockdev: the ramdisk
> + * driver does this.
> + */
> + list_move(&inode->i_list, &sb->s_dirty);
> + continue;
> + }
> + /*
> + * Assume that all inodes on this superblock are memory
> + * backed. Skip the superblock.
> + */
> break;
> + }
>
> if (wbc->nonblocking && bdi_write_congested(bdi)) {
> wbc->encountered_congestion = 1;
>
> _

- --
Coincidences are spiritual puns.
-- G.K. Chesterton
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+6kxjhxDfDIoZlaURAsHrAKCRFnHCpzdBbtJ8C9vrY6P7T9+dYACgg+fL
XYizhhJD8KZ3bO4O/YzXr2c=
=Rwik
-----END PGP SIGNATURE-----

2003-06-13 22:18:30

by Andrew Morton

[permalink] [raw]
Subject: Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem?

Unai Garro Arrazola <[email protected]> wrote:
>
> I just got the time to checked. It works great, thanks!

Thanks for following up.

> Where can I send this box of chocolates as gratitude? ;-)

Not to the guy who broke it in the first place ;)