2007-10-24 18:05:43

by James Ausmus

[permalink] [raw]
Subject: Possible 2.6.23 regression - Disappearing disk space

Since updating my laptop to 2.6.23, occasionally all of my free disk
space on my root partition will just go away, with no files accounting
for the space, with no odd messages in dmesg or my syslog. If I
reboot, I immediately have the proper amount of free space again. Here
is the output of a du -sx * on /, and the output of the df command
when the problem is occuring, followed by the same info after a fresh
reboot (literally just did the command in the failed state, then
immediately rebooted and ran the same commands again) - any thoughts
as to what might be happening?

As a note - when I first see the issue, I have exactly 0 free space
available on root, as per df - I then delete some random things in
order to have enough free space to operate, which is why in my first
df you see 55M available

Prior to reboot (with issue occurring):

charles / # du -sx *
8814 bin
0 boot
132 dev
392 emul
105869 etc
1718345 home
0 lib
4858 lib32
72358 lib64
4 media
72374 mnt
2027054 opt
0 proc
126411 root
15596 sbin
0 sys
28237 tmp
20887995 usr
1098215 var

charles / # df

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda4 37020624 36964772 55852 100% /
udev 10240 132 10108 2% /dev
shm 1028712 24 1028688 1% /dev/shm
/dev/sdc2 73246080 20255824 52990256 28% /media/linux

charles / #

du -sx totals: 26,166,654

Post reboot:

charles / # df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda4 37020624 24449440 12571184 67% /
udev 10240 132 10108 2% /dev
shm 1028712 0 1028712 0% /dev/shm
/dev/sdg2 73246080 20255824 52990256 28% /media/linux
charles / # du -sx /

charles / # du -sx *
8814 bin
0 boot
132 dev
392 emul
105869 etc
1719241 home
0 lib
4858 lib32
72358 lib64
4 media
72374 mnt
2027054 opt
0 proc
126406 root
15596 sbin
0 sys
28249 tmp
20887995 usr
1040963 var

charles / # df

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda4 37020624 24452848 12567776 67% /
udev 10240 132 10108 2% /dev
shm 1028712 24 1028688 1% /dev/shm
/dev/sdg2 73246080 20255824 52990256 28% /media/linux

charles / #

charles / # uname -a
Linux charles 2.6.23-gentoo #1 SMP PREEMPT Tue Oct 16 11:34:08 PDT
2007 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4400+
AuthenticAMD GNU/Linux

charles / # mount
/dev/sda4 on / type reiserfs (rw,noatime)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec)
udev on /dev type tmpfs (rw,nosuid)
devpts on /dev/pts type devpts (rw,nosuid,noexec)
shm on /dev/shm type tmpfs (rw,noexec,nosuid,nodev)
usbfs on /proc/bus/usb type usbfs (rw,noexec,nosuid,devmode=0664,devgid=85)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc
(rw,noexec,nosuid,nodev)
nfsd on /proc/fs/nfs type nfsd (rw,noexec,nosuid,nodev)
/dev/sdg2 on /media/linux type reiserfs (rw,nosuid,nodev)
/media/linux/usr/portage/distfiles on /usr/portage/distfiles type none
(rw,noexec,nosuid,nodev,bind)



As you can see, the du -sx * gives virtually identical used space
(which adds up to about 26G, in both instances), but df thinks that
there is no free space available, and I can't write anything to disk
(I get a disk full error).

Attached is output from lspci -v, and zcat /proc/config.gz

Thanks!

-James


Attachments:
(No filename) (3.53 kB)
config (41.35 kB)
lspci.txt (9.09 kB)
Download all attachments

2007-10-24 18:25:37

by Tomasz Chmielewski

[permalink] [raw]
Subject: Re: Possible 2.6.23 regression - Disappearing disk space

James Ausmus wrote:

> As a note - when I first see the issue, I have exactly 0 free space
> available on root, as per df - I then delete some random things in
> order to have enough free space to operate, which is why in my first
> df you see 55M available

Perhaps some program still uses some (deleted) files.

Here's how you can achieve a similar effect manually:

# dd if=/dev/zero of=/file

And on another terminal:

# rm -f /file


Now watch your space decreasing with "df -h", although "the file was
deleted". Was it really?

# lsof -n|grep /file
dd 6406 root 1w REG 8,1 807791616
45 /file (deleted)



That said, you might want to use lsof and search for "deleted" before
concluding any further.


--
Tomasz Chmielewski
http://wpkg.org

2007-10-24 18:30:58

by James Ausmus

[permalink] [raw]
Subject: Re: Possible 2.6.23 regression - Disappearing disk space

On 10/24/07, Tomasz Chmielewski <[email protected]> wrote:
> James Ausmus wrote:
>
> > As a note - when I first see the issue, I have exactly 0 free space
> > available on root, as per df - I then delete some random things in
> > order to have enough free space to operate, which is why in my first
> > df you see 55M available
>
> Perhaps some program still uses some (deleted) files.
>
> Here's how you can achieve a similar effect manually:
>
> # dd if=/dev/zero of=/file
>
> And on another terminal:
>
> # rm -f /file
>
>
> Now watch your space decreasing with "df -h", although "the file was
> deleted". Was it really?
>
> # lsof -n|grep /file
> dd 6406 root 1w REG 8,1 807791616
> 45 /file (deleted)
>
>
>
> That said, you might want to use lsof and search for "deleted" before
> concluding any further.


Thanks for the tip - I'll do that the next time the issue occurs and
post my results.

-James

>
>
> --
> Tomasz Chmielewski
> http://wpkg.org
>

2007-10-25 11:56:37

by Rolf Eike Beer

[permalink] [raw]
Subject: Re: Possible 2.6.23 regression - Disappearing disk space

James Ausmus wrote:
> Since updating my laptop to 2.6.23, occasionally all of my free disk
> space on my root partition will just go away, with no files accounting
> for the space, with no odd messages in dmesg or my syslog. If I
> reboot, I immediately have the proper amount of free space again. Here
> is the output of a du -sx * on /, and the output of the df command
> when the problem is occuring, followed by the same info after a fresh
> reboot (literally just did the command in the failed state, then
> immediately rebooted and ran the same commands again) - any thoughts
> as to what might be happening?

The file that eats up the memory is still opened by a process, but deleted.

HTH

Eike


Attachments:
(No filename) (703.00 B)
signature.asc (194.00 B)
This is a digitally signed message part.
Download all attachments

2007-10-26 00:36:34

by Bill Davidsen

[permalink] [raw]
Subject: Re: Possible 2.6.23 regression - Disappearing disk space

James Ausmus wrote:
> Since updating my laptop to 2.6.23, occasionally all of my free disk
> space on my root partition will just go away, with no files accounting
> for the space, with no odd messages in dmesg or my syslog. If I
> reboot, I immediately have the proper amount of free space again. Here
> is the output of a du -sx * on /, and the output of the df command
> when the problem is occuring, followed by the same info after a fresh
> reboot (literally just did the command in the failed state, then
> immediately rebooted and ran the same commands again) - any thoughts
> as to what might be happening?
>
Clearly some process is still using a deleted file. However, if it
doesn't happen with 2.6.22.x kernels, it would seem fall under the
category of regression, in the "used to work" sense. Before going
further you may want to be really sure that an older kernel doesn't do
this, so no one wastes time on a non-problem.

Assuming the older kernel works fine, it's possible that some new
behavior of the kernel as causing a process to misbehave, and step one
is to use lsof and try to find the process. It's possible that "top"
might be useful, although whatever is using the disk space may just be
lurking.

--
Bill Davidsen <[email protected]>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot

2007-10-26 19:14:21

by James Ausmus

[permalink] [raw]
Subject: Re: Possible 2.6.23 regression - Disappearing disk space

On 10/25/07, Bill Davidsen <[email protected]> wrote:
> James Ausmus wrote:
> > Since updating my laptop to 2.6.23, occasionally all of my free disk
> > space on my root partition will just go away, with no files accounting
> > for the space, with no odd messages in dmesg or my syslog. If I
> > reboot, I immediately have the proper amount of free space again. Here
> > is the output of a du -sx * on /, and the output of the df command
> > when the problem is occuring, followed by the same info after a fresh
> > reboot (literally just did the command in the failed state, then
> > immediately rebooted and ran the same commands again) - any thoughts
> > as to what might be happening?
> >
> Clearly some process is still using a deleted file. However, if it
> doesn't happen with 2.6.22.x kernels, it would seem fall under the
> category of regression, in the "used to work" sense. Before going
> further you may want to be really sure that an older kernel doesn't do
> this, so no one wastes time on a non-problem.
>
> Assuming the older kernel works fine, it's possible that some new
> behavior of the kernel as causing a process to misbehave, and step one
> is to use lsof and try to find the process. It's possible that "top"
> might be useful, although whatever is using the disk space may just be
> lurking.
>

OK, false alarm, this is definitely a userpsace (or a user... :)
problem - had a 12GB .xsession-errors file that I had deleted but was
still being held open - now I just have to determine why I have a 12GB
.xsession-errors file... :(

Thanks for the help all!

-James

> --
> Bill Davidsen <[email protected]>
> "We have more to fear from the bungling of the incompetent than from
> the machinations of the wicked." - from Slashdot
>

2007-10-26 19:19:50

by Tomasz Chmielewski

[permalink] [raw]
Subject: Re: Possible 2.6.23 regression - Disappearing disk space

James Ausmus wrote:

> OK, false alarm, this is definitely a userpsace (or a user... :)
> problem - had a 12GB .xsession-errors file that I had deleted but was
> still being held open - now I just have to determine why I have a 12GB
> .xsession-errors file... :(

Surely, .xsession-errors was 12GB large because of a 2.6.23 regression ;)


--
Tomasz Chmielewski
http://blog.wpkg.org