-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
(Note: bug originally posted on BugTraq, this response is cross-posted to
the LKML because of the way the server hosting the X font server ground to
a halt.)
On Mon, 10 Jun 2002 at 10:20, Tom wrote:
> Affected
> ========
> Mozilla 1.0 and earlier
> verified on Linux and Solaris, other Unixes most likely affected as well.
Re-verified on Debian GNU/Linux running kernel 2.4.18-xfs with Mozilla
1.0.0 and XFree86 4.1.0 with the font server running on a separate server
also running on Debian GNU/Linux with kernel 2.4.18-xfs and the X font
server from XFree86 4.1.0.
The server running the X font server has 512MB RAM, 1GB swap, has one ext2
partition (/boot), one ReiserFS partition (Squid cache) and the rest are
XFS partitions. It also runs the kernel-mode NFS server v1.0 and Samba
2.2.4, among other things, and is typically headless.
> Effect
> ======
> System becomes unuseable or X windows crashes
> (varies depending on system configuration)
Here I had a workstation (that I thought 'hey, I can reboot this after
I've felt how the problem works') that used a remote font server, which
was company-wide. Yes, I'm the administrator so I killed myself. I've got
a personal note to myself, as well. "Don't try stuff posted on BugTraq on
anything connected to the network, dummy." ;)
Anyway, FWIW:
First I noticed that the Mozilla window stopped responding immediately. So
I closed it and breathed deeply. Then I noticed the server hosting the X
font server started thrashing really badly. /home is exported from the
same server via kernel-mode NFS, so file access to /home started freezing
on workstations. Samba access slowed down, but continued to work.
I was able to log on to the server with enough time to SIGKILL the
xfs-daemon process. Unfortunately this wasn't good enough. The server
started running up various processes stuck in "D" state, the OOM killer
went on panic mode and started killing things left and right, mostly from
what I notice apache and apache-ssl processes with messages like "Out of
Memory: Killed process xxxx (apache)". I was also able to do a `free`
after killing xfs-daemon and noticed that there was a lot of free memory
both physically and on swap.
Within less than ten minutes on this relatively lightly-loaded server, I
could not log in to the machine locally, even as root (whose home
directory is not NFS-exported) and load levels shot up to around 25, which
is definitely abnormal. Existing logged-on processes also got stuck in
whatever they were doing (`ps ax`, in particular is what I remember).
Attempted reboots locally via Ctrl-Alt-Delete and Magic SysRq failed
because (1) I had disabled ctrl-alt-delete mapping in inittab "for
security", and (2) I had not compiled Magic SysRq support on this
particular server. More notes to self.
> Description
> ===========
> When loading pages with a specially prepared (or erroneous) stylesheet,
> mozilla and X windows (not restricted to XFree) exhibit any of two
> undesireable behaviours. This seems to depend on the local system
> configuration, especially to the presence of xfs, but bug reports so far
> are inconclusive.
> In one scenario, X simply crashes, taking everything with it. This will result
> in the loss of unsaved work.
> In scenario two, memory useage of the X server explodes until the machine
> reaches the thrashing point, at which point only a hard kill (-9) of the
> X server can save it, provided there are enough system resources left to
> issue the kill.
In my case the workstation was easy to stabilize. I was actually able to
close all windows and exit xfce properly. It's the server running the X
font server that ground to painful halt. I do not know how things would
have been if I had the Magic SysRq enabled.
> The bug is triggered by a huge font setting done through CSS. Depending
> on the end user's system configuration, this will either trigger an
> abort in the XFree86 code ("Beziers this large not supported") or cause
> an explosive use of memory. It is unknown how much memory could get
> consumed, but follow-ups to the mozilla bug verify that machines with 1
> GB of memory still reach the thrashing point.
While I agree with BugTraq posts in response to this that applications
like Mozilla which accept font-sizes from unknown sources should have some
check to prevent such large sizes from crashing X and/or the X Font
Server, I'm alarmed by (1) the way the X font server allows itself to be
crashed like this, and (2) the way the entire Linux kernel seems to have
been unable to handle the situation. While having a central company or
department wide font server may not have been the best choice I made, this
seems like a simple way to drive a stake through a system's heart.
Suggestions on how to work around this on multiple levels would definitely
be appreciated. I'll be starting by removing the X font server from our
file and authentication server onto some high-powered workstation, but I'm
sure this won't be enough, and knowing that a user process like xfs-daemon
can drag the Linux kernel down to knees is not very comforting. :(
--> Jijo
- --
Federico Sevilla III : <http://jijo.free.net.ph/>
Network Administrator : The Leather Collection, Inc.
GnuPG Key ID : 0x93B746BE
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
iD4DBQE9B/kJ5rCBSJO3Rr4RAqYOAJ90o1C6RnU7Lgyu2mgP0XOwrL11yACWKIad
ksWA3s4feBrlFFXCX/pGlg==
=E751
-----END PGP SIGNATURE-----
> With current Linux kernels, this careful coding brings no benefit
> whatsover, as malloc never (?) returns NULL. What is worse, as far as
> I know the kernel doesn't send advance warning of an OOM situation; it
> would not be too difficult to stop allocating memory when that happens.
With properly configured systems and mode 2/3 overcommit set you an
get NULL back from malloc Its up to the user
> check to prevent such large sizes from crashing X and/or the X Font
> Server, I'm alarmed by (1) the way the X font server allows itself to be
> crashed like this, and (2) the way the entire Linux kernel seems to have
> been unable to handle the situation. While having a central company or
So turn on the features to conrol it. Set rlimits on the xfs server and
enable non overcommit (-ac kernel)
Alan
Hi Alan,
(cc BugTraq and LKML)
On Thu, 13 Jun 2002 at 06:39, Alan Cox wrote:
> > check to prevent such large sizes from crashing X and/or the X Font
> > Server, I'm alarmed by (1) the way the X font server allows itself to
> > be crashed like this, and (2) the way the entire Linux kernel seems to
> > have been unable to handle the situation. While having a central
> > company or
>
> So turn on the features to conrol it. Set rlimits on the xfs server and
> enable non overcommit (-ac kernel)
I am using SGI's XFS, and I think they follow Marcelo's kernels for the
2.4 series, at the moment. Are there any plans of getting non overcommit
from your tree into Marcelo's?
TIA. :)
--> Jijo
--
Federico Sevilla III : <http://jijo.free.net.ph/>
Network Administrator : The Leather Collection, Inc.
GnuPG Key ID : 0x93B746BE
On Thu, 13 Jun 2002 13:57:33 +0800 (PHT),
Federico Sevilla III <[email protected]> wrote:
>On Thu, 13 Jun 2002 at 06:39, Alan Cox wrote:
>> > check to prevent such large sizes from crashing X and/or the X Font
>> > Server, I'm alarmed by (1) the way the X font server allows itself to
>> > be crashed like this, and (2) the way the entire Linux kernel seems to
>> > have been unable to handle the situation. While having a central
>> > company or
>>
>> So turn on the features to conrol it. Set rlimits on the xfs server and
>> enable non overcommit (-ac kernel)
>
>I am using SGI's XFS, and I think they follow Marcelo's kernels for the
SGI's XFS != xfs server. SGI XFS == journalling filesystem.
xfs server == font server for the X windowing system.
On Thu, Jun 13, 2002 at 09:44:33AM +0800, Federico Sevilla III wrote:
> From: Federico Sevilla III <[email protected]>
> To: BugTraq Mailing List <[email protected]>,
> Linux Kernel Mailing List <[email protected]>
>
> (Note: bug originally posted on BugTraq, this response is cross-posted to
> the LKML because of the way the server hosting the X font server ground to
> a halt.)
It really is nothing new. (Besides of XFree86's Type1 engine calling
abort() when it does not like to do something ..)
...
> Suggestions on how to work around this on multiple levels would definitely
> be appreciated. I'll be starting by removing the X font server from our
> file and authentication server onto some high-powered workstation, but I'm
> sure this won't be enough, and knowing that a user process like xfs-daemon
> can drag the Linux kernel down to knees is not very comforting. :(
ANY very big program with active working set larger than memory size
has problems at Linux / Linux has problems handling it. Indeed the
problem is _not_ new, nor trivial to solve efficiently. The ultimate
situation is called "trashing", where a program to proceed a page of
code needs to be moved into memory, and to make room for that, some
other program page must be moved out.. What makes the issue more
difficult is that the memory is used for lots and lots of different
kinds of buffers and caches as well, and playing a balancing act on
them all is quite difficult.
This is recurring topic at linux-kernel list, and has its own
list called linux-mm (not at vger, though.)
Others may have different views on the issue, but I think that:
2.0: fairly tolerable OOM/trashing behaviour
2.2: got worse
2.4.early: rather terrible
2.4.late: improved somewhat, roughly in par with 2.2
> --> Jijo
> - --
> Federico Sevilla III : <http://jijo.free.net.ph/>
...
> Please read the FAQ at http://www.tux.org/lkml/
/Matti Aarnio
On Thu, 13 Jun 2002 at 16:11, Keith Owens wrote:
> On Thu, 13 Jun 2002 13:57:33 +0800 (PHT),
> Federico Sevilla III <[email protected]> wrote:
> >On Thu, 13 Jun 2002 at 06:39, Alan Cox wrote:
> >> > check to prevent such large sizes from crashing X and/or the X Font
> >> > Server, I'm alarmed by (1) the way the X font server allows itself to
> >> > be crashed like this, and (2) the way the entire Linux kernel seems to
> >> > have been unable to handle the situation. While having a central
> >> > company or
> >>
> >> So turn on the features to conrol it. Set rlimits on the xfs server and
> >> enable non overcommit (-ac kernel)
> >
> >I am using SGI's XFS, and I think they follow Marcelo's kernels for the
>
> SGI's XFS != xfs server. SGI XFS == journalling filesystem.
> xfs server == font server for the X windowing system.
Argh. And we get hit by similarly-named projects. :(
To clarify: I wanted to know if there were plans of getting non overcommit
from the AC tree into Marcelo's mainline 2.4 tree, because I use SGI's XFS
(journalling filesystem), and thus use the -xfs kernel tree, which follows
-marcelo and not -ac.
Apologies.
--> Jijo
--
Federico Sevilla III : <http://jijo.free.net.ph/>
Network Administrator : The Leather Collection, Inc.
GnuPG Key ID : 0x93B746BE
On 13 Jun, Federico Sevilla III wrote:
> Suggestions on how to work around this on multiple levels would definitely
> be appreciated. I'll be starting by removing the X font server from our
> file and authentication server onto some high-powered workstation, but I'm
> sure this won't be enough, and knowing that a user process like xfs-daemon
> can drag the Linux kernel down to knees is not very comforting. :(
>
The protection that you need is provided by "ulimit" on most Unixes.
There are facilities to limit maximum real memory used, maximum virtual
memory, maximum number of processes, etc. This specific bug in XFree is
one of a general case of inescapable user process bugs. It resulted in
an almost infinite size malloc() request. You can acheive the same
effect in any userspace program by just putting malloc() inside an
infinite loop.
If you allow users to run with unlimited memory permission, you are
vulnerable. The XFree bug will hit more people than usual because it is
common to put the ulimit on regular user logins and forget to place a
limit on the automatically started processes. The default configuration
from RedHat, SuSE, and others is to start XFree outside the login
system. You can also place limits on these processes but you need to
examine the startup scripts to install the limits in the right places.
This would then result in a different DoS. Whenever XFree hits the
memory limit, the malloc's will fail, and XFree will decide what to do
about it. Depending on the circumstances, XFree may shut down, thus
killing all the X window dependent processes.
R Horn
* Melchior FRANZ -- Thursday 13 June 2002 18:49:
> * [email protected] -- Thursday 13 June 2002 18:33:
> > It resulted in an almost infinite size malloc() request.
>
> No. AFAIK it is caused in the file xc/lib/font/Type1/t1font.c
^^^^^^^^
This should have been t1func.c, sorry.
m.
* [email protected] -- Thursday 13 June 2002 18:33:
> It resulted in an almost infinite size malloc() request.
No. AFAIK it is caused in the file xc/lib/font/Type1/t1font.c
by wrong RAM requirement estimation heuristics. Not too much
memory is requested but too few! And if XFree actually needs
more memory than it had originally estimated and requested, it
simply aborts.
m.
>From billy Thu Jun 13 12:31:23 2002
From: Melchior FRANZ <[email protected]>
Date: Thu, 13 Jun 2002 19:17:49 +0200
X-PGP: http://www.unet.univie.ac.at/~a8603365/melchior.franz
Content-Disposition: inline
Sender: [email protected]
X-Mailing-List: [email protected]
* Melchior FRANZ -- Thursday 13 June 2002 18:49:
> * [email protected] -- Thursday 13 June 2002 18:33:
> > It resulted in an almost infinite size malloc() request.
>
> No. AFAIK it is caused in the file xc/lib/font/Type1/t1font.c
^^^^^^^^
This should have been t1func.c, sorry.
t1func.c ?
This bit here, in Type1OpenScalable()?
/* heuristic for "maximum" size of pool we'll need: */
size = 200000 + 120 *
(int)hypot(vals->pixel_matrix[2],
vals->pixel_matrix[3])
* sizeof(short);
if (size < 0 || NULL == (pool = (long *)xalloc(size))) {
xfree(cid);
DestroyFontRec(pFont);
return AllocError;
}
* Billy O'Connor -- Thursday 13 June 2002 19:34:
> This bit here, in Type1OpenScalable()?
>
> /* heuristic for "maximum" size of pool we'll need: */
> size = 200000 + 120 *
Yes. I simply replaced 120 by 600 after which it processed even the
biggest of my fonts scaled to 1000 points, without aborting. But there
might be bigger fonts.
Note that there are 2 further places where these (wrong) heuristics
are used, and I changed 120 -> 600 there, too (4 times in summary).
This is, however, no elegant solution. The XFree people discuss meanwhile
to replace the whole type1 stuff by the one from freetype2.
m.
* Billy O'Connor -- Thursday 13 June 2002 19:34:
> This bit here, in Type1OpenScalable()?
This may be a different bug, though. In my case (requesting a particular
font at >=200 pt size) the Xserver always only aborted. While this is clearly
an unpleasant experience and has to be fixed, it never really =crashed=
or locked the machine.
m. ??
On Thu, 13 Jun 2002, Federico Sevilla III wrote:
> I was able to log on to the server with enough time to SIGKILL the
> xfs-daemon process. Unfortunately this wasn't good enough. The server
> started running up various processes stuck in "D" state, the OOM killer
> went on panic mode and started killing things left and right, mostly from
> what I notice apache and apache-ssl processes with messages like "Out of
> Memory: Killed process xxxx (apache)". I was also able to do a `free`
> after killing xfs-daemon and noticed that there was a lot of free memory
> both physically and on swap.
>
> Within less than ten minutes on this relatively lightly-loaded server, I
> could not log in to the machine locally, even as root (whose home
> directory is not NFS-exported) and load levels shot up to around 25, which
> is definitely abnormal. Existing logged-on processes also got stuck in
> whatever they were doing (`ps ax`, in particular is what I remember).
It has always puzzled me that a process using lots of memory can bring
down an entire (otherwise relatively idle) server to the extent that one
cannot even get mingetty on a local terminal to respond to keypresses. I
can confirm that the described situation is not just a one-off.
It is my experience that a single process using large amounts of memory
causes the system to start swapping. Once it starts swapping, every
process that does anything (apart from indefinite wait) goes into "I'm
ready to do some processing, but my memory is swapped out" state, and the
whole system collapses.
I don't know if Linux prioritises swap-ins as well as CPU time, but
perhaps this would be a good way of stabilising this circumstance. One
could arrange the dynamic process priorities so that the bad neighbour
(read: exploited xfs-daemon) gets less than its fair share of physical
RAM, therefore letting the rest of the system live. Perhaps one could
limit the number of processes that are having their swap-in serviced at
any one time, and make the ones that are being serviced the ones with the
highest dynamic priority.
I don't know what exactly is causing the problem, but it is possible that
pages are being paged in, the process runs on them for a quantum, and then
paged out. In this case, it might help to dictate a minimum time a page
can be "in", measured in the amount of work the owning process manages to
do.
My other quibble is that the out-of-memory killer seems to choose
processes fairly randomly. Ideally, it should kill the process that is
causing the problem - which in my experience is always the biggest process
in the system, or the process which has grown the fastest recently.
However, it should probably be positively discouraged from killing things
like apache (where most of the process space is shared), or gettys (which
are firstly small, and secondly likely to be respawned by someone at the
first opportunity).
> Attempted reboots locally via Ctrl-Alt-Delete and Magic SysRq failed
> because (1) I had disabled ctrl-alt-delete mapping in inittab "for
> security", and (2) I had not compiled Magic SysRq support on this
> particular server. More notes to self.
I think that if you had not disabled ctrl-alt-delete, it wouldn't have
done much good anyway, since usually that key combination just tells init
to spawn a shutdown process, which has just as much chance of getting
resources as any other process in the system.
> While I agree with BugTraq posts in response to this that applications
> like Mozilla which accept font-sizes from unknown sources should have some
> check to prevent such large sizes from crashing X and/or the X Font
> Server, I'm alarmed by (1) the way the X font server allows itself to be
> crashed like this, and (2) the way the entire Linux kernel seems to have
> been unable to handle the situation.
I am in complete agreement with both points, but particularly that the
kernel should be able to cope with the situation gracefully. I know one
can set limits on processes, but the kernel should still be able to cope
if we don't.
> Suggestions on how to work around this on multiple levels would definitely
> be appreciated.
My suggestion would be to set a maximum core size for the xfs-daemon
process. Given your setup, something like 400MB seems appropriate - maybe
a little lower. Details for doing this seem to differ from linux to linux.
Having done that, I would make sure xfs respawns when it dies - that way
someone can't just DOS your whole network by asking for a large font.
Finally, wait for the xfs developers to put in a font size limit, or patch
the source yourself.
Apart from that, I wouldn't move xfs to a bigger server unless you have
already had people complaining about its performance. Moving it to a
bigger system just changes the constant in the equation - the attacker
would only need to specify a 100000 point font instead of a 50000 point
font to bring the system down.
I doubt any of my kernel suggestions have not already been thought about,
but it was worth a try. Please can this problem be fixed soon?
Matthew
--
"If I wanted to kill a battleship, I'd use a s?!tload of Harpoons."
"NT is a lot cheaper." -- Paul Tomblin & Petro
Matthew Wakeling <[email protected]>:
>
> On Thu, 13 Jun 2002, Federico Sevilla III wrote:
>
> > I was able to log on to the server with enough time to SIGKILL the
> > xfs-daemon process. Unfortunately this wasn't good enough. The server
> > started running up various processes stuck in "D" state, the OOM killer
> > went on panic mode and started killing things left and right, mostly from
> > what I notice apache and apache-ssl processes with messages like "Out of
> > Memory: Killed process xxxx (apache)". I was also able to do a `free`
> > after killing xfs-daemon and noticed that there was a lot of free memory
> > both physically and on swap.
> >
> > Within less than ten minutes on this relatively lightly-loaded server, I
> > could not log in to the machine locally, even as root (whose home
> > directory is not NFS-exported) and load levels shot up to around 25, which
> > is definitely abnormal. Existing logged-on processes also got stuck in
> > whatever they were doing (`ps ax`, in particular is what I remember).
>
> It has always puzzled me that a process using lots of memory can bring
> down an entire (otherwise relatively idle) server to the extent that one
> cannot even get mingetty on a local terminal to respond to keypresses. I
> can confirm that the described situation is not just a one-off.
>
> It is my experience that a single process using large amounts of memory
> causes the system to start swapping. Once it starts swapping, every
> process that does anything (apart from indefinite wait) goes into "I'm
> ready to do some processing, but my memory is swapped out" state, and the
> whole system collapses.
Not necessarily. The condition can also be caused by having a large, well
behaved process working its' little heart out properly, and a small process
that grows suddenly (or even slowly - it doesn't take much to push it over
the limit). As the small process grows, the larger one is paged out. Once
the swap space is filled just adding one more page could do it. And it doesn't
matter what process allocates that page. Key: disable oversubscription of
memory.
[snip]
> > While I agree with BugTraq posts in response to this that applications
> > like Mozilla which accept font-sizes from unknown sources should have some
> > check to prevent such large sizes from crashing X and/or the X Font
> > Server, I'm alarmed by (1) the way the X font server allows itself to be
> > crashed like this, and (2) the way the entire Linux kernel seems to have
> > been unable to handle the situation.
>
> I am in complete agreement with both points, but particularly that the
> kernel should be able to cope with the situation gracefully. I know one
> can set limits on processes, but the kernel should still be able to cope
> if we don't.
It can't decide what causes the problem. There are too may possibilities.
This is NOT a bug. I consider it a problem of a misconfigured server. As long
as memory oversubscription is permitted, there are a LOT of things that can
cause a system failure. Examples are:
DNS cache table fills memory
X font server fills memory (the discussed failure)
Sendmail recieves large Email (2-8MB)
Web server recieves a lot of requests (though this one is harder and
needs a lot of static pages to be loaded into memory)
Database recieves lots of queries for lots of data
All cron jobs kick off at once, with several requiring lots of memory
(I've even seen that last one kill solaris)
> > Suggestions on how to work around this on multiple levels would definitely
> > be appreciated.
>
> My suggestion would be to set a maximum core size for the xfs-daemon
> process. Given your setup, something like 400MB seems appropriate - maybe
> a little lower. Details for doing this seem to differ from linux to linux.
> Having done that, I would make sure xfs respawns when it dies - that way
> someone can't just DOS your whole network by asking for a large font.
> Finally, wait for the xfs developers to put in a font size limit, or patch
> the source yourself.
Also put a maximum limit on the X server.
> Apart from that, I wouldn't move xfs to a bigger server unless you have
> already had people complaining about its performance. Moving it to a
> bigger system just changes the constant in the equation - the attacker
> would only need to specify a 100000 point font instead of a 50000 point
> font to bring the system down.
Even if the font server survives - the X server wouldn't. In both cases,
only huristics can be applied.
1. Do not generate a font with more than 100 pixels high (around 1 inch
on display) by the font server. This would have to be a configuration
item, since there will be cases where that size is unreasonable. Also
might be a good idea to not cache fonts > X size even if generated.
2. Do not have more than X active fonts at a time (to cover the case of
multiple fonts at the maximum) where X is based on an external
configuration (XF86Config parameter limit). Or specify the maximum
amount of memory to allocate to fonts, when that space is filled, return
a font error (font not available) to the application. Again, deallocate
fonts > X size when not in use.
The easiest fix is to disable oversubscription of memory, though that may
cause some daemons to abort if they don't check for allocation failures
(which I do consider a bug).
> I doubt any of my kernel suggestions have not already been thought about,
> but it was worth a try. Please can this problem be fixed soon?
>
> Matthew
>
> --
> "If I wanted to kill a battleship, I'd use a s?!tload of Harpoons."
> "NT is a lot cheaper." -- Paul Tomblin & Petro
Good signature... :)
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: [email protected]
Any opinions expressed are solely my own.
On Thu, 13 Jun 2002, Jesse Pollard wrote:
> > It is my experience that a single process using large amounts of memory
> > causes the system to start swapping. Once it starts swapping, every
> > process that does anything (apart from indefinite wait) goes into "I'm
> > ready to do some processing, but my memory is swapped out" state, and the
> > whole system collapses.
>
> Not necessarily. The condition can also be caused by having a large, well
> behaved process working its' little heart out properly, and a small process
> that grows suddenly (or even slowly - it doesn't take much to push it over
> the limit). As the small process grows, the larger one is paged out. Once
> the swap space is filled just adding one more page could do it. And it doesn't
> matter what process allocates that page. Key: disable oversubscription of
> memory.
Can we at least agree that the current kernel behaviour is a positive
feedback loop - something is bad, therefore it's about to get worse. Some
of the suggestions I had would move this more towards a negative feedback
loop.
> It can't decide what causes the problem. There are too may possibilities.
I think the majority of times a system will be set up with enough swap
space to handle its normal operation. Otherwise, just give it some more
swap. However, one circumstance that throwing lots of swap around doesn't
fix is when a process has an insatiable need for memory. In this case,
either the process grows very quickly, or is just plain big. I think the
out-of-memory killer should target big or growing processes. If it doesn't
hit the correct process the first time, it will free up a lot more RAM
than it would otherwise, and it would be likely to get it right the second
time.
> > My suggestion would be to set a maximum core size for the xfs-daemon
> > process...
>
> Also put a maximum limit on the X server.
Although this wasn't the problem in this case (and therefore wouldn't have
a massive effect), it's a sensible precaution.
The xfs server is the important server here, because DOSing it DOSes a
whole network of workstations.
> The easiest fix is to disable oversubscription of memory, though that may
> cause some daemons to abort if they don't check for allocation failures
> (which I do consider a bug).
That does indeed sound a good idea. I guess one would then give the system
one big dollop of swap, to allow it to actually cater for processes that
allocate large amounts without using it.
Matthew
--
Bashir: And who told you that?
O'Brien: You did. In the future.
Bashir: Oh. Well, who am I to argue with me?
In article <[email protected]> you wrote:
> It has always puzzled me that a process using lots of memory can bring
> down an entire (otherwise relatively idle) server to the extent that one
> cannot even get mingetty on a local terminal to respond to keypresses. I
> can confirm that the described situation is not just a one-off.
Well, with crrent versions you have to have 2 or 3 of them to realy make a
server trtash. Something like:
tail /dev/zero
recovers with the oom killer
tail /dev/zero & tail /dev/zero & tail /dev/zero
will most likely trash for minutes.
Some kind of penalty could be introduced, because this is better to set up
than ulimit. On a system with large amounts of users you simply cant set the
ulimit of a single user to mainram/concurrent-users or
mainram/concurrent-evil-users.
Greetings
Bernd
Matthew Wakeling wrote:
> In this case,
> either the process grows very quickly, or is just plain big. I think the
> out-of-memory killer should target big or growing processes. If it doesn't
> hit the correct process the first time, it will free up a lot more RAM
> than it would otherwise, and it would be likely to get it right the second
> time.
Um, so you want to kill the database server? Think carefully
about making automatic selections like that. Wouldn't it be
much better to just tell a process that makes a memory request
that won't fit that it can't have it? The process can then
decide on it's own if it is capable of continuing or aborting.
The real solution lies in getting rid of over subscription and
properly returning NULL for memory allocations when RAM+SWAP
has run out. To me this is a kernel memory subsystem issue.
When the X font server requested that huge block of memory it
should have been told you can't have it as there is no way it
would fit within RAM+SWAP-other processes. No need to kill a
process.
--
| Bryan Andersen | [email protected] | http://www.nerdvest.com |
| Buzzwords are like annoying little flies that deserve to be swatted. |
| "Linux, the OS Microsoft doesn't want you to know about.". |
| -Bryan Andersen |
Matthew Wakeling wrote:
[...]
> However, one circumstance that throwing lots of swap around doesn't
> fix is when a process has an insatiable need for memory. In this case,
> either the process grows very quickly, or is just plain big. I think the
> out-of-memory killer should target big or growing processes. If it doesn't
> hit the correct process the first time, it will free up a lot more RAM
> than it would otherwise, and it would be likely to get it right the second
> time.
>
A fork bomb would kill everything else in your machine then. The bomb
program doesn't grow and is smaller than anythine else - but there's
so many of them. So all the other useful programs,
including shells, are killed. :-(
Helge Hafting
To answer a couple of the questions in this thread.
The Type 1 backend currently in XFree86 (originally by IBM and
Lexmark) uses static memory allocation (it was designed for embedded
systems) and does indeed have very poor error handling. The current
plan is to replace the Type 1 backend altoghether with a unified Type
1 and TrueType backend based on the FreeType library. This will
hopefully happen in 4.3.0.
As far as I know, nobody is currently working on fixing the (legacy)
Type 1 backend. While patches to fix its behaviour are likely to get
accepted, I think that getting the Type 1 backend to work reasonably
is more work than it's worth now that FreeType does a decent job with
Type 1 fonts.
As to out of memory allocations: with a few exceptions, the core X
server code deals smartly with malloc returning NULL: the current
client receives a BadAlloc error (``insufficient resources''), and
other clients are not bothered. (On the other hand, it is a rare
client that will deal gracefully with a BadAlloc; and if it's your
window manager that gets the error, all hell breaks loose.)
With current Linux kernels, this careful coding brings no benefit
whatsover, as malloc never (?) returns NULL. What is worse, as far as
I know the kernel doesn't send advance warning of an OOM situation; it
would not be too difficult to stop allocating memory when that happens.
I suggest xpert at xfree86.org as the right list to continue this
discussion. If you follow up on linux-kernel, please be so kind as to
CC me.
Regards,
Juliusz Chroboczek
P.S. As always, none of the above represents an official position of
the XFree86 project, but merely my personal perception of the
situation. This message may or may not have any sort of relationship
with reality.
On Thursday 13 June 2002 12:26, [email protected] wrote:
Exactly, any user without limits can arbitrarily "fork bomb" a system too. A
shell script and newbie level programming talent is all you need... That
whole class of DoS are hard to stop. You can do 100 things, like starve a
system of file handles, open 50,000 listen ports, whatever. You can set
limits, but there are not even standard APIs for limiting every conceivable
exhaustible resource systems allocate.
> On 13 Jun, Federico Sevilla III wrote:
> > Suggestions on how to work around this on multiple levels would
> > definitely be appreciated. I'll be starting by removing the X font server
> > from our file and authentication server onto some high-powered
> > workstation, but I'm sure this won't be enough, and knowing that a user
> > process like xfs-daemon can drag the Linux kernel down to knees is not
> > very comforting. :(
>
> The protection that you need is provided by "ulimit" on most Unixes.
> There are facilities to limit maximum real memory used, maximum virtual
> memory, maximum number of processes, etc. This specific bug in XFree is
> one of a general case of inescapable user process bugs. It resulted in
> an almost infinite size malloc() request. You can acheive the same
> effect in any userspace program by just putting malloc() inside an
> infinite loop.
>
> If you allow users to run with unlimited memory permission, you are
> vulnerable. The XFree bug will hit more people than usual because it is
> common to put the ulimit on regular user logins and forget to place a
> limit on the automatically started processes. The default configuration
> from RedHat, SuSE, and others is to start XFree outside the login
> system. You can also place limits on these processes but you need to
> examine the startup scripts to install the limits in the right places.
>
> This would then result in a different DoS. Whenever XFree hits the
> memory limit, the malloc's will fail, and XFree will decide what to do
> about it. Depending on the circumstances, XFree may shut down, thus
> killing all the X window dependent processes.
>
> R Horn