2001-03-31 20:07:29

by Boris Pisarcik

[permalink] [raw]
Subject: Question about SysRq

Hi.

I managed fullowing situation: user with no ulimits will run script like
this:

#! /usr/bin/perl

while (1)
{
fork();
};

on say tty2. The processes get created pretty fast. After a short while
I supposed a single solution to this to kill all session by alt+sysrq+k,
but nothing happened. Under normal averagely loaded situation, this will
imidiately kill all processes on current vt and bring getty prompt.
Shouldn't it function similiarily in former case ? I see all processes on vt
get SIGKILL, so what's hapenned ? Maybe I had to wait
a bit longer for kernel to accomplish that ? Killing all processes with init
(alt+sysrq+i) seems to be immediate.


Thought, i really love all sysrq properties of linux, so i need less often
to make hardware resets an then await and fear, what fsck will print.
One more property, that i'd like to have should be request key to force the
most basic text mode (say 80x25) on the console, when eg. X freezes and
i kill its session, then last gfx mode resides on the screen and see no way
to restore back the text mode - /usr/bin/reset or something alike will not
do it. But it seems to be not a good idea at all, does it ?

Cheers B.


--


2001-04-02 17:08:39

by mirabilos

[permalink] [raw]
Subject: Basic Text Mode (was: Re: Question about SysRq)

> Thought, i really love all sysrq properties of linux, so i need less often
> to make hardware resets an then await and fear, what fsck will print.

101% ACK

> One more property, that i'd like to have should be request key to force the
> most basic text mode (say 80x25) on the console, when eg. X freezes and
> i kill its session, then last gfx mode resides on the screen and see no way
> to restore back the text mode - /usr/bin/reset or something alike will not
> do it. But it seems to be not a good idea at all, does it ?

It is a very good idea, and to implement quite easy. You just do have to
diff between three types of video cards (MDA, MGA and HGC vs. CGA and AGA vs. EGA+).
Then you do direct register writes. For the HGC I did it recently in a DOS proggy
which switched from text to gfx and back. I had a TSR which simulated a gfx BIOS.
Only problem is, I lost the source. But I could rewrite and test it on request.
I even would put it under GPL for the kernel (normally this is a no-no for me),
just ask me. I will write it in NASM then because I can't the AT&T syntax.
For non-i386 Platforms I do not know about this topic. (IIRC the Apples didnt
even have a text mode)
Maybe I could look up the EGA register values somewhere.

-mirabilos


2001-04-03 00:01:11

by Dr. Kelsey Hudson

[permalink] [raw]
Subject: Re: Question about SysRq

On Sat, 31 Mar 2001, Boris Pisarcik wrote:
> on say tty2. The processes get created pretty fast. After a short while
> I supposed a single solution to this to kill all session by alt+sysrq+k,
> but nothing happened. Under normal averagely loaded situation, this will
> imidiately kill all processes on current vt and bring getty prompt.
> Shouldn't it function similiarily in former case ? I see all processes on vt
> get SIGKILL, so what's hapenned ? Maybe I had to wait
> a bit longer for kernel to accomplish that ? Killing all processes with init
> (alt+sysrq+i) seems to be immediate.
>
>
> Thought, i really love all sysrq properties of linux, so i need less often
> to make hardware resets an then await and fear, what fsck will print.
> One more property, that i'd like to have should be request key to force the
> most basic text mode (say 80x25) on the console, when eg. X freezes and
> i kill its session, then last gfx mode resides on the screen and see no way
> to restore back the text mode - /usr/bin/reset or something alike will not
> do it. But it seems to be not a good idea at all, does it ?

I've noticed a similar situation:

I recently upgraded to XFree86 4.0.3 and have been having nothing but
problems with it. Often times, the machine will crash, with crap shot to
the screen. Howver, the IP stack still functions, routes packets, but none
of the user processes respond. I'll try and hit sysrq-s to sync the disks,
sysrq-u to unmount them, and sysrq-b to boot the machine. Unfortunately,
the only one that responds is sysrq-b, which boots the box without
syncing or unmounting the disks. Not only does that piss me off but it's
led to some fs corruption as well (which pisses me off even more). sysrq-b
is the *only* combination I can get working when this happens. However,
when the machine is *not* locked, sysrq-[su] work fine. Kernel is
2.4.3-pre6 on SMP i686.

Kelsey Hudson [email protected]
Software Engineer
Compendium Technologies, Inc (619) 725-0771
---------------------------------------------------------------------------

2001-04-03 03:59:29

by James Simmons

[permalink] [raw]
Subject: Re: Basic Text Mode (was: Re: Question about SysRq)


> One more property, that i'd like to have should be request key to force
> the most basic text mode (say 80x25) on the console, when eg. X freezes
> and i kill its session, then last gfx mode resides on the screen and
> see no way to restore back the text mode - /usr/bin/reset or something
> alike will not do it. But it seems to be not a good idea at all, does it
> ?

I'm working on this. In theory it should be possible with SAK. Alt-SysRq-k
resets the console. Here it set the vc_mode back to KD_TEXT and the
keyboard back to VC_XLATE. It even reset ths palette. What it doesn't do
is reset the hardware state. I hope to change this for 2.5.X. It is
possible using fbdev to do this very easily. I'm working on it so you can
go back and forth between fbdev and vgacon for those who want to if their
hardware supports it use vgacon. This can be applied to this problem very
easily.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons [[email protected]] ____/|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org =(_)=
http://linuxgfx.sourceforge.net U
http://linuxconsole.sourceforge.net

2001-04-03 22:36:39

by Boris Pisarcik

[permalink] [raw]
Subject: Re: Question about SysRq

Unfortunately,
> the only one that responds is sysrq-b, which boots the box without
> syncing or unmounting the disks. Not only does that piss me off but it's
> led to some fs corruption as well (which pisses me off even more). sysrq-b
> is the *only* combination I can get working when this happens.

I looked a bit at the source of sysrq handling. I've found there is
rather big difference between sysrq+b and other killers handling.
Sysrq+b is just called with pretty straitforward fashion - stops other
processors on SMP and reboots directly (hardware impulse or triple fault)
or through the bios - so it just calls for the corruptions.

On the other side sysrq+s marks a single variable, which will be tested
next cycle in the bdflush kernel threads' main loop, and adds bdflush to
scheduler runqueue list. So it gets possibility to check for emergency
sync onle when gets next scheduled. Does it ?

Can you anyhow find something in your logs/console/serial console messages
like 13.13.2000 kernel : Sysrq: Emergency Sync (this should be present - is
written within keyboard handler, not after shedule) and what's next logs ?
We could determine, if the bdflush thread got scheduled and called emergency
syncing routine indeed.

As you wrote no of your processes does respond - probably telnet will
not help. You may try to write experimental programme, that only log
say current time every n seconds, and see, if it just stopped to
log messages after lockup-time. If not - it doesn't get scheduled.
If continues - there's problem with syncing. Again - try, as far
as i understand, log kernel messages to serial console or alike, because
the messages should not get written to logfiles - syslogd can't be woken up
eg.

Quick help against those corruptions, which comes on my mind, is use
the reiserfs. I have no real experiences with that and its reliability,
also as aj followed some of messages in this list about resierfs - it has
some problems too - but in definition it shoudn't get corrupted by not-
syncing reboot. But i see this not much helpfull ,cause if you really
would depend on big reliability, you wouldn't intall 2.3.x-pre kernel :)

There go also occasionally discussions about watchdogs - it may be
helpfull - but none of the two really solve the problem.

LW: today a got ugly lockup with dosemu and experimental execution of
virtual pool ;). Neither Sysrq+b functioned. But that's probably another
story. Root or privilege suid processes (X server among them) need really
just a 1-bit error to corrupt near what they like.

The least fsck sessions and nice day B.

2001-04-03 23:00:33

by Andreas Dilger

[permalink] [raw]
Subject: Re: Question about SysRq

You write:
> Can you anyhow find something in your logs/console/serial console messages
> like 13.13.2000 kernel : Sysrq: Emergency Sync (this should be present - is
> written within keyboard handler, not after shedule) and what's next logs ?
> We could determine, if the bdflush thread got scheduled and called emergency
> syncing routine indeed.

It sounds like the kernel is stuck somewhere in a tight loop, so nothing
is being rescheduled. If you have an SMP system (or an APIC) you may be
able to see where it is stuck with the NMI watchdog timer.

> Quick help against those corruptions, which comes on my mind, is use
> the reiserfs. I have no real experiences with that and its reliability,
> also as aj followed some of messages in this list about resierfs - it has
> some problems too - but in definition it shoudn't get corrupted by not-
> syncing reboot.

Actually, this is not true. Reiserfs will only prevent corruption of the
filesystem metadata. It does not guarantee that the file contents are
valid if they are being changed when the system crashes.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-04-04 07:00:46

by Boris Pisarcik

[permalink] [raw]
Subject: Re: Basic Text Mode (was: Re: Question about SysRq)


Some stupid questions about videomem:

1) How do 2 or more X servers, or svgalibbed apps share the same physical video
memory ? Does it get saved to ram when switching between them ?

2) Does console switching (gfx or text) save and restore all registers of
videocard in kernel ? Or kernel only restores text things and gfx apps must
it do in their own ?



2001-04-04 07:00:46

by Boris Pisarcik

[permalink] [raw]
Subject: Re: Question about SysRq

>You could even set sysvinit
> to run it when you press a certain key combo.

You mean inittab & kbrequest ? I didn't know about this. Must have a look
at some documentation, manpage didn't help me a lot.

Besides, you have very interesting name/domain, cheaf of bandits !!

B.

2001-04-04 07:00:46

by Boris Pisarcik

[permalink] [raw]
Subject: Re: Basic Text Mode (was: Re: Question about SysRq)

> It is a very good idea, and to implement quite easy. You just do have to
> diff between three types of video cards (MDA, MGA and HGC vs. CGA and AGA vs. EGA+).
> Then you do direct register writes. For the HGC I did it recently in a DOS proggy
> which switched from text to gfx and back. I had a TSR which simulated a gfx BIOS.
> Only problem is, I lost the source. But I could rewrite and test it on request.
> I even would put it under GPL for the kernel (normally this is a no-no for me),
> just ask me. I will write it in NASM then because I can't the AT&T syntax.
> For non-i386 Platforms I do not know about this topic. (IIRC the Apples didnt
> even have a text mode)
> Maybe I could look up the EGA register values somewhere.

The hardware part of mode restoring could really be done the register way and
may be interesting - i did code a lot in dos and this is what i did too,
if i had experience with register programming and vga(..). Mostly i skipped
this arena, because i have had information about vga registers just enough
to damage monitor, so rather used VESA for video stuff :)
But i think only this way should break some internal state of console or
what driver.

Thanks for the willingness, thought. I read from the thread - James Simmons,
console developer, is working on it, so you may eg. ask to cooperate with him,
he surely has a lot of ideas of text/vga switching in linux.

Cheers B.

2001-04-12 18:51:42

by Dr. Kelsey Hudson

[permalink] [raw]
Subject: Re: Question about SysRq

On Wed, 4 Apr 2001, Boris Pisarcik wrote:
> I looked a bit at the source of sysrq handling. I've found there is
> rather big difference between sysrq+b and other killers handling.
> Sysrq+b is just called with pretty straitforward fashion - stops other
> processors on SMP and reboots directly (hardware impulse or triple fault)
> or through the bios - so it just calls for the corruptions.

ah, that would explain it...

> On the other side sysrq+s marks a single variable, which will be tested
> next cycle in the bdflush kernel threads' main loop, and adds bdflush to
> scheduler runqueue list. So it gets possibility to check for emergency
> sync onle when gets next scheduled. Does it ?
>
> Can you anyhow find something in your logs/console/serial console messages
> like 13.13.2000 kernel : Sysrq: Emergency Sync (this should be present - is
> written within keyboard handler, not after shedule) and what's next logs ?
> We could determine, if the bdflush thread got scheduled and called emergency
> syncing routine indeed.

Nope, there was nothing in the logs.

> As you wrote no of your processes does respond - probably telnet will
> not help. You may try to write experimental programme, that only log
> say current time every n seconds, and see, if it just stopped to
> log messages after lockup-time. If not - it doesn't get scheduled.
> If continues - there's problem with syncing. Again - try, as far
> as i understand, log kernel messages to serial console or alike, because
> the messages should not get written to logfiles - syslogd can't be woken up
> eg.

Telnet's disabled anyways :) Cleartext passwords SUCK. :)
I've got a nifty LCD thingy I can hook up to the serial port and use as a
console if need be.

> Quick help against those corruptions, which comes on my mind, is use
> the reiserfs. I have no real experiences with that and its reliability,
> also as aj followed some of messages in this list about resierfs - it has
> some problems too - but in definition it shoudn't get corrupted by not-
> syncing reboot. But i see this not much helpfull ,cause if you really
> would depend on big reliability, you wouldn't intall 2.3.x-pre kernel :)

I'm not about to convert my filesystems over... It's too much a hassle for
little gain. ext2 is faster anyways, IIRC.

The problem disappeared when I installed 2.4.3 release; I think it was a
DRM issue in the kernel that was causing the lockups

Thanks for the help though

Kelsey Hudson [email protected]
Software Engineer
Compendium Technologies, Inc (619) 725-0771
---------------------------------------------------------------------------