by Linus Torvalds

[permalink] [raw]

Subject: Re: context switch vs. signal delivery [was: Re: Accelerating usermode linux]

On Mon, 5 Aug 2002, Oliver Neukum wrote:
>
> > Also, people who play games with FP actually change the FP data on the
> > stack frame, and depend on signal return to reload it. Admittedly I've
> > only ever seen this on SIGFPE, but anyway - this is all done with integer
> > instructions that just touch bitpatterns on the stack.. The kernel can't
> > catch it sanely.
>
> Could the fp state be put on its own page and the dirty bit
> evaluated in the decision whether to restore fpu state ?

I'm sure anything is _possible_, but there are a few problems with that
approach. In particular, playing VM games tends to be quite expensive on
SMP, since you need to make sure that the TLB entry for that page is
invalidated on all the other CPU's before you insert the FPU page.

Also, you'd need to play games with dirty bit handling, since the page
_is_ dirty (it contains FP data), so the VM must know to write it out if
it pages things. That's ok - we have separate per-page and per-TLB-entry
dirty bits anyway, but right now the VM layer knows it can move the TLB
entry dirty bit into the per-page dirty bit and drop it - which wouldn't
be the case if we also have a FPU dirty bit.

That's fixable - we could just make a "software TLB dirty bit" that it
updated whenever the hardware TLB dirty bit is cleared and moved into the
per-page dirty bit.

But the end result sounds rather complicated, especially since all the
page table walking necessary for setting this all up is likely to be about
as expensive as the thing we're trying to avoid..

Rule of thumb: it almost never pays to be "clever".

Linus

2002-08-05 22:03:04

[email protected] said:
> I'm glad we agree on that one :)

Yup, sorry. That test is wrong, and is slated to be fixed at some point.

> When the task is registered as socket owner and is just about to enter
> the kernel due to a syscall, it will stop with a SIGTRAP and the
> tracing kernel process will run sometime and see a SIGCHLD. But after
> the task stopped and before the kernel process can change SIGIO
> ownership back, a new interrupt could come in and the SIGIO would
> remain pending in the task's process until the task was scheduled to
> run next time.
>
> How do you solve this?

A couple of ways. The system call path can call sigio_handler to clear
out any pending IO. The SIGIO that was trapped in the process will cause
another call to sigio_handler which won't turn up any IO, but I don't
consider that to be a problem.

The kernel process can examine the signal pending mask of the process after
it has transferred SIGIO to itself. This can be done either through
/proc/<pid>/status or a ptrace extension, since we're happily postulating
new things for it to do anyway. If there is a SIGIO pending, it calls
sigio_handler.

Any other possibilities that you see?

Jeff

2002-08-06 17:58:52

[email protected] said:
> The task is uncooperative and doesn't dequeue signals itself. When it
> gets a signal it stops. The kernel then sees the signal and accepts it
> using sigwaitinfo, at which point it is no longer pending in the task
> either. The siginfo structure then provides the necessary info, i.e.
> which fd caused the i/o.

I think this is more or less what I had in mind. The thing that is missing
is for sigwaitinfo to be able to dequeue another process' signals, which is
where the shared signal queue would come in.

> If you have a magic aio descriptor, how does the task process read
> signals from it and stop?

I was looking at this as a way of dequeueing signals from the other process.
The task process would have the signal queued and wake up the kernel process
as happens now. The kernel process would have /proc/<task-pid>/sigqueue
or something opened and would read siginfos from it. Those would then be
dequeued from the task process.

This almost suffices for getting page fault information, except that, for
some reason, siginfo doesn't say whether the faulting access was a read or
a write.

And now that I'm thinking about it, aio doesn't really come into it. This
would be strictly synchronous.

Jeff