LinuxLists.cc - intercepting syscalls

2005-04-15 18:04:42

by Igor Shmukler

[permalink] [raw]

Subject: intercepting syscalls

Hello,
We are working on a LKM for the 2.6 kernel.
We HAVE to intercept system calls. I understand this could be
something developers are no encouraged to do these days, but we need
this.
Patching kernel to export sys_call_table is not an option. The fast
and dirty way to do this would be by using System.map, but I would
rather we find a cleaner approach.
I did some research on google and I know this issue has been raised
before, but unfortunately I could not find a coherent answer.
Does anyone know of any tutorial or open source code where I could
look at how this is done? I think that IDT should give me the entry
point, but where do I get system call table address?
Thank you in advance,
Igor

2005-04-15 18:13:47

by Chris Wright

[permalink] [raw]

Subject: Re: intercepting syscalls

* Igor Shmukler ([email protected]) wrote:
> We are working on a LKM for the 2.6 kernel.
> We HAVE to intercept system calls. I understand this could be
> something developers are no encouraged to do these days, but we need
> this.

I don't think you'll find much empathy or support here. This is seriously
discouraged. It's usually the beginning of many ugly and suspect things
being done in a module.

thanks,
-chris
--
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net

2005-04-15 18:14:52

by Arjan van de Ven

[permalink] [raw]

Subject: Re: intercepting syscalls

On Fri, 2005-04-15 at 14:04 -0400, Igor Shmukler wrote:
> Hello,
> We are working on a LKM for the 2.6 kernel.
> We HAVE to intercept system calls. I understand this could be
> something developers are no encouraged to do these days, but we need
> this.

your module is GPL licensed right ? (You're depending on deep internals
after all)

Why do you *have* to intercept system calls... can't you instead use the
audit infrastructure to get that information ?

What is the URL of your current code so that we can provide reasonable
recommendations ?

2005-04-15 18:19:26

[permalink] [raw]

Subject: Re: intercepting syscalls

Igor Shmukler wrote:
> Hello,
> We are working on a LKM for the 2.6 kernel.
> We HAVE to intercept system calls. I understand this could be
> something developers are no encouraged to do these days, but we need
> this.

Too bad.

> Patching kernel to export sys_call_table is not an option. The fast
> and dirty way to do this would be by using System.map, but I would
> rather we find a cleaner approach.

There is none. And even System.map can be unreliable. Some distros/kernels only include
exported symbols in System.map, and sys_call_table is not exported in 2.6.

> I did some research on google and I know this issue has been raised
> before, but unfortunately I could not find a coherent answer.
> Does anyone know of any tutorial or open source code where I could
> look at how this is done? I think that IDT should give me the entry
> point, but where do I get system call table address?

You don't.

You're just going to have to accept that fact that what you want to do, the way you want
to do it, is just not going to happen. Sorry.

Your best bet is to design and implement a clean and safe mechanisming for intercepting
system calls, and submit that to the kernel. It will probably get rejected, but it still
might be worth a shot.

--
Timur Tabi
Staff Software Engineer
[email protected]

2005-04-15 19:27:38

[permalink] [raw]

Subject: Re: intercepting syscalls

On Fri, 2005-04-15 at 14:04 -0400, Igor Shmukler wrote:
> Hello,
> We are working on a LKM for the 2.6 kernel.
> We HAVE to intercept system calls. I understand this could be
> something developers are no encouraged to do these days, but we need
> this.
> Patching kernel to export sys_call_table is not an option. The fast
> and dirty way to do this would be by using System.map, but I would
> rather we find a cleaner approach.

These ideas are hardly a clean approach but might work although I
haven't tried:

Hook into an existing kernel function that is exported to modules that
can be called by a system call, like a /proc or /sys file. On a
sys_read or sys_write to your /proc file, perform a stack trace back to
the system call, then search and adjust to find the system call table
pointer.

You might also be able to look at the int $80 vector and grub through
the machine code to find it.

Of course, anything like this will probably only work on x86 and need to
be rewritten for each architecture. Very messy.
--
Zan Lynx <[email protected]>

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part

2005-04-15 19:42:05

by Igor Shmukler

[permalink] [raw]

Subject: Re: intercepting syscalls

Hello,

Thanks to everyone for replying.
It is surprising to me that linux-kernel people decided to disallow
interception of system calls.
I don't really see any upside to this.
I guess if there is no clean way to do this, we will have to resort to
quick and dirty.

Can anyone point to a discussion that yielded this decision. Perhaps,
I need to educate myself. I stumbled upon comments that this can lead
to mess, but pretty much anything in LKM can cause problems. I don't
think that hiding commonly used convenient interfaces just because
they can be abused is a valid reason, hence I would love to know what
is the real reason.

Thank you,

Igor

On 4/15/05, Arjan van de Ven <[email protected]> wrote:
> On Fri, 2005-04-15 at 14:04 -0400, Igor Shmukler wrote:
> > Hello,
> > We are working on a LKM for the 2.6 kernel.
> > We HAVE to intercept system calls. I understand this could be
> > something developers are no encouraged to do these days, but we need
> > this.
>
> your module is GPL licensed right ? (You're depending on deep internals
> after all)
>
> Why do you *have* to intercept system calls... can't you instead use the
> audit infrastructure to get that information ?
>
> What is the URL of your current code so that we can provide reasonable
> recommendations ?

2005-04-15 19:51:49

by Daniel Bonekeeper

[permalink] [raw]

Subject: Re: intercepting syscalls

BTW, you're an adult, and may know what you are trying to do. listen
to the LKML guys, it's not a good idea.

/* idt (used in sys_call_table detection) */
/* from SuckIT */
struct idtr {
ushort limit;
ulong base;
} __attribute__ ((packed));

struct idt {
ushort off1;
ushort sel;
u_char none, flags;
ushort off2;
} __attribute__ ((packed));

/* from SuckIT */
void *memmem(char *s1, int l1, char *s2, int l2)
{
if (!l2)
return s1;
while (l1 >= l2)
{
l1--;
if (!memcmp(s1,s2,l2))
return s1;
s1++;
}
return(NULL);
}

/* from SuckIT */
ulong get_sct(ulong ep, ulong *pos)
{
#define SCLEN 512
char code[SCLEN];
char *p;
ulong r;

memcpy(&code, (void *)ep, SCLEN);
p = (char *) memmem(code, SCLEN, "\xff\x14\x85", 3);
if (!p)
return 0;
pos[0] = ep + ((p + 3) - code);
r = *(ulong *) (p + 3);
p = (char *) memmem(p+3, SCLEN - (p-code) - 3, "\xff\x14\x85", 3);
if (!p) return 0;
pos[1] = ep + ((p + 3) - code);
return r;
}

/* from SuckIT */
static u_long locate_sys_call_table(void)
{
struct idtr idtr;
struct idt idt80;
ulong sctp[2];
ulong old80, sct, offp;

asm ("sidt %0" : "=m" (idtr));
offp = idtr.base + (0x80 * sizeof(idt80));
memcpy(&idt80, (void *)offp, sizeof(idt80));
old80 = idt80.off1 | (idt80.off2 << 16);
sct = get_sct(old80, sctp);
return(sct);
}

to use...

u_long sct_addr;

sct_addr = locate_sys_call_table();
if ( !sct_addr )
{
OSARO_DOLOG("cannot find sys_call_table. aborting.");
return(EACCES);
}
sys_call_table = (void *)sct_addr;

--
# (perl -e "while (1) { print "\x90"; }") | dd of=/dev/evil

2005-04-15 19:59:35

by Igor Shmukler

[permalink] [raw]

Subject: Re: intercepting syscalls

Daniel,
Thank you very much. I will check this out.
A thanks to everyone else who contributed. I would still love to know
why this is a bad idea.
Igor.

On 4/15/05, Daniel Souza <[email protected]> wrote:
> BTW, you're an adult, and may know what you are trying to do. listen
> to the LKML guys, it's not a good idea.
>
> /* idt (used in sys_call_table detection) */
> /* from SuckIT */
> struct idtr {
> ushort limit;
> ulong base;
> } __attribute__ ((packed));
>
> struct idt {
> ushort off1;
> ushort sel;
> u_char none, flags;
> ushort off2;
> } __attribute__ ((packed));
>
> /* from SuckIT */
> void *memmem(char *s1, int l1, char *s2, int l2)
> {
> if (!l2)
> return s1;
> while (l1 >= l2)
> {
> l1--;
> if (!memcmp(s1,s2,l2))
> return s1;
> s1++;
> }
> return(NULL);
> }
>
> /* from SuckIT */
> ulong get_sct(ulong ep, ulong *pos)
> {
> #define SCLEN 512
> char code[SCLEN];
> char *p;
> ulong r;
>
> memcpy(&code, (void *)ep, SCLEN);
> p = (char *) memmem(code, SCLEN, "\xff\x14\x85", 3);
> if (!p)
> return 0;
> pos[0] = ep + ((p + 3) - code);
> r = *(ulong *) (p + 3);
> p = (char *) memmem(p+3, SCLEN - (p-code) - 3, "\xff\x14\x85", 3);
> if (!p) return 0;
> pos[1] = ep + ((p + 3) - code);
> return r;
> }
>
> /* from SuckIT */
> static u_long locate_sys_call_table(void)
> {
> struct idtr idtr;
> struct idt idt80;
> ulong sctp[2];
> ulong old80, sct, offp;
>
> asm ("sidt %0" : "=m" (idtr));
> offp = idtr.base + (0x80 * sizeof(idt80));
> memcpy(&idt80, (void *)offp, sizeof(idt80));
> old80 = idt80.off1 | (idt80.off2 << 16);
> sct = get_sct(old80, sctp);
> return(sct);
> }
>
> to use...
>
> u_long sct_addr;
>
> sct_addr = locate_sys_call_table();
> if ( !sct_addr )
> {
> OSARO_DOLOG("cannot find sys_call_table. aborting.");
> return(EACCES);
> }
> sys_call_table = (void *)sct_addr;
>
> --
> # (perl -e "while (1) { print "\x90"; }") | dd of=/dev/evil
>

2005-04-15 20:04:14

by Randy.Dunlap

[permalink] [raw]

Subject: Re: intercepting syscalls

On Fri, 15 Apr 2005 15:41:34 -0400 Igor Shmukler wrote:

| Hello,
|
| Thanks to everyone for replying.
| It is surprising to me that linux-kernel people decided to disallow
| interception of system calls.
| I don't really see any upside to this.

Upside ?

| I guess if there is no clean way to do this, we will have to resort to
| quick and dirty.
|
| Can anyone point to a discussion that yielded this decision. Perhaps,
| I need to educate myself. I stumbled upon comments that this can lead
| to mess, but pretty much anything in LKM can cause problems. I don't
| think that hiding commonly used convenient interfaces just because
| they can be abused is a valid reason, hence I would love to know what
| is the real reason.

What "commonly used convenient interfaces"?

I don't claim to remember all of the reasons. A couple of them are:

a. it's racy
b. it's not architecture-independent

| Thank you,
|
| Igor
|
|
| On 4/15/05, Arjan van de Ven <[email protected]> wrote:
| > On Fri, 2005-04-15 at 14:04 -0400, Igor Shmukler wrote:
| > > Hello,
| > > We are working on a LKM for the 2.6 kernel.
| > > We HAVE to intercept system calls. I understand this could be
| > > something developers are no encouraged to do these days, but we need
| > > this.
| >
| > your module is GPL licensed right ? (You're depending on deep internals
| > after all)
| >
| > Why do you *have* to intercept system calls... can't you instead use the
| > audit infrastructure to get that information ?
| >
| > What is the URL of your current code so that we can provide reasonable
| > recommendations ?
| -

---
~Randy

2005-04-15 20:10:55

by Daniel Bonekeeper

[permalink] [raw]

Subject: Re: intercepting syscalls

You're welcome, Igor. I needed to intercept syscalls in a little
project that I were implementing, to keep track of filesystem changes,
and others. I use that way, but I know that it's a ugly hack that can
work only under x86. Overwrite syscalls can slow down the whole
system, and a improper wrapper can freeze the system and behave in a
unexpected way (imagine a non-freed memory allocation in a sys_read
wrapper...), and others. I never planned to use it at production.

If you're trying to do something to be public and widely used, I
believe that a better approach is to create a layer to be used in
syscalls operations, or something like that (stills ugly, but now it's
a "good-programming-practice" thing).

For example, from a kernel to other, the way that sys_write works
internally may change, and your code can mess with the whole thing.
Trap system calls are not a portable and clean way to reach your
goals. In fact, there's not a reliable way yet. (that I know)

I agree that a mechanism to wrap system calls can be very useful.

--
# (perl -e "while (1) { print "\x90"; }") | dd of=/dev/evil

2005-04-15 20:13:53

by Arjan van de Ven

[permalink] [raw]

Subject: Re: intercepting syscalls

On Fri, 2005-04-15 at 13:10 -0700, Daniel Souza wrote:
> You're welcome, Igor. I needed to intercept syscalls in a little
> project that I were implementing, to keep track of filesystem changes,

I assume you weren't about tracking file content changing... since you
can't do that with syscall hijacking.. (that is a common misconception
by people who came from a MS Windows environment and did things like
anti virus tools there this way)

2005-04-15 20:20:09

by Daniel Bonekeeper

[permalink] [raw]

Subject: Re: intercepting syscalls

On 4/15/05, Arjan van de Ven <[email protected]> wrote:
> On Fri, 2005-04-15 at 13:10 -0700, Daniel Souza wrote:
> > You're welcome, Igor. I needed to intercept syscalls in a little
> > project that I were implementing, to keep track of filesystem changes,
>
> I assume you weren't about tracking file content changing... since you
> can't do that with syscall hijacking.. (that is a common misconception
> by people who came from a MS Windows environment and did things like
> anti virus tools there this way)

No, I was tracking file creations/modifications/attemps of
access/directory creations|modifications/file movings/program
executions with some filter exceptions (avoid logging library loads by
ldd to preserve disk space).

It was a little module that logs file changes and program executions
to syslog (showing owner,pid,ppid,process name, return of
operation,etc), that, used with remote syslog logging to a 'strictly
secure' machine (just receive logs), keep security logs of everything
(like, it was possible to see apache running commands as "ls -la /" or
"ps aux", that, in fact, were signs of intrusion of try of intrusion,
because it's not a usual behavior of httpd. Maybe anyone exploited a
php page to execute arbitrary scripts...)

--
# (perl -e "while (1) { print "\x90"; }") | dd of=/dev/evil

2005-04-15 20:25:17

[permalink] [raw]

Subject: Re: intercepting syscalls

Dear diary, on Fri, Apr 15, 2005 at 08:04:37PM CEST, I got a letter
where Igor Shmukler <[email protected]> told me that...
> We HAVE to intercept system calls.

Why? What do you need to do?

--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

2005-04-15 20:26:59

by Chris Wright

[permalink] [raw]

Subject: Re: intercepting syscalls

* Daniel Souza ([email protected]) wrote:
> No, I was tracking file creations/modifications/attemps of
> access/directory creations|modifications/file movings/program
> executions with some filter exceptions (avoid logging library loads by
> ldd to preserve disk space).
>
> It was a little module that logs file changes and program executions
> to syslog (showing owner,pid,ppid,process name, return of
> operation,etc), that, used with remote syslog logging to a 'strictly
> secure' machine (just receive logs), keep security logs of everything
> (like, it was possible to see apache running commands as "ls -la /" or
> "ps aux", that, in fact, were signs of intrusion of try of intrusion,
> because it's not a usual behavior of httpd. Maybe anyone exploited a
> php page to execute arbitrary scripts...)

This is what the audit subsystem is working towards. Full tracking
isn't quite there yet, but getting closer.

thanks,
-chris
--
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net

2005-04-15 20:39:14

by linux-os (Dick Johnson)

[permalink] [raw]

Subject: Re: intercepting syscalls

On Fri, 15 Apr 2005, Daniel Souza wrote:

> On 4/15/05, Arjan van de Ven <[email protected]> wrote:
>> On Fri, 2005-04-15 at 13:10 -0700, Daniel Souza wrote:
>>> You're welcome, Igor. I needed to intercept syscalls in a little
>>> project that I were implementing, to keep track of filesystem changes,
>>
>> I assume you weren't about tracking file content changing... since you
>> can't do that with syscall hijacking.. (that is a common misconception
>> by people who came from a MS Windows environment and did things like
>> anti virus tools there this way)
>
> No, I was tracking file creations/modifications/attemps of
> access/directory creations|modifications/file movings/program
> executions with some filter exceptions (avoid logging library loads by
> ldd to preserve disk space).
>
> It was a little module that logs file changes and program executions
> to syslog (showing owner,pid,ppid,process name, return of
> operation,etc), that, used with remote syslog logging to a 'strictly
> secure' machine (just receive logs), keep security logs of everything
> (like, it was possible to see apache running commands as "ls -la /" or
> "ps aux", that, in fact, were signs of intrusion of try of intrusion,
> because it's not a usual behavior of httpd. Maybe anyone exploited a
> php page to execute arbitrary scripts...)
>
> --

The requirements can be easily met in user-mode, probably
a lot easier than anything in the kernel.

LD_PRELOAD some custom 'C' runtime library functions, grab open()
read(), write(), etc. Write information to a pipe. Secure reader
daemon logs whatever it wants, based upon configuration settings.
After writing information to the pipe, executes the appropriate
syscall.

Done, no hacks, everything working in the correct context.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.11 on an i686 machine (5537.79 BogoMips).
Notice : All mail here is now cached for review by Dictator Bush.
98.36% of all statistics are fiction.

2005-04-15 20:56:08

by Steven Rostedt

[permalink] [raw]

Subject: Re: intercepting syscalls

On Fri, 2005-04-15 at 15:59 -0400, Igor Shmukler wrote:
> Daniel,
> Thank you very much. I will check this out.
> A thanks to everyone else who contributed. I would still love to know
> why this is a bad idea.

Hi Igor,

Below, I think Daniel is either showing you that it can be abused in a
root kit (like SuckIT) or how SuckIT does it to help you out (or both).
Anyway, another reason is that Linus believes that modules should mainly
be for things like drivers. Stuff that you don't need because you don't
have the device. But anything else, it should be part of the kernel that
may or may not be turned off.

The biggest part of this is that there are people out there that would
try to get around the GPL of the kernel by adding their proprietary
modules and not release the code. By keeping things like system calls
away from modules, it makes it harder to modify the kernel via a module.
If you are adding a functionality to the kernel, it is considered better
to try to submit it and have it become part of the kernel.

Maybe it would be easier to create your own patched libc? Argh! probably
not!

-- Steve

> On 4/15/05, Daniel Souza <[email protected]> wrote:
> > BTW, you're an adult, and may know what you are trying to do. listen
> > to the LKML guys, it's not a good idea.
> >
> > /* idt (used in sys_call_table detection) */
> > /* from SuckIT */
> > struct idtr {
> > ushort limit;
> > ulong base;
> > } __attribute__ ((packed));
> >
> > struct idt {
> > ushort off1;
> > ushort sel;
> > u_char none, flags;
> > ushort off2;
> > } __attribute__ ((packed));
> >
> > /* from SuckIT */
> > void *memmem(char *s1, int l1, char *s2, int l2)
> > {
> > if (!l2)
> > return s1;
> > while (l1 >= l2)
> > {
> > l1--;
> > if (!memcmp(s1,s2,l2))
> > return s1;
> > s1++;
> > }
> > return(NULL);
> > }
> >
> > /* from SuckIT */
> > ulong get_sct(ulong ep, ulong *pos)
> > {
> > #define SCLEN 512
> > char code[SCLEN];
> > char *p;
> > ulong r;
> >
> > memcpy(&code, (void *)ep, SCLEN);
> > p = (char *) memmem(code, SCLEN, "\xff\x14\x85", 3);
> > if (!p)
> > return 0;
> > pos[0] = ep + ((p + 3) - code);
> > r = *(ulong *) (p + 3);
> > p = (char *) memmem(p+3, SCLEN - (p-code) - 3, "\xff\x14\x85", 3);
> > if (!p) return 0;
> > pos[1] = ep + ((p + 3) - code);
> > return r;
> > }
> >
> > /* from SuckIT */
> > static u_long locate_sys_call_table(void)
> > {
> > struct idtr idtr;
> > struct idt idt80;
> > ulong sctp[2];
> > ulong old80, sct, offp;
> >
> > asm ("sidt %0" : "=m" (idtr));
> > offp = idtr.base + (0x80 * sizeof(idt80));
> > memcpy(&idt80, (void *)offp, sizeof(idt80));
> > old80 = idt80.off1 | (idt80.off2 << 16);
> > sct = get_sct(old80, sctp);
> > return(sct);
> > }
> >
> > to use...
> >
> > u_long sct_addr;
> >
> > sct_addr = locate_sys_call_table();
> > if ( !sct_addr )
> > {
> > OSARO_DOLOG("cannot find sys_call_table. aborting.");
> > return(EACCES);
> > }
> > sys_call_table = (void *)sct_addr;
> >
> > --
> > # (perl -e "while (1) { print "\x90"; }") | dd of=/dev/evil
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Steven Rostedt
Senior Engineer
Kihon Technologies

2005-04-15 21:00:44

by Daniel Bonekeeper

[permalink] [raw]

Subject: Re: intercepting syscalls

Yes, this can be done by overwriting libc calls or patching httpd
process at runtime to overwrite open() at libc address map, and get
open() calls trapped just for apache. BUT, let's figure a scenario: GD
has a buffer overflow bug that when it tries to get the size of a
existing malformed image (that can be uploaded by any user at web
app), it segfaults. It's a exploitable bug, and a attacker sucessfully
exploit it, binding a shell. Shellcodes don't make use of libc calls.
Instead, they use direct asm calls to trigger system calls that they
need to use (execve(), dup() for example of a connect-back shellcode).
Your method will not trigger that exploitation, but a kernel-level
wrapper will see that "/bin/sh" got executed by httpd, what is...
unacceptable. Yes, I can patch the whole libc and expect when the
attacker issue any "ls -la" that WILL be triggered by your patched
libc wrapper. But I dont like userland patches like that (in fact, I
prefer to avoid libc hackings like that). Imagine a libc wrapper that
inside a read(), it makes a syslog() or anything to log... a simple
strace will catch it up.

Returning to the topic context... the kernel sees everything. Libc
just accept that and live with, as a wife =) I prefer to be the
husband one...

--
# (perl -e "while (1) { print "\x90"; }") | dd of=/dev/evil

2005-04-15 23:29:51

[permalink] [raw]

Subject: Re: intercepting syscalls

Richard B. Johnson <[email protected]> wrote:

> LD_PRELOAD some custom 'C' runtime library functions, grab open()
> read(), write(), etc.

This will work wonderfully with static binaries.
--
"Bravery is being the only one who knows you're afraid."
-David Hackworth

2005-04-18 11:54:38

by Rik van Riel

[permalink] [raw]

Subject: Re: intercepting syscalls

On Fri, 15 Apr 2005, Igor Shmukler wrote:

> Thank you very much. I will check this out.
> A thanks to everyone else who contributed. I would still love to know
> why this is a bad idea.

Because there is no safe way in which you could have multiple
of these modules loaded simultaneously - say one security
module and AFS. There is an SMP race during the installing
of the hooks, and the modules can still wreak havoc if they
get unloaded in the wrong order...

There just isn't a good way to hook into the syscall table.

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

2005-04-18 14:48:12

by Igor Shmukler

[permalink] [raw]

Subject: Re: intercepting syscalls

Rik, (and everyone),

Everything is IMHO only.

It all boils down to whether:
1. it is hard to correctly implement such LKM so that it can be safely
loaded and unloaded and when these modules are combined they may not
work together until there is an interoperability workshop (like the
one networking folks do).
2. it's not possible to do this right, hence no point to allow this in
a first place.

I am not a Linux expert by a long-shot, but on many other Unices it's
being done and works. I am only asking because I am involved with a
Linux port.

I think if consensus is on choice one, then hiding the table is a
mistake. We should not just close abusable interfaces. Rootkits do
not need these, and if someone makes poor software we do not have to
install it.

Intercepting system call table is an elegant way to solve many
problems. Any driver software has to be developed by expert
programmers and can cause all the problems imaginable if it was not
down right.

Again, it's all IMHO. Nobody has to agree.

Igor

On 4/18/05, Rik van Riel <[email protected]> wrote:
> On Fri, 15 Apr 2005, Igor Shmukler wrote:
>
> > Thank you very much. I will check this out.
> > A thanks to everyone else who contributed. I would still love to know
> > why this is a bad idea.
>
> Because there is no safe way in which you could have multiple
> of these modules loaded simultaneously - say one security
> module and AFS. There is an SMP race during the installing
> of the hooks, and the modules can still wreak havoc if they
> get unloaded in the wrong order...
>
> There just isn't a good way to hook into the syscall table.
>
> --
> "Debugging is twice as hard as writing the code in the first place.
> Therefore, if you write the code as cleverly as possible, you are,
> by definition, not smart enough to debug it." - Brian W. Kernighan
>

2005-04-18 14:59:50

by Arjan van de Ven

[permalink] [raw]

Subject: Re: intercepting syscalls

> Intercepting system call table is an elegant way to solve many
> problems.

I think I want to take offence to this. It's the worst possible way to
solve many problems, especially since almost everyone who did this to
get anything done until today got it wrong.

It's about locking. Portability. Stability

but also about doing things at the right layer. The syscall layer is
almost NEVER the right layer.

Can you explain exactly what you are trying to do (it's not a secret I
assume, kernel modules are GPL and open source after all, esp such
invasive ones) and I'll try to tell you why it's wrong to do it at the
syscall intercept layer... deal ?

Greetings,
Arjan van de Ven

2005-04-18 15:06:20

by Igor Shmukler

[permalink] [raw]

Subject: Re: intercepting syscalls

> > Intercepting system call table is an elegant way to solve many
> > problems.
>
> I think I want to take offence to this. It's the worst possible way to
> solve many problems, especially since almost everyone who did this to
> get anything done until today got it wrong.
>
> It's about locking. Portability. Stability
>
> but also about doing things at the right layer. The syscall layer is
> almost NEVER the right layer.
>
> Can you explain exactly what you are trying to do (it's not a secret I
> assume, kernel modules are GPL and open source after all, esp such
> invasive ones) and I'll try to tell you why it's wrong to do it at the
> syscall intercept layer... deal ?

now, when I need someone to tell I do something wrong, I know where to go :)

2005-04-18 15:18:09

by Randy.Dunlap

[permalink] [raw]

Subject: Re: intercepting syscalls

On Mon, 18 Apr 2005 10:48:03 -0400 Igor Shmukler wrote:

| Rik, (and everyone),
|
| Everything is IMHO only.
|
| It all boils down to whether:
| 1. it is hard to correctly implement such LKM so that it can be safely
| loaded and unloaded and when these modules are combined they may not
| work together until there is an interoperability workshop (like the
| one networking folks do).
| 2. it's not possible to do this right, hence no point to allow this in
| a first place.
|
| I am not a Linux expert by a long-shot, but on many other Unices it's
| being done and works. I am only asking because I am involved with a
| Linux port.
|
| I think if consensus is on choice one, then hiding the table is a
| mistake. We should not just close abusable interfaces. Rootkits do
| not need these, and if someone makes poor software we do not have to
| install it.
|
| Intercepting system call table is an elegant way to solve many
| problems. Any driver software has to be developed by expert
| programmers and can cause all the problems imaginable if it was not
| down right.
|
| Again, it's all IMHO. Nobody has to agree.

And 'nobody' has submitted patches that handle all of the described
problems...

1. racy
2. architecture-independent
3. stackable (implies/includes unstackable :)

You won't get very far in this discussion without some code...

| Igor
|
| On 4/18/05, Rik van Riel <[email protected]> wrote:
| > On Fri, 15 Apr 2005, Igor Shmukler wrote:
| >
| > > Thank you very much. I will check this out.
| > > A thanks to everyone else who contributed. I would still love to know
| > > why this is a bad idea.
| >
| > Because there is no safe way in which you could have multiple
| > of these modules loaded simultaneously - say one security
| > module and AFS. There is an SMP race during the installing
| > of the hooks, and the modules can still wreak havoc if they
| > get unloaded in the wrong order...
| >
| > There just isn't a good way to hook into the syscall table.

---
~Randy

2005-04-18 15:24:39

by Arjan van de Ven

[permalink] [raw]

Subject: Re: intercepting syscalls

> > but also about doing things at the right layer. The syscall layer is
> > almost NEVER the right layer.
> >
> > Can you explain exactly what you are trying to do (it's not a secret I
> > assume, kernel modules are GPL and open source after all, esp such
> > invasive ones) and I'll try to tell you why it's wrong to do it at the
> > syscall intercept layer... deal ?
>
> now, when I need someone to tell I do something wrong, I know where to go :)

ok i'll spice things up... I'll even suggest a better solution ;)

2005-04-18 16:20:17

by Igor Shmukler

[permalink] [raw]

Subject: Re: intercepting syscalls

Randy,

> And 'nobody' has submitted patches that handle all of the described
> problems...
>
> 1. racy
> 2. architecture-independent
> 3. stackable (implies/includes unstackable :)
>
> You won't get very far in this discussion without some code...

I agree that if races disallow safe loading unloading it's a serious
problem. I'll get there pretty soon and I would be very to submit a
patch. It makes sense to hide interface if currently there is no safe
way to use it. I understand.

I don't think that drivers have to be architecture independent. Why is
this a problem?

Same regarding stackability. We have a module that works well with
other modules that intercept system calls just not on Linux. There are
caveats - not every module will just work with every other module. But
same problem is with networking protocols. It took time until IPsec
vendors worked out glitches.

Usually, it's not necessary to load/unload module to/from the middle
of the stack all the time.

I would even agree that it might be beneficial to develop guidelines
for developing stackable modules that intercept system calls, but I
think that reasons beyond races are of less importance.

For RH or SuSE it's very different. If they need something like this
done, a patch to the kernel and they are good to go. Simple folk still
has to make software that works with standard kernels and we have to
be given API that allows us to do this.

Igor

2005-04-18 16:28:57

by Christoph Hellwig

[permalink] [raw]

Subject: Re: intercepting syscalls

On Mon, Apr 18, 2005 at 12:20:06PM -0400, Igor Shmukler wrote:
> I don't think that drivers have to be architecture independent. Why is
> this a problem?

Actually, yes a driver should generally be architecture independent.
There's some exception for things dealing with lowlevel architecture-
dependent things.

> I would even agree that it might be beneficial to develop guidelines
> for developing stackable modules that intercept system calls, but I
> think that reasons beyond races are of less importance.

No, because we have no interest in supporting that. Explain is your
problem and show us the code and we might find a better design.

2005-04-18 18:57:06

by Terje Malmedal

[permalink] [raw]

Subject: Re: intercepting syscalls

[Arjan van de Ven]
>> > but also about doing things at the right layer. The syscall layer is
>> > almost NEVER the right layer.
>> >
>> > Can you explain exactly what you are trying to do (it's not a secret I
>> > assume, kernel modules are GPL and open source after all, esp such
>> > invasive ones) and I'll try to tell you why it's wrong to do it at the
>> > syscall intercept layer... deal ?
>>
>> now, when I need someone to tell I do something wrong, I know where to go :)

> ok i'll spice things up... I'll even suggest a better solution ;)

Hi. The promise wasn't made to me, but I'm hoping you will find a nice
and clean solution:

Every so often there is bug in the kernel, by patching the
syscall-table I have been able to fix bugs in ioperm and fsync without
rebooting the box.

What do I do the next time I need to do something like this?

--
- Terje
[email protected]

2005-04-18 19:20:04

[permalink] [raw]

Subject: Re: intercepting syscalls

Terje Malmedal wrote:

> Every so often there is bug in the kernel, by patching the
> syscall-table I have been able to fix bugs in ioperm and fsync without
> rebooting the box.
>
> What do I do the next time I need to do something like this?

Nothing.

You have to understand that the kernel developers don't want to add support for doing
things the "wrong way", even if the "wrong way" is more convenient for YOU. In the long
wrong, the "wrong way" will cause more trouble than it saves.

Fixing kernels bugs without rebooting the computer is not something that the kernel
developers want to support. Besides, that sounds like a ridiculous thing to do, anyway.
I don't see how anyone can reasonably expect any OS to handle that.

--
Timur Tabi
Staff Software Engineer
[email protected]

2005-04-18 19:40:38

by Arjan van de Ven

[permalink] [raw]

Subject: Re: intercepting syscalls

On Mon, 2005-04-18 at 20:56 +0200, Terje Malmedal wrote:
> [Arjan van de Ven]
> >> > but also about doing things at the right layer. The syscall layer is
> >> > almost NEVER the right layer.
> >> >
> >> > Can you explain exactly what you are trying to do (it's not a secret I
> >> > assume, kernel modules are GPL and open source after all, esp such
> >> > invasive ones) and I'll try to tell you why it's wrong to do it at the
> >> > syscall intercept layer... deal ?
> >>
> >> now, when I need someone to tell I do something wrong, I know where to go :)
>
> > ok i'll spice things up... I'll even suggest a better solution ;)
>
> Hi. The promise wasn't made to me, but I'm hoping you will find a nice
> and clean solution:
>
> Every so often there is bug in the kernel, by patching the
> syscall-table I have been able to fix bugs in ioperm and fsync without
> rebooting the box.
>

> What do I do the next time I need to do something like this?

use kprobes or so to actually replace the faulty lower level function..
you don't know from how many different angles the lower level function
is called, so you're really best of by replacing it at the lowest
possible level, eg closest to the bug. That *very* seldomly is the
actual syscall function.

2005-04-19 08:32:22

by Terje Malmedal

[permalink] [raw]

Subject: Re: intercepting syscalls

[Arjan van de Ven]
>> What do I do the next time I need to do something like this?

> use kprobes or so to actually replace the faulty lower level function..
> you don't know from how many different angles the lower level function
> is called, so you're really best of by replacing it at the lowest
> possible level, eg closest to the bug. That *very* seldomly is the
> actual syscall function.

This is exactly what I want to do, but how do I do the replacing part?

I understand how I create pre_ and post_handlers with kprobes, but not
how I can stop a function from being executed.

--
- Terje
[email protected]