2002-07-18 02:24:08

by Shaya Potter

[permalink] [raw]
Subject: more thoughts on a new jail() system call

ok, I've gone through each syscall (for x86/2.4.18) I've divided the
processes into sections, saying if I think they should be fine (with
reason I think).

P - [Process] - Allow the system call to operate as normal, as it only
affects the processes's individual state, and doesnt change how the OS
relates to the process.

F - [File descriptor] - The system call depends on a file descriptor.
Since every process in a jail would have been started in the jail, all
the file descriptor's should be safe.

C - [Chroot] - w/ proper chroot should be fine (as operates on
filename, and therefore will be confined to the jail)

J - [Jail] - system calls that need to be "caught" when run in a jail.

^J - Works normally b/c jails are 100% sandboxed off from everything

Different things have to be thought about, such as how processes running
on the reguler system interact with jailed processes. A jailed process
can't send signals (ex. kill() ) to processes outside its jail, but what
about processes outside the jails, can they send send signals to
processes inside a jail? I can imagine that bad things could possibly
occur if jail'd processes could communicate w/ normal processes (or vice
versa)

anyways, this is just to get the ball rolling on discussion, below is my
list of all the syscalls and how I sort them out (and some I dont know
much about), so please comment.

thanks,

shaya potter

--------

sys_exit) P

sys_fork) P (would create a new process in the same jail)

sys_read) F

sys_write) F

sys_open) C

sys_close) F

sys_waitpid) P

sys_creat) C

sys_link) C

sys_unlink) C

sys_execve) P

sys_chdir) C

sys_time) J

sys_mknod) J - Need FIFO ability, everything else not.

sys_chmod) C

sys_lchown16) C

sys_stat) C

sys_lseek) C

sys_getpid) P

sys_mount) J - Not able to mount inside a jail.

sys_oldumount) J - Not able to unmount inside a jail.

sys_setuid16) ^J - since jail is secure, can setuid all you want.

sys_getuid16) P

sys_stime) J - can't set system time from jail.

sys_ptrace) J - Either disable totally, or filter to only processes in
same jail.

sys_alarm) P

sys_fstat) F

sys_pause) P (works w/ alarm)

sys_utime) C

sys_access) C

sys_nice) J - a process as root inside a jail shouldn't be able to -19
itself consuming all cpu.

sys_sync) Affects the system, but in a non destructive way. regular
users can run sync, so shouldn't be a problem?

sys_kill) J - Can only kill processes inside your own jail.

sys_rename) C

sys_mkdir) C

sys_rmdir) C

sys_dup) F/P

sys_pipe) P

sys_times) P

sys_brk) allow as normal (this is for malloc, right?)

sys_setgid16) ^J - assuming jail is secure.

sys_getgid16) P

sys_signal) P

sys_geteuid16) P

sys_getegid16) P

sys_acct) J - no seperate process accounting for jail, so doesnt make
sense

sys_umount) J - not allowed

sys_ioctl) J - disallowed, but perhaps if devices recognize jails and
filter commands based on that...

sys_fcntl) F

sys_setpgid) ^J

sys_olduname) - P

sys_umask) P

sys_chroot) J - Jail depends on this being strong. So either revoke,
or figure out a way to decide that a chroot is safe (i.e. below jail's
root) and force a chdir into the new chroot.

sys_ustat) J - Do we want a jailed process getting this info?

sys_dup2) F/P

sys_getppid) P

sys_getpgrp) P

sys_setsid) NOT SURE - no clue what this really does

sys_sigaction) P

sys_sgetmask) P

sys_ssetmask) P

sys_setreuid16) ^J - assuming jail is secure....

sys_setregid16) ^J - ......

sys_sigsuspend) P

sys_sigpending) P

sys_sethostname) J - Can't run from jail.

we can either virtualize it or disallow it.

sys_setrlimit) J - can't run from jail.

sys_old_getrlimit) P

sys_getrusage) P (though inaccurate if we move, unless we reset)

sys_gettimeofday) P (though some aspect of machine dependence)

sys_settimeofday) J - Can't run.

sys_getgroups16) P

sys_setgroups16) ^J - assuming jail is secure, doesnt matter.

old_select) P

sys_symlink) C

sys_lstat) C

sys_readlink) NOT SURE - seems fine.

sys_uselib) NOT SURE - seems fine.

sys_swapon) J - cant affect system from jail.

sys_reboot) J - cant affect system from jail (perhaps can be used to
kill jail)

old_readdir) NOT SURE (dont think it should be a problem)

old_mmap) P

sys_munmap) P

sys_truncate) C

sys_ftruncate) F

sys_fchmod) F

sys_fchown16) F

sys_getpriority) P

sys_setpriority) J - jail processes cant play with their priority

sys_statfs) NOT SURE - should a jail process be able to get info on
system?

sys_fstatfs) same as statfs

sys_ioperm) J - jail processes cant play with local io ports

sys_socketcall) J - Bind seems to be the only problem. jail() includes
an ip address, and a jailed process can only bind to that address. so
do we force the addr to be this address, or does one allow INADDR_ANY
and translate that to the jail'd ip address?

sys_syslog) NOT SURE (probably jailed away)

sys_setitimer) P
sys_getitimer) P

sys_newstat) C
sys_newlstat) C
sys_newfstat) F

sys_uname) NOT SURE - giving jailed process info on local system?

sys_iopl) J - lock out, local machine specific.

sys_vhangup) NOT SURE - Should be fine, right?

sys_vm86old) J - hardware access, not for jails.

sys_wait4) V

sys_swapoff) J - jail cant play with swap.

sys_sysinfo) J - local machine info again?

sys_ipc) - needs to be jailed. Only processes in the same jail can IPC
amongst themselves.

sys_fsync) NOT SURE - same as sync

sys_sigreturn) used by kernel, should never be used elsewhere (acc. to
man page) not sure what to do with it in actuality.

man page quote
"When the Linux kernel creates the stack frame for a signal
handler, a call to sigreturn is inserted into the stack frame so that
the the signal handler will call sigreturn upon return. This
inserted call to sigreturn cleans up the stack so that the process
can restart from where it was interrupted by the signal."

"The sigreturn call is used by the kernel to implement signal
handlers. It should never be called directly. Better yet, the
specific use of the __unused argument varies depending on the
architecture."

sys_clone) P

sys_setdomainname) J - can't change local system info.

sys_newuname) J - getting info on local system again.

sys_modify_ldt) NOT SURE - have no clue.

sys_adjtimex) J - can't change local system.

sys_mprotect) P - process specific

sys_sigprocmask) P

sys_create_module) J - lock out, as local machine affecting

sys_init_module) J - lock out

sys_delete_module) J - lock out

sys_get_kernel_syms) NOT SURE, if no module functions, any need?

sys_quotactl) J - lock out, shouldnt be able to manipulate disk quotas.

sys_getpgid) P

sys_fchdir) F

sys_bdflush) J - only root can call it, so jails cant.

sys_sysfs) J - info on local system?

sys_personality) not really applicable as used for running binaries
for other UNIX's (IBCS for instance), would one run them in a jail?

sys_setfsuid16) ^J

sys_setfsgid16) ^J

sys_llseek) F

sys_getdents) F

sys_select) P

sys_flock) F

sys_msync) P (called b4 munmap)

sys_readv) F

sys_writev) F

sys_getsid) NOT SURE - whats it for?

sys_fdatasync) NOT SURE - probably same as other syncs.

sys_sysctl) J - perhaps jails could have their own sysctls.

sys_mlock) J - jailed process cant lock memory

sys_munlock) J

sys_mlockall) J

sys_munlockall) J

sys_sched_setparam) J - jailed processes cant play with scheduler
sys_sched_getparam) J
sys_sched_setscheduler) J
sys_sched_getscheduler) J

sys_sched_yield) P

sys_sched_get_priority_max) NOT SURE (any point?)
sys_sched_get_priority_min) "
sys_sched_rr_get_interval) "

sys_nanosleep) P

sys_mremap) P (works on virtual addresses)

sys_setresuid16) ^J

sys_getresuid16) P

sys_vm86) J - lock out, hardware access.

sys_query_module) J - lock out, os access.

sys_poll) P/F

sys_nfsservctl) J - lock out.

sys_setresgid16) ^J
sys_getresgid16) P

sys_prctl) P (seems ok - operates on process and parent)

sys_rt_sigreturn) no manpage, but similiar to their non rt functions?
sys_rt_sigaction) "
sys_rt_sigprocmask) "
sys_rt_sigpending) "
sys_rt_sigtimedwait) "
sys_rt_sigqueueinfo) "
sys_rt_sigsuspend) "

sys_pread) F (similiar to read)

sys_pwrite) F (similiar to write)

sys_chown16) C

sys_getcwd) C

sys_capget) NOT SURE, probably disallow

sys_capset) J - as can perhaps get around security

sys_sigaltstack) NOT SURE - no man page have no idea what it is

sys_sendfile) F

sys_vfork) P

sys_getrlimit) P - not sure if any point?

sys_mmap2) NOT SURE (maybe P)

no man page, but if mmap related shouldnt be a problem

sys_truncate64) F - 64 bit version of equivalent function
sys_ftruncate64) C - "
sys_stat64) F - "
sys_lstat64) F - "
sys_fstat64) C - "

sys_lchown) C

sys_getuid) P
sys_getgid) P
sys_geteuid) P
sys_getegid) P
sys_setreuid) ^J
sys_setregid) ^J
sys_getgroups) P
sys_setgroups) ^J

sys_fchown) C

sys_setresuid) ^J
sys_getresuid) P
sys_setresgid) ^J
sys_getresgid) P

sys_chown) F

sys_setuid) ^J
sys_setgid) ^J
sys_setfsuid) ^J
sys_setfsgid) J

sys_pivot_root) J - local machine specific.

sys_mincore) NOT SURE

sys_madvise) NOT SURE

sys_getdents64) F 64 bit call of same name
sys_fcntl64) F "

sys_gettid) P (not exactly sure what it is) no man page

sys_readahead) NOT SURE - no man page




2002-07-18 02:28:52

by Shaya Potter

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

woops, change this (from an earlier draft, where i was using different
names)

sys_wait4) J - Can only wait on a process in its jail

possibly some other mistakes, feel free to rip it apart of course :)

2002-07-19 00:34:30

by daw

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

Shaya Potter wrote:
>sys_mknod) J - Need FIFO ability, everything else not.

Beware the ability to pass file descriptors across Unix
domain sockets. This should probably be restricted somehow.
Along similar lines, you didn't mention sendmsg() and
recvmsg(), but the fd-passing parts should probably be
restricted.

>sys_setuid16) ^J - since jail is secure, can setuid all you want.

I'd look very carefully at whether root can bypass any
of the access controls you're relying on. For instance,
with root, one can bind to ports below 1024.

>sys_ioctl) J - disallowed, but perhaps if devices recognize jails and
>filter commands based on that...

In my experience building jails (see Janus), this will
be a problem. There are a small number of ioctl()s that
are widely used by applications. To give some examples,
I find that we needed to allow TIOCGPGRP, FIONBIO, and
FIONREAD (they seem safe). Also, I found that lots of
real apps use TCGETS, TCSETS, and TIOCSPGRP; unfortunately,
I'm not too sure whether these are safe.

However, I agree that most ioctl()s are probably dangerous.
Maybe a reasonable stance is to deny all ioctl()s by default,
and have a few exceptions for known-safe ioctl()s to be allowed.

>sys_fcntl) F

Some fcntl() calls are unsafe. For instance, F_SETOWN may
give a backdoor way to send signals to processes outside
the jail.

>sys_olduname) - P

I'd argue that this should be restricted, on general
principles. (General principle: A jailed process shouldn't
be able to learn anything about the host it's running on.)

>sys_getcwd) C
>sys_ustat) J - Do we want a jailed process getting this info?
>sys_statfs) NOT SURE - should a jail process be able to get info on system?
>sys_fstatfs) same as statfs
>sys_sysfs) J - info on local system?

It's probably not critical, but I'd argue that these should
be denied, on general principles, unless there is some
reason to think it will be very useful. getcwd() is probably
the most critical to deny, as it can give away detailed
information in some cases.

(General principle: If you're in a jail, you shouldn't be
able to learn any information about where that jail resides
on the filesystem.)

>sys_stat) C

Similarly, I'd argue that st_dev maybe should be restricted.

>sys_getppid) P
>sys_getpgid) P

What if the parent process is outside the jail? Does it
cause any harm to disclose the parent pid? I'm not sure...

>sys_setsid) NOT SURE - no clue what this really does

I think it's probably ok, but I'm not 100% sure, either.

>sys_socketcall) J - Bind seems to be the only problem. jail() includes
>an ip address, and a jailed process can only bind to that address. so
>do we force the addr to be this address, or does one allow INADDR_ANY
>and translate that to the jail'd ip address?

The most interesting part is whether connect()
and sendto() should also be restricted. I think
restrictions on access to the network are going
to be critical to security: it is the #1 easiest
way to escape from a jail, if there are no restrictions
on connect() and the like. In principle, we could
use IP Chains for this, though in practice, I suspect
most callers to jail() will forget to set up appropriate
IP filtering. I wonder if there is any way to
reduce the likelihood of this failure mode and keep
programmers honest?

Also, socket() should probably be restricted to
prevent creation of raw IP and PF_PACKET sockets
and the like (sending forged traffic, sniffing
on local traffic).

The SO_BINDTODEVICE and IP_HDRINCL socket option
should probably be restricted.

Also, are there any implications of SO_PASSCRED,
SO_PEERCRED, SCM_RIGHTS, SCM_CREDENTIALS, SO_DEBUG,
SO_REUSEADDR, IP_OPTIONS, IP_PKTINFO?

See also sendmsg() and recvmsg() fd-passing.

>sys_syslog) NOT SURE (probably jailed away)

sys_syslog touches a global shared resource, hence
should probably be denied to jailed processes.

>sys_vhangup) NOT SURE - Should be fine, right?

Seems ok to me.

>sys_fsync) NOT SURE - same as sync
>sys_fdatasync) NOT SURE - probably same as other syncs.

The *sync*() calls seem ok to me.

>sys_getsid) NOT SURE - whats it for?

You shouldn't be able to call getsid() on some other
process outside the jail. Also, calling getsid() on
yourself might reveal information about your parent,
like getppid() or getpgid() (minor).

2002-07-19 02:05:35

by Thunder from the hill

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

Hi,

On 19 Jul 2002, David Wagner wrote:
> >sys_ioctl) J - disallowed, but perhaps if devices recognize jails and
> >filter commands based on that...

I think it's quite hard for any type of network application to work well
without TIOCINQ.

Regards,
Thunder
--
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o? K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y-
------END GEEK CODE BLOCK------

2002-07-19 03:03:06

by Albert D. Cahalan

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

>> sys_olduname) - P
>
> I'd argue that this should be restricted, on general
> principles. (General principle: A jailed process shouldn't
> be able to learn anything about the host it's running on.)

Learning this info is easy enough without a syscall.
You only cause trouble for legit usage.

>> sys_getcwd) C
>> sys_ustat) J - Do we want a jailed process getting this info?
>> sys_statfs) NOT SURE - should a jail process be able to get info on system?
>> sys_fstatfs) same as statfs
>> sys_sysfs) J - info on local system?
>
> It's probably not critical, but I'd argue that these should
> be denied, on general principles, unless there is some
> reason to think it will be very useful. getcwd() is probably
> the most critical to deny, as it can give away detailed
> information in some cases.
>
> (General principle: If you're in a jail, you shouldn't be
> able to learn any information about where that jail resides
> on the filesystem.)

No, sys_getcwd will return info based on your current root.
After chroot and all, your "/" is the top of your jail.

>> sys_setsid) NOT SURE - no clue what this really does
>
> I think it's probably ok, but I'm not 100% sure, either.

Yes it's OK. It's needed for job control.

>> sys_syslog) NOT SURE (probably jailed away)
>
> sys_syslog touches a global shared resource, hence
> should probably be denied to jailed processes.

It's got to be redirected.

>> sys_vhangup) NOT SURE - Should be fine, right?
>
> Seems ok to me.

Have fun with devpts.

>> sys_getsid) NOT SURE - whats it for?
>
> You shouldn't be able to call getsid() on some other
> process outside the jail. Also, calling getsid() on
> yourself might reveal information about your parent,
> like getppid() or getpgid() (minor).

Your parent ought to be 1.

2002-07-19 03:35:46

by daw

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

Albert D. Cahalan wrote:
>>> sys_olduname) - P
>>
>> I'd argue that this should be restricted, on general
>> principles. (General principle: A jailed process shouldn't
>> be able to learn anything about the host it's running on.)
>
>Learning this info is easy enough without a syscall.
>You only cause trouble for legit usage.

Ok. To be clear, I consider this minor and probably
unimportant for security, hence just allowing this is
probably reasonable.

That said, is it really true that you can learn the
hostname and the like without a syscall? How?

>No, sys_getcwd will return info based on your current root.
>After chroot and all, your "/" is the top of your jail.

Ahh, I feel stupid for overlooking that. You're
absolutely right. Thanks for the correction.

2002-07-19 04:15:44

by James Antill

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

Thunder from the hill <[email protected]> writes:

> Hi,
>
> On 19 Jul 2002, David Wagner wrote:
> > >sys_ioctl) J - disallowed, but perhaps if devices recognize jails and
> > >filter commands based on that...
>
> I think it's quite hard for any type of network application to work well
> without TIOCINQ.

The more general spelling is FIONREAD, and I generally find that only
crap network applications need to use it. Good ones just try and read
a largish amount of data into a buffer.

I'd agree that more than a couple of apps would break without it, but
that isn't what you said.

--
James Antill -- <[email protected]>
Firewall n.
1. A bad security program used to make other bad security programs less
baddly in need of security.

2002-07-19 04:53:47

by Thunder from the hill

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

Hi,

On 19 Jul 2002, James Antill wrote:
> The more general spelling is FIONREAD, and I generally find that only
> crap network applications need to use it. Good ones just try and read
> a largish amount of data into a buffer.

That doesn't matter as long as you haven't got any idea on how much data
will be read. Especially relaying between two completely different hosts,
possibly unknown protocols, you don't have a chance to know who will send
next. Without TIOCINQ you'll almost be shot if you have received lots of
lots of stuff from the client and expect any response from the server. You
just won't get it.

Give me another version of the appended piece of code that won't use
ioctl, and I'll consider an acknowledgement.

Regards,
Thunder
--
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o? K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y-
------END GEEK CODE BLOCK------


Attachments:
portforwarder.c (4.11 kB)

2002-07-19 07:43:34

by Ville Herva

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

On Fri, Jul 19, 2002 at 12:21:47AM +0000, you [David Wagner] wrote:
> Shaya Potter wrote:
> >sys_mknod) J - Need FIFO ability, everything else not.
>
> Beware the ability to pass file descriptors across Unix
> domain sockets. This should probably be restricted somehow.
> Along similar lines, you didn't mention sendmsg() and
> recvmsg(), but the fd-passing parts should probably be
> restricted.

I gather FreeBSD allow passing fd's, but not in or out the jail. Just inside
it.

> >sys_setuid16) ^J - since jail is secure, can setuid all you want.
>
> I'd look very carefully at whether root can bypass any
> of the access controls you're relying on. For instance,
> with root, one can bind to ports below 1024.

In FreeBSD jail, jailed root is supposed to be safe. So if something is
jailed - and has the necessary privileges - it can bind to the jail ip (each
jail has its own ip). But it can't bind to any other ip's of the box.

http://docs.freebsd.org/44doc/papers/jail/jail-6.html#section10

> >sys_socketcall) J - Bind seems to be the only problem. jail() includes
> >an ip address, and a jailed process can only bind to that address. so
> >do we force the addr to be this address, or does one allow INADDR_ANY
> >and translate that to the jail'd ip address?

In FreeBSD, INADDR_ANY is explicitly translated to jail's IP. Many daemons
use INADDR_ANY routinely, so I think it makes sense.

> >sys_syslog) NOT SURE (probably jailed away)
>
> sys_syslog touches a global shared resource, hence
> should probably be denied to jailed processes.

Ummh, most logical way would be to create an own syslog for each jail.
That's also the most laborous alternative, though...



-- v --

[email protected]

2002-07-19 07:45:43

by James Antill

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call


--
# James Antill -- [email protected]
:0:
* ^From: .*james@and\.org
/dev/null


Attachments:
portforwarder.c (10.39 kB)

2002-07-19 16:21:23

by Shaya Potter

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

On Thu, 2002-07-18 at 20:21, David Wagner wrote:
> Shaya Potter wrote:
> >sys_mknod) J - Need FIFO ability, everything else not.
>
> Beware the ability to pass file descriptors across Unix
> domain sockets. This should probably be restricted somehow.
> Along similar lines, you didn't mention sendmsg() and
> recvmsg(), but the fd-passing parts should probably be
> restricted.

as others said, fifo's should only work inside a jail.

could sendmsg and recvmsg possibly be a problem in this scenario? - hack
jail, hack local machine, have process on local machine communicate with
process in jail passing fds?

>
> >sys_setuid16) ^J - since jail is secure, can setuid all you want.
>
> I'd look very carefully at whether root can bypass any
> of the access controls you're relying on. For instance,
> with root, one can bind to ports below 1024.

the whole point is too allow things like that safely.

>
> >sys_ioctl) J - disallowed, but perhaps if devices recognize jails and
> >filter commands based on that...
>
> In my experience building jails (see Janus), this will
> be a problem. There are a small number of ioctl()s that
> are widely used by applications. To give some examples,
> I find that we needed to allow TIOCGPGRP, FIONBIO, and
> FIONREAD (they seem safe). Also, I found that lots of
> real apps use TCGETS, TCSETS, and TIOCSPGRP; unfortunately,
> I'm not too sure whether these are safe.

Right, some of this might be need to be virtualized. we might need to
expand on FreeBSD's prison (internal) struct to store information like
this. But haven't had a chance to go through ioctl's yet.

Is it a problem to allow ioctls that reguler users can do? We might
need to filter the options passed, but if we eliminated all ioctls that
only root could use, would probably be a start.

>
> However, I agree that most ioctl()s are probably dangerous.
> Maybe a reasonable stance is to deny all ioctl()s by default,
> and have a few exceptions for known-safe ioctl()s to be allowed.

agree.

>
> >sys_fcntl) F
>
> Some fcntl() calls are unsafe. For instance, F_SETOWN may
> give a backdoor way to send signals to processes outside
> the jail.

so these have to be filtered.

>
> >sys_olduname) - P
>
> I'd argue that this should be restricted, on general
> principles. (General principle: A jailed process shouldn't
> be able to learn anything about the host it's running on.)

and as other said (as as I was sort of planning, and mentioned by other
uname like functions, virtualizing is beneficial here)

>
> >sys_getcwd) C
> >sys_ustat) J - Do we want a jailed process getting this info?
> >sys_statfs) NOT SURE - should a jail process be able to get info on
system?
> >sys_fstatfs) same as statfs
> >sys_sysfs) J - info on local system?
>
> It's probably not critical, but I'd argue that these should
> be denied, on general principles, unless there is some
> reason to think it will be very useful. getcwd() is probably
> the most critical to deny, as it can give away detailed
> information in some cases.

as others mentioned cwd isn't a problem as chroot'd. for the others, I
dont know, some seem to think its usefull.

>
> (General principle: If you're in a jail, you shouldn't be
> able to learn any information about where that jail resides
> on the filesystem.)

I agree. hence virtulize certian things.

>
> >sys_stat) C
>
> Similarly, I'd argue that st_dev maybe should be restricted.

what do you mean?

>
> >sys_getppid) P
> >sys_getpgid) P
>
> What if the parent process is outside the jail? Does it
> cause any harm to disclose the parent pid? I'm not sure...

everything will be parented at minimum to 1. The trick is, things
shouldn't be able to be jailed after the fact, but should have to be
created inside the jail. So it may be needed to break bsd's jail
semantics, and make jail a sort of exec like call.

>
> >sys_setsid) NOT SURE - no clue what this really does
>
> I think it's probably ok, but I'm not 100% sure, either.
>
> >sys_socketcall) J - Bind seems to be the only problem. jail()
includes
> >an ip address, and a jailed process can only bind to that address. so
> >do we force the addr to be this address, or does one allow INADDR_ANY
> >and translate that to the jail'd ip address?
>
> The most interesting part is whether connect()
> and sendto() should also be restricted. I think
> restrictions on access to the network are going
> to be critical to security: it is the #1 easiest
> way to escape from a jail, if there are no restrictions
> on connect() and the like. In principle, we could
> use IP Chains for this, though in practice, I suspect
> most callers to jail() will forget to set up appropriate
> IP filtering. I wonder if there is any way to
> reduce the likelihood of this failure mode and keep
> programmers honest?

how could one use connect() to escape from the jail? I couldn't think of
any, but i'm probably missing something obvious.

>
> Also, socket() should probably be restricted to
> prevent creation of raw IP and PF_PACKET sockets
> and the like (sending forged traffic, sniffing
> on local traffic).

hmm, good point, but not sure that's in the scope of a jail. jail is
there to limit damage to local system if its cracked, cracking a jail is
like cracking a vmware vm, you can do all you want inside, but cant get
out.

>
> The SO_BINDTODEVICE and IP_HDRINCL socket option
> should probably be restricted.

BINDTODEVICE obviously, HDRINCL related to raw sockets, not sure this is
not in the scope of jail.

>
> Also, are there any implications of SO_PASSCRED,
> SO_PEERCRED, SCM_RIGHTS, SCM_CREDENTIALS, SO_DEBUG,
> SO_REUSEADDR, IP_OPTIONS, IP_PKTINFO?

ok, I'll look into those.

>
> See also sendmsg() and recvmsg() fd-passing.

dont know enough about fd-passing I've decided, where do I read up on
this. Most of what I know about fd's are as an int, and the kernel
keeps track of what int goes to what file, so not sure how its a
problem.

>
> >sys_syslog) NOT SURE (probably jailed away)
>
> sys_syslog touches a global shared resource, hence
> should probably be denied to jailed processes.

or virtualized.

>
> >sys_vhangup) NOT SURE - Should be fine, right?
>
> Seems ok to me.
>
> >sys_fsync) NOT SURE - same as sync
> >sys_fdatasync) NOT SURE - probably same as other syncs.
>
> The *sync*() calls seem ok to me.
>
> >sys_getsid) NOT SURE - whats it for?
>
> You shouldn't be able to call getsid() on some other
> process outside the jail. Also, calling getsid() on
> yourself might reveal information about your parent,
> like getppid() or getpgid() (minor).

yes, thats what we were planning for things like getppid, and see my
comment about jail() perhaps being like exec.

2002-07-19 16:31:11

by Shaya Potter

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

On Fri, 2002-07-19 at 12:24, Shaya Potter wrote:
> On Thu, 2002-07-18 at 20:21, David Wagner wrote:
> > >sys_syslog) NOT SURE (probably jailed away)
> >
> > sys_syslog touches a global shared resource, hence
> > should probably be denied to jailed processes.
>
> or virtualized.

forget that, stupid, sys_syslog only deals with printk buffer, not
normal syslogd. so lock it away from jails, system deals with it
normally.

2002-07-19 16:32:19

by Shaya Potter

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

On Thu, 2002-07-18 at 23:06, Albert D. Cahalan wrote:
>
> >> sys_vhangup) NOT SURE - Should be fine, right?
> >
> > Seems ok to me.
>
> Have fun with devpts.

what would happen if a jail had virtualized ttys? i.e. they each had
tty1,2,3,4..... and this was transalted to an open real tty when needed,
and the transalation mapping being kept in the prison struct?

>
> >> sys_getsid) NOT SURE - whats it for?
> >
> > You shouldn't be able to call getsid() on some other
> > process outside the jail. Also, calling getsid() on
> > yourself might reveal information about your parent,
> > like getppid() or getpgid() (minor).
>
> Your parent ought to be 1.

yes.

2002-07-19 21:07:47

by Shaya Potter

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

On Thu, 2002-07-18 at 23:06, Albert D. Cahalan wrote:
> >> sys_vhangup) NOT SURE - Should be fine, right?
> >
> > Seems ok to me.
>
> Have fun with devpts.

can you expand on why this might be a problem, as far I can tell the
syscall is in fs/open.c

it seems very simple to me

asmlinkage long sys_vhangup(void)
{
if (capable(CAP_SYS_TTY_CONFIG)) {
tty_vhangup(current->tty);
return 0;
}
return -EPERM;
}

basically, we call tty_vhangup on the process's tty.

if tty_vhangup was the syscall, I could see this being a problem, but as
sys_vhangup can only operate on the what the task_struct has, how is it
a problem?

thanks,

shaya potter

2002-07-19 22:47:00

by Shaya Potter

[permalink] [raw]
Subject: Re: more thoughts on a new jail() system call

On Thu, 2002-07-18 at 20:21, David Wagner wrote:
> Shaya Potter wrote:
> >sys_mknod) J - Need FIFO ability, everything else not.
>
> Beware the ability to pass file descriptors across Unix
> domain sockets. This should probably be restricted somehow.
> Along similar lines, you didn't mention sendmsg() and
> recvmsg(), but the fd-passing parts should probably be
> restricted.

not sure there has to be anything restricted, more so than the
filesystem restrictions already. As from what I can tell from Stevens
there are 2 ways to pass a fd over an AF_UNIX socket. either socketpair
(parent/child relationship i.e. both in jail) or a named socket, which
then its constrained to the jailed FS, and therefore only processes in
that particular jail have access to it.

or am I wrong?

thanks,

shaya potter