Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751359AbWCXVBw (ORCPT ); Fri, 24 Mar 2006 16:01:52 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751417AbWCXVBw (ORCPT ); Fri, 24 Mar 2006 16:01:52 -0500 Received: from MAIL.13thfloor.at ([212.16.62.50]:3548 "EHLO mail.13thfloor.at") by vger.kernel.org with ESMTP id S1751359AbWCXVBv (ORCPT ); Fri, 24 Mar 2006 16:01:51 -0500 Date: Fri, 24 Mar 2006 22:01:50 +0100 From: Herbert Poetzl To: "Eric W. Biederman" Cc: Kirill Korotaev , Dave Hansen , Sam Vilain , linux-kernel@vger.kernel.org, OpenVZ developers list , "Serge E.Hallyn" , Andrew Morton Subject: Re: [RFC] [PATCH 0/7] Some basic vserver infrastructure Message-ID: <20060324210150.GA22308@MAIL.13thfloor.at> Mail-Followup-To: "Eric W. Biederman" , Kirill Korotaev , Dave Hansen , Sam Vilain , linux-kernel@vger.kernel.org, OpenVZ developers list , "Serge E.Hallyn" , Andrew Morton References: <20060321061333.27638.63963.stgit@localhost.localdomain> <1142967011.10906.185.camel@localhost.localdomain> <44241224.9000200@sw.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3889 Lines: 95 On Fri, Mar 24, 2006 at 01:28:49PM -0700, Eric W. Biederman wrote: > Kirill Korotaev writes: > > > >> I certainly have not. I do feel that developing this just from the > >> top down is the wrong way to do this. In some of the preliminary > >> patches we have found several pieces of code that we will have to > >> touch that is currently in need of a cleanup. That is why I have > >> been cleaning up /proc. sysctl is in need of similar treatment > >> but is in less bad shape. > > Eric, though I suggest to postpone proc and sysctl a bit, can you share > > me your vision of /proc and /sysctl virtualization a bit? > > A good way to handle them IMHO is to make fully virtual, i.e. each > > namespace should have an own set of sysctl or proc tree. > > Roughly I agree. Some cases are easier than others. So let me take > just the sysvipc case as an example. > > My thinking is move the calls for printing the sysvipc namespace > from fs/proc/generic.c (with all of it's cool helpers) to > fs/proc/base.c. > > So we wind up with: > /proc//sysvipc/msg > /proc//sysvipc/sem > /proc//sysvipc/shm > /proc/sysvipc -> /proc/self/sysvipc > > For sysctl we add a method to fetch the address of > the variable and perhaps a few other attributes, > that method is passed a task structure. > > Then we can have per process instances of: > /proc//sys/sem > /proc//sys/shmall > /proc//sys/shmmax > /proc//sys/msgmax > /proc//sys/msgmni > /proc//sys/shmmni > And a symlink at: > /proc/sys that points to /proc//sys > > Getting sysvipc to show up in a per process fashion is pretty > easy. Getting the entire sys hierarchy to show up per process > is a little harder simply because I think to do it cleanly requires > help functions that I don't have yet. I have removed all of > the internal dependence on magic inode numbers completely removing > the hard coded inode numbers and putting sys looks doable. > > Does that sound like a reasonable model? hmm, isn't per process a little extreme ... I know what you want to accomplish but won't this lead to a per process procfs? and, if you want to do per process procfs, what would be the gain? just my opinion ... best, Herbert > >> Part of it is that I have stopped to look more closely at what > >> other people are doing and to look at alternative implementations. > > If you need any help with it in OpenVZ, feel free to ask. We have > > broken-out patches for recent 2.6.16 kernel. > > > >> One interesting thing I have manged to do is by using ptrace I have > >> implemented enter for the existing filesystem namespaces without > >> having to modify the kernel. This at least says that enter and > >> debugging are two faces of the same coin. > > Hmmm, strange claim/conclusion... /dev/kmem allows to change namespaces > > also :) and even to obtain root priviliges if needed... :) > > True. However this is much less ugly then using /dev/kmem, and it is > much closer to what applications like user mode linux do. The primary > question in my mind was what should the permissions checks be when > performing this kind of action. Using ptrace satisfied that. > > So I now have a bounding box for what enter should be able to do > and what permissions it should take. > > > Eric, let's not compare approaches with inches :) > > As you remember your PID namespaces doesn't suite us well... :( > > More discussion when the time is right. But I believe I have solved > the fundamental incompatibility that we had. I asked you a question to > confirm that a while ago, but I have not heard anything back. > > Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/