TH> If it ever becomes a general enough problem (which I extremely
TH> strongly doubt),
Migration of a container? Yeah, it's one of the primary reasons for
doing what we're doing :)
TH> we can think about allowing processes in a netns to change
TH> sequence number but that would be a single setsockopt option
Yeah, well there's more than that, of course, if you want to be able
to checkpoint a socket in any state. Buffers, time-wait, etc.
TH> instead of the horror show of dumping in-kernel data structures in
TH> binary blob.
Well, as should be evident from a review of the code, we don't dump
binary kernel data structures as a general rule. We canonicalize them
into checkpoint headers on the way out and build the new data
structures (or use existing kernel interfaces to do so) on the way in.
You know, just like netlink does.
It has even been suggested that we do this with netlink instead, to
mirror the other "horror show" tools that we all use on a daily basis.
We're not opposed to this, but we do have some concerns about
performance.
--
Dan Smith
IBM Linux Technology Center
email: [email protected]
Hello,
On 11/17/2010 04:33 PM, Dan Smith wrote:
> TH> If it ever becomes a general enough problem (which I extremely
> TH> strongly doubt),
>
> Migration of a container? Yeah, it's one of the primary reasons for
> doing what we're doing :)
Well, then push for the feature. If the rationale is strong enough,
it'll get in.
> TH> we can think about allowing processes in a netns to change
> TH> sequence number but that would be a single setsockopt option
>
> Yeah, well there's more than that, of course, if you want to be able
> to checkpoint a socket in any state. Buffers, time-wait, etc.
I haven't really thought about it too deeply but for all other misc
states, you should be able to emulate it by talking to a netfilter
module. The reason why I suggested sequence number changing setsocket
option is because that is the only performance sensitive part and with
that you should be able to resume live sockets without conntracking.
For cold paths, using netfilter module during resume should do, right?
> TH> instead of the horror show of dumping in-kernel data structures in
> TH> binary blob.
>
> Well, as should be evident from a review of the code, we don't dump
> binary kernel data structures as a general rule. We canonicalize them
> into checkpoint headers on the way out and build the new data
> structures (or use existing kernel interfaces to do so) on the way in.
> You know, just like netlink does.
netlink interaction is defined by ABI.
> It has even been suggested that we do this with netlink instead, to
> mirror the other "horror show" tools that we all use on a daily basis.
> We're not opposed to this, but we do have some concerns about
> performance.
The horror show part is dumping internal data structure without due
scrutinization in a way which can only ever be useful for CR when most
of the same states are already exported via ABI defined ways.
Thanks.
--
tejun
On Wed, Nov 17, 2010 at 5:40 PM, Tejun Heo <[email protected]> wrote:
> The horror show part is dumping internal data structure without due
> scrutinization in a way which can only ever be useful for CR when most
> of the same states are already exported via ABI defined ways.
That's what review process is for, isn't it?
Please, look at what is being dumped and what isn't.