DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=sender:date:from:to:cc:subject:message-id:references:mime-version
         :content-type:content-disposition:in-reply-to:user-agent;
        b=NYJuULL7+UlcKcCAvXW8r8uVKCZZ+1MhQ906qJ6YMReBBMNXooPpVKPeOeWX7jA8KN
         6dbJ3inJSBSHCiuyfeqtWQnNWA1CuZw8o9Lser7zePu5kUdysLaysVfUskGhlhX00pOH
         VMiMR/9HoV9E2rcO6Yxvbg7O+/nbhjslMalso=
Date: Wed, 9 Feb 2011 15:18:03 +0100
From: Tejun Heo <tj@kernel.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>, jan.kratochvil@redhat.com,
        linux-kernel@vger.kernel.org, torvalds@linux-foundation.org,
        akpm@linux-foundation.org
Subject: Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after
 PTRACE_ATTACH
Message-ID: <20110209141803.GH3770@htj.dyndns.org>
References: <20110204105343.GA12133@htj.dyndns.org>
 <20110204130455.GA3671@redhat.com>
 <20110204144858.GI12133@htj.dyndns.org>
 <20110204170602.GB15301@redhat.com>
 <20110205133926.GM12133@htj.dyndns.org>
 <20110207134235.GB22538@redhat.com>
 <20110207141135.GA16992@htj.dyndns.org>
 <20110207153723.GA27997@redhat.com>
 <20110207163121.GB16992@htj.dyndns.org>
 <20110207174821.GA1237@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20110207174821.GA1237@redhat.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 8502
Lines: 187

Hello, Oleg.

On Mon, Feb 07, 2011 at 06:48:21PM +0100, Oleg Nesterov wrote:
> > I don't know.  Maybe it's more consistent that way and I'm not
> > fundamentally against that but it is a big behavior change
> 
> Hmm. I tried to describe the current behaviour...

We can make it behave like the following.  { | } denotes two
alternative behaviors regarding SIGCONT.

  If a group stop is initiated while, or was in progress when a task
  is ptraced, the task will stop for group stop and notify the ptracer
  accordingly.  Note that the task could be trapped elsewhere delaying
  this from happening.  When the task stops for group stop, it
  participates in group stop as if it is not ptraced and the real
  parent is notified of group stop completion.

  Note that { the task is put into TASK_TRACED state and group stop
  resume by SIGCONT is ignored. | the task is put into TASK_STOPPED
  state and the following PTRACE request will transition it into
  TASK_TRACED.  If SIGCONT is received before transition to
  TASK_TRACED is made, the task will resume execution.  If PTRACE
  request faces with SIGCONT, PTRACE request may fail. }

  The ptracer may resume execution of the task using PTRACE_CONT
  without affecting other tasks in the group.  The task will not stop
  for the same group stop again while ptraced.

  On ptrace detach, if group stop is in effect, the task will be put
  into TASK_STOPPED state and if it is the first time the task is
  stopping for the current group stop, it will participate in group
  stop completion.

This can be phrased better but it seems well defined enough for me.  I
take it that one of your concerns is direct transition into
TASK_TRACED on group stop while ptraced which prevents the tracee from
reacting to the following SIGCONT.  I'm not sure how much of an actual
problem it is given that our notification to real parent hasn't worked
at all till now but we can definitely implement proper TASK_STOPPED ->
TRACED transition on the next PTRACE request.  There exists a
fundamental race condition between SIGCONT and the next PTRACE call
but I don't think it's such a big deal as long as the transition
itself is done properly.

If we don't go that route, another solution would be to add a ptrace
call which can listen to SIGCONT.  ie. PTRACE_WAIT_CONT or whatever
which the ptracer can call once it knows the tracee entered group
stop.

In either case, the fundamentals of ptrace operation don't really
change.  All ptrace operations are still per-task and ptracer almost
always has control over execution of the tracee.  Sure, it allows
ptraced task to escape group stop but it seems defined clear enough
and IMHO actually is a helpful debugging feature.  After all, it's not
like stop signals can be used for absoultely reliable job control.
There's an inherent race against SIGCONT.

> > What do you do about PTRACE requests while a task is group stopped?
> > Reject them?  Block them?
> 
> Yes, another known oddity. Of course we shouldn't reject or block.
> Perhaps we can ignore this initially. If SIGCONT comes after another
> request does STOPPED/TRACED it clears SIGNAL_STOP_STOPPED, but the
> tracee won't run until the next PTRACE_CONT, this makes sense.

That conceptually might make sense but other than the conceptual
integrity it widely changes the assumptions and is less useful than
the current behavior.  I don't really see why we would want to do
that.

> The problem is, gdb can't leave the tracee in STOPPED state if it
> wants. We need to improve this somehow (like in your previous example
> with gdb).
>
> Only if it attaches to every thread in the thread group. Otherwise,
> if the non-thread has already initiated the group-stop, the tracee
> will notice TIF_SIGPENDING eventually and call do_signal_stop(),
> debugger can't control this.

The debugger is still notified and can override it.  gdb already can
and does.

> > I don't think it's an extreme corner case
> > we can break.  For example, if a user gdb's a program which raises one
> > of the stop signals, currently the user expects to be able to continue
> > the program from whithin the gdb.  If we make group stop override
> > ptrace, there's no other recourse than sending signal from outside.
> 
> Yes. Of course, gdb can be "fixed", it can send SIGCONT.
> 
> But yes, this is the noticeable change, that is why I suggested
> ptrace_resume-acts-as-SIGCONT logic. Ugly, yes, but more or less
> compatible. (although let me repeat, _pesonally_ I'd prefer to
> simply tell user-space to learn the new rules ;)

I can't really agree there.  First, to me, it seems like too radical a
change and secondly the resulting behavior might look conceptually
pleasing but is not as useful as the current one.  Why make a change
which results in reduced usefulness while noticeably breaking existing
users?

> > The only thing it achieves is the integrity of group
> > stop
> 
> Given that SIGCHLD doesn't queue and with or without your changes
> we send it per-thread, it is not trivial for gdb to detect the
> group-stop anyway. Again, the kernel should help somehow.

Hmmm?  Isn't this discoverable from the exit code from wait?

> > and I'm not really sure whether that's something worth achieving
> > at the cost of debugging capabilities especially when we don't _have_
> > to lose them.
> 
> But we do not? I mean, at least this is not worse than the current
> behaviour.

I think it's worse.  With your changes, debuggers can't diddle the
tasks behind group stop's back which the current users already expect.

> > > As for SIGCONT. Roland suggests (roughly) to change ptrace_resume()
> > > so that it doesn't wakeup the stopped tracee until the real SIGCONT
> > > comes. And this makes sense to me.
> >
> > I agree it adds more integrity to group stop but at the cost of
> > debugging capabilities.  I'm not yet convinced integrity of group stop
> > is that important.  Why is it such a big deal?
> 
> Of course I can't "prove" it is that important. But I think so.

Heh, I'm asking for proof that it is more useful. :-) But I'm still
curious why you think it's important because the benefits aren't
apparent to me.  Roland and you seem to share this opinion without
much dicussion so maybe I'm missing something?

> > CLD_STOPPED is too but while ptrace is attached the notifications are
> > made per-task and delivered to the tracer.
> 
> No, there is a difference. Sure, CLD_STOPPED is per-process without
> ptrace. But CLD_CONTINUED continues to be per-process even if all
> threads are traced.

Hmm... I need to think more about it.  I'm not fully following your
point.

> > I think this is the key question.  Whether to de-throne
> > PTRACE_CONT such that it cannot override group stop.  As I've already
> > said several times already, I think it is a pretty fundamental
> > property of ptrace
> 
> Again, I am a bit confused. Note that PTRACE_CONT overrides
> group stop if we do the above. It should wake up the tracee, in
> SIGCONT-compatible way (yes, the latter is not exactly clear).

What do you mean?  Waking up in SIGCONT-compatible way?  Sending
SIGCONT ending the whole group stop?

> But at least this should be visible to real parent. We shouldn't
> silently make the stopped tracee running while its real_parent
> thinks everything is stopped.

I think maybe this is where our different POVs come from.  To me, it
isn't too objectionable to allow debuggers to diddle with tracees
behind the real parent's back.  In fact, it would be quite useful when
debugging job control related behaviors.  I wouldn't have much problem
accepting the other way around - ie. strict job control even while
being debugged, but given that it is already allowed and visible, I
fail to see why we should change the behavior.  It doesn't seem to
have enough benefits to warrant such visible change.

If I change the patchset such that group stop while ptraced first
enters TASK_STOPPED and then transitions into TASK_TRACED on the next
PTRACE call, would that be something you guys can agree on?  We would
still need to ask the tracee to move into TASK_TRACED and so there
will be race window which would be visible under very convoluted
situations but IMHO they truly are the extreme corner cases.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/