Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1423288AbbEEN3Q (ORCPT ); Tue, 5 May 2015 09:29:16 -0400 Received: from mail-qk0-f174.google.com ([209.85.220.174]:35574 "EHLO mail-qk0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750876AbbEEN3G (ORCPT ); Tue, 5 May 2015 09:29:06 -0400 Message-ID: <5548C59D.60608@hurleysoftware.com> Date: Tue, 05 May 2015 09:29:01 -0400 From: Peter Hurley User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Nic Percival , Michael Matz , Kevin Fletcher , Paul Matthews , Chris Purvis CC: NeilBrown , Greg Kroah-Hartman , Jiri Slaby , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' References: <20150501162040.05c0cb42@notabene.brown> <5543964C.9030606@hurleysoftware.com> <5548A718.7070305@hurleysoftware.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7014 Lines: 161 Stop top-posting. On 05/05/2015 08:03 AM, Nic Percival wrote: > > There is only ever one debuggee process. > My original demo (and indeed the original test failure) is not threaded. The debugger is multi-threaded. > > I've brought in Chris, Fletch and Paul, my immediate colleagues, into the discussion. > > The email thread is getting a little tangled, however, from my standpoint I have.. > > 1) poll tells us we have nothing to read on a pty, when we know something was written into the other end. You're using a synchronization mechanism (ptrace) to validate an asynchronous process (tty i/o). That's not going to work. > 2) Given that 'poll' is not telling us that data has been written into the pty, what can we use? Surely that is what poll is for. poll() doesn't tell you that nothing has been written. You're inferring that using a broken understanding of terminal i/o: ttys are not synchronous pipes. > 3) If a debuggee program has displayed 'how old are you?' and then hit a breakpoint on the 'ACCEPT' response, then the question might very well not be displayed, despite the debugger sitting on the statement some way subsequent to the display. Let's extend your logic process here to a general-purpose debugger that can control all output devices. 1. The debugger and debuggee are running on X-Windows. 2. The debuggee outputs 'how old are you?' 3. The debugger immediately halts the debuggee and all output devices. The output will not appear on the monitor because X-Windows output is asynchronous. So is terminal i/o. > 4) If I understand correctly, the modification is a performance enhancement. Obviously in the case of 'ptrace' debugging, performance is not a requirement. Nothing obvious about it. Not all uses of ptrace are interactive, and certainly don't want alternate behavior based on whether the process is ptraced. > 5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 'How old are you?' in the xterm, but an input (getchar, whatever) is hit before that output is displayed? With or without ptrace? Of course. It's called typeahead. Since tty i/o is buffered, the following is possible: 1. The user types '15\r' 2. The process writes 'How old are you?' 3. The process reads '15\n' Processes that don't want typeahead call tcflush() before reading input. Regards, Peter Hurley > Thanks, > Nic > > > > -----Original Message----- > From: Peter Hurley [mailto:peter@hurleysoftware.com] > Sent: 05 May 2015 12:19 > To: Nic Percival; Michael Matz > Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@vger.kernel.org > Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' > > > A: No. > Q: Should I include quotations after my reply? > > http://daringfireball.net/2007/07/on_top > > > On 05/05/2015 04:20 AM, Nic Percival wrote: >> Michael is correct. >> Our COBOL debugger has a test feature whereby we can drive it to step through debugging code, hitting breakpoints and so on. >> The debugger maintains a 'user screen' which is what the 'debuggee' process has displayed. >> This is communicated to the debugger with pseudo-tty's. >> The state of this user screen is checked as part of this (and other) tests. > > So the debugger doesn't display output from other non-TRACEME threads or child processes of the debuggee, right? > > When that's fixed, you'll see that the "test failure" has gone away. > >> The actual test failure is a failure of some text to be displayed on the debuggee user screen when we know, given it has hit a certain breakpoint, that the text has been written. >> >> What is worse is its non-deterministic. > > That your test is non-deterministic stems from the fact that the i/o is asynchronous. > > You would experience the same problem if your test setup was a tty in loopback. > >> Sometimes the text makes it and is displayed, so it wouldn't even be practical to modify the test to make it pass. >> We wouldn't really want to do that anyway - the test is just fine on other earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris. > > There is a reason Linux is the platform of choice for scalability. > > Regards, > Peter Hurley > >> -----Original Message----- >> From: Michael Matz [mailto:matz@suse.de] >> Sent: 04 May 2015 13:24 >> To: Peter Hurley >> Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby; >> linux-kernel@vger.kernel.org >> Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes' >> >> Hi, >> >> On Fri, 1 May 2015, Peter Hurley wrote: >> >>> I don't think this a real bug, in the sense that pty i/o is not >>> synchronous, in the same way that tty i/o is not synchronous. >> >> Here's what I wrote internally about my speculations about this being a bug or not: >> >>>> I also never hit it with pipes (remove the USEPTY define), also not >>>> on sle12, so it must be some change specific to the pty implementation. >>>> >>>> Now, all of this is of course unspecified. There are two >>>> asynchronous processes involved, and a buffered tube between them. >>>> Just because one process filled one end of the tube (the breakpoint >>>> was hit) doesn't mean the contents have to appear at that instant at >>>> the other end. So the change in behaviour in sle12 is not a genuine >>>> bug. It _might_ be an unintented change, though, that's why kernel >>>> people should comment on this. If there are no terribly good >>>> reasons for this change I'd consider it a quality-of-implementation >>>> regression in sle12. >> >> So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team. >> >>> However, that said, if this is a regression (regression as in "it >>> broke something that used to work", not regression as in "this new >>> thing I'm writing doesn't behave the way I want it to" :) ) >>> >>> Help me understand the use-case here: are you using pty i/o to debug >>> the debugger? >> >> Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he wanted to write from scratch. >> >> (FWIW: I do think it's a better QoI factor if something returns data >> from a tube if we can know via side channels (break points) that >> something must have been written locally to the other end of the tube, >> if that can be ensured without too much other work) >> >> >> Ciao, >> Michael. >> >> >> This message has been scanned for malware by Websense. >> www.websense.com >> > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/