Date:   Tue, 21 Mar 2023 16:36:49 +0100
From:   Petr Mladek <pmladek@suse.com>
To:     John Ogness <john.ogness@linutronix.de>
Cc:     Sergey Senozhatsky <senozhatsky@chromium.org>,
        Steven Rostedt <rostedt@goodmis.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        linux-kernel@vger.kernel.org,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: simplify: was: Re: [PATCH printk v1 06/18] printk: nobkl: Add
 acquire/release logic
Message-ID: <ZBnPEaJKdHyTtUNS@alley>
References: <20230302195618.156940-1-john.ogness@linutronix.de>
 <20230302195618.156940-7-john.ogness@linutronix.de>
 <ZBSkoKCdG5uiVNPq@alley>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <ZBSkoKCdG5uiVNPq@alley>
Precedence: bulk

On Fri 2023-03-17 18:34:30, Petr Mladek wrote:
> Hi,
> 
> I send this before reading today's answers about the basic rules.
> 
> I have spent on this answer few days and I do not want to delay
> it indefinitely. It documents my initial feelings about the code.
> Also it describes some ideas that might or need not be useful
> anyway.
> 
> Also there is a POC that slightly modifies the logic. But the basic
> approach remains the same.

I looked at this with a "fresh" mind. I though if there was any real
advantage in the proposed change of the cons_release() logic. I mean
to just clear .cpu and .cur_prio and let cons_try_acquire() to take
over the lock.

I tried to describe my view below.

> > +/**
> > + * __cons_release - Release the console after output is done
> > + * @ctxt:	The acquire context that contains the state
> > + *		at cons_try_acquire()
> > + *
> > + * Returns:	True if the release was regular
> > + *
> > + *		False if the console is in unusable state or was handed over
> > + *		with handshake or taken	over hostile without handshake.
> > + *
> > + * The return value tells the caller whether it needs to evaluate further
> > + * printing.
> > + */
> > +static bool __cons_release(struct cons_context *ctxt)
> > +{
> > +	struct console *con = ctxt->console;
> > +	short flags = console_srcu_read_flags(con);
> > +	struct cons_state hstate;
> > +	struct cons_state old;
> > +	struct cons_state new;
> > +
> > +	if (WARN_ON_ONCE(!(flags & CON_NO_BKL)))
> > +		return false;
> > +
> > +	cons_state_read(con, CON_STATE_CUR, &old);
> > +again:
> > +	if (!cons_state_bits_match(old, ctxt->state))
> > +		return false;
> > +
> > +	/* Release it directly when no handover request is pending. */
> > +	if (!old.req_prio)
> > +		goto unlock;
> > +
> > +	/* Read the handover target state */
> > +	cons_state_read(con, CON_STATE_REQ, &hstate);
> > +
> > +	/* If the waiter gave up hstate is 0 */
> > +	if (!hstate.atom)
> > +		goto unlock;
> > +
> > +	/*
> > +	 * If a higher priority waiter raced against a lower priority
> > +	 * waiter then unlock instead of handing over to either. The
> > +	 * higher priority waiter will notice the updated state and
> > +	 * retry.
> > +	 */
> > +	if (hstate.cur_prio != old.req_prio)
> > +		goto unlock;

The above check might cause that CUR will be completely unlocked
even when there is a request. It is a corner case. It would happen
when a higher priority context is in the middle of over-ridding
an older request (already took REQ but have not updated
CUR.req_prio yet).

As a result any context might take CUR while the higher priority
context is re-starting the request and tries to get the lock with
the updated CUR.

It is a bit pity but it is not end of the world. The higher priority
context would just need to wait for another context.

That said, my proposal would solve this a bit cleaner way.
CUR would stay blocked for the .req_prio context. As a result,
the being-overridden REQ owner would become CUR owner.
And the higher priority context would then need to setup
new REQ against the previous REQ owner.

> > +
> > +	/* Switch the state and preserve the sequence on 64bit */
> > +	copy_bit_state(new, hstate);
> > +	copy_seq_state64(new, old);
> > +	if (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &old, &new))
> > +		goto again;

The other difference is that the above code will do just half of
the request-related manipulation. It will assing CUR to the REQ owner.
The REQ owner will need to realize that it got the lock and
clean up REQ part.

Or by other words, there are 3 pieces of information:

   + CUR owner is defined by CUR.cpu and CUR.cur_prio
   + REQ owner is defined by REQ.cpu and REQ.cur_prio
   + CUR knows about the request by CUR.req_prio

The current code modifies the pieces in thie order:

CPU0				CPU1

// take a free lock
set CUR.cpu
set CUR.cur_prio

				// set request
				set REQ.cpu
				set REQ.cur_prio

				// notify CUR
				set CUR.req_prio

// re-assign the lock to CPU1
set CUR.cpu = REQ.cpu
set CUR.cur_prio = REQ.cur_prio
set CUR.req_prio = 0

				// clean REQ
				REQ.cpu =0;
				REQ.cur_prio = 0;


In this case, CPU0 has to read REQ and does a job for CPU1.

Instead, my proposal does:

CPU0				CPU1

// take a free lock
set CUR.cpu
set cur.prio

				// set request
				set REQ.cpu
				set REQ,cur_prio

				// notify CUR
				set CUR.req_prio

// unlock CPU0
set CUR.cpu = 0
set CUR.cur_prio = 0;
keep CUR.req_prio == REQ.cur_prio

				// take the lock and clean notification
				set CUR.cpu = REQ.cpu
				set CUR.cur_prio = REQ.cur_prio
				set CUR.req_prio = 0

				// clean REQ
				REQ.cpu =0;
				REQ.cur_prio = 0;


In this case:

   + CPU0: It manipulates only CUR. And it keeps CUR.req_prio value.
	   It does not check REQ at all.

   + CPU1: Manipulates all REQ-related variables and fields.
	   It modifies SEQ.cpu and SEQ.cur_prio only when
	   they are free.

It looks a bit cleaner. Also it might help to think about barriers
because each side touches only its variables and fields. We might
need less explicit barriers that might be needed when one CPU
does a change for the other.


My view:

I would prefer to do the logic change. It might help with review
and also with the long term maintenance.

But I am not 100% sure if it is worth it. The original approach might
be good enough. The important thing is that it modifies CUR and REQ
variables and fields in the right order. And I do not see any
chicken-and-egg problems. Also the barriers should be doable.

Best Regards,
Petr