Subject: Re: [1/1] CBUS: new very fast (for insert operations) message bus
	based on kenel connector.
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Reply-To: johnpol@2ka.mipt.ru
To: Andrew Morton <akpm@osdl.org>
Cc: linux-kernel@vger.kernel.org, guillaume.thouvenin@bull.net,
       jlan@engr.sgi.com, efocht@hpce.nec.com, linuxram@us.ibm.com,
       gh@us.ibm.com, elsa-devel@lists.sourceforge.net, greg@kroah.com
In-Reply-To: <20050401012547.68c26523.akpm@osdl.org>
References: <20050320112336.2b082e27@zanzibar.2ka.mipt.ru>
	 <20050331162758.44aeaf44.akpm@osdl.org> <1112337814.9334.42.camel@uganda>
	 <20050331232625.09057712.akpm@osdl.org> <1112341514.9334.103.camel@uganda>
	 <20050331235927.6d104665.akpm@osdl.org> <1112345400.9334.157.camel@uganda>
	 <20050401012547.68c26523.akpm@osdl.org>
Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-e5MR/ewWdmfqb1YLIMrr"
Organization: MIPT
Date: Fri, 01 Apr 2005 14:16:55 +0400
Message-Id: <1112350615.9334.194.camel@uganda>
Mime-Version: 1.0
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 5811
Lines: 156


--=-e5MR/ewWdmfqb1YLIMrr
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Fri, 2005-04-01 at 01:25 -0800, Andrew Morton wrote:
> Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> >
> > On Thu, 2005-03-31 at 23:59 -0800, Andrew Morton wrote:
> > > Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> > > >
> > > > On Thu, 2005-03-31 at 23:26 -0800, Andrew Morton wrote:
> > > > > Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> > > > > >
> > > > > > > > +static int cbus_event_thread(void *data)
> > > > > >  > > +{
> > > > > >  > > +	int i, non_empty =3D 0, empty =3D 0;
> > > > > >  > > +	struct cbus_event_container *c;
> > > > > >  > > +
> > > > > >  > > +	daemonize(cbus_name);
> > > > > >  > > +	allow_signal(SIGTERM);
> > > > > >  > > +	set_user_nice(current, 19);
> > > > > >  >=20
> > > > > >  > Please use the kthread api for managing this thread.
> > > > > >  >=20
> > > > > >  > Is a new kernel thread needed?
> > > > > >=20
> > > > > >  Logic behind cbus is following:=20
> > > > > >  1. make insert operation return as soon as possible,
> > > > > >  2. deferring actual message delivering to the safe time
> > > > > >=20
> > > > > >  That thread does second point.
> > > > >=20
> > > > > But does it need a new thread rather than using the existing keve=
ntd?
> > > >=20
> > > > Yes, it is much cleaner [especially from performance tuning point]=20
> > > > to use own kernel thread than pospone all work to the queued work.
> > > >=20
> > >=20
> > > Why?  Unless keventd is off doing something else (rare), it should be
> > > exactly equivalent.  And if keventd _is_ off doing something else the=
n that
> > > will slow down this kernel thread too, of course.
> >=20
> > keventd does very hard jobs on some of my test machines which=20
> > for example route big amount of traffic.
>=20
> As I said - that's going to cause _your_ kernel thread to be slowed down =
as
> well.

Yes, but it does not solve peak performance issues - all scheduled
jobs can run one after another which will decrease insert performance.

> I mean, it's just a choice between two ways of multiplexing the CPU.  One
> is via a context switch in schedule() and the other is via list traversal
> in run_workqueue().  The latter will be faster.

But in case of separate thread one can control execution process,
if it will be called from work queue then insert requests=20
can appear one after another in a very interval,
so theirs processing will hurt insert performance.

> > > Plus keventd is thread-per-cpu and quite possibly would be faster.
> >=20
> > I experimented with several usage cases for CBUS and it was proven=20
> > to be the fastest case when only one sending thread exists which manage=
s
> > only very limited amount of messages at a time [like 10 in CBUS
> > currently]
>=20
> Maybe that's because the cbus data structures are insufficiently
> percpuified.  On really big machines that single kernel thread will be a
> big bottleneck.

It is not because of messages itself, but becouse of it's peaks,
if there is a peak then above mechanism will smooth it into
several pieces [for example 10 in each bundle, that value should be
changeable in run-time, will place it into TODO],
with keventd there is no guarantee that next peak will be processed
not just after the current one, but after some timeout.

> > without direct awakening [that is why wake_up() is commented in
> > cbus_insert()].
>=20
> You mean the
>=20
> 	interruptible_sleep_on_timeout(&cbus_wait_queue, 10);
>=20
> ?  (That should be HZ/100, btw).
>=20
> That seems a bit kludgy - someone could queue 10000 messages in that time=
,
> although they'll probably run out of memory first, if it's doing
> GFP_ATOMIC.

GFP_ATOMIC issues will be resolved first.

> Introducing an up-to-ten millisecond latency seems a lot worse than some
> reduction in peak bandwidth - it's not as if pumping 100000 events/sec is=
 a
> likely use case.  Using prepare_to_wait/finish_wait will provide some
> speedup in SMP environments due to reduced cacheline transfers.

It is a question actually...
If we allow peak processing, then we _definitely_ will have insert=20
performance degradation, it was observed in my tests.
The main goal of CBUS was exactly insert speed - so
it somehow must smooth shape performance peaks, and thus
above budget was introdyced.
It is similar to NAPI in some abstract way, but with different aims -=20
NAPI for speed improovement, but here we have peak smootheness.

> > If too many deferred insert works will be called simultaneously
> > [which may happen with keventd] it will slow down insert operations
> > noticeably.
>=20
> What is a "deferred insert work"?  Insertion is always synchronous?

Insert is synchronous in one CPU, but actuall message delivering is
deferred.

> > I did not try that case with the keventd but with one kernel thread=20
> > it was tested and showed worse performance.
>=20
> But your current implementation has only one kernel thread?

It has a budget and timeout between each bundle processing.
keventd does not allow to create such a timeout between each bundle
processing.

--=20
        Evgeniy Polyakov

Crash is better than data corruption -- Arthur Grabowski

--=-e5MR/ewWdmfqb1YLIMrr
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQBCTR+XIKTPhE+8wY0RAnb3AJsHYcaW6B/z656ow68o+W5cNtKUdQCfWip1
CnnQQrWlbZxcdPB3MQH+pys=
=e6zy
-----END PGP SIGNATURE-----

--=-e5MR/ewWdmfqb1YLIMrr--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/