2005-05-31 10:28:52

by Lars Marowsky-Bree

[permalink] [raw]
Subject: Re: [Clusters_sig] Re: [ANNOUNCE] Linux Cluster Summit 2005

On 2005-05-21T09:29:01, Robert Wipfel <[email protected]> wrote:

> outside looking in, is web services an! d grid. Returning to the
> reality of many vendor's enterprise* business, the suitespot for h/a
> clusters still seems to be somewhere around ~8 dual-CPU nodes with
> many customers deploying multiple similar clusters. Nodes are never in
> multiple clusters at once, rather, individual nodes are members of a
> cluster and that cluster might be a member of a cluster of clusters.

A single node must be big enough to support sane load balancing; ie, big
enough to run at least one (or more) "whole" resource entities / jobs.

That is the breaking point after which it is more sensible to deploy
more nodes - with looser coupling - than making a single node / SSI
component larger, because decoupled operation means less complexity for
fault isolation.


Sincerely,
Lars Marowsky-Br?e <[email protected]>

--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"


2005-05-31 19:30:19

by David Nicol

[permalink] [raw]
Subject: Common Cluster Infrastructure discussion

On 5/31/05, Lars Marowsky-Bree <[email protected]> wrote:
> On 2005-05-21T09:29:01, Robert Wipfel <[email protected]> wrote:
>
> > outside looking in, is web services and grid. Returning to the
> > reality of many vendor's enterprise* business, the sweet spot for h/a
> > clusters still seems to be somewhere around ~8 dual-CPU nodes with
> > many customers deploying multiple similar clusters. Nodes are never in
> > multiple clusters at once, rather, individual nodes are members of a
> > cluster and that cluster might be a member of a cluster of clusters.

To restate the proposal, in response to this distinction, that "Nodes
are never in
multiple clusters at once, rather, individual nodes are members of a
cluster and that cluster might be a member of a cluster of clusters", in the
language of the CCI proposal, one or more nodes in a subcluster would join
the larger cluster, but not all of them, and these liaison nodes would
handle the
communications between this subcluster and other subclusters. using
CCI, different
subclusters in this grid could run different clustering frameworks, or might be
lone boxes in the supercluster that aren't actually representing clusters.

To implement a cluster being a member of a cluster of clusters with the same
interface that is used to manage a node being a member of a cluster, that is the
idea.

I take away from LMB's remark a requirement that the CCI must support in-cluster
selection/election for a presented service, so that the liaison nodes,
which could be
all nodes, or could be a subset of all nodes, in the subcluster,
could present themselves
as a coherent authority to the other members of the supercluster, over
the channels
defined by the supercluster, representing a single node identifier in
the supercluster.

We want in-cluster communications to be through the CCI rather than through an
implementation detail (such as tcp/ip) because we do not want to
confuse communications
by associating node identification with any artifact, such as IP
address, which could
be broken by an architecture change, or even by a failover.

> A single node must be big enough to support sane load balancing; ie, big
> enough to run at least one (or more) "whole" resource entities / jobs.

OTOH, the grain size of a "whole job" can be tuned to fit the reality
of your hardware.

Hopefully the CCI will provide useful metrics about node capability
and current load
to allow apples-to-apples comparisons for making better load balancing decisions
in heterogenous clusters.



> "Ignorance more frequently begets confidence than does knowledge"

very good - I know it does for me!

David L Nicol
Proudly ignorant of many and much