Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752970AbbLaLhZ (ORCPT ); Thu, 31 Dec 2015 06:37:25 -0500 Received: from mo4-p00-ob.smtp.rzone.de ([81.169.146.216]:28209 "EHLO mo4-p00-ob.smtp.rzone.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751374AbbLaLhM (ORCPT ); Thu, 31 Dec 2015 06:37:12 -0500 X-RZG-AUTH: :OH8QVVOrc/CP6za/qRmbF3BWedPGA1vjs2ejZCzW8NRdwTYefHi0LhjeQF0sTFwGWOFPJQ== X-RZG-CLASS-ID: mo00 From: Thomas Schoebel-Theuer To: linux-kernel@vger.kernel.org, tst@schoebel-theuer.de Subject: [RFC 00/31] Current state of MARS Date: Thu, 31 Dec 2015 12:35:55 +0100 Message-Id: X-Mailer: git-send-email 2.6.4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 10916 Lines: 219 MARS Light is an asynchronous replication system for block storage over long distances [1,2,3]. It is the base for HA over long distances (more than 50km) and over network bottlenecks, e.g. high / varying packet loss rates. The out-of-tree (OOT) version of MARS (branch light0.1.y) is in production at 1&1 Internet SE, currently at more than 1600 servers of various types, more than 6 petabytes of total storage, and has collected more than 6 millions of operating hours with less than 10 customer-visible incidents caused by MARS (most of them in the early beta phase). >From my last LKML posting in 2014, I got the following major TODOs from you: 1) no kernel prepatch anymore, no additional EXPORT_SYMBOL()s anywhere in the rest of the kernel. => accomplished in the enclosed patchset (however not yet fully bug-free). See also branches WIP-PORTABLE, WIP-BASE and WIP-PROPOSED-UPSTREAM at [1] for some out-of-tree (OOT) versions of MARS. => this means there will be zero impact from MARS to the rest of the kernel when MARS is not used. Additionally I placed it into drivers/staging/ (configurabale via migration script ./rework-mars-for-upstream.pl ). 2) replace the symlink tree by something else. => not yet done, due to lack of time. Sorry. I am planning to work on it in the forthcoming year, and hope to get enough time for it. I think the attached patchset is not yet ready for submission, but please help me by giving early feedback. I would be very glad if some experienced upstream hacker would mentor me and help me at various points. Here are some of my thoughts about further development: In future, I would like to store the old symlink information (list of key => value pairs) in status files instead, one instance for each resource, and one for global configuration data (notice that the information _needs_ to be persistent for coping with node failures, and that dynamic storage is needed anyway for huge masses of transaction logfiles). This should enhance scalability in the distributed system. Here is a use case: almost our full 1&1 webhosting machine park has been migrated from DRBD to MARS, retaining the conventional pair structure. Thus more than 800 DRBD clusters have been turned into MARS clusters. In future, the whole 1&1 machine park could collapse into a _single_ big cluster consisting of thousands of machines. This has some advantages, not limited to greater flexibility with dynamic join-resource / leave-resource operations throughout that big cluster. For example, userspace can more easily detect failures cluster-wide and datacenter-wide, because the Lamport Clock algorithm used by MARS [2,3] is already a kind of "heartbeat", even remembering the timestamps of last successful information exchange. Thus it should be relatively easy to implement quorum algorithms etc on top of it in userspace. Note that I am currently paid by 1&1 to ensure a seamless upgrade path for servers to any new data format, without downtime (other than temporarily switching the primary roles). If you don't want the old symlink format in the upstream kernel, I would first implement the new format OOT (and strip out the old code for submission). Otherwise, I would agree to do this intree, hopefully with help and advise from some interested kernel hackers. Cheers and a happy new year, Thomas [1] https://github.com/schoebel/mars [2] https://github.com/schoebel/mars/blob/master/docu/MARS_Froscon2015.pdf [3] https://github.com/schoebel/mars/blob/master/docu/mars-manual.pdf Thomas Schoebel-Theuer (31): mars: add new module lamport mars: add new module brick_say mars: add new module brick_mem mars: add new module brick_checking mars: add new module meta mars: add new module brick mars: add new module lib_pairing_heap mars: add new module lib_queue mars: add new module lib_rank mars: add new module lib_limiter mars: add new module lib_timing mars: add new module vfs_compat mars: add new module xio mars: add new module xio_net mars: add new module lib_mapfree mars: add new module lib_log mars: add new module xio_bio mars: add new module xio_sio mars: add new module xio_client mars: add new module xio_if mars: add new module xio_copy mars: add new module xio_trans_logger mars: add new module xio_server mars: add new module light_strategy mars: add new module light_net mars: add new module light_server_strategy mars: add new module mars_proc mars: add new module mars_light mars: add new module Makefile mars: add new module Kconfig mars: activate build drivers/staging/Kconfig | 2 + drivers/staging/Makefile | 1 + drivers/staging/mars/Kconfig | 266 + drivers/staging/mars/Makefile | 61 + drivers/staging/mars/brick.c | 728 +++ drivers/staging/mars/brick_mem.c | 1081 ++++ drivers/staging/mars/brick_say.c | 916 +++ drivers/staging/mars/lamport.c | 61 + drivers/staging/mars/lib/lib_limiter.c | 129 + drivers/staging/mars/lib/lib_rank.c | 87 + drivers/staging/mars/lib/lib_timing.c | 71 + drivers/staging/mars/mars_light/light_net.c | 109 + .../mars/mars_light/light_server_strategy.c | 403 ++ drivers/staging/mars/mars_light/light_strategy.c | 2132 +++++++ drivers/staging/mars/mars_light/mars_light.c | 5880 ++++++++++++++++++++ drivers/staging/mars/mars_light/mars_proc.c | 369 ++ drivers/staging/mars/xio_bricks/lib_log.c | 505 ++ drivers/staging/mars/xio_bricks/lib_mapfree.c | 380 ++ drivers/staging/mars/xio_bricks/xio.c | 161 + drivers/staging/mars/xio_bricks/xio_bio.c | 845 +++ drivers/staging/mars/xio_bricks/xio_client.c | 1055 ++++ drivers/staging/mars/xio_bricks/xio_copy.c | 1005 ++++ drivers/staging/mars/xio_bricks/xio_if.c | 961 ++++ drivers/staging/mars/xio_bricks/xio_net.c | 1830 ++++++ drivers/staging/mars/xio_bricks/xio_server.c | 486 ++ drivers/staging/mars/xio_bricks/xio_sio.c | 571 ++ drivers/staging/mars/xio_bricks/xio_trans_logger.c | 3309 +++++++++++ include/linux/brick/brick.h | 642 +++ include/linux/brick/brick_checking.h | 104 + include/linux/brick/brick_mem.h | 218 + include/linux/brick/brick_say.h | 96 + include/linux/brick/lamport.h | 26 + include/linux/brick/lib_limiter.h | 49 + include/linux/brick/lib_pairing_heap.h | 110 + include/linux/brick/lib_queue.h | 166 + include/linux/brick/lib_rank.h | 135 + include/linux/brick/lib_timing.h | 181 + include/linux/brick/meta.h | 106 + include/linux/brick/vfs_compat.h | 45 + include/linux/mars_light/light_strategy.h | 236 + include/linux/mars_light/mars_proc.h | 34 + include/linux/xio/lib_log.h | 329 ++ include/linux/xio/lib_mapfree.h | 84 + include/linux/xio/xio.h | 313 ++ include/linux/xio/xio_bio.h | 85 + include/linux/xio/xio_client.h | 105 + include/linux/xio/xio_copy.h | 115 + include/linux/xio/xio_if.h | 108 + include/linux/xio/xio_net.h | 171 + include/linux/xio/xio_server.h | 91 + include/linux/xio/xio_sio.h | 68 + include/linux/xio/xio_trans_logger.h | 263 + 52 files changed, 27284 insertions(+) create mode 100644 drivers/staging/mars/Kconfig create mode 100644 drivers/staging/mars/Makefile create mode 100644 drivers/staging/mars/brick.c create mode 100644 drivers/staging/mars/brick_mem.c create mode 100644 drivers/staging/mars/brick_say.c create mode 100644 drivers/staging/mars/lamport.c create mode 100644 drivers/staging/mars/lib/lib_limiter.c create mode 100644 drivers/staging/mars/lib/lib_rank.c create mode 100644 drivers/staging/mars/lib/lib_timing.c create mode 100644 drivers/staging/mars/mars_light/light_net.c create mode 100644 drivers/staging/mars/mars_light/light_server_strategy.c create mode 100644 drivers/staging/mars/mars_light/light_strategy.c create mode 100644 drivers/staging/mars/mars_light/mars_light.c create mode 100644 drivers/staging/mars/mars_light/mars_proc.c create mode 100644 drivers/staging/mars/xio_bricks/lib_log.c create mode 100644 drivers/staging/mars/xio_bricks/lib_mapfree.c create mode 100644 drivers/staging/mars/xio_bricks/xio.c create mode 100644 drivers/staging/mars/xio_bricks/xio_bio.c create mode 100644 drivers/staging/mars/xio_bricks/xio_client.c create mode 100644 drivers/staging/mars/xio_bricks/xio_copy.c create mode 100644 drivers/staging/mars/xio_bricks/xio_if.c create mode 100644 drivers/staging/mars/xio_bricks/xio_net.c create mode 100644 drivers/staging/mars/xio_bricks/xio_server.c create mode 100644 drivers/staging/mars/xio_bricks/xio_sio.c create mode 100644 drivers/staging/mars/xio_bricks/xio_trans_logger.c create mode 100644 include/linux/brick/brick.h create mode 100644 include/linux/brick/brick_checking.h create mode 100644 include/linux/brick/brick_mem.h create mode 100644 include/linux/brick/brick_say.h create mode 100644 include/linux/brick/lamport.h create mode 100644 include/linux/brick/lib_limiter.h create mode 100644 include/linux/brick/lib_pairing_heap.h create mode 100644 include/linux/brick/lib_queue.h create mode 100644 include/linux/brick/lib_rank.h create mode 100644 include/linux/brick/lib_timing.h create mode 100644 include/linux/brick/meta.h create mode 100644 include/linux/brick/vfs_compat.h create mode 100644 include/linux/mars_light/light_strategy.h create mode 100644 include/linux/mars_light/mars_proc.h create mode 100644 include/linux/xio/lib_log.h create mode 100644 include/linux/xio/lib_mapfree.h create mode 100644 include/linux/xio/xio.h create mode 100644 include/linux/xio/xio_bio.h create mode 100644 include/linux/xio/xio_client.h create mode 100644 include/linux/xio/xio_copy.h create mode 100644 include/linux/xio/xio_if.h create mode 100644 include/linux/xio/xio_net.h create mode 100644 include/linux/xio/xio_server.h create mode 100644 include/linux/xio/xio_sio.h create mode 100644 include/linux/xio/xio_trans_logger.h -- 2.6.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/