2023-10-23 16:02:20

by Mathew, Cherry G.*

[permalink] [raw]
Subject: [RFC+Patch] Formal models as source of truth for Software Architects.

Hi Hackers,

I wanted to share with you some work I've been doing for a couple of
months with the spin verifier and OS C code. [1]

Although not directly and immediately relevant to current kernel
development, this may be relevant to folks thinking about architecture
and future directions in managing design complexity.

As I was working on plumbing the ARC code [1] to an actual block driver,
it dawned on me that there may be additional advantages to writing code
based on formal models, other than just consistency verification of
existing production code. So while I'm getting ahead of myself, let me
re-start in a reasonable middle of this narrative, and describe how I
got here.

Intro:
Software must be the only "Engineering" discipline, where the equivalent
of drawing up an architectural masterplan before you build a skyscraper
in software, is to start by getting some shovels and start digging the
basement. While there are a few efforts driven by industry ("inspired"
by regulation, possibly) [2] [3], I really couldn't find any open source
attempts at ongoing, continuously integrated work that is in lockstep
with production sourcecode, with the exception of academic work in
kernel code verification, eg: [4], or much smaller codebases such as
seL4 [5]. (Please feel free to point me at everything I've surely
missed).

I think we're thinking about the problem from the wrong end. Inspired by
"Use the Source, Luke", I realised that perhaps we should approach the
problem of systems software development using the old unix/engineering
principles of modularity and clean interfaces.

"D3" - Design Driven Development:
Let's say we want to develop a new block driver interface that "front"s
an existing block device, say for eg: as a first step to implementing a
cache of some sort. In addition, let's assume that we've been given the
requirement that this driver needs to be available on several OSs. As a
software architect, how do you put something concrete down on paper,
which serves as the "Source of Truth", before hammering on the actual
drivers ?

I propose, that a formal model, specified using a modelling language can
serve this purpose. However, a formal model alone, seems to be
insufficient. Most formal models seem to exist in a sort of echo chamber
of their own, where the model sits aside from actual production code,
or the other extreme, where implementation code itself serves as the
source of truth, instead of a more abstract formal spec.

While looking for a verifier tool that addresses this problem of the
"gap" between a model and the implementation code written to express
that model, I came across the model extractor from the spin
verifier. [6]. Modex really blew my mind. For the first time, I could
write a complete loop starting from:

a) A Formal model to represent concurrent processes and the joint
evolution of their state.
b) Invariants expressed in linear temporal logic, to reason about bounds
within which this model should operate.
c) An implementation in a programming language based on eyeballed model
code from "a)" above.
d) A model extractor that could parse the above implementation code, and
pull it back into a formal model representation such as in "a)"
above.
e) And finally the ability to apply invariants in "b)" , originally
meant for the abstract model in "a)" over the extracted model from
"d)".

If there were inconsistencies in step "e)", while none in "b)", this
could mean that either the extractor hadn't pulled out the right model,
or the implementation had strayed from that specified in "a)"
above. This would then require further eyeballing, patching, and
repeating steps "c)" -> "e)" until the extracted model "d)" and the hand
crafted model "a)" satisfied the invariants in "b)".

The key assumption here (which I have no way of proving is true) is that
the satisfaction of invariants in "b)" above, could stand in place for
equivalence of model and code. Thus we have a development process that
I'm calling "Design Driven Development" (or "D3" in short form) which
allows for iterative tightening up of implemented code to a formal
specification.

Discussion:
If one were a software architect in a larger project, one could just
focus on the correctness of one's model, while software engineers could
focus on implementing the model faithfully, using the D3 loop above to
ensure fidelity.

I believe this is really powerful - as I'm new to this area, I would
love to hear from folks that have used this or analogous techniques in
industry or open-source (ideally both), to get a sense of where my
current thinking is, with respect to the industry.

Please find attached, a long-ish patch that is heavily WIP. It attempts
to model a "generic" block driver in spin's promela language. The model
is in cbd.pm[hl] , the invariants are in cbd.inv. The implementation of
the model for linux is in linux/cbd.c ( you can use the Makefile to
compile it as an actual linux module). The implementation of the model
for NetBSD is in NetBSD/cbd.c (ditto - instructions in respective
Makefile). Note that I haven't actually tested the driver - it will most
likely crash, as I haven't really looked at the engineering side of
things yet - but thought I should publish this right now, to get early
feedback, and also give a sense of why actual CI testing is still needed
to ensure that the final implementation plumbing is right.

The "extraction" is specified in a "harness" defined in files with
extensions "*.prx", suplimented by LUTs to help translate C code back to
the extracted model. This is where the age of spin/modex becomes evident
- clearly the extraction language needs to be a lot more expressive (I
see it analagous to the expressivity in a regex machine, vs. a
parser). This is an area I'd love to see more work on, to the point
where we have a tool (if not already available) that can enable
something like D3 as the gold standard for software quality and
reliability.

For architects, it takes off the burden of having to rely
on their memory and random tests and bespoke models to keep state. For
engineers, it provides a clear "source of truth" and a process based
mechanism to verify their implementation based on this source. And for
project managers, there's much better visibility over the entire QA,
analogous to but much more powerful than a simple TDD style development
process.

Thank you for your attention - I hope to hear back from you - do please
Cc: me if you're replying to list. Especially if you actively work on
formally verified infrastructure code where QA is a regulatory need
(avionics ? MISRA-C related constraints ? healthcare ?)

Patch inlined below. To try it out, you need to have 'spin' installed
using your favourite package manager, and modex [6] compiled , with both
binaries in $PATH.

Then you can try things out:

1) Deploy sources to local directory: $ patch -tp1 < cbd.diff

2) Check out the model. $ make spin-gen; less spinmodel.pml

3) Try to verify the model (see various commented out #defines in
spinmodel.pml if you want to spread your wings.):
$ make spin-run

4) Try to extract the model from linux/cbd.c:
$ make clean modex-linux; less spinmodel.pml
You can try to verify this extracted spinmodel.pml via 3) above.

5) Try to extract the model from NetBSD/cbd.c:
$ make clean modex-NetBSD; less spinmodel.pml
You can try to verify this extracted spinmodel.pml via 3) above.

Looking forward!

Many Thanks,

MattC (/~cherry)

[1] https://mail-index.netbsd.org/tech-kern/2023/09/28/msg029203.html
[2] https://en.wikipedia.org/wiki/Model-based_systems_engineering
[3] https://en.wikipedia.org/wiki/Model-driven_engineering
[4] https://unsat.cs.washington.edu/papers/nelson-hyperkernel.pdf
[5] https://sel4.systems/
[6] https://github.com/nimble-code/Modex


diff -urN cbd.null/cbd.drv cbd/cbd.drv
--- cbd.null/cbd.drv 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbd.drv 2023-10-19 17:56:08.023048865 +0000
@@ -0,0 +1,299 @@
+/* Spin model driver for NetBSD cbd(4) cbd.c */
+
+/*
+ * Copyright 2023, Mathew, Cherry G. <[email protected]>
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "cbd.pmh"
+
+/* We drive the routines concurrently */
+#define NPROCS 2
+#define index_pid (_pid - 1)
+
+/* Per-device instance state. Currently we only have one instance */
+bit sc_lock; /* Driver instance lock */
+unsigned cbdevsz : DEVSZBITS; /* Same as size of the backing block dev */
+
+unsigned cbdev : DEVIDBITS; /* Our Device "handle" */
+unsigned backingdev : DEVIDBITS; /* Backing device "handle" */
+unsigned cachingdev : DEVIDBITS; /* Caching device "handle" */
+
+/* Aggregated traffic stats (per-device) */
+unsigned cbdbytesread : DEVSZBITS;
+unsigned cbdbyteswritten : DEVSZBITS;;
+
+/*
+ * Helper functions - better as inline because:
+ * - spin parser doesn't care about location vs. preprocessor does.
+ * - trace output looks better
+ */
+hidden int error[NPROCS]; /* Error indicator for last op */
+
+inline SETERROR(_errno)
+{
+ error[index_pid] = _errno;
+}
+
+
+inline
+mutex_enter(_mutex)
+{
+ atomic {
+ (_mutex == 0) -> _mutex = 1;
+ }
+}
+
+inline
+mutex_exit(_mutex)
+{
+ atomic {
+ (_mutex == 1) -> _mutex = 0;
+ }
+}
+
+/* Set state to "Configured", so that ioctl() can work */
+inline
+cbd_init()
+{
+ if
+ :: cbd_unconfigured() ->
+ cbdev = CBD_DEVID_CBD1;
+ :: else -> SETERROR(EBUSY);
+ skip;
+ fi
+}
+
+/* Undo init() */
+inline
+cbd_fini()
+{
+ do
+ :: cbd_backed(cbdev) ->
+ mutex_enter(sc_lock);
+ cbd_detach_backend(CBD_DEVID_BACKING);
+ mutex_exit(sc_lock);
+ :: cbd_configured(cbdev) ->
+ mutex_enter(sc_lock);
+ if
+ :: cbd_backed(cbdev) -> /* Lost race. Retry */
+ mutex_exit(sc_lock);
+ :: else ->
+ cbdev = CBD_DEVID_NONE;
+ mutex_exit(sc_lock);
+ break;
+ fi
+ :: else -> skip; /* Retry until things quiesce */
+ od
+}
+
+inline
+io_backing(backingdev, bp)
+{
+ /* Aggregate i/o data */
+ if
+ :: is_set(bp.direction, B_READ) ->
+ /* Increment bytes read */
+ cbdbytesread = cbdbytesread + bp.spooled;
+ :: else -> /* Assume B_WRITE */
+ /* Increment bytes written */
+ cbdbyteswritten = cbdbyteswritten + bp.spooled;
+ fi
+
+}
+
+inline
+ioctl_backing(_dev, _cmd, _data, _flag)
+{
+ /* Wouldn't have got here without specific intention.
+ * Therefore we just make sure we're in the right state.
+ */
+ SETERROR((cbd_backed(_dev) -> 0 : ENXIO));
+}
+
+inline
+reset_cbdstate()
+{
+ sc_lock = 0;
+ cbdbytesread = 0;
+ cbdbyteswritten = 0;
+ cbdev = CBD_DEVID_NONE;
+ cbdevsz = 0;
+ backingdev = CBD_DEVID_NONE;
+ cachingdev = CBD_DEVID_NONE;
+}
+
+bit strategy = 0;
+bit ioctl = 0;
+bit configured = 0;
+bit rdwr = 0;
+
+inline
+drive_cbd(_dev, _flags, _buffer,
+ _cmd, _data)
+{
+ /* Start with state that covers max code */
+ _dev = CBD_DEVID_CBD1;
+ set_flag(_flags, FREAD | FWRITE);
+
+ _buffer.error = 0;
+ _buffer.address = 0;
+ _buffer.offset = 0;
+ _buffer.length = CBD_BACKING_DISKSIZE; /* Try moving the whole buffer */
+ _buffer.spooled = 0;
+ _buffer.dev = CBD_DEVID_NONE;
+ set_flag(_buffer.direction, B_READ);
+
+ _cmd = CBDIO_ATTACH_BACKEND;
+ _data = CBD_DEVID_BACKING;
+
+ do
+ ::
+ if
+ :: (!configured) ->
+ cbd_init();
+ configured = 1;
+ :: (ioctl) ->
+ cbdioctl(_dev, _cmd, _data, _flags);
+ if
+//#define EXPLORE_IOCTL
+#ifdef EXPLORE_IOCTL
+ :: (_dev > CBD_DEVID_NONE) -> _dev--;
+ :: (_dev < CBD_DEVID_CACHING) -> _dev++;
+#endif
+ :: (_cmd > CBDIO_INVAL) -> _cmd--;
+ :: (_cmd < CBDIO_GET_STATUS) -> _cmd++;
+ :: (_data > CBD_DEVID_NONE) -> _data--;
+ :: (_data < CBD_DEVID_CACHING) -> _data++;
+ :: is_set(_flags, FREAD) -> clear_flag(_flags, FREAD); set_flag(_flags, FWRITE);
+ :: is_set(_flags, FWRITE) -> clear_flag(_flags, FWRITE); set_flag(_flags, FREAD);
+ :: else -> set_flag(_flags, FREAD);
+ fi
+ ioctl = 0;
+
+ :: (!ioctl) -> ioctl = 1;
+ :: (strategy) ->
+ mutex_enter(sc_lock);
+ cbdstrategy(_buffer);
+ mutex_exit(sc_lock);
+//#define EXPLORE_STRATEGY
+#ifdef EXPLORE_STRATEGY
+ if
+ :: is_set(_buffer.direction, B_WRITE) ->
+ clear_flag(_buffer.direction, B_WRITE);
+ set_flag(_buffer.direction, B_READ);
+ :: is_set(_buffer.direction, B_READ) ->
+ clear_flag(_buffer.direction, B_READ);
+ set_flag(_buffer.direction, B_WRITE);
+ :: else ->
+ set_flag(_buffer.direction, B_READ);
+ fi
+ strategy = 0;
+# endif
+ :: (!strategy) -> strategy = 1;
+//#define EXPLORE_BUFFERSTATE
+#ifdef EXPLORE_BUFFERSTATE
+ :: (_buffer.offset < (CBD_BACKING_DISKSIZE + 1)) ->
+ _buffer.offset++;
+ :: (_buffer.length < (CBD_BACKING_DISKSIZE + 2)) ->
+ _buffer.length++;
+ :: (_buffer.address < VAMAX) -> _buffer.address++;
+ :: (_buffer.spooled < (CBD_BACKING_DISKSIZE + 2)) ->
+ _buffer.spooled++;
+ :: (_buffer.offset > 0) -> _buffer.length--;
+ :: (_buffer.length > 0) -> _buffer.length--;
+ :: (_buffer.spooled > 0) -> _buffer.spooled--;
+ :: (_buffer.address > 0) -> _buffer.address--;
+#endif
+//#define EXPLORE_READWRITESTATE
+#ifdef EXPLORE_READWRITESTATE
+ :: (rdwr) ->
+ cbdread(_dev, _buffer);
+ cbdwrite(_dev, _buffer);
+ rdwr = 0;
+ :: (!rdwr) -> rdwr = 1;
+
+ :: (configured) ->
+ cbd_fini();
+ configured = 0;
+#endif
+ :: else ->
+ break;
+ fi
+
+ od
+}
+
+/* Per proc data to drive the state machine */
+typedef user_vars_t {
+ unsigned dev : DEVIDBITS;
+ unsigned flags : FLAGBITS;
+ buf_t buffer;
+ unsigned cmd : IOCMDBITS;
+
+ /* IRL, this would be eg: the backing device node path.
+ * For now we just pass a fake DEVIDXXX "handle" that
+ * represents the backing device.
+ */
+
+ unsigned data : DEVIDBITS;
+};
+
+/* Drive functions through various state variations */
+proctype
+p_drive_cbd()
+{
+ /* Setup initial per-proc caller values. These will get
+ * iterated in the inline callees below below
+ */
+ user_vars_t user_vars[NPROCS];
+
+ drive_cbd(user_vars[index_pid].dev, user_vars[index_pid].flags,
+ user_vars[index_pid].buffer, user_vars[index_pid].cmd,
+ user_vars[index_pid].data);
+
+}
+
+/* Drive the procs */
+init {
+
+ /* Initialise Device params */
+ reset_cbdstate();
+
+ pid proc;
+
+ /* Spin up NPROCS concurrent processes */
+ atomic {
+ proc = 0;
+ do
+ :: proc < NPROCS ->
+ run p_drive_cbd();
+ proc++
+ :: else ->
+ break
+ od
+ }
+
+}
diff -urN cbd.null/cbd.inv cbd/cbd.inv
--- cbd.null/cbd.inv 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbd.inv 2023-10-18 17:08:10.842776509 +0000
@@ -0,0 +1,49 @@
+/* Spin process invariants for NetBSD cbd(4) cbd.c */
+
+/*
+ * Copyright 2023, MattC <[email protected]>
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+ltl
+{
+/* Checks while outside critical section - ie; kernel state is stable. */
+ always ((sc_lock == 0) implies (
+ /* Device state basic consistency */
+ (cbd_configured(cbdev) implies
+ !cbd_unconfigured()) &&
+ /* Active ioctl passthrough implies configured device */
+ (cbd_backed(cbdev) implies
+ cbd_configured(cbdev)) &&
+ /* There will be I/O eventually */
+ (eventually always (cbdbytesread + cbdbyteswritten) > 0)
+ ) /* (sc_lock == 0) */
+ ) /* always */ //&&
+// always ((sc_lock == 1) implies (
+ /* Lock must always be released */
+// (eventually (sc_lock == 0))
+// ) /* (sc_lock == 1) */
+// ) /* always */ // &&
+}
diff -urN cbd.null/cbdioctl.linux.lut cbd/cbdioctl.linux.lut
--- cbd.null/cbdioctl.linux.lut 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbdioctl.linux.lut 2023-10-23 08:14:11.902411553 +0000
@@ -0,0 +1,15 @@
+// cbd_ioctl(...) -> cbdioctl(...)
+
+cbd_backed(cbp) cbd_backed(_dev)
+
+Substitute c_expr { 222 == cmd } (_cmd == CBDIO_ATTACH_BACKEND)
+Substitute c_expr { 333 == cmd } (_cmd == CBDIO_DETACH_BACKEND)
+
+Substitute c_expr { (cbp==0) } false
+Substitute c_expr { (!cbd_configured(cbp)) } !cbd_configured(_dev)
+Substitute c_expr { (!cbd_backed(cbp)) } !cbd_backed(_dev)
+Substitute c_code { err=(-ENXIO); } SETERROR(ENXIO)
+Substitute c_code { err=(-EBUSY); } SETERROR(EBUSY)
+Substitute c_code { err=(-ENODEV); } SETERROR(ENODEV)
+Substitute c_code { err=(-EPERM); } SETERROR(EPERM)
+Substitute c_code { err=(-ENOTTY); } skip
diff -urN cbd.null/cbdioctl.NetBSD.lut cbd/cbdioctl.NetBSD.lut
--- cbd.null/cbdioctl.NetBSD.lut 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbdioctl.NetBSD.lut 2023-10-23 08:23:23.430620615 +0000
@@ -0,0 +1,16 @@
+// cbd_ioctl(...) -> cbdioctl(...)
+
+cbd_backed(sc) cbd_backed(_dev)
+
+Substitute c_expr { (sc==0) } false
+Substitute c_expr { 222 == cmd } (_cmd == CBDIO_ATTACH_BACKEND)
+Substitute c_expr { 333 == cmd } (_cmd == CBDIO_DETACH_BACKEND)
+Substitute c_code { error=EBUSY; } SETERROR(EBUSY)
+Substitute c_code { error=ENODEV; } SETERROR(ENODEV)
+Substitute c_code { error=EPERM; } SETERROR(EPERM)
+
+Substitute c_code { error=cbd_attach_backend(sc,(&(cbdioctlargs))); } cbd_attach_backend(_dev, _data)
+Substitute c_code { error=cbd_detach_backend(sc); } cbd_detach_backend(_data)
+Substitute c_expr { (!cbd_backed(sc)) } !cbd_backed(_dev)
+
+
diff -urN cbd.null/cbdattachbackend.linux.lut cbd/cbdattachbackend.linux.lut
--- cbd.null/cbdattachbackend.linux.lut 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbdattachbackend.linux.lut 2023-10-23 08:12:51.340517105 +0000
@@ -0,0 +1 @@
+Substitute c_expr { IS_ERR(backingdev) } false
diff -urN cbd.null/cbdattachbackend.NetBSD.lut cbd/cbdattachbackend.NetBSD.lut
--- cbd.null/cbdattachbackend.NetBSD.lut 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbdattachbackend.NetBSD.lut 2023-10-23 06:56:21.559519110 +0000
@@ -0,0 +1,2 @@
+// Hard hammer - XXX: Find a better way to handle error ?
+Substitute c_expr { (error!=0) } false
diff -urN cbd.null/cbd.linux.prx cbd/cbd.linux.prx
--- cbd.null/cbd.linux.prx 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbd.linux.prx 2023-10-23 08:16:16.155489259 +0000
@@ -0,0 +1,217 @@
+// Spin model extractor harness for linux/cbd.c written by cherry
+//
+%F linux/cbd.c
+%X -n modexbug // Somehow this is needed to activate per-func LUT
+%X -L cbdstrategy.linux -n cbd_submit_bio
+%X -L cbdioctl.linux -n cbd_ioctl
+%X -L cbdattachbackend.linux -n cbd_attach_backend
+%X -n cbd_detach_backend
+%H
+// Disable effects of all included files and try to implement a subset of the APIs they provide.
+#define _LINUX_BLKDEV_H
+
+// linux/blk_types.h
+#define SECTOR_SHIFT 9
+typedef void struct block_device;
+typedef void blk_mode_t;
+
+#define __LINUX_MUTEX_H
+struct mutex {
+ void *dummy;
+};
+
+// sys/null.h
+#define NULL 0
+
+#define true 1
+#define false 0
+
+typedef int bool;
+
+/* Linuxisms */
+#define __user
+#define __init
+#define __exit
+
+#define module_init(x)
+#define module_exit(x)
+#define MODULE_LICENSE(L)
+
+%%
+//%C // c_code {}
+//%%
+//%D // c_cdecl {}
+//%%
+%L // XXX: Looks like per function tables are broken for now.
+// Common transformations
+
+NonState hidden cbd_major global
+NonState hidden exclusive global
+NonState hidden cbd_fops global
+NonState hidden exclusive global
+
+mutex_lock(... mutex_enter(sc_lock)
+mutex_unlock(... mutex_exit(sc_lock)
+
+// cbd_submit_bio(...) -> cbdstrategy(...)
+// See cbdstrategy.linux.lut broad substs which need scope limiting.
+NonState hidden bio cbd_submit_bio
+NonState hidden cbp cbd_submit_bio
+NonState hidden bdev cbd_submit_bio
+NonState hidden bio_offset cbd_submit_bio
+NonState hidden bdevsz cbd_submit_bio
+
+// Gen-ed Code substitution. This is due to a lack of a sensible parser harness language with the right semantics.
+Substitute c_expr { ((bio_offset+bio->bi_iter.bi_size)>bdevsz) } ((bp.offset + bp.length) > cbdevsz)
+Substitute c_expr { (bio_op(bio)&REQ_OP_READ) } is_set(bp.direction, B_READ)
+Substitute c_code { bio_offset=((bio_offset>bdevsz) ? bdevsz : bio_offset); } bp.offset = ((bp.offset > cbdevsz) -> cbdevsz : bp.offset)
+Substitute c_code [bio] { bio->bi_iter.bi_size=(bdevsz-bio_offset); } bp.length = (cbdevsz - bp.offset)
+Substitute c_code [bio] { bio->bi_iter.bi_sector=(bio_offset>>9); } skip
+Substitute c_code [bio] { bio->bi_status=BLK_STS_NOTSUPP; } bp.error = EINVAL;
+Substitute c_code [bio] { bio->bi_status=BLK_STS_OFFLINE; } bp.error = ENXIO
+
+// cbd_ioctl(...) -> cbdioctl(...)
+// See cbdioctl.linux.lut broad substs which need scope limiting.
+NonState hidden userdata cbd_ioctl
+NonState hidden arg cbd_ioctl
+NonState hidden cmd cbd_ioctl
+NonState hidden mode cbd_ioctl
+NonState hidden bdev cbd_ioctl
+NonState hidden cbp cbd_ioctl
+NonState hidden err cbd_ioctl
+
+
+Substitute c_expr { (!(mode&BLK_OPEN_WRITE)) } !is_set(_flag, FWRITE)
+Substitute c_code [bdev->bd_disk && bdev] { cbp=bdev->bd_disk->private_data; } skip
+Substitute c_code { err=cbd_attach_backend(cbp,userdata); } cbd_attach_backend(_dev,_data)
+Substitute c_code { err=cbd_detach_backend(cbp); } cbd_detach_backend(_data)
+Substitute c_code [cbp && cbp->backingdev->bd_disk->fops && cbp->backingdev->bd_disk && cbp->backingdev && cbp] { err=cbp->backingdev->bd_disk->fops->ioctl(cbp->backingdev,mode,cmd,arg); } ioctl_backing(backingdev, _cmd, _data, _flag)
+
+// cbd_attach_backend(...) -> cbd_attach_backend()
+// See cbdattachbackend.linux.lut broad substs which need scope limiting.
+NonState hidden lockedcbp cbd_attach_backend
+NonState hidden pathname cbd_attach_backend
+NonState hidden userdata cbd_attach_backend
+NonState hidden backingdev cbd_attach_backend
+
+Substitute c_expr { (lockedcbp->cbdev==0) } (_dev != CBD_DEVID_CBD1)-> SETERROR(ENODEV)
+Substitute c_code { pathname=kzalloc(PATH_MAX,GFP_KERNEL); } skip;
+Substitute c_expr { (pathname==0) } false
+Substitute c_expr { (strncpy_from_user(pathname,(void *)userdata,PATH_MAX)==(-EFAULT)) } (_data != CBD_DEVID_BACKING); SETERROR(EINVAL)
+Substitute c_code { kfree(pathname); } skip
+Substitute c_code { backingdev=blkdev_get_by_path(strim(pathname),((BLK_OPEN_READ|BLK_OPEN_WRITE)|BLK_OPEN_EXCL),exclusive,0); } skip
+Substitute c_code [lockedcbp] { lockedcbp->backingdev=backingdev; } backingdev = CBD_DEVID_BACKING
+Substitute c_code [lockedcbp->cbdev && lockedcbp] { set_capacity(lockedcbp->cbdev->bd_disk,bdev_nr_sectors(backingdev)); } cbdevsz = CBD_BACKING_DISKSIZE
+
+// cbd_detach_backend(...) -> cbd_detach_backend()
+NonState hidden lockedcbp cbd_detach_backend
+
+Substitute c_expr { ((lockedcbp->cbdev==0)||(lockedcbp->backingdev==0)) } (_dev != CBD_DEVID_CBD1 || backingdev != CBD_DEVID_BACKING) -> SETERROR(ENODEV)
+Substitute c_code [lockedcbp] { blkdev_put(lockedcbp->backingdev,exclusive); } skip
+Substitute c_code [lockedcbp] { lockedcbp->backingdev=0; } backingdev = CBD_DEVID_NONE
+Substitute c_code [lockedcbp->cbdev && lockedcbp] { set_capacity(lockedcbp->cbdev->bd_disk,0); } cbdevsz = 0
+%%
+
+%P
+hidden int tmpdev; /* Saved during passthrough */
+
+inline
+cbdstrategy(bp)
+{
+ #include "_modex_cbd_submit_bio.pml"
+}
+
+inline
+cbdioctl(_dev, _cmd, _data, _flag)
+{
+ SETERROR(0); /* default */
+ #include "_modex_cbd_ioctl.pml"
+}
+
+inline
+cbd_attach_backend(_dev, _data)
+{
+ SETERROR(0);
+#include "_modex_cbd_attach_backend.pml"
+}
+
+inline
+cbd_detach_backend(_dev)
+{
+ SETERROR(0);
+#include "_modex_cbd_detach_backend.pml"
+}
+
+// linux/cbd.c doesn't have read/write fops, so we
+// fake it to allow cbd.drv to be parsed
+// However See: cbd.drv: -D EXPLORE_READWRITESTATE
+/* For now, we just check re-entrancy consistency.
+ * XXX: uio validation
+ */
+inline
+cbdread(_dev, _uio)
+{
+ SETERROR(0);
+
+ /* Check for backing disk */
+ mutex_enter(sc_lock);
+ if
+ :: !cbd_configured(_dev) ->
+ SETERROR(EINVAL);
+ mutex_exit(sc_lock);
+ goto cbdread_out;
+ :: !cbd_backed(_dev) ->
+ SETERROR(ENOENT);
+ mutex_exit(sc_lock);
+ goto cbdread_out;
+ :: else ->
+ skip;
+ fi
+ mutex_exit(sc_lock);
+
+ /* simulate uio -> buf translation */
+ buf_t rd_buf;
+ rd_buf.direction = B_READ;
+ rd_buf.error = 0;
+ rd_buf.length = _uio.length;
+ rd_buf.address = _uio.address;
+ rd_buf.spooled = 0;
+ cbdstrategy(rd_buf);
+
+cbdread_out:
+}
+
+inline
+cbdwrite(_dev, _uio)
+{
+ SETERROR(0);
+
+ /* Check for backing disk */
+ mutex_enter(sc_lock);
+ if
+ :: !cbd_configured(_dev) ->
+ SETERROR(EINVAL);
+ mutex_exit(sc_lock);
+ goto cbdwrite_out;
+ :: !cbd_backed(_dev) ->
+ SETERROR(ENOENT);
+ mutex_exit(sc_lock);
+ goto cbdwrite_out;
+ :: else ->
+ skip;
+ fi
+ mutex_exit(sc_lock);
+
+ /* simulate uio -> buf translation */
+ buf_t rw_buf;
+ rw_buf.direction = B_WRITE;
+ rw_buf.error = 0;
+ rw_buf.length = _uio.length;
+ rw_buf.address = _uio.address;
+ rw_buf.spooled = 0;
+
+ cbdstrategy(rw_buf);
+cbdwrite_out:
+}
+
+%%
\ No newline at end of file
diff -urN cbd.null/cbd.NetBSD.prx cbd/cbd.NetBSD.prx
--- cbd.null/cbd.NetBSD.prx 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbd.NetBSD.prx 2023-10-23 08:20:37.150517408 +0000
@@ -0,0 +1,232 @@
+// Spin model extractor harness for NetBSD/cbd.c written by cherry
+//
+%F NetBSD/cbd.c
+%X -n modexbug // Somehow this is needed to activate per-func LUT
+%X -L cbdstrategy.NetBSD -n cbd_strategy
+%X -L cbdioctl.NetBSD -n cbd_ioctl
+%X -L cbdattachbackend.NetBSD -n cbd_attach_backend
+%X -n cbd_detach_backend
+%H
+// Disable effects of all included files and try to implement a subset of the APIs they provide.
+
+#define _SYS_TYPES_H_
+#define _SYS_STAT_H_
+#define _SYS_BUF_H
+#define _SYS_BUFQ_H
+#define _SYS_CONF_H
+#define _SYS_DISK_H
+#define _SYS_DISKLABEL_H
+#define _SYS_ERRNO_H
+#define _UVM_PROT_
+#define _SYS_ACL_H
+#define _SYS_VNODE_IF_H_
+#define _SYS_VNODE_H_
+
+typedef void device_t;
+//typedef void *kmutex_t; //Kludgy hack, see below
+#define kmutex_t (void *)
+typedef void dev_t;
+typedef void u_long;
+typedef void buf_t;
+typedef void uint64_t;
+
+/* Override switch arrays */
+void cbd_bdevsw;
+void cbd_cdevsw;
+void cbddkdriver;
+
+#define CFATTACH_DECL3_NEW(_ign1, _ign2, _ign3, _ign4, _ign5, _ign6, _ign7, _ign8, _ign9)
+
+// sys/null.h
+#define NULL 0
+
+#define true 1
+#define false 0
+
+typedef int bool;
+
+#define disk_busy(_dk)
+#define disk_unbusy(_dk, _count, _flags)
+
+#define biodone(_bp)
+
+%%
+//%C // c_code {}
+//%%
+//%D // c_cdecl {}
+//%%
+%L // XXX: Looks like per function tables are broken for now.
+// Common transformations
+
+NonState hidden cbd_major global
+NonState hidden cbd_cdevsw global
+NonState hidden sc_lock global
+
+mutex_enter(... mutex_enter(sc_lock)
+mutex_exit(... mutex_exit(sc_lock)
+
+Substitute c_code { unit=DISKUNIT(dev); } skip
+Substitute c_code { sc=device_lookup_private((&(cbd_cd)),unit); } skip
+
+// cbd_strategy(...) -> cbdstrategy(...)
+NonState hidden sc cbd_strategy
+NonState hidden buf cbd_strategy
+NonState hidden dev cbd_strategy
+NonState hidden unit cbd_strategy
+
+// Gen-ed Code substitution. This is due to a lack of a sensible parser harness language with the right semantics.
+Substitute c_expr { ((dbtob(buf->b_blkno)+buf->b_bcount)>sc->sc_tvn_bsize) } ((bp.offset + bp.length) > cbdevsz)
+Substitute c_expr { BUF_ISREAD(buf) } is_set(bp.direction, B_READ)
+Substitute c_code [sc && sc && buf && buf] { buf->b_blkno=((dbtob(buf->b_blkno)>sc->sc_tvn_bsize) ? F: btodb(sc->sc_tvn_bsize) : bufb_blkno); } bp.offset = ((bp.offset > cbdevsz) -> cbdevsz : bp.offset)
+Substitute c_code [buf && sc && buf] { buf->b_bcount=(sc->sc_tvn_bsize-dbtob(buf->b_blkno)); } bp.length = (cbdevsz - bp.offset)
+Substitute c_code [buf] { buf->b_error=EINVAL; } bp.error = EINVAL
+Substitute c_code [buf] { buf->b_error=ENXIO; } bp.error = ENXIO
+
+// cbd_ioctl(...) -> cbdioctl(...)
+NonState hidden sc cbd_ioctl
+NonState hidden dev cbd_ioctl
+NonState hidden cmd cbd_ioctl
+NonState hidden data cbd_ioctl
+NonState hidden flag cbd_ioctl
+NonState hidden l cbd_ioctl
+NonState hidden unit cbd_ioctl
+NonState hidden error cbd_ioctl
+NonState hidden cbdioctlargs cbd_ioctl
+
+Substitute c_code { cbdioctlargs.diskpath=data; } skip
+Substitute c_code { cbdioctlargs.l=l; } skip
+Substitute c_expr { (!cbd_configured(sc)) } !cbd_configured(_dev) -> SETERROR(ENXIO)
+Substitute c_expr { (!(flag&FWRITE)) } !is_set(_flag, FWRITE)
+Substitute c_code [l && sc] { error=VOP_IOCTL(sc->sc_tvn,cmd,data,flag,l->l_cred); } ioctl_backing(backingdev, _cmd, _data, _flag)
+
+// cbd_attach_backend(...) -> cbd_attach_backend()
+NonState hidden lockedsc cbd_attach_backend
+NonState hidden cbdioctl cbd_attach_backend
+NonState hidden data cbd_attach_backend
+NonState hidden pbuf cbd_attach_backend
+NonState hidden error cbd_attach_backend
+NonState hidden vp cbd_attach_backend
+NonState hidden numsec cbd_attach_backend
+NonState hidden secsz cbd_attach_backend
+
+Substitute c_expr { (lockedsc->sc_dev==0) } (_dev != CBD_DEVID_CBD1)-> SETERROR(ENODEV)
+Substitute c_code [cbdioctl] { error=pathbuf_copyin(cbdioctl->diskpath,(&(pbuf))); } skip
+Substitute c_code [cbdioctl] { error=vn_bdev_openpath(pbuf,(&(vp)),cbdioctl->l); } if :: (_data != CBD_DEVID_BACKING) -> SETERROR(ENODEV) :: else; fi
+Substitute c_code { pathbuf_destroy(pbuf); } skip
+Substitute c_code { error=getdisksize(vp,(&(numsec)),(&(secsz))); } skip
+Substitute c_code [lockedsc] { lockedsc->sc_tvn=vp; } backingdev = CBD_DEVID_BACKING
+Substitute c_code [lockedsc] { lockedsc->sc_tvn_bsize=(numsec*secsz); } cbdevsz = CBD_BACKING_DISKSIZE
+Substitute c_code [cbdioctl->l && cbdioctl && lockedsc] { lockedsc->sc_tvn_l_cred=cbdioctl->l->l_cred; } skip
+
+
+// cbd_detach_backend(...) -> cbd_detach_backend()
+NonState hidden lockedsc cbd_detach_backend
+
+Substitute c_expr { ((((lockedsc->sc_dev==0)||(lockedsc->sc_tvn==0))||(lockedsc->sc_tvn_bsize==0))||(lockedsc->sc_tvn_l_cred==0)) } (_dev != CBD_DEVID_CBD1 || backingdev != CBD_DEVID_BACKING) -> SETERROR(ENODEV);
+Substitute c_code [lockedsc && lockedsc] { vn_close(lockedsc->sc_tvn,(FREAD|FWRITE),lockedsc->sc_tvn_l_cred); } skip
+Substitute c_code [lockedsc] { lockedsc->sc_tvn_bsize=0; } cbdevsz = 0
+Substitute c_code [lockedsc] { lockedsc->sc_tvn_l_cred=0; } skip
+Substitute c_code [lockedsc] { lockedsc->sc_tvn=0; } backingdev = CBD_DEVID_NONE
+%%
+
+%P
+hidden int tmpdev; /* Saved during passthrough */
+
+inline
+cbdstrategy(bp)
+{
+ #include "_modex_cbd_strategy.pml"
+}
+
+inline
+cbdioctl(_dev, _cmd, _data, _flag)
+{
+ SETERROR(0); /* default */
+ #include "_modex_cbd_ioctl.pml"
+}
+
+inline
+cbd_attach_backend(_dev, _data)
+{
+ SETERROR(0);
+#include "_modex_cbd_attach_backend.pml"
+}
+
+inline
+cbd_detach_backend(_dev)
+{
+ SETERROR(0);
+#include "_modex_cbd_detach_backend.pml"
+}
+
+// XXX: please extract below functions.
+/* For now, we just check re-entrancy consistency.
+ * XXX: uio validation
+ */
+inline
+cbdread(_dev, _uio)
+{
+ SETERROR(0);
+
+ /* Check for backing disk */
+ mutex_enter(sc_lock);
+ if
+ :: !cbd_configured(_dev) ->
+ SETERROR(EINVAL);
+ mutex_exit(sc_lock);
+ goto cbdread_out;
+ :: !cbd_backed(_dev) ->
+ SETERROR(ENOENT);
+ mutex_exit(sc_lock);
+ goto cbdread_out;
+ :: else ->
+ skip;
+ fi
+ mutex_exit(sc_lock);
+
+ /* simulate uio -> buf translation */
+ buf_t rd_buf;
+ rd_buf.direction = B_READ;
+ rd_buf.error = 0;
+ rd_buf.length = _uio.length;
+ rd_buf.address = _uio.address;
+ rd_buf.spooled = 0;
+ cbdstrategy(rd_buf);
+
+cbdread_out:
+}
+
+inline
+cbdwrite(_dev, _uio)
+{
+ SETERROR(0);
+
+ /* Check for backing disk */
+ mutex_enter(sc_lock);
+ if
+ :: !cbd_configured(_dev) ->
+ SETERROR(EINVAL);
+ mutex_exit(sc_lock);
+ goto cbdwrite_out;
+ :: !cbd_backed(_dev) ->
+ SETERROR(ENOENT);
+ mutex_exit(sc_lock);
+ goto cbdwrite_out;
+ :: else ->
+ skip;
+ fi
+ mutex_exit(sc_lock);
+
+ /* simulate uio -> buf translation */
+ buf_t rw_buf;
+ rw_buf.direction = B_WRITE;
+ rw_buf.error = 0;
+ rw_buf.length = _uio.length;
+ rw_buf.address = _uio.address;
+ rw_buf.spooled = 0;
+
+ cbdstrategy(rw_buf);
+cbdwrite_out:
+}
+
+%%
\ No newline at end of file
diff -urN cbd.null/cbd.pmh cbd/cbd.pmh
--- cbd.null/cbd.pmh 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbd.pmh 2023-10-22 19:48:38.564639937 +0000
@@ -0,0 +1,120 @@
+/*
+ * Copyright 2023, MattC <[email protected]>
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _CBD_PMH_
+#define _CBD_PMH_
+
+#define ERRBITS 6 /* Add as list below grows */
+
+/* Modelled on NetBSD sys/errno.h */
+#define EPERM 1 /* Operation not permitted */
+#define ENOENT 2 /* No such file or directory */
+#define EIO 5 /* Input/output error */
+#define ENXIO 6 /* Device not configured */
+#define EBADF 9 /* Bad file descriptor */
+#define EBUSY 16 /* Device busy */
+#define ENODEV 19 /* Operation not supported by device */
+#define EINVAL 22 /* Invalid argument */
+
+#define DEVSZBITS 10 /* Max block device log2(sz) */
+#define DEVIDBITS 4 /* Add more as list below grows */
+
+/* handle IDs for "device"s */
+#define CBD_DEVID_NONE 0
+#define CBD_DEVID_CBD1 1
+#define CBD_DEVID_BACKING 2
+#define CBD_DEVID_CACHING 3
+
+/* "State" values, to report current states, on query */
+#define CBD_UNCONFIGURED 0
+#define CBD_CONFIGURED 1
+#define CBD_BACKED 2
+#define CBD_CACHED 3
+
+#define FLAGBITS 4
+/* Buffer direction flag (mutually exclusive) */
+#define B_READ 1 /* Read flag. */
+#define B_WRITE 2 /* Write flag. */
+
+/* dev read()/write(). Write flag imples Read flag */
+#define FREAD 1
+#define FWRITE 3
+
+inline set_flag(_flags, _FLAG) { _flags = (_flags | _FLAG) }
+inline clear_flag(_flags, _FLAG) { _flags = _flags & ~_FLAG }
+#define is_set(_flags, _FLAG) ((_flags & _FLAG) && 1)
+
+#define IOCMDBITS 4
+
+/* ioctl related command codes */
+#define CBDIO_INVAL 0
+#define CBDIO_ATTACH_BACKEND 1
+#define CBDIO_DETACH_BACKEND 2
+#define CBDIO_GET_STATUS 3
+
+#define VABITS DEVSZBITS
+#define VAMAX (1 << (VABITS - 1)) /* A boundary for a "valid" address */
+#define is_valid_address(_addr) (_addr <= VAMAX)
+
+#define CBD_BACKING_DISKSIZE (VAMAX)
+
+/* Device state queries */
+#define cbd_unconfigured() (cbdev == CBD_DEVID_NONE)
+
+/* In a multi-device scenario (out of scope of this
+ * model), this would be a search for the device matching
+ * the arg _dev.
+ */
+
+#define cbd_configured(_dev) ((cbdev == CBD_DEVID_CBD1) && (_dev == cbdev))
+
+#define cbd_backed(_dev) (cbd_configured(_dev) && (backingdev == CBD_DEVID_BACKING))
+
+#define cbd_cached(_dev) (cbd_backed() && cachingdev == CBD_DEVID_CACHING)
+
+#define cbd_status(_dev) \
+ ((_dev == CBD_DEVID_NONE) -> CBD_UNCONFIGURED : \
+ ((cbdev == CBD_DEVID_CBD1) -> \
+ ((backingdev == CBD_DEVID_BACKING) -> \
+ ((cachingdev == CBD_DEVID_CACHING) -> \
+ CBD_CACHED : CBD_BACKED) : \
+ CBD_CONFIGURED) : \
+ CBD_UNCONFIGURED))
+
+
+typedef buf_t {
+ unsigned error : ERRBITS; /* error flagged during processing */
+ unsigned address : VABITS; /* address representation */
+ unsigned offset : DEVSZBITS; /* Offset into device */
+ unsigned length : DEVSZBITS; /* total length of buffer */
+ unsigned spooled : DEVSZBITS; /* length processed so far */
+ unsigned direction : FLAGBITS; /* whether callee is to read from or write to */
+ unsigned dev : DEVIDBITS; /* Device involved */
+};
+
+
+#endif /* _CBD_PMH_ */
\ No newline at end of file
diff -urN cbd.null/cbd.pml cbd/cbd.pml
--- cbd.null/cbd.pml 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbd.pml 2023-10-22 21:00:12.814145286 +0000
@@ -0,0 +1,261 @@
+/* Spin process models for NetBSD cbd(4) cbd.c */
+
+/*
+ * Copyright 2023, MattC <[email protected]>
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "cbd.pmh"
+
+hidden int tmpdev; /* Saved during passthrough */
+inline
+cbdstrategy(/* buf_t */ bp)
+{
+
+
+ /* sanity check buffer */
+
+ if
+ :: ((bp.offset + bp.length) > cbdevsz) ->
+ if
+ :: is_set(bp.direction, B_READ) ->
+ /* Clip the requested length.
+ * Total length effectively read will be in
+ * bp.spooled
+ */
+ bp.offset = ((bp.offset > cbdevsz) -> cbdevsz :
+ bp.offset);
+ bp.length = (cbdevsz - bp.offset);
+
+ skip;
+
+ :: is_set(bp.direction, B_WRITE) ->
+ bp.error = EINVAL;
+ fi
+ goto strategy_out;
+ :: else ->
+ skip;
+ fi
+
+ if /* Always check if device is still active before acting */
+ :: cbd_backed(bp.dev) ->
+ tmpdev = bp.dev;
+ mutex_enter(sc_lock);
+ bp.dev = backingdev; /* Set for callee */
+ io_backing(backingdev, bp);
+ bp.dev = tmpdev; /* Reset for caller */
+ mutex_exit(sc_lock);
+ :: else -> /* Fallback - don't block */
+ bp.error = ENXIO;
+ goto strategy_out;
+ fi
+strategy_out:
+
+}
+
+/* Attach backing disk device specified in data */
+
+inline
+cbd_attach_backend(_dev, _data)
+{
+ SETERROR(0);
+
+ /* Simulate data validation */
+ if
+ :: (_dev != CBD_DEVID_CBD1 || /* Device validity */
+ _data != CBD_DEVID_BACKING) ->
+ SETERROR(ENODEV);
+ goto attach_out;
+ :: else ->
+ /* IRL, this would come from the backing dev */
+ cbdevsz = CBD_BACKING_DISKSIZE;
+ backingdev = _data;
+ fi
+
+attach_out:
+}
+
+/* Detach backing disk device specified in data */
+inline
+cbd_detach_backend(_dev)
+{
+ SETERROR(0);
+
+ /* Simulate dev state validation */
+
+ if
+ :: (_dev != CBD_DEVID_CBD1 ||
+ backingdev != CBD_DEVID_BACKING) ->
+ SETERROR(ENODEV);
+ goto detach_out;
+ :: else ->
+ backingdev = CBD_DEVID_NONE;
+ cbdevsz = 0;
+ fi
+
+detach_out:
+}
+
+inline
+cbdioctl(_dev, _cmd, _data, _flag)
+{
+ SETERROR(0); /* default */
+
+ mutex_enter(sc_lock);
+ if
+ :: !cbd_configured(_dev) ->
+ SETERROR(ENXIO);
+ goto ioctl_out; /* Invalid device ID */
+ :: else ->
+ skip;
+ fi
+
+ if
+ :: (_cmd == CBDIO_ATTACH_BACKEND) ->
+ /* Simulate attaching a backing device */
+ if
+ :: cbd_backed(_dev) -> /* Already backed. */
+ SETERROR(EBUSY); /* return EBUSY */
+ goto ioctl_out;
+ :: else ->
+ skip;
+ fi
+ if
+ :: !is_set(_flag, FWRITE) -> /* Semantic inconsistency */
+ SETERROR(EPERM);
+ goto ioctl_out;
+ :: else ->
+ skip;
+ fi
+ cbd_attach_backend(_dev, _data);
+ goto ioctl_out;
+
+ :: (_cmd == CBDIO_DETACH_BACKEND) ->
+ /* Simulate detaching a backing device */
+ if
+ :: !cbd_backed(_dev) -> /* Nothing to detach. */
+ SETERROR(ENODEV);
+ goto ioctl_out;
+ :: else ->
+ skip;
+ fi
+ if
+ :: !is_set(_flag, FWRITE) -> /* Semantic inconsistency */
+ SETERROR(EPERM);
+ goto ioctl_out;
+ :: else ->
+ skip;
+ fi
+ cbd_detach_backend(_dev);
+ goto ioctl_out;
+
+ :: (_cmd == CBDIO_GET_STATUS) ->
+ /* Simulate Query */
+ /* dev "handle" as proxy for "status" */
+ _data = cbd_status(_dev);
+ goto ioctl_out; /* NOP */
+ :: (_cmd == CBDIO_INVAL) -> /* Adversarial/careless user */
+ SETERROR(EINVAL);
+ goto ioctl_out;
+ :: else ->
+ skip; /* Pass through to backing device ioctl() */
+ fi
+
+ /* Anything else is passed to the backing disk */
+ ioctl_backing(backingdev, _cmd, _data, _flag);
+
+ioctl_out:
+ mutex_exit(sc_lock);
+}
+
+/* For now, we just check re-entrancy consistency.
+ * XXX: uio validation
+ */
+inline
+cbdread(_dev, _uio)
+{
+ SETERROR(0);
+
+ /* Check for backing disk */
+ mutex_enter(sc_lock);
+ if
+ :: !cbd_configured(_dev) ->
+ SETERROR(ENXIO);
+ mutex_exit(sc_lock);
+ goto cbdread_out;
+ :: !cbd_backed(_dev) ->
+ SETERROR(ENOENT);
+ mutex_exit(sc_lock);
+ goto cbdread_out;
+ :: else ->
+ skip;
+ fi
+ mutex_exit(sc_lock);
+
+ /* simulate uio -> buf translation */
+ buf_t rd_buf;
+ rd_buf.direction = B_READ;
+ rd_buf.error = 0;
+ rd_buf.length = _uio.length;
+ rd_buf.address = _uio.address;
+ rd_buf.spooled = 0;
+ cbdstrategy(rd_buf);
+
+cbdread_out:
+}
+
+inline
+cbdwrite(_dev, _uio)
+{
+ SETERROR(0);
+
+ /* Check for backing disk */
+ mutex_enter(sc_lock);
+ if
+ :: !cbd_configured(_dev) ->
+ SETERROR(ENXIO);
+ mutex_exit(sc_lock);
+ goto cbdwrite_out;
+ :: !cbd_backed(_dev) ->
+ SETERROR(ENOENT);
+ mutex_exit(sc_lock);
+ goto cbdwrite_out;
+ :: else ->
+ skip;
+ fi
+ mutex_exit(sc_lock);
+
+ /* simulate uio -> buf translation */
+ buf_t rw_buf;
+ rw_buf.direction = B_WRITE;
+ rw_buf.error = 0;
+ rw_buf.length = _uio.length;
+ rw_buf.address = _uio.address;
+ rw_buf.spooled = 0;
+
+ cbdstrategy(rw_buf);
+cbdwrite_out:
+}
+
diff -urN cbd.null/cbd.prx cbd/cbd.prx
--- cbd.null/cbd.prx 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbd.prx 2023-10-19 20:12:35.838587981 +0000
@@ -0,0 +1,218 @@
+// Spin model extractor harness written by cherry
+//
+%F linux/cbd.c
+%X -n modexbug // Somehow this is needed to activate per-func LUT
+
+// See cbdstrategy for translation table only for *function()*s
+%X -L cbdstrategy -n cbd_submit_bio
+// See cbdioctl.lut for translation table only for *function()*s
+%X -L cbdioctl -n cbd_ioctl
+
+%X -n cbd_attach_backend
+%X -n cbd_detach_backend
+%H
+// Disable effects of all included files and try to implement a subset of the APIs they provide.
+#define _LINUX_BLKDEV_H
+
+// linux/blk_types.h
+#define SECTOR_SHIFT 9
+typedef void struct block_device;
+typedef void blk_mode_t;
+
+#define __LINUX_MUTEX_H
+struct mutex {
+ void *dummy;
+};
+
+// sys/null.h
+#define NULL 0
+
+#define true 1
+#define false 0
+
+typedef int bool;
+
+/* Linuxisms */
+#define __user
+#define __init
+#define __exit
+
+#define module_init(x)
+#define module_exit(x)
+#define MODULE_LICENSE(L)
+
+%%
+//%C // c_code {}
+//%%
+//%D // c_cdecl {}
+//%%
+%L // XXX: Looks like per function tables are broken for now.
+// Common transformations
+
+NonState hidden cbd_major global
+NonState hidden exclusive global
+NonState hidden cbd_fops global
+NonState hidden exclusive global
+
+mutex_lock(... mutex_enter(sc_lock)
+mutex_unlock(... mutex_exit(sc_lock)
+
+// cbd_submit_bio(...) -> cbdstrategy(...)
+NonState hidden bio cbd_submit_bio
+NonState hidden cbp cbd_submit_bio
+NonState hidden bdev cbd_submit_bio
+NonState hidden bio_offset cbd_submit_bio
+NonState hidden bdevsz cbd_submit_bio
+
+// Gen-ed Code substitution. This is due to a lack of a sensible parser harness language with the right semantics.
+Substitute c_expr { ((bio_offset+bio->bi_iter.bi_size)>bdevsz) } ((bp.offset + bp.length) > cbdevsz)
+Substitute c_expr { (bio_op(bio)&REQ_OP_READ) } is_set(bp.direction, B_READ)
+Substitute c_code { bio_offset=((bio_offset>bdevsz) ? bdevsz : bio_offset); } bp.offset = ((bp.offset > cbdevsz) -> cbdevsz : bp.offset)
+Substitute c_code [bio] { bio->bi_iter.bi_size=(bdevsz-bio_offset); } bp.length = (cbdevsz - bp.offset)
+Substitute c_code [bio] { bio->bi_iter.bi_sector=(bio_offset>>9); } skip
+Substitute c_code [bio] { bio->bi_status=BLK_STS_IOERR; } bp.error = EIO;
+Substitute c_code [bio] { bio->bi_status=BLK_STS_OFFLINE; } bp.error = ENXIO
+
+// cbd_ioctl(...) -> cbdioctl(...)
+NonState hidden userdata cbd_ioctl
+NonState hidden arg cbd_ioctl
+NonState hidden cmd cbd_ioctl
+NonState hidden mode cbd_ioctl
+NonState hidden bdev cbd_ioctl
+NonState hidden cbp cbd_ioctl
+NonState hidden err cbd_ioctl
+
+
+Substitute c_expr { (!cbd_configured(cbp)) } cbd_configured(_dev)
+Substitute c_code { err=(-ENXIO); } SETERROR(ENXIO)
+Substitute c_expr { 222 == cmd } (_cmd == CBDIO_ATTACH_BACKEND)
+Substitute c_expr { 333 == cmd } (_cmd == CBDIO_DETACH_BACKEND)
+Substitute c_code [bdev->bd_disk && bdev] { cbp=bdev->bd_disk->private_data; } skip
+Substitute c_code { err=cbd_attach_backend(cbp,userdata); } cbd_attach_backend(_data)
+Substitute c_code { err=cbd_detach_backend(cbp); } cbd_detach_backend(_data)
+Substitute c_code [cbp && cbp->backingdev->bd_disk->fops && cbp->backingdev->bd_disk && cbp->backingdev && cbp] { err=cbp->backingdev->bd_disk->fops->ioctl(cbp->backingdev,mode,cmd,arg); } ioctl_backing(backingdev, _cmd, _data, _flag)
+Substitute c_code { err=(-ENOTTY); } skip
+
+// cbd_attach_backend(...) -> cbd_attach_backend()
+NonState hidden lockedcbp cbd_attach_backend
+NonState hidden pathname cbd_attach_backend
+NonState hidden userdata cbd_attach_backend
+NonState hidden backingdev cbd_attach_backend
+
+Substitute c_code { pathname=kzalloc(PATH_MAX,GFP_KERNEL); } skip;
+Substitute c_expr { (pathname==0) } false
+Substitute c_expr { (strncpy_from_user(pathname,(void *)userdata,PATH_MAX)==(-EFAULT)) } (_data != CBD_DEVID_BACKING); SETERROR(EINVAL)
+Substitute c_code { kfree(pathname); } skip
+Substitute c_code { backingdev=blkdev_get_by_path(strim(pathname),((BLK_OPEN_READ|BLK_OPEN_WRITE)|BLK_OPEN_EXCL),exclusive,0); } skip
+Substitute c_expr { IS_ERR(backingdev) } false
+Substitute c_code [lockedcbp] { lockedcbp->backingdev=backingdev; } backingdev = CBD_DEVID_BACKING
+Substitute c_code [lockedcbp->cbdev && lockedcbp] { set_capacity(lockedcbp->cbdev->bd_disk,bdev_nr_sectors(backingdev)); } cbdevsz = CBD_BACKING_DISKSIZE
+
+NonState hidden lockedcbp cbd_detach_backend
+Substitute c_code [lockedcbp] { blkdev_put(lockedcbp->backingdev,exclusive); } skip
+Substitute c_code [lockedcbp] { lockedcbp->backingdev=0; } backingdev = CBD_DEVID_NONE
+Substitute c_code [lockedcbp->cbdev && lockedcbp] { set_capacity(lockedcbp->cbdev->bd_disk,0); } cbdevsz = 0
+%%
+
+%P
+hidden int tmpdev; /* Saved during passthrough */
+
+inline
+cbdstrategy(bp)
+{
+#include "_modex_cbd_submit_bio.pml"
+}
+
+inline
+cbdioctl(_dev, _cmd, _data, _flag)
+{
+#include "_modex_cbd_ioctl.pml"
+}
+
+inline
+cbd_attach_backend(_data)
+{
+ SETERROR(0);
+#include "_modex_cbd_attach_backend.pml"
+}
+
+inline
+cbd_detach_backend(_data)
+{
+ SETERROR(0);
+#include "_modex_cbd_detach_backend.pml"
+}
+
+// linux/cbd.c doesn't have read/write fops, so we
+// fake it to allow cbd.drv to be parsed
+// However See: cbd.drv: -D EXPLORE_READWRITESTATE
+/* For now, we just check re-entrancy consistency.
+ * XXX: uio validation
+ */
+inline
+cbdread(_dev, _uio)
+{
+ SETERROR(0);
+
+ /* Check for backing disk */
+ mutex_enter(sc_lock);
+ if
+ :: !cbd_configured(_dev) ->
+ SETERROR(EINVAL);
+ mutex_exit(sc_lock);
+ goto cbdread_out;
+ :: !cbd_backed(_dev) ->
+ SETERROR(ENOENT);
+ mutex_exit(sc_lock);
+ goto cbdread_out;
+ :: else ->
+ skip;
+ fi
+ mutex_exit(sc_lock);
+
+ /* simulate uio -> buf translation */
+ buf_t rd_buf;
+ rd_buf.direction = B_READ;
+ rd_buf.error = 0;
+ rd_buf.length = _uio.length;
+ rd_buf.address = _uio.address;
+ rd_buf.spooled = 0;
+ cbdstrategy(rd_buf);
+
+cbdread_out:
+}
+
+inline
+cbdwrite(_dev, _uio)
+{
+ SETERROR(0);
+
+ /* Check for backing disk */
+ mutex_enter(sc_lock);
+ if
+ :: !cbd_configured(_dev) ->
+ SETERROR(EINVAL);
+ mutex_exit(sc_lock);
+ goto cbdwrite_out;
+ :: !cbd_backed(_dev) ->
+ SETERROR(ENOENT);
+ mutex_exit(sc_lock);
+ goto cbdwrite_out;
+ :: else ->
+ skip;
+ fi
+ mutex_exit(sc_lock);
+
+ /* simulate uio -> buf translation */
+ buf_t rw_buf;
+ rw_buf.direction = B_WRITE;
+ rw_buf.error = 0;
+ rw_buf.length = _uio.length;
+ rw_buf.address = _uio.address;
+ rw_buf.spooled = 0;
+
+ cbdstrategy(rw_buf);
+cbdwrite_out:
+}
+
+%%
\ No newline at end of file
diff -urN cbd.null/cbdstrategy.linux.lut cbd/cbdstrategy.linux.lut
--- cbd.null/cbdstrategy.linux.lut 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbdstrategy.linux.lut 2023-10-19 13:12:04.576568138 +0000
@@ -0,0 +1,7 @@
+// cbd_submit_bio(...) -> cbdstrategy(...)
+
+cbd_backed(cbp) cbd_backed(bp.dev)
+bio_set_dev(... tmpdev = bp.dev; bp.dev = backingdev
+submit_bio_noacct(bio) io_backing(backingdev, bp); bp.dev = tmpdev
+
+
diff -urN cbd.null/cbdstrategy.NetBSD.lut cbd/cbdstrategy.NetBSD.lut
--- cbd.null/cbdstrategy.NetBSD.lut 1970-01-01 00:00:00.000000000 +0000
+++ cbd/cbdstrategy.NetBSD.lut 2023-10-23 08:23:18.598269036 +0000
@@ -0,0 +1,10 @@
+// cbd_strategy(...) -> cbdstrategy(...)
+
+cbd_backed(sc) cbd_backed(bp.dev)
+VOP_STRATEGY(sc->sc_tvn,buf) tmpdev = bp.dev; bp.dev = backingdev; io_backing(backingdev, bp); bp.dev = tmpdev
+
+Substitute c_expr { (sc==0) } false
+
+
+
+
diff -urN cbd.null/linux/cbd.c cbd/linux/cbd.c
--- cbd.null/linux/cbd.c 1970-01-01 00:00:00.000000000 +0000
+++ cbd/linux/cbd.c 2023-10-22 21:10:18.387421555 +0000
@@ -0,0 +1,315 @@
+#include <linux/mutex.h>
+#include <linux/blkdev.h>
+
+#define CBD_ATTACH_BACKEND 222 /* XXX: fit into ioctl namespace */
+#define CBD_DETACH_BACKEND 333 /* XXX: ditto */
+
+
+/* Driver-wide values that don't change over the life of the module */
+
+/* At the moment, we assume that there's 1:1 correspondence between a
+ * driver instance (ie; module_init/exit pair) and a device
+ * instance. Thus we use cbd_major below as the primary "key" to
+ * lookup device state, via blk_get_no_open(). Note - this is not
+ * related to "minor" devices, which can still use cbd_major as
+ * above.
+ * XXX: Fix this.
+ */
+static int cbd_major = 0;
+static char *exclusive = "cbd_ioctl";
+
+struct cbd_private {
+ struct block_device *cbdev; /* Back ref for state tracking
+ * - see cbd_configured()
+ * below
+ */
+ struct block_device *backingdev;
+ struct mutex cbdlock;
+};
+
+/* Driver state enquiry */
+static bool
+cbd_configured(struct cbd_private *cbp)
+{
+ return (cbp->cbdev != NULL);
+}
+
+/* Implies attached */
+static bool
+cbd_backed(struct cbd_private *cbp)
+{
+ // XXX: check sc->sc_size == backing disk size
+ return (cbd_configured(cbp) && cbp->backingdev != NULL);
+}
+
+static int
+cbd_attach_backend(struct cbd_private *lockedcbp, void *userdata)
+{
+ char *pathname;
+ struct block_device *backingdev;
+
+ /* XXX: proper dev validation */
+ if (lockedcbp->cbdev == NULL) {
+ return -ENODEV;
+ }
+
+ pathname = kzalloc(PATH_MAX, GFP_KERNEL);
+
+ if (pathname == NULL) {
+ return -ENOMEM;
+ }
+
+ if (strncpy_from_user(pathname, (void __user *)userdata,
+ PATH_MAX) == -EFAULT) {
+ kfree(pathname);
+ return -EFAULT;
+ }
+
+ backingdev = blkdev_get_by_path(strim(pathname),
+ BLK_OPEN_READ|BLK_OPEN_WRITE|BLK_OPEN_EXCL,
+ exclusive, NULL);
+
+ kfree(pathname);
+
+ if (IS_ERR(backingdev)) {
+ return PTR_ERR(backingdev);
+ }
+
+ set_capacity(lockedcbp->cbdev->bd_disk, bdev_nr_sectors(backingdev));
+ lockedcbp->backingdev = backingdev;
+
+
+ return 0;
+}
+
+static int
+cbd_detach_backend(struct cbd_private *lockedcbp)
+{
+ /* XXX: proper dev state validity */
+ if (lockedcbp->cbdev == NULL ||
+ lockedcbp->backingdev == NULL) {
+ return -ENODEV;
+ }
+
+ blkdev_put(lockedcbp->backingdev, exclusive);
+
+ lockedcbp->backingdev = NULL;
+ set_capacity(lockedcbp->cbdev->bd_disk, 0);
+
+ return 0;
+}
+
+static int cbd_ioctl(struct block_device *bdev, blk_mode_t mode,
+ unsigned cmd, unsigned long arg)
+{
+ int err = 0;
+ void __user *userdata = (void __user *)arg;
+ struct cbd_private *cbp;
+ cbp = bdev->bd_disk->private_data;
+
+ if (cbp == NULL) {
+ err = -ENXIO;
+ goto ioctl_out;
+ }
+
+ mutex_lock(&cbp->cbdlock);
+
+ if (!cbd_configured(cbp)) {
+ err = -ENXIO;
+ goto ioctl_out;
+ }
+
+ switch(cmd) {
+ case CBD_ATTACH_BACKEND:
+ if (cbd_backed(cbp)) {
+ err = -EBUSY;
+ break;
+ }
+ if (!(mode & BLK_OPEN_WRITE)) {
+ err = -EPERM;
+ break;
+ }
+ err = cbd_attach_backend(cbp, userdata);
+ break;
+ case CBD_DETACH_BACKEND: /* XXX: */
+ if (!cbd_backed(cbp)) {
+ err = -ENODEV;
+ break;
+ }
+ if (!(mode & BLK_OPEN_WRITE)) {
+ err = -EPERM;
+ break;
+ }
+ err = cbd_detach_backend(cbp);
+ break;
+ default: /*Pass through */
+ if (cbd_backed(cbp)) {
+ err = cbp->backingdev->bd_disk->fops->
+ ioctl(cbp->backingdev, mode, cmd, arg);
+ } else {
+ err = -ENOTTY;
+ }
+ break;
+ }
+ioctl_out:
+ mutex_unlock(&cbp->cbdlock);
+ return err;
+}
+
+static void cbd_submit_bio(struct bio *bio)
+{
+ struct block_device *bdev = bio->bi_bdev;
+ struct cbd_private *cbp = bdev->bd_disk->private_data;
+
+ size_t bio_offset = bio->bi_iter.bi_sector << SECTOR_SHIFT;
+ /* Model assumes this is cached in driver instance after
+ * backing device is attached.
+ */
+ size_t bdevsz = bdev_nr_bytes(bdev);
+
+ /* sanity check bio limits */
+ if ((bio_offset + bio->bi_iter.bi_size) > bdevsz) {
+ if (bio_op(bio) & REQ_OP_READ) {
+ /* Clip bio */
+ bio_offset = ((bio_offset > bdevsz) ? bdevsz : bio_offset);
+ /* Update the bio */
+ bio->bi_iter.bi_size = bdevsz - bio_offset;
+
+ bio->bi_iter.bi_sector = (bio_offset >> SECTOR_SHIFT);
+ }
+ } else {
+ bio->bi_status = BLK_STS_NOTSUPP;
+ return;
+ }
+
+ if (cbd_backed(cbp)) {
+ mutex_lock(&cbp->cbdlock);
+ /* Pass through to backing dev */
+ bio_set_dev(bio, cbp->backingdev);
+ mutex_unlock(&cbp->cbdlock);
+ submit_bio_noacct(bio);
+ } else {
+ bio->bi_status = BLK_STS_OFFLINE;
+ }
+
+ return;
+
+}
+
+static const struct block_device_operations cbd_fops = {
+ .owner = THIS_MODULE,
+ .ioctl = cbd_ioctl,
+ .submit_bio = cbd_submit_bio,
+};
+
+
+static int __init cbd_init(void)
+{
+ int err = 0;
+ struct block_device *bdev;
+ struct gendisk *gendisk;
+ struct cbd_private *cbp;
+
+ cbp = kzalloc(sizeof(struct cbd_private),
+ GFP_KERNEL);
+
+ if (cbp == NULL) {
+ return -ENOMEM;
+ }
+
+ gendisk = blk_alloc_disk(NUMA_NO_NODE);
+
+ if (gendisk == NULL) {
+ kfree(cbp);
+ goto out;
+ }
+
+ sprintf(gendisk->disk_name, "cbd");
+ gendisk->fops = &cbd_fops;
+
+ /* Since we pass through for now. XXX: review */
+ blk_queue_flag_set(QUEUE_FLAG_SYNCHRONOUS, gendisk->queue);
+ blk_queue_flag_set(QUEUE_FLAG_NOWAIT, gendisk->queue);
+
+ gendisk->private_data = cbp;
+
+ /* Note: set_capacity() is called later when the backend is attached.
+ * XXX: Will this work ?
+ */
+ err = add_disk(gendisk);
+ if (err) {
+ put_disk(gendisk);
+ kfree(cbp);
+ goto out;
+ }
+
+ cbd_major = register_blkdev(0, "cachedrive");
+ if (cbd_major < 0) {
+ err = cbd_major;
+ put_disk(gendisk);
+ kfree(cbp);
+ goto out;
+ } else {
+ bdev = blkdev_get_by_dev(MKDEV(cbd_major, 0),
+ BLK_OPEN_READ|BLK_OPEN_WRITE, exclusive, NULL);
+ if (IS_ERR(bdev)) {
+ err = PTR_ERR(bdev);
+ put_disk(gendisk);
+ kfree(cbp);
+ goto out;
+ }
+
+ if (bdev == NULL) {
+ put_disk(gendisk);
+ kfree(cbp);
+ goto out;
+ }
+
+ bdev->bd_disk = gendisk;
+ mutex_init(&cbp->cbdlock);
+ mutex_lock(&cbp->cbdlock);
+ cbp->cbdev = bdev;
+ mutex_unlock(&cbp->cbdlock);
+ err = 0; /* Mark success */
+ }
+
+out:
+ return err;
+}
+
+static void __exit cbd_exit(void)
+{
+ struct block_device *bdev;
+ struct gendisk *gendisk;
+ struct cbd_private *cbp;
+
+ bdev = blkdev_get_by_dev(MKDEV(cbd_major, 0),
+ BLK_OPEN_READ|BLK_OPEN_WRITE, exclusive, NULL);
+
+ if (bdev == NULL) {
+ // XXX: panic() or something else ?
+ panic("can't find allocated block device major number %d\n", cbd_major);
+ }
+
+ gendisk = bdev->bd_disk;
+ cbp = gendisk->private_data;
+
+ mutex_lock(&cbp->cbdlock);
+ if (cbp->backingdev != NULL) {
+ mutex_unlock(&cbp->cbdlock);
+ cbd_detach_backend(cbp);
+ }
+ mutex_unlock(&cbp->cbdlock);
+
+ put_disk(gendisk);
+ kfree(cbp);
+ blkdev_put(bdev, exclusive);
+
+ unregister_blkdev(cbd_major, "cbd");
+}
+
+module_init(cbd_init);
+module_exit(cbd_exit);
+
+MODULE_LICENSE("GPL");
+
diff -urN cbd.null/linux/Makefile cbd/linux/Makefile
--- cbd.null/linux/Makefile 1970-01-01 00:00:00.000000000 +0000
+++ cbd/linux/Makefile 2023-10-16 09:20:14.033674649 +0000
@@ -0,0 +1,10 @@
+obj-m += cbd.o
+LINUXSRC ?= $(HOME)/linux #For dev, for now.
+# The "standard seems to be"
+#LINUXSRC?=/lib/modules/$(shell uname -r)/build
+
+all:
+ make -C $(LINUXSRC) M=$(PWD) modules
+
+clean:
+ make -C $(LINUXSRC) M=$(PWD) clean
diff -urN cbd.null/Makefile cbd/Makefile
--- cbd.null/Makefile 1970-01-01 00:00:00.000000000 +0000
+++ cbd/Makefile 2023-10-21 17:25:46.201831270 +0000
@@ -0,0 +1,96 @@
+# This set of spinroot related files were written by MattC
+# <[email protected]> in the Gregorian Calendar year AD.2023, in the month
+# of February that year.
+#
+# We have two specification files and a properties file (".inv")
+#
+# The properties file contains a "constraint" sections
+# such as ltl or never claims (either or, not both).
+#
+# The specification is divided into two files:
+# the file with suffix '.drv' is a "driver" which
+# instantiates processes that will ultimately "drive" the
+# models under test.
+# The file with the suffix '.pml' contains the process
+# model code, which, is intended to be the formal specification
+# for the code we are interested in writing in C.
+#
+# A file with a '.pmh' extension is a "header" file which
+# both the driver and the model can use.
+#
+# We process these files in slightly different ways during
+# the dev cycle, but broadly speaking, the idea is to generate
+# a file called 'spinmodel.pml' which contains the final
+# model file that is fed to spin.
+#
+# During a model extraction verification run, "modex" is run to
+# extract the 'specification' from C code written to implement
+# the model defined above. We use a 'harness' file (see file with
+# suffix '.prx' below.
+#
+# Once the harness has been run, spinmodel.pml should be
+# synthesised and processed as usual.
+#
+# The broad idea is that software development starts by writing the
+# spec first, validating the model, and then implementing the model in
+# C, after which we come back to extract the model from the C file
+# and cross check our implementation using spin.
+#
+# If things go well, the constraints specified in the '.inv' file
+# should hold exactly for both the handwritten model, and the
+# extracted one.
+
+spin-gen: cbd.pml cbd.drv cbd.inv
+ cp cbd.pml model #mimic modex - see below.
+ cat model >> spinmodel.pml;cat cbd.drv >> spinmodel.pml;cat cbd.inv >> spinmodel.pml;
+ spin -am spinmodel.pml
+ cc -DVECTORSZ=65536 -o cbd.pan pan.c
+
+all: spin-gen prog
+
+# Verification related targets.
+spin-run: spinmodel.pml
+ ./cbd.pan -a #Generate cbd.pml.trail on error
+ @test -f spinmodel.pml.trail && spin -t spinmodel.pml -p -g -l && ./cbd.pan -r spinmodel.pml.trail -g || true
+
+# Modex Extracts from C code to 'model' - see cbd.linux.prx
+modex-linux: cbd.linux.prx cbd.drv cbd.inv linux/cbd.c
+ modex -v -w cbd.linux.prx
+ echo "//Included via command line - see Makefile:modex-*:\n\n\n#include \"cbd.pmh\"\n\n" >> spinmodel.pml #include "cbd.pmh" - avoid modex cpp expansion
+ cat model >> spinmodel.pml;cat cbd.drv >> spinmodel.pml;cat cbd.inv >> spinmodel.pml;
+ spin -am spinmodel.pml #Sanity check
+ cc -DVECTORSZ=65536 -o cbd.pan pan.c
+
+# Modex Extracts from C code to 'model' - see cbd.NetBSD.prx
+modex-NetBSD: cbd.NetBSD.prx cbd.drv cbd.inv NetBSD/cbd.c
+ modex -v -w cbd.NetBSD.prx
+ echo "//Included via command line - see Makefile:modex-*:\n\n\n#include \"cbd.pmh\"\n\n" >> spinmodel.pml #include "cbd.pmh" - avoid modex cpp expansion
+ cat model >> spinmodel.pml;cat cbd.drv >> spinmodel.pml;cat cbd.inv >> spinmodel.pml;
+ spin -am spinmodel.pml #Sanity check
+ cc -DVECTORSZ=65536 -o cbd.pan pan.c
+
+# Housekeeping
+modex-gen-clean:
+ rm -f spinmodel.pml # Our consolidated model file
+ rm -f _spin_nvr.tmp # Never claim file
+ rm -f model # modex generated intermediate "model" file
+ rm -f pan.* # Spin generated source files
+ rm -f _modex* # modex generated script files
+ rm -f linux/cbd.[IM]
+ rm -f NetBSD/cbd.[IM]
+
+prog-clean:
+ rm -f cbd
+
+spin-run-clean:
+ rm -f spinmodel.pml.trail
+
+spin-gen-clean:
+ rm -f cbd.pan # model executables.
+ rm -f spinmodel.pml # Consolidated model source files
+ rm -f _spin_nvr.tmp # Never claim file
+ rm -f model # Intermediate "model" file
+ rm -f pan.* # Spin generated source files
+
+clean: modex-gen-clean spin-gen-clean spin-run-clean prog-clean
+ rm -f *~
diff -urN cbd.null/NetBSD/cbd.c cbd/NetBSD/cbd.c
--- cbd.null/NetBSD/cbd.c 1970-01-01 00:00:00.000000000 +0000
+++ cbd/NetBSD/cbd.c 2023-10-23 06:37:45.843939111 +0000
@@ -0,0 +1,588 @@
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <sys/buf.h>
+#include <sys/bufq.h>
+#include <sys/conf.h>
+#include <sys/disk.h>
+#include <sys/disklabel.h>
+#include <sys/errno.h>
+
+#include <uvm/uvm_prot.h>
+#include <sys/acl.h>
+#include <sys/vnode_if.h>
+#include <sys/vnode.h>
+
+#include <sys/module.h>
+
+#include <sys/fcntl.h>
+
+#define CBD_ATTACH_BACKEND 222 /* XXX: fit into ioctl namespace */
+#define CBD_DETACH_BACKEND 333 /* XXX: ditto */
+
+static void cbd_attach(device_t, device_t, void *);
+static int cbd_detach(device_t, int);
+
+static dev_type_open(cbd_open);
+static dev_type_close(cbd_close);
+static dev_type_read(cbd_read);
+static dev_type_write(cbd_write);
+static dev_type_ioctl(cbd_ioctl);
+static dev_type_strategy(cbd_strategy);
+static dev_type_size(cbd_size);
+
+struct cbd_ioctl {
+ const char *diskpath; /* Path to backing block disk dev node */
+ struct lwp *l; /* Userland caller lwp */
+};
+
+const struct dkdriver cbddkdriver = {
+ .d_open = cbd_open,
+ .d_close = cbd_close,
+ .d_strategy = cbd_strategy,
+ .d_minphys = minphys,
+};
+
+struct cbd_softc {
+ device_t sc_dev; /* Back ref to autoconf dev ptr. */
+ struct disk sc_dk; /* generic disk information */
+ kmutex_t sc_lock; /* Protects access to instance of this struct. */
+ struct bufq_state *sc_buflist; /* transaction request/reply queue */
+ struct vnode *sc_tvn; /* backing device
+ * vnode */
+ int sc_tvn_bsize; /* Backing device size, which
+ * is also the size of our
+ * device
+ */
+ struct kauth_cred *sc_tvn_l_cred; /* Creds used with sc_tvn */
+};
+
+const struct bdevsw cbd_bdevsw = {
+ .d_open = cbd_open,
+ .d_close = cbd_close,
+ .d_strategy = cbd_strategy,
+ .d_ioctl = cbd_ioctl,
+ .d_dump = nodump,
+ .d_psize = cbd_size,
+ .d_discard = nodiscard,
+ .d_flag = D_DISK | D_MPSAFE
+};
+
+const struct cdevsw cbd_cdevsw = {
+ .d_open = cbd_open,
+ .d_close = cbd_close,
+ .d_read = cbd_read,
+ .d_write = cbd_write,
+ .d_ioctl = cbd_ioctl,
+ .d_stop = nostop,
+ .d_tty = notty,
+ .d_poll = nopoll,
+ .d_mmap = nommap,
+ .d_kqfilter = nokqfilter,
+ .d_discard = nodiscard,
+ .d_flag = D_DISK | D_MPSAFE
+};
+
+CFATTACH_DECL3_NEW(cbd, sizeof(struct cbd_softc),
+ 0, cbd_attach, cbd_detach, NULL, NULL, NULL, DVF_DETACH_SHUTDOWN);
+
+#ifdef _MODULE
+MODULE(MODULE_CLASS_DRIVER, cbd, NULL);
+CFDRIVER_DECL(cbd, DV_DISK, NULL);
+
+
+
+static int
+cbd_modcmd(modcmd_t cmd, void *arg)
+{
+ devmajor_t bmajor = -1, cmajor = -1;
+
+ int error = 0;
+ switch (cmd) {
+ case MODULE_CMD_INIT:
+
+ error = devsw_attach("cbd", &cbd_bdevsw, &bmajor,
+ &cbd_cdevsw, &cmajor);
+
+ if (error) {
+ break;
+ }
+
+ error = config_cfdriver_attach(&cbd_cd);
+
+ if (error) {
+ devsw_detach(&cbd_bdevsw, &cbd_cdevsw);
+ break;
+ }
+
+ error = config_cfattach_attach(cbd_cd.cd_name, &cbd_ca);
+ if (error) {
+ config_cfdriver_detach(&cbd_cd);
+ devsw_detach(&cbd_bdevsw, &cbd_cdevsw);
+ break;
+ }
+
+ break;
+
+ case MODULE_CMD_FINI:
+ error = config_cfattach_detach(cbd_cd.cd_name, &cbd_ca);
+ if (error) {
+ break;
+ }
+ error = config_cfdriver_detach(&cbd_cd);
+ if (error) {
+ break;
+ }
+
+ devsw_detach(&cbd_bdevsw, &cbd_cdevsw);
+ break;
+ case MODULE_CMD_STAT:
+ error = ENOTTY;
+ break;
+
+ default:
+ error = ENOTTY;
+ break;
+ }
+
+ return error;
+
+}
+
+
+#endif
+
+static void cbd_attach(device_t, device_t, void *);
+static int cbd_detach(device_t, int);
+static bool
+is_read(struct buf *buf)
+{
+ return ((buf->b_flags & B_READ) == B_READ);
+}
+
+#if notyet
+static bool
+is_write(struct buf *buf)
+{
+ return ((buf->b_flags & B_WRITE) == B_WRITE);
+}
+#endif
+
+/*
+ * The idea here is that the state sequentially progresses
+ * forwards as configuration completes.
+ *
+ * So for eg: a device that has a caching device setup, implies it is
+ * already cbd_backed(), but not vice-versa, because an unfronted
+ * backed device is merely an uncached "passthrough" situation.
+ */
+
+static bool
+cbd_configured(struct cbd_softc *sc)
+{
+ /*
+ * Note: device_t is a pointer, no specific "NULL" typedef
+ * available
+ */
+ return (sc->sc_dev != NULL);
+}
+
+/* Implies attached */
+static bool
+cbd_backed(struct cbd_softc *sc)
+{
+ // XXX: check sc->sc_size == backing disk size
+ return (cbd_configured(sc) && sc->sc_tvn != NULL);
+}
+
+static void
+cbd_attach(device_t parent, device_t self, void *aux)
+{
+ struct cbd_softc *sc = device_private(self);
+
+ mutex_init(&sc->sc_lock, MUTEX_DEFAULT, IPL_NONE);
+
+ mutex_enter(&sc->sc_lock);
+
+ if (cbd_configured(sc)) { /* Sorry, busy */
+ mutex_exit(&sc->sc_lock);
+ }
+
+ sc->sc_dev = self;
+
+ /* Initialize and attach the disk structure. */
+ disk_init(&sc->sc_dk, device_xname(self), &cbddkdriver);
+ disk_attach(&sc->sc_dk);
+
+ bufq_alloc(&sc->sc_buflist, "fcfs", 0);
+
+ mutex_exit(&sc->sc_lock);
+
+ if (!pmf_device_register(self, NULL, NULL))
+ aprint_error_dev(self, "couldn't establish power handler\n");
+}
+
+static int
+cbd_detach(device_t self, int flags)
+{
+ struct cbd_softc *sc = device_private(self);
+
+ mutex_enter(&sc->sc_lock);
+
+ if (!cbd_configured(sc)) {
+ mutex_exit(&sc->sc_lock);
+ return ENXIO;
+ }
+
+ /* We assume the !(flags & DETACH_FORCE) case here. */
+ if (((flags & DETACH_FORCE) == 0) &&
+ (bufq_peek(sc->sc_buflist) != NULL)) {
+ mutex_exit(&sc->sc_lock);
+ return EBUSY; /* There are pending transactions */
+ } else { /* DETACH_FORCE */
+ bufq_drain(sc->sc_buflist);
+ //XXX: Tell backing device to drain/flush etc ?
+ }
+
+ // XXX: detach backed/fronted devices ?
+ // quiesce everything first, then sc->sc_tvn = NULL; ?
+ // All threads need to cease related activity.
+
+ mutex_exit(&sc->sc_lock);
+
+ pmf_device_deregister(self);
+ bufq_free(sc->sc_buflist);
+ disk_detach(&sc->sc_dk);
+
+ mutex_destroy(&sc->sc_lock);
+
+ return 0;
+}
+
+static int
+cbd_open(dev_t dev, int flag, int ifmt, struct lwp *l)
+{
+ /* Sanity check params */
+ switch(ifmt) {
+ case S_IFCHR:
+ case S_IFBLK:
+ break;
+ default:
+ return EINVAL;
+ }
+
+ int unit;
+ struct cbd_softc *sc;
+
+ unit = DISKUNIT(dev);
+ sc = device_lookup_private(&cbd_cd, unit);
+
+ /*
+ * Trying to open unconfigured dev.
+ * cbdconfig(8) should be run first.
+ */
+ if (sc == NULL) {
+ return ENXIO;
+ }
+
+ mutex_enter(&sc->sc_lock);
+
+ if (!cbd_backed(sc)) {
+ mutex_exit(&sc->sc_lock);
+ return ENXIO;
+ }
+
+ mutex_exit(&sc->sc_lock);
+
+ return 0;
+}
+
+
+static int
+cbd_close(dev_t dev, int flag, int ifmt, struct lwp *l)
+{
+ /* Sanity check params */
+ switch(ifmt) {
+ case S_IFCHR:
+ case S_IFBLK:
+ break;
+ default:
+ return EINVAL;
+ }
+
+ int unit;
+ struct cbd_softc *sc;
+
+ unit = DISKUNIT(dev);
+ sc = device_lookup_private(&cbd_cd, unit);
+
+ /*
+ * Trying to close unconfigured dev.
+ * cbdconfig(8) should be run first.
+ */
+ if (sc == NULL) {
+ return ENXIO;
+ }
+
+ mutex_enter(&sc->sc_lock);
+
+ if (!cbd_backed(sc)) {
+ mutex_exit(&sc->sc_lock);
+ return ENXIO;
+ }
+
+ mutex_exit(&sc->sc_lock);
+ return 0;
+}
+
+
+static void
+cbd_strategy(struct buf *buf)
+{
+
+ dev_t dev = buf->b_dev;
+
+ int unit;
+ struct cbd_softc *sc;
+
+ unit = DISKUNIT(dev);
+ sc = device_lookup_private(&cbd_cd, unit);
+
+ if (sc == NULL) {
+ return;
+ }
+
+ /* Sanity check buffer */
+ if ((dbtob(buf->b_blkno) + buf->b_bcount) > sc->sc_tvn_bsize) {
+ if (BUF_ISREAD(buf)) {
+ /* Clip offset */
+ buf->b_blkno = (dbtob(buf->b_blkno) > sc->sc_tvn_bsize) ? btodb(sc->sc_tvn_bsize) : buf->b_blkno;
+ /* Update the request length */
+ buf->b_bcount = (sc->sc_tvn_bsize - dbtob(buf->b_blkno));
+ } else {
+ buf->b_error = EINVAL;
+ biodone(buf);
+ return;
+ }
+ }
+
+
+ if (cbd_backed(sc)) {
+ mutex_enter(&sc->sc_lock);
+ disk_busy(&sc->sc_dk);
+ VOP_STRATEGY(sc->sc_tvn, buf);
+ disk_unbusy(&sc->sc_dk, buf->b_bcount, is_read(buf));
+ mutex_exit(&sc->sc_lock);
+ } else {
+ buf->b_error = ENXIO;
+ biodone(buf);
+ return;
+ }
+
+ /* We assume that the backing driver has frobbed buf
+ * appropriately, for return;
+ */
+}
+
+static int
+cbd_attach_backend(struct cbd_softc *lockedsc, void *data)
+{
+ struct cbd_ioctl *cbdioctl = data;
+
+ int error = 0;
+ struct pathbuf *pbuf;
+ struct vnode *vp;
+
+ uint64_t numsec;
+ unsigned int secsz;
+
+ /* XXX: proper dev validation */
+ if (lockedsc->sc_dev == NULL) {
+ return -ENODEV;
+ }
+
+ error = pathbuf_copyin(cbdioctl->diskpath, &pbuf);
+ if (error != 0) {
+ return error;
+ }
+
+ error = vn_bdev_openpath(pbuf, &vp, cbdioctl->l);
+ pathbuf_destroy(pbuf);
+ if (error != 0) {
+ return error;
+ }
+
+ error = getdisksize(vp, &numsec, &secsz);
+ if (error != 0) {
+ vn_close(vp, /* See vn_bdev_openpath() */ FREAD | FWRITE, cbdioctl->l->l_cred);
+ return error;
+ }
+
+ lockedsc->sc_tvn = vp;
+ lockedsc->sc_tvn_l_cred = cbdioctl->l->l_cred;
+ lockedsc->sc_tvn_bsize = numsec * secsz;
+
+ return error;
+}
+
+static int
+cbd_detach_backend(struct cbd_softc *lockedsc)
+{
+ /* XXX: proper dev state validity */
+ if (lockedsc->sc_dev == NULL ||
+ lockedsc->sc_tvn == NULL ||
+ lockedsc->sc_tvn_bsize == 0 ||
+ lockedsc->sc_tvn_l_cred == NULL) {
+ return -ENODEV;
+ }
+
+ vn_close(lockedsc->sc_tvn, /* See vn_bdev_openpath() */ FREAD | FWRITE,
+ lockedsc->sc_tvn_l_cred);
+
+ lockedsc->sc_tvn_bsize = 0;
+ lockedsc->sc_tvn_l_cred = NULL;
+ lockedsc->sc_tvn = NULL; /* reset */
+
+ return 0;
+}
+
+static int
+cbd_ioctl(dev_t dev, u_long cmd, void *data, int flag, struct lwp *l)
+{
+ int unit, error = 0;
+ struct cbd_softc *sc;
+ struct cbd_ioctl cbdioctlargs;
+
+ unit = DISKUNIT(dev);
+ sc = device_lookup_private(&cbd_cd, unit);
+
+ if (sc == NULL) {
+ return ENXIO;
+ }
+
+ mutex_enter(&sc->sc_lock);
+
+ if (!cbd_configured(sc)) {
+ mutex_exit(&sc->sc_lock);
+ return ENXIO;
+ }
+
+ switch(cmd) {
+ case CBD_ATTACH_BACKEND:
+ if (cbd_backed(sc)) {
+ error = EBUSY;
+ break;
+ }
+ if (!(flag & FWRITE)) {
+ error = EPERM;
+ }
+ cbdioctlargs.diskpath = data;
+ cbdioctlargs.l = l;
+ error = cbd_attach_backend(sc, &cbdioctlargs);
+ break;
+ case CBD_DETACH_BACKEND:
+ if (!cbd_backed(sc)) {
+ error = ENODEV;
+ break;
+ }
+ if (!(flag & FWRITE)) {
+ error = EPERM;
+ break;
+ }
+ error = cbd_detach_backend(sc);
+ break;
+
+ /* default: Pass through */
+ }
+
+ error = VOP_IOCTL(sc->sc_tvn, cmd, data, flag, l->l_cred);
+
+ mutex_exit(&sc->sc_lock);
+
+ return error;
+}
+
+static int
+cbd_read(dev_t dev, struct uio *uio, int flags)
+{
+ int unit;
+ struct cbd_softc *sc;
+
+ unit = DISKUNIT(dev);
+ sc = device_lookup_private(&cbd_cd, unit);
+
+ if (sc == NULL) {
+ return ENXIO;
+ }
+
+ mutex_enter(&sc->sc_lock);
+
+ if (!cbd_backed(sc)) {
+ mutex_exit(&sc->sc_lock);
+ return ENXIO;
+ }
+
+ mutex_exit(&sc->sc_lock);
+
+ return physio(cbd_strategy, NULL, dev, B_READ, minphys, uio);
+}
+
+static int
+cbd_write(dev_t dev, struct uio *uio, int flags)
+{
+ int unit;
+ struct cbd_softc *sc;
+
+ unit = DISKUNIT(dev);
+ sc = device_lookup_private(&cbd_cd, unit);
+
+ /*
+ * Trying to close unconfigured dev.
+ * cbdconfig(8) should be run first.
+ */
+ if (sc == NULL) {
+ return ENXIO;
+ }
+
+ mutex_enter(&sc->sc_lock);
+
+ if (!cbd_backed(sc)) {
+ mutex_exit(&sc->sc_lock);
+ return ENXIO;
+ }
+
+ mutex_exit(&sc->sc_lock);
+
+ return physio(cbd_strategy, NULL, dev, B_WRITE, minphys, uio);
+}
+
+
+static int
+cbd_size(dev_t dev)
+{
+ int unit;
+ struct cbd_softc *sc;
+
+ unit = DISKUNIT(dev);
+ sc = device_lookup_private(&cbd_cd, unit);
+
+ /*
+ * Trying to close unconfigured dev.
+ * cbdconfig(8) should be run first.
+ */
+ if (sc == NULL) {
+ return ENXIO;
+ }
+
+ mutex_enter(&sc->sc_lock);
+
+ if (!cbd_backed(sc)) {
+ mutex_exit(&sc->sc_lock);
+ return ENXIO;
+ }
+
+ // XXPass through.
+
+ mutex_exit(&sc->sc_lock);
+
+ return -1; // XXX:
+}
diff -urN cbd.null/NetBSD/cbd.ioconf cbd/NetBSD/cbd.ioconf
--- cbd.null/NetBSD/cbd.ioconf 1970-01-01 00:00:00.000000000 +0000
+++ cbd/NetBSD/cbd.ioconf 2023-10-20 15:40:16.888208026 +0000
@@ -0,0 +1,8 @@
+# $NetBSD$
+
+ioconf cbd
+
+include "conf/files"
+
+pseudo-device cbd
+
diff -urN cbd.null/NetBSD/Makefile cbd/NetBSD/Makefile
--- cbd.null/NetBSD/Makefile 1970-01-01 00:00:00.000000000 +0000
+++ cbd/NetBSD/Makefile 2023-10-20 15:41:52.334710787 +0000
@@ -0,0 +1,10 @@
+# $NetBSD$
+# Please build using crossbuild nbmake-$ARCH
+
+.include "${NETBSDSRCDIR}/sys/modules/Makefile.inc"
+
+KMOD= cbd
+IOCONF= cbd.ioconf
+SRCS= cbd.c
+
+.include <bsd.kmodule.mk>


2023-10-30 10:48:40

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: [RFC+Patch] Formal models as source of truth for Software Architects.

Hi Mathew,

I'm not a kernel hacker, but your mail still floated into my inbox
(probably because you used a combination of "memory" and "model"
somewhere throughout your text).

The company I currently work at is applying formal methods at various
places on production code.

A few years ago we also built systems to extract models from C code (for
TLA+ back then), but those systems are now totally unused (and
unmaintained) due to poor scalability of exploratory state-based model
checking.

And then there's the issue that many engineers don't know how to write
reasonable temporal assertions, especially not ones that can give
sufficient guarantees to be worth the extra effort and maintenance.

If we use a simplified model of what the system should be doing to
specify, then usually we are not looking for equivalence but only for
uni-directional simulation: the spec usually needs to have more degrees
of freedom than the implementation, which is just a single implementation.

What has worked well for us is anything that works

- directly on the code

- doesn't require learning any kind of math to find bugs with

- is scalable enough to find bugs on real code

- doesn't require a lot of extra effort from the engineers (compared to
the extra assurance you get)

- doesn't need to be adapted whenever the code changes


I find in software, usually blueprint and skyscraper are in some sense
the same thing.
Or in some sense, the skyscraper is extracted automatically by a
compiler/interpreter and installation system - we never actually dig
anything ourselves, but have the ability to experiment directly on the
blueprint as we build it.

That's because languages already try to provide us reasonable
abstraction mechanisms by which we can simplify the code and make it
more "model-like".

So writing the blueprint twice - once in some modelling language and
once in a language meant for execution - doesn't really provide a lot of
benefit.
But it requires doing changes at two levels whenever the software design
changes. It just violates DRY.

What makes sense in our experience is identifying properties you would
like the system to have and devising a specific checking mechanism for that.
For example, you might specify the protocol of how the device
communicates with the system and what assumptions it has (like "don't
start a second command while the first one hasn't finished"), and then
check that the driver code obeys this policy.
Then you can change the software code without touching the protocol
specification, and just re-verify.

In the easiest case, the protocol can be a mock device process that is
inserted via a form of dependency injection to talk to the real driver
code during verification, and if the code isn't too complex you can
exhaustively check several test cases.


Only if the code is highly unreadable for specific reasons like
performance or talking to legacy ABIs, it can make sense to have an
"executable comment" that implements the same functional behavior in a
very concise, readable way (with crud performance/mock APIs replacing
the actual behavior), and check that the real code and the "executable
comment" do "the same thing".


The main issues I see when we look at it through the skyscraper analogy
is that usually people in other engineering disciplines have CAD tools -
backed by very powerful computers - to really systematically and
thoroughly check the design.
What we usually have is stress testing and some test suites built by
hand that miss many cases.


Anyways, that's my (limited) experience.

Hope it helps,

jonas


Am 10/23/2023 um 5:53 PM schrieb Mathew, Cherry G.*:
> Hi Hackers,
>
> I wanted to share with you some work I've been doing for a couple of
> months with the spin verifier and OS C code. [1]
>
> Although not directly and immediately relevant to current kernel
> development, this may be relevant to folks thinking about architecture
> and future directions in managing design complexity.
>
> As I was working on plumbing the ARC code [1] to an actual block driver,
> it dawned on me that there may be additional advantages to writing code
> based on formal models, other than just consistency verification of
> existing production code. So while I'm getting ahead of myself, let me
> re-start in a reasonable middle of this narrative, and describe how I
> got here.
>
> Intro:
> Software must be the only "Engineering" discipline, where the equivalent
> of drawing up an architectural masterplan before you build a skyscraper
> in software, is to start by getting some shovels and start digging the
> basement. While there are a few efforts driven by industry ("inspired"
> by regulation, possibly) [2] [3], I really couldn't find any open source
> attempts at ongoing, continuously integrated work that is in lockstep
> with production sourcecode, with the exception of academic work in
> kernel code verification, eg: [4], or much smaller codebases such as
> seL4 [5]. (Please feel free to point me at everything I've surely
> missed).
>
> I think we're thinking about the problem from the wrong end. Inspired by
> "Use the Source, Luke", I realised that perhaps we should approach the
> problem of systems software development using the old unix/engineering
> principles of modularity and clean interfaces.
>
> "D3" - Design Driven Development:
> Let's say we want to develop a new block driver interface that "front"s
> an existing block device, say for eg: as a first step to implementing a
> cache of some sort. In addition, let's assume that we've been given the
> requirement that this driver needs to be available on several OSs. As a
> software architect, how do you put something concrete down on paper,
> which serves as the "Source of Truth", before hammering on the actual
> drivers ?
>
> I propose, that a formal model, specified using a modelling language can
> serve this purpose. However, a formal model alone, seems to be
> insufficient. Most formal models seem to exist in a sort of echo chamber
> of their own, where the model sits aside from actual production code,
> or the other extreme, where implementation code itself serves as the
> source of truth, instead of a more abstract formal spec.
>
> While looking for a verifier tool that addresses this problem of the
> "gap" between a model and the implementation code written to express
> that model, I came across the model extractor from the spin
> verifier. [6]. Modex really blew my mind. For the first time, I could
> write a complete loop starting from:
>
> a) A Formal model to represent concurrent processes and the joint
> evolution of their state.
> b) Invariants expressed in linear temporal logic, to reason about bounds
> within which this model should operate.
> c) An implementation in a programming language based on eyeballed model
> code from "a)" above.
> d) A model extractor that could parse the above implementation code, and
> pull it back into a formal model representation such as in "a)"
> above.
> e) And finally the ability to apply invariants in "b)" , originally
> meant for the abstract model in "a)" over the extracted model from
> "d)".
>
> If there were inconsistencies in step "e)", while none in "b)", this
> could mean that either the extractor hadn't pulled out the right model,
> or the implementation had strayed from that specified in "a)"
> above. This would then require further eyeballing, patching, and
> repeating steps "c)" -> "e)" until the extracted model "d)" and the hand
> crafted model "a)" satisfied the invariants in "b)".
>
> The key assumption here (which I have no way of proving is true) is that
> the satisfaction of invariants in "b)" above, could stand in place for
> equivalence of model and code. Thus we have a development process that
> I'm calling "Design Driven Development" (or "D3" in short form) which
> allows for iterative tightening up of implemented code to a formal
> specification.
>
> Discussion:
> If one were a software architect in a larger project, one could just
> focus on the correctness of one's model, while software engineers could
> focus on implementing the model faithfully, using the D3 loop above to
> ensure fidelity.
>
> I believe this is really powerful - as I'm new to this area, I would
> love to hear from folks that have used this or analogous techniques in
> industry or open-source (ideally both), to get a sense of where my
> current thinking is, with respect to the industry.
>
> Please find attached, a long-ish patch that is heavily WIP. It attempts
> to model a "generic" block driver in spin's promela language. The model
> is in cbd.pm[hl] , the invariants are in cbd.inv. The implementation of
> the model for linux is in linux/cbd.c ( you can use the Makefile to
> compile it as an actual linux module). The implementation of the model
> for NetBSD is in NetBSD/cbd.c (ditto - instructions in respective
> Makefile). Note that I haven't actually tested the driver - it will most
> likely crash, as I haven't really looked at the engineering side of
> things yet - but thought I should publish this right now, to get early
> feedback, and also give a sense of why actual CI testing is still needed
> to ensure that the final implementation plumbing is right.
>
> The "extraction" is specified in a "harness" defined in files with
> extensions "*.prx", suplimented by LUTs to help translate C code back to
> the extracted model. This is where the age of spin/modex becomes evident
> - clearly the extraction language needs to be a lot more expressive (I
> see it analagous to the expressivity in a regex machine, vs. a
> parser). This is an area I'd love to see more work on, to the point
> where we have a tool (if not already available) that can enable
> something like D3 as the gold standard for software quality and
> reliability.
>
> For architects, it takes off the burden of having to rely
> on their memory and random tests and bespoke models to keep state. For
> engineers, it provides a clear "source of truth" and a process based
> mechanism to verify their implementation based on this source. And for
> project managers, there's much better visibility over the entire QA,
> analogous to but much more powerful than a simple TDD style development
> process.
>
> Thank you for your attention - I hope to hear back from you - do please
> Cc: me if you're replying to list. Especially if you actively work on
> formally verified infrastructure code where QA is a regulatory need
> (avionics ? MISRA-C related constraints ? healthcare ?)
>
> Patch inlined below. To try it out, you need to have 'spin' installed
> using your favourite package manager, and modex [6] compiled , with both
> binaries in $PATH.
>
> Then you can try things out:
>
> 1) Deploy sources to local directory: $ patch -tp1 < cbd.diff
>
> 2) Check out the model. $ make spin-gen; less spinmodel.pml
>
> 3) Try to verify the model (see various commented out #defines in
> spinmodel.pml if you want to spread your wings.):
> $ make spin-run
>
> 4) Try to extract the model from linux/cbd.c:
> $ make clean modex-linux; less spinmodel.pml
> You can try to verify this extracted spinmodel.pml via 3) above.
>
> 5) Try to extract the model from NetBSD/cbd.c:
> $ make clean modex-NetBSD; less spinmodel.pml
> You can try to verify this extracted spinmodel.pml via 3) above.
>
> Looking forward!
>
> Many Thanks,
>
> MattC (/~cherry)
>
> [1]https://mail-index.netbsd.org/tech-kern/2023/09/28/msg029203.html
> [2]https://en.wikipedia.org/wiki/Model-based_systems_engineering
> [3]https://en.wikipedia.org/wiki/Model-driven_engineering
> [4]https://unsat.cs.washington.edu/papers/nelson-hyperkernel.pdf
> [5]https://sel4.systems/
> [6]https://github.com/nimble-code/Modex
>
>
> diff -urN cbd.null/cbd.drv cbd/cbd.drv
> --- cbd.null/cbd.drv 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbd.drv 2023-10-19 17:56:08.023048865 +0000
> @@ -0,0 +1,299 @@
> +/* Spin model driver for NetBSD cbd(4) cbd.c */
> +
> +/*
> + * Copyright 2023, Mathew, Cherry G.<[email protected]>
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * 1. Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in the
> + * documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include "cbd.pmh"
> +
> +/* We drive the routines concurrently */
> +#define NPROCS 2
> +#define index_pid (_pid - 1)
> +
> +/* Per-device instance state. Currently we only have one instance */
> +bit sc_lock; /* Driver instance lock */
> +unsigned cbdevsz : DEVSZBITS; /* Same as size of the backing block dev */
> +
> +unsigned cbdev : DEVIDBITS; /* Our Device "handle" */
> +unsigned backingdev : DEVIDBITS; /* Backing device "handle" */
> +unsigned cachingdev : DEVIDBITS; /* Caching device "handle" */
> +
> +/* Aggregated traffic stats (per-device) */
> +unsigned cbdbytesread : DEVSZBITS;
> +unsigned cbdbyteswritten : DEVSZBITS;;
> +
> +/*
> + * Helper functions - better as inline because:
> + * - spin parser doesn't care about location vs. preprocessor does.
> + * - trace output looks better
> + */
> +hidden int error[NPROCS]; /* Error indicator for last op */
> +
> +inline SETERROR(_errno)
> +{
> + error[index_pid] = _errno;
> +}
> +
> +
> +inline
> +mutex_enter(_mutex)
> +{
> + atomic {
> + (_mutex == 0) -> _mutex = 1;
> + }
> +}
> +
> +inline
> +mutex_exit(_mutex)
> +{
> + atomic {
> + (_mutex == 1) -> _mutex = 0;
> + }
> +}
> +
> +/* Set state to "Configured", so that ioctl() can work */
> +inline
> +cbd_init()
> +{
> + if
> + :: cbd_unconfigured() ->
> + cbdev = CBD_DEVID_CBD1;
> + :: else -> SETERROR(EBUSY);
> + skip;
> + fi
> +}
> +
> +/* Undo init() */
> +inline
> +cbd_fini()
> +{
> + do
> + :: cbd_backed(cbdev) ->
> + mutex_enter(sc_lock);
> + cbd_detach_backend(CBD_DEVID_BACKING);
> + mutex_exit(sc_lock);
> + :: cbd_configured(cbdev) ->
> + mutex_enter(sc_lock);
> + if
> + :: cbd_backed(cbdev) -> /* Lost race. Retry */
> + mutex_exit(sc_lock);
> + :: else ->
> + cbdev = CBD_DEVID_NONE;
> + mutex_exit(sc_lock);
> + break;
> + fi
> + :: else -> skip; /* Retry until things quiesce */
> + od
> +}
> +
> +inline
> +io_backing(backingdev, bp)
> +{
> + /* Aggregate i/o data */
> + if
> + :: is_set(bp.direction, B_READ) ->
> + /* Increment bytes read */
> + cbdbytesread = cbdbytesread + bp.spooled;
> + :: else -> /* Assume B_WRITE */
> + /* Increment bytes written */
> + cbdbyteswritten = cbdbyteswritten + bp.spooled;
> + fi
> +
> +}
> +
> +inline
> +ioctl_backing(_dev, _cmd, _data, _flag)
> +{
> + /* Wouldn't have got here without specific intention.
> + * Therefore we just make sure we're in the right state.
> + */
> + SETERROR((cbd_backed(_dev) -> 0 : ENXIO));
> +}
> +
> +inline
> +reset_cbdstate()
> +{
> + sc_lock = 0;
> + cbdbytesread = 0;
> + cbdbyteswritten = 0;
> + cbdev = CBD_DEVID_NONE;
> + cbdevsz = 0;
> + backingdev = CBD_DEVID_NONE;
> + cachingdev = CBD_DEVID_NONE;
> +}
> +
> +bit strategy = 0;
> +bit ioctl = 0;
> +bit configured = 0;
> +bit rdwr = 0;
> +
> +inline
> +drive_cbd(_dev, _flags, _buffer,
> + _cmd, _data)
> +{
> + /* Start with state that covers max code */
> + _dev = CBD_DEVID_CBD1;
> + set_flag(_flags, FREAD | FWRITE);
> +
> + _buffer.error = 0;
> + _buffer.address = 0;
> + _buffer.offset = 0;
> + _buffer.length = CBD_BACKING_DISKSIZE; /* Try moving the whole buffer */
> + _buffer.spooled = 0;
> + _buffer.dev = CBD_DEVID_NONE;
> + set_flag(_buffer.direction, B_READ);
> +
> + _cmd = CBDIO_ATTACH_BACKEND;
> + _data = CBD_DEVID_BACKING;
> +
> + do
> + ::
> + if
> + :: (!configured) ->
> + cbd_init();
> + configured = 1;
> + :: (ioctl) ->
> + cbdioctl(_dev, _cmd, _data, _flags);
> + if
> +//#define EXPLORE_IOCTL
> +#ifdef EXPLORE_IOCTL
> + :: (_dev > CBD_DEVID_NONE) -> _dev--;
> + :: (_dev < CBD_DEVID_CACHING) -> _dev++;
> +#endif
> + :: (_cmd > CBDIO_INVAL) -> _cmd--;
> + :: (_cmd < CBDIO_GET_STATUS) -> _cmd++;
> + :: (_data > CBD_DEVID_NONE) -> _data--;
> + :: (_data < CBD_DEVID_CACHING) -> _data++;
> + :: is_set(_flags, FREAD) -> clear_flag(_flags, FREAD); set_flag(_flags, FWRITE);
> + :: is_set(_flags, FWRITE) -> clear_flag(_flags, FWRITE); set_flag(_flags, FREAD);
> + :: else -> set_flag(_flags, FREAD);
> + fi
> + ioctl = 0;
> +
> + :: (!ioctl) -> ioctl = 1;
> + :: (strategy) ->
> + mutex_enter(sc_lock);
> + cbdstrategy(_buffer);
> + mutex_exit(sc_lock);
> +//#define EXPLORE_STRATEGY
> +#ifdef EXPLORE_STRATEGY
> + if
> + :: is_set(_buffer.direction, B_WRITE) ->
> + clear_flag(_buffer.direction, B_WRITE);
> + set_flag(_buffer.direction, B_READ);
> + :: is_set(_buffer.direction, B_READ) ->
> + clear_flag(_buffer.direction, B_READ);
> + set_flag(_buffer.direction, B_WRITE);
> + :: else ->
> + set_flag(_buffer.direction, B_READ);
> + fi
> + strategy = 0;
> +# endif
> + :: (!strategy) -> strategy = 1;
> +//#define EXPLORE_BUFFERSTATE
> +#ifdef EXPLORE_BUFFERSTATE
> + :: (_buffer.offset < (CBD_BACKING_DISKSIZE + 1)) ->
> + _buffer.offset++;
> + :: (_buffer.length < (CBD_BACKING_DISKSIZE + 2)) ->
> + _buffer.length++;
> + :: (_buffer.address < VAMAX) -> _buffer.address++;
> + :: (_buffer.spooled < (CBD_BACKING_DISKSIZE + 2)) ->
> + _buffer.spooled++;
> + :: (_buffer.offset > 0) -> _buffer.length--;
> + :: (_buffer.length > 0) -> _buffer.length--;
> + :: (_buffer.spooled > 0) -> _buffer.spooled--;
> + :: (_buffer.address > 0) -> _buffer.address--;
> +#endif
> +//#define EXPLORE_READWRITESTATE
> +#ifdef EXPLORE_READWRITESTATE
> + :: (rdwr) ->
> + cbdread(_dev, _buffer);
> + cbdwrite(_dev, _buffer);
> + rdwr = 0;
> + :: (!rdwr) -> rdwr = 1;
> +
> + :: (configured) ->
> + cbd_fini();
> + configured = 0;
> +#endif
> + :: else ->
> + break;
> + fi
> +
> + od
> +}
> +
> +/* Per proc data to drive the state machine */
> +typedef user_vars_t {
> + unsigned dev : DEVIDBITS;
> + unsigned flags : FLAGBITS;
> + buf_t buffer;
> + unsigned cmd : IOCMDBITS;
> +
> + /* IRL, this would be eg: the backing device node path.
> + * For now we just pass a fake DEVIDXXX "handle" that
> + * represents the backing device.
> + */
> +
> + unsigned data : DEVIDBITS;
> +};
> +
> +/* Drive functions through various state variations */
> +proctype
> +p_drive_cbd()
> +{
> + /* Setup initial per-proc caller values. These will get
> + * iterated in the inline callees below below
> + */
> + user_vars_t user_vars[NPROCS];
> +
> + drive_cbd(user_vars[index_pid].dev, user_vars[index_pid].flags,
> + user_vars[index_pid].buffer, user_vars[index_pid].cmd,
> + user_vars[index_pid].data);
> +
> +}
> +
> +/* Drive the procs */
> +init {
> +
> + /* Initialise Device params */
> + reset_cbdstate();
> +
> + pid proc;
> +
> + /* Spin up NPROCS concurrent processes */
> + atomic {
> + proc = 0;
> + do
> + :: proc < NPROCS ->
> + run p_drive_cbd();
> + proc++
> + :: else ->
> + break
> + od
> + }
> +
> +}
> diff -urN cbd.null/cbd.inv cbd/cbd.inv
> --- cbd.null/cbd.inv 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbd.inv 2023-10-18 17:08:10.842776509 +0000
> @@ -0,0 +1,49 @@
> +/* Spin process invariants for NetBSD cbd(4) cbd.c */
> +
> +/*
> + * Copyright 2023, MattC<[email protected]>
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * 1. Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in the
> + * documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +ltl
> +{
> +/* Checks while outside critical section - ie; kernel state is stable. */
> + always ((sc_lock == 0) implies (
> + /* Device state basic consistency */
> + (cbd_configured(cbdev) implies
> + !cbd_unconfigured()) &&
> + /* Active ioctl passthrough implies configured device */
> + (cbd_backed(cbdev) implies
> + cbd_configured(cbdev)) &&
> + /* There will be I/O eventually */
> + (eventually always (cbdbytesread + cbdbyteswritten) > 0)
> + ) /* (sc_lock == 0) */
> + ) /* always */ //&&
> +// always ((sc_lock == 1) implies (
> + /* Lock must always be released */
> +// (eventually (sc_lock == 0))
> +// ) /* (sc_lock == 1) */
> +// ) /* always */ // &&
> +}
> diff -urN cbd.null/cbdioctl.linux.lut cbd/cbdioctl.linux.lut
> --- cbd.null/cbdioctl.linux.lut 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbdioctl.linux.lut 2023-10-23 08:14:11.902411553 +0000
> @@ -0,0 +1,15 @@
> +// cbd_ioctl(...) -> cbdioctl(...)
> +
> +cbd_backed(cbp) cbd_backed(_dev)
> +
> +Substitute c_expr { 222 == cmd } (_cmd == CBDIO_ATTACH_BACKEND)
> +Substitute c_expr { 333 == cmd } (_cmd == CBDIO_DETACH_BACKEND)
> +
> +Substitute c_expr { (cbp==0) } false
> +Substitute c_expr { (!cbd_configured(cbp)) } !cbd_configured(_dev)
> +Substitute c_expr { (!cbd_backed(cbp)) } !cbd_backed(_dev)
> +Substitute c_code { err=(-ENXIO); } SETERROR(ENXIO)
> +Substitute c_code { err=(-EBUSY); } SETERROR(EBUSY)
> +Substitute c_code { err=(-ENODEV); } SETERROR(ENODEV)
> +Substitute c_code { err=(-EPERM); } SETERROR(EPERM)
> +Substitute c_code { err=(-ENOTTY); } skip
> diff -urN cbd.null/cbdioctl.NetBSD.lut cbd/cbdioctl.NetBSD.lut
> --- cbd.null/cbdioctl.NetBSD.lut 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbdioctl.NetBSD.lut 2023-10-23 08:23:23.430620615 +0000
> @@ -0,0 +1,16 @@
> +// cbd_ioctl(...) -> cbdioctl(...)
> +
> +cbd_backed(sc) cbd_backed(_dev)
> +
> +Substitute c_expr { (sc==0) } false
> +Substitute c_expr { 222 == cmd } (_cmd == CBDIO_ATTACH_BACKEND)
> +Substitute c_expr { 333 == cmd } (_cmd == CBDIO_DETACH_BACKEND)
> +Substitute c_code { error=EBUSY; } SETERROR(EBUSY)
> +Substitute c_code { error=ENODEV; } SETERROR(ENODEV)
> +Substitute c_code { error=EPERM; } SETERROR(EPERM)
> +
> +Substitute c_code { error=cbd_attach_backend(sc,(&(cbdioctlargs))); } cbd_attach_backend(_dev, _data)
> +Substitute c_code { error=cbd_detach_backend(sc); } cbd_detach_backend(_data)
> +Substitute c_expr { (!cbd_backed(sc)) } !cbd_backed(_dev)
> +
> +
> diff -urN cbd.null/cbdattachbackend.linux.lut cbd/cbdattachbackend.linux.lut
> --- cbd.null/cbdattachbackend.linux.lut 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbdattachbackend.linux.lut 2023-10-23 08:12:51.340517105 +0000
> @@ -0,0 +1 @@
> +Substitute c_expr { IS_ERR(backingdev) } false
> diff -urN cbd.null/cbdattachbackend.NetBSD.lut cbd/cbdattachbackend.NetBSD.lut
> --- cbd.null/cbdattachbackend.NetBSD.lut 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbdattachbackend.NetBSD.lut 2023-10-23 06:56:21.559519110 +0000
> @@ -0,0 +1,2 @@
> +// Hard hammer - XXX: Find a better way to handle error ?
> +Substitute c_expr { (error!=0) } false
> diff -urN cbd.null/cbd.linux.prx cbd/cbd.linux.prx
> --- cbd.null/cbd.linux.prx 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbd.linux.prx 2023-10-23 08:16:16.155489259 +0000
> @@ -0,0 +1,217 @@
> +// Spin model extractor harness for linux/cbd.c written by cherry
> +//
> +%F linux/cbd.c
> +%X -n modexbug // Somehow this is needed to activate per-func LUT
> +%X -L cbdstrategy.linux -n cbd_submit_bio
> +%X -L cbdioctl.linux -n cbd_ioctl
> +%X -L cbdattachbackend.linux -n cbd_attach_backend
> +%X -n cbd_detach_backend
> +%H
> +// Disable effects of all included files and try to implement a subset of the APIs they provide.
> +#define _LINUX_BLKDEV_H
> +
> +// linux/blk_types.h
> +#define SECTOR_SHIFT 9
> +typedef void struct block_device;
> +typedef void blk_mode_t;
> +
> +#define __LINUX_MUTEX_H
> +struct mutex {
> + void *dummy;
> +};
> +
> +// sys/null.h
> +#define NULL 0
> +
> +#define true 1
> +#define false 0
> +
> +typedef int bool;
> +
> +/* Linuxisms */
> +#define __user
> +#define __init
> +#define __exit
> +
> +#define module_init(x)
> +#define module_exit(x)
> +#define MODULE_LICENSE(L)
> +
> +%%
> +//%C // c_code {}
> +//%%
> +//%D // c_cdecl {}
> +//%%
> +%L // XXX: Looks like per function tables are broken for now.
> +// Common transformations
> +
> +NonState hidden cbd_major global
> +NonState hidden exclusive global
> +NonState hidden cbd_fops global
> +NonState hidden exclusive global
> +
> +mutex_lock(... mutex_enter(sc_lock)
> +mutex_unlock(... mutex_exit(sc_lock)
> +
> +// cbd_submit_bio(...) -> cbdstrategy(...)
> +// See cbdstrategy.linux.lut broad substs which need scope limiting.
> +NonState hidden bio cbd_submit_bio
> +NonState hidden cbp cbd_submit_bio
> +NonState hidden bdev cbd_submit_bio
> +NonState hidden bio_offset cbd_submit_bio
> +NonState hidden bdevsz cbd_submit_bio
> +
> +// Gen-ed Code substitution. This is due to a lack of a sensible parser harness language with the right semantics.
> +Substitute c_expr { ((bio_offset+bio->bi_iter.bi_size)>bdevsz) } ((bp.offset + bp.length) > cbdevsz)
> +Substitute c_expr { (bio_op(bio)&REQ_OP_READ) } is_set(bp.direction, B_READ)
> +Substitute c_code { bio_offset=((bio_offset>bdevsz) ? bdevsz : bio_offset); } bp.offset = ((bp.offset > cbdevsz) -> cbdevsz : bp.offset)
> +Substitute c_code [bio] { bio->bi_iter.bi_size=(bdevsz-bio_offset); } bp.length = (cbdevsz - bp.offset)
> +Substitute c_code [bio] { bio->bi_iter.bi_sector=(bio_offset>>9); } skip
> +Substitute c_code [bio] { bio->bi_status=BLK_STS_NOTSUPP; } bp.error = EINVAL;
> +Substitute c_code [bio] { bio->bi_status=BLK_STS_OFFLINE; } bp.error = ENXIO
> +
> +// cbd_ioctl(...) -> cbdioctl(...)
> +// See cbdioctl.linux.lut broad substs which need scope limiting.
> +NonState hidden userdata cbd_ioctl
> +NonState hidden arg cbd_ioctl
> +NonState hidden cmd cbd_ioctl
> +NonState hidden mode cbd_ioctl
> +NonState hidden bdev cbd_ioctl
> +NonState hidden cbp cbd_ioctl
> +NonState hidden err cbd_ioctl
> +
> +
> +Substitute c_expr { (!(mode&BLK_OPEN_WRITE)) } !is_set(_flag, FWRITE)
> +Substitute c_code [bdev->bd_disk && bdev] { cbp=bdev->bd_disk->private_data; } skip
> +Substitute c_code { err=cbd_attach_backend(cbp,userdata); } cbd_attach_backend(_dev,_data)
> +Substitute c_code { err=cbd_detach_backend(cbp); } cbd_detach_backend(_data)
> +Substitute c_code [cbp && cbp->backingdev->bd_disk->fops && cbp->backingdev->bd_disk && cbp->backingdev && cbp] { err=cbp->backingdev->bd_disk->fops->ioctl(cbp->backingdev,mode,cmd,arg); } ioctl_backing(backingdev, _cmd, _data, _flag)
> +
> +// cbd_attach_backend(...) -> cbd_attach_backend()
> +// See cbdattachbackend.linux.lut broad substs which need scope limiting.
> +NonState hidden lockedcbp cbd_attach_backend
> +NonState hidden pathname cbd_attach_backend
> +NonState hidden userdata cbd_attach_backend
> +NonState hidden backingdev cbd_attach_backend
> +
> +Substitute c_expr { (lockedcbp->cbdev==0) } (_dev != CBD_DEVID_CBD1)-> SETERROR(ENODEV)
> +Substitute c_code { pathname=kzalloc(PATH_MAX,GFP_KERNEL); } skip;
> +Substitute c_expr { (pathname==0) } false
> +Substitute c_expr { (strncpy_from_user(pathname,(void *)userdata,PATH_MAX)==(-EFAULT)) } (_data != CBD_DEVID_BACKING); SETERROR(EINVAL)
> +Substitute c_code { kfree(pathname); } skip
> +Substitute c_code { backingdev=blkdev_get_by_path(strim(pathname),((BLK_OPEN_READ|BLK_OPEN_WRITE)|BLK_OPEN_EXCL),exclusive,0); } skip
> +Substitute c_code [lockedcbp] { lockedcbp->backingdev=backingdev; } backingdev = CBD_DEVID_BACKING
> +Substitute c_code [lockedcbp->cbdev && lockedcbp] { set_capacity(lockedcbp->cbdev->bd_disk,bdev_nr_sectors(backingdev)); } cbdevsz = CBD_BACKING_DISKSIZE
> +
> +// cbd_detach_backend(...) -> cbd_detach_backend()
> +NonState hidden lockedcbp cbd_detach_backend
> +
> +Substitute c_expr { ((lockedcbp->cbdev==0)||(lockedcbp->backingdev==0)) } (_dev != CBD_DEVID_CBD1 || backingdev != CBD_DEVID_BACKING) -> SETERROR(ENODEV)
> +Substitute c_code [lockedcbp] { blkdev_put(lockedcbp->backingdev,exclusive); } skip
> +Substitute c_code [lockedcbp] { lockedcbp->backingdev=0; } backingdev = CBD_DEVID_NONE
> +Substitute c_code [lockedcbp->cbdev && lockedcbp] { set_capacity(lockedcbp->cbdev->bd_disk,0); } cbdevsz = 0
> +%%
> +
> +%P
> +hidden int tmpdev; /* Saved during passthrough */
> +
> +inline
> +cbdstrategy(bp)
> +{
> + #include "_modex_cbd_submit_bio.pml"
> +}
> +
> +inline
> +cbdioctl(_dev, _cmd, _data, _flag)
> +{
> + SETERROR(0); /* default */
> + #include "_modex_cbd_ioctl.pml"
> +}
> +
> +inline
> +cbd_attach_backend(_dev, _data)
> +{
> + SETERROR(0);
> +#include "_modex_cbd_attach_backend.pml"
> +}
> +
> +inline
> +cbd_detach_backend(_dev)
> +{
> + SETERROR(0);
> +#include "_modex_cbd_detach_backend.pml"
> +}
> +
> +// linux/cbd.c doesn't have read/write fops, so we
> +// fake it to allow cbd.drv to be parsed
> +// However See: cbd.drv: -D EXPLORE_READWRITESTATE
> +/* For now, we just check re-entrancy consistency.
> + * XXX: uio validation
> + */
> +inline
> +cbdread(_dev, _uio)
> +{
> + SETERROR(0);
> +
> + /* Check for backing disk */
> + mutex_enter(sc_lock);
> + if
> + :: !cbd_configured(_dev) ->
> + SETERROR(EINVAL);
> + mutex_exit(sc_lock);
> + goto cbdread_out;
> + :: !cbd_backed(_dev) ->
> + SETERROR(ENOENT);
> + mutex_exit(sc_lock);
> + goto cbdread_out;
> + :: else ->
> + skip;
> + fi
> + mutex_exit(sc_lock);
> +
> + /* simulate uio -> buf translation */
> + buf_t rd_buf;
> + rd_buf.direction = B_READ;
> + rd_buf.error = 0;
> + rd_buf.length = _uio.length;
> + rd_buf.address = _uio.address;
> + rd_buf.spooled = 0;
> + cbdstrategy(rd_buf);
> +
> +cbdread_out:
> +}
> +
> +inline
> +cbdwrite(_dev, _uio)
> +{
> + SETERROR(0);
> +
> + /* Check for backing disk */
> + mutex_enter(sc_lock);
> + if
> + :: !cbd_configured(_dev) ->
> + SETERROR(EINVAL);
> + mutex_exit(sc_lock);
> + goto cbdwrite_out;
> + :: !cbd_backed(_dev) ->
> + SETERROR(ENOENT);
> + mutex_exit(sc_lock);
> + goto cbdwrite_out;
> + :: else ->
> + skip;
> + fi
> + mutex_exit(sc_lock);
> +
> + /* simulate uio -> buf translation */
> + buf_t rw_buf;
> + rw_buf.direction = B_WRITE;
> + rw_buf.error = 0;
> + rw_buf.length = _uio.length;
> + rw_buf.address = _uio.address;
> + rw_buf.spooled = 0;
> +
> + cbdstrategy(rw_buf);
> +cbdwrite_out:
> +}
> +
> +%%
> \ No newline at end of file
> diff -urN cbd.null/cbd.NetBSD.prx cbd/cbd.NetBSD.prx
> --- cbd.null/cbd.NetBSD.prx 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbd.NetBSD.prx 2023-10-23 08:20:37.150517408 +0000
> @@ -0,0 +1,232 @@
> +// Spin model extractor harness for NetBSD/cbd.c written by cherry
> +//
> +%F NetBSD/cbd.c
> +%X -n modexbug // Somehow this is needed to activate per-func LUT
> +%X -L cbdstrategy.NetBSD -n cbd_strategy
> +%X -L cbdioctl.NetBSD -n cbd_ioctl
> +%X -L cbdattachbackend.NetBSD -n cbd_attach_backend
> +%X -n cbd_detach_backend
> +%H
> +// Disable effects of all included files and try to implement a subset of the APIs they provide.
> +
> +#define _SYS_TYPES_H_
> +#define _SYS_STAT_H_
> +#define _SYS_BUF_H
> +#define _SYS_BUFQ_H
> +#define _SYS_CONF_H
> +#define _SYS_DISK_H
> +#define _SYS_DISKLABEL_H
> +#define _SYS_ERRNO_H
> +#define _UVM_PROT_
> +#define _SYS_ACL_H
> +#define _SYS_VNODE_IF_H_
> +#define _SYS_VNODE_H_
> +
> +typedef void device_t;
> +//typedef void *kmutex_t; //Kludgy hack, see below
> +#define kmutex_t (void *)
> +typedef void dev_t;
> +typedef void u_long;
> +typedef void buf_t;
> +typedef void uint64_t;
> +
> +/* Override switch arrays */
> +void cbd_bdevsw;
> +void cbd_cdevsw;
> +void cbddkdriver;
> +
> +#define CFATTACH_DECL3_NEW(_ign1, _ign2, _ign3, _ign4, _ign5, _ign6, _ign7, _ign8, _ign9)
> +
> +// sys/null.h
> +#define NULL 0
> +
> +#define true 1
> +#define false 0
> +
> +typedef int bool;
> +
> +#define disk_busy(_dk)
> +#define disk_unbusy(_dk, _count, _flags)
> +
> +#define biodone(_bp)
> +
> +%%
> +//%C // c_code {}
> +//%%
> +//%D // c_cdecl {}
> +//%%
> +%L // XXX: Looks like per function tables are broken for now.
> +// Common transformations
> +
> +NonState hidden cbd_major global
> +NonState hidden cbd_cdevsw global
> +NonState hidden sc_lock global
> +
> +mutex_enter(... mutex_enter(sc_lock)
> +mutex_exit(... mutex_exit(sc_lock)
> +
> +Substitute c_code { unit=DISKUNIT(dev); } skip
> +Substitute c_code { sc=device_lookup_private((&(cbd_cd)),unit); } skip
> +
> +// cbd_strategy(...) -> cbdstrategy(...)
> +NonState hidden sc cbd_strategy
> +NonState hidden buf cbd_strategy
> +NonState hidden dev cbd_strategy
> +NonState hidden unit cbd_strategy
> +
> +// Gen-ed Code substitution. This is due to a lack of a sensible parser harness language with the right semantics.
> +Substitute c_expr { ((dbtob(buf->b_blkno)+buf->b_bcount)>sc->sc_tvn_bsize) } ((bp.offset + bp.length) > cbdevsz)
> +Substitute c_expr { BUF_ISREAD(buf) } is_set(bp.direction, B_READ)
> +Substitute c_code [sc && sc && buf && buf] { buf->b_blkno=((dbtob(buf->b_blkno)>sc->sc_tvn_bsize) ? F: btodb(sc->sc_tvn_bsize) : bufb_blkno); } bp.offset = ((bp.offset > cbdevsz) -> cbdevsz : bp.offset)
> +Substitute c_code [buf && sc && buf] { buf->b_bcount=(sc->sc_tvn_bsize-dbtob(buf->b_blkno)); } bp.length = (cbdevsz - bp.offset)
> +Substitute c_code [buf] { buf->b_error=EINVAL; } bp.error = EINVAL
> +Substitute c_code [buf] { buf->b_error=ENXIO; } bp.error = ENXIO
> +
> +// cbd_ioctl(...) -> cbdioctl(...)
> +NonState hidden sc cbd_ioctl
> +NonState hidden dev cbd_ioctl
> +NonState hidden cmd cbd_ioctl
> +NonState hidden data cbd_ioctl
> +NonState hidden flag cbd_ioctl
> +NonState hidden l cbd_ioctl
> +NonState hidden unit cbd_ioctl
> +NonState hidden error cbd_ioctl
> +NonState hidden cbdioctlargs cbd_ioctl
> +
> +Substitute c_code { cbdioctlargs.diskpath=data; } skip
> +Substitute c_code { cbdioctlargs.l=l; } skip
> +Substitute c_expr { (!cbd_configured(sc)) } !cbd_configured(_dev) -> SETERROR(ENXIO)
> +Substitute c_expr { (!(flag&FWRITE)) } !is_set(_flag, FWRITE)
> +Substitute c_code [l && sc] { error=VOP_IOCTL(sc->sc_tvn,cmd,data,flag,l->l_cred); } ioctl_backing(backingdev, _cmd, _data, _flag)
> +
> +// cbd_attach_backend(...) -> cbd_attach_backend()
> +NonState hidden lockedsc cbd_attach_backend
> +NonState hidden cbdioctl cbd_attach_backend
> +NonState hidden data cbd_attach_backend
> +NonState hidden pbuf cbd_attach_backend
> +NonState hidden error cbd_attach_backend
> +NonState hidden vp cbd_attach_backend
> +NonState hidden numsec cbd_attach_backend
> +NonState hidden secsz cbd_attach_backend
> +
> +Substitute c_expr { (lockedsc->sc_dev==0) } (_dev != CBD_DEVID_CBD1)-> SETERROR(ENODEV)
> +Substitute c_code [cbdioctl] { error=pathbuf_copyin(cbdioctl->diskpath,(&(pbuf))); } skip
> +Substitute c_code [cbdioctl] { error=vn_bdev_openpath(pbuf,(&(vp)),cbdioctl->l); } if :: (_data != CBD_DEVID_BACKING) -> SETERROR(ENODEV) :: else; fi
> +Substitute c_code { pathbuf_destroy(pbuf); } skip
> +Substitute c_code { error=getdisksize(vp,(&(numsec)),(&(secsz))); } skip
> +Substitute c_code [lockedsc] { lockedsc->sc_tvn=vp; } backingdev = CBD_DEVID_BACKING
> +Substitute c_code [lockedsc] { lockedsc->sc_tvn_bsize=(numsec*secsz); } cbdevsz = CBD_BACKING_DISKSIZE
> +Substitute c_code [cbdioctl->l && cbdioctl && lockedsc] { lockedsc->sc_tvn_l_cred=cbdioctl->l->l_cred; } skip
> +
> +
> +// cbd_detach_backend(...) -> cbd_detach_backend()
> +NonState hidden lockedsc cbd_detach_backend
> +
> +Substitute c_expr { ((((lockedsc->sc_dev==0)||(lockedsc->sc_tvn==0))||(lockedsc->sc_tvn_bsize==0))||(lockedsc->sc_tvn_l_cred==0)) } (_dev != CBD_DEVID_CBD1 || backingdev != CBD_DEVID_BACKING) -> SETERROR(ENODEV);
> +Substitute c_code [lockedsc && lockedsc] { vn_close(lockedsc->sc_tvn,(FREAD|FWRITE),lockedsc->sc_tvn_l_cred); } skip
> +Substitute c_code [lockedsc] { lockedsc->sc_tvn_bsize=0; } cbdevsz = 0
> +Substitute c_code [lockedsc] { lockedsc->sc_tvn_l_cred=0; } skip
> +Substitute c_code [lockedsc] { lockedsc->sc_tvn=0; } backingdev = CBD_DEVID_NONE
> +%%
> +
> +%P
> +hidden int tmpdev; /* Saved during passthrough */
> +
> +inline
> +cbdstrategy(bp)
> +{
> + #include "_modex_cbd_strategy.pml"
> +}
> +
> +inline
> +cbdioctl(_dev, _cmd, _data, _flag)
> +{
> + SETERROR(0); /* default */
> + #include "_modex_cbd_ioctl.pml"
> +}
> +
> +inline
> +cbd_attach_backend(_dev, _data)
> +{
> + SETERROR(0);
> +#include "_modex_cbd_attach_backend.pml"
> +}
> +
> +inline
> +cbd_detach_backend(_dev)
> +{
> + SETERROR(0);
> +#include "_modex_cbd_detach_backend.pml"
> +}
> +
> +// XXX: please extract below functions.
> +/* For now, we just check re-entrancy consistency.
> + * XXX: uio validation
> + */
> +inline
> +cbdread(_dev, _uio)
> +{
> + SETERROR(0);
> +
> + /* Check for backing disk */
> + mutex_enter(sc_lock);
> + if
> + :: !cbd_configured(_dev) ->
> + SETERROR(EINVAL);
> + mutex_exit(sc_lock);
> + goto cbdread_out;
> + :: !cbd_backed(_dev) ->
> + SETERROR(ENOENT);
> + mutex_exit(sc_lock);
> + goto cbdread_out;
> + :: else ->
> + skip;
> + fi
> + mutex_exit(sc_lock);
> +
> + /* simulate uio -> buf translation */
> + buf_t rd_buf;
> + rd_buf.direction = B_READ;
> + rd_buf.error = 0;
> + rd_buf.length = _uio.length;
> + rd_buf.address = _uio.address;
> + rd_buf.spooled = 0;
> + cbdstrategy(rd_buf);
> +
> +cbdread_out:
> +}
> +
> +inline
> +cbdwrite(_dev, _uio)
> +{
> + SETERROR(0);
> +
> + /* Check for backing disk */
> + mutex_enter(sc_lock);
> + if
> + :: !cbd_configured(_dev) ->
> + SETERROR(EINVAL);
> + mutex_exit(sc_lock);
> + goto cbdwrite_out;
> + :: !cbd_backed(_dev) ->
> + SETERROR(ENOENT);
> + mutex_exit(sc_lock);
> + goto cbdwrite_out;
> + :: else ->
> + skip;
> + fi
> + mutex_exit(sc_lock);
> +
> + /* simulate uio -> buf translation */
> + buf_t rw_buf;
> + rw_buf.direction = B_WRITE;
> + rw_buf.error = 0;
> + rw_buf.length = _uio.length;
> + rw_buf.address = _uio.address;
> + rw_buf.spooled = 0;
> +
> + cbdstrategy(rw_buf);
> +cbdwrite_out:
> +}
> +
> +%%
> \ No newline at end of file
> diff -urN cbd.null/cbd.pmh cbd/cbd.pmh
> --- cbd.null/cbd.pmh 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbd.pmh 2023-10-22 19:48:38.564639937 +0000
> @@ -0,0 +1,120 @@
> +/*
> + * Copyright 2023, MattC<[email protected]>
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * 1. Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in the
> + * documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _CBD_PMH_
> +#define _CBD_PMH_
> +
> +#define ERRBITS 6 /* Add as list below grows */
> +
> +/* Modelled on NetBSD sys/errno.h */
> +#define EPERM 1 /* Operation not permitted */
> +#define ENOENT 2 /* No such file or directory */
> +#define EIO 5 /* Input/output error */
> +#define ENXIO 6 /* Device not configured */
> +#define EBADF 9 /* Bad file descriptor */
> +#define EBUSY 16 /* Device busy */
> +#define ENODEV 19 /* Operation not supported by device */
> +#define EINVAL 22 /* Invalid argument */
> +
> +#define DEVSZBITS 10 /* Max block device log2(sz) */
> +#define DEVIDBITS 4 /* Add more as list below grows */
> +
> +/* handle IDs for "device"s */
> +#define CBD_DEVID_NONE 0
> +#define CBD_DEVID_CBD1 1
> +#define CBD_DEVID_BACKING 2
> +#define CBD_DEVID_CACHING 3
> +
> +/* "State" values, to report current states, on query */
> +#define CBD_UNCONFIGURED 0
> +#define CBD_CONFIGURED 1
> +#define CBD_BACKED 2
> +#define CBD_CACHED 3
> +
> +#define FLAGBITS 4
> +/* Buffer direction flag (mutually exclusive) */
> +#define B_READ 1 /* Read flag. */
> +#define B_WRITE 2 /* Write flag. */
> +
> +/* dev read()/write(). Write flag imples Read flag */
> +#define FREAD 1
> +#define FWRITE 3
> +
> +inline set_flag(_flags, _FLAG) { _flags = (_flags | _FLAG) }
> +inline clear_flag(_flags, _FLAG) { _flags = _flags & ~_FLAG }
> +#define is_set(_flags, _FLAG) ((_flags & _FLAG) && 1)
> +
> +#define IOCMDBITS 4
> +
> +/* ioctl related command codes */
> +#define CBDIO_INVAL 0
> +#define CBDIO_ATTACH_BACKEND 1
> +#define CBDIO_DETACH_BACKEND 2
> +#define CBDIO_GET_STATUS 3
> +
> +#define VABITS DEVSZBITS
> +#define VAMAX (1 << (VABITS - 1)) /* A boundary for a "valid" address */
> +#define is_valid_address(_addr) (_addr <= VAMAX)
> +
> +#define CBD_BACKING_DISKSIZE (VAMAX)
> +
> +/* Device state queries */
> +#define cbd_unconfigured() (cbdev == CBD_DEVID_NONE)
> +
> +/* In a multi-device scenario (out of scope of this
> + * model), this would be a search for the device matching
> + * the arg _dev.
> + */
> +
> +#define cbd_configured(_dev) ((cbdev == CBD_DEVID_CBD1) && (_dev == cbdev))
> +
> +#define cbd_backed(_dev) (cbd_configured(_dev) && (backingdev == CBD_DEVID_BACKING))
> +
> +#define cbd_cached(_dev) (cbd_backed() && cachingdev == CBD_DEVID_CACHING)
> +
> +#define cbd_status(_dev) \
> + ((_dev == CBD_DEVID_NONE) -> CBD_UNCONFIGURED : \
> + ((cbdev == CBD_DEVID_CBD1) -> \
> + ((backingdev == CBD_DEVID_BACKING) -> \
> + ((cachingdev == CBD_DEVID_CACHING) -> \
> + CBD_CACHED : CBD_BACKED) : \
> + CBD_CONFIGURED) : \
> + CBD_UNCONFIGURED))
> +
> +
> +typedef buf_t {
> + unsigned error : ERRBITS; /* error flagged during processing */
> + unsigned address : VABITS; /* address representation */
> + unsigned offset : DEVSZBITS; /* Offset into device */
> + unsigned length : DEVSZBITS; /* total length of buffer */
> + unsigned spooled : DEVSZBITS; /* length processed so far */
> + unsigned direction : FLAGBITS; /* whether callee is to read from or write to */
> + unsigned dev : DEVIDBITS; /* Device involved */
> +};
> +
> +
> +#endif /* _CBD_PMH_ */
> \ No newline at end of file
> diff -urN cbd.null/cbd.pml cbd/cbd.pml
> --- cbd.null/cbd.pml 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbd.pml 2023-10-22 21:00:12.814145286 +0000
> @@ -0,0 +1,261 @@
> +/* Spin process models for NetBSD cbd(4) cbd.c */
> +
> +/*
> + * Copyright 2023, MattC<[email protected]>
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * 1. Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in the
> + * documentation and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include "cbd.pmh"
> +
> +hidden int tmpdev; /* Saved during passthrough */
> +inline
> +cbdstrategy(/* buf_t */ bp)
> +{
> +
> +
> + /* sanity check buffer */
> +
> + if
> + :: ((bp.offset + bp.length) > cbdevsz) ->
> + if
> + :: is_set(bp.direction, B_READ) ->
> + /* Clip the requested length.
> + * Total length effectively read will be in
> + * bp.spooled
> + */
> + bp.offset = ((bp.offset > cbdevsz) -> cbdevsz :
> + bp.offset);
> + bp.length = (cbdevsz - bp.offset);
> +
> + skip;
> +
> + :: is_set(bp.direction, B_WRITE) ->
> + bp.error = EINVAL;
> + fi
> + goto strategy_out;
> + :: else ->
> + skip;
> + fi
> +
> + if /* Always check if device is still active before acting */
> + :: cbd_backed(bp.dev) ->
> + tmpdev = bp.dev;
> + mutex_enter(sc_lock);
> + bp.dev = backingdev; /* Set for callee */
> + io_backing(backingdev, bp);
> + bp.dev = tmpdev; /* Reset for caller */
> + mutex_exit(sc_lock);
> + :: else -> /* Fallback - don't block */
> + bp.error = ENXIO;
> + goto strategy_out;
> + fi
> +strategy_out:
> +
> +}
> +
> +/* Attach backing disk device specified in data */
> +
> +inline
> +cbd_attach_backend(_dev, _data)
> +{
> + SETERROR(0);
> +
> + /* Simulate data validation */
> + if
> + :: (_dev != CBD_DEVID_CBD1 || /* Device validity */
> + _data != CBD_DEVID_BACKING) ->
> + SETERROR(ENODEV);
> + goto attach_out;
> + :: else ->
> + /* IRL, this would come from the backing dev */
> + cbdevsz = CBD_BACKING_DISKSIZE;
> + backingdev = _data;
> + fi
> +
> +attach_out:
> +}
> +
> +/* Detach backing disk device specified in data */
> +inline
> +cbd_detach_backend(_dev)
> +{
> + SETERROR(0);
> +
> + /* Simulate dev state validation */
> +
> + if
> + :: (_dev != CBD_DEVID_CBD1 ||
> + backingdev != CBD_DEVID_BACKING) ->
> + SETERROR(ENODEV);
> + goto detach_out;
> + :: else ->
> + backingdev = CBD_DEVID_NONE;
> + cbdevsz = 0;
> + fi
> +
> +detach_out:
> +}
> +
> +inline
> +cbdioctl(_dev, _cmd, _data, _flag)
> +{
> + SETERROR(0); /* default */
> +
> + mutex_enter(sc_lock);
> + if
> + :: !cbd_configured(_dev) ->
> + SETERROR(ENXIO);
> + goto ioctl_out; /* Invalid device ID */
> + :: else ->
> + skip;
> + fi
> +
> + if
> + :: (_cmd == CBDIO_ATTACH_BACKEND) ->
> + /* Simulate attaching a backing device */
> + if
> + :: cbd_backed(_dev) -> /* Already backed. */
> + SETERROR(EBUSY); /* return EBUSY */
> + goto ioctl_out;
> + :: else ->
> + skip;
> + fi
> + if
> + :: !is_set(_flag, FWRITE) -> /* Semantic inconsistency */
> + SETERROR(EPERM);
> + goto ioctl_out;
> + :: else ->
> + skip;
> + fi
> + cbd_attach_backend(_dev, _data);
> + goto ioctl_out;
> +
> + :: (_cmd == CBDIO_DETACH_BACKEND) ->
> + /* Simulate detaching a backing device */
> + if
> + :: !cbd_backed(_dev) -> /* Nothing to detach. */
> + SETERROR(ENODEV);
> + goto ioctl_out;
> + :: else ->
> + skip;
> + fi
> + if
> + :: !is_set(_flag, FWRITE) -> /* Semantic inconsistency */
> + SETERROR(EPERM);
> + goto ioctl_out;
> + :: else ->
> + skip;
> + fi
> + cbd_detach_backend(_dev);
> + goto ioctl_out;
> +
> + :: (_cmd == CBDIO_GET_STATUS) ->
> + /* Simulate Query */
> + /* dev "handle" as proxy for "status" */
> + _data = cbd_status(_dev);
> + goto ioctl_out; /* NOP */
> + :: (_cmd == CBDIO_INVAL) -> /* Adversarial/careless user */
> + SETERROR(EINVAL);
> + goto ioctl_out;
> + :: else ->
> + skip; /* Pass through to backing device ioctl() */
> + fi
> +
> + /* Anything else is passed to the backing disk */
> + ioctl_backing(backingdev, _cmd, _data, _flag);
> +
> +ioctl_out:
> + mutex_exit(sc_lock);
> +}
> +
> +/* For now, we just check re-entrancy consistency.
> + * XXX: uio validation
> + */
> +inline
> +cbdread(_dev, _uio)
> +{
> + SETERROR(0);
> +
> + /* Check for backing disk */
> + mutex_enter(sc_lock);
> + if
> + :: !cbd_configured(_dev) ->
> + SETERROR(ENXIO);
> + mutex_exit(sc_lock);
> + goto cbdread_out;
> + :: !cbd_backed(_dev) ->
> + SETERROR(ENOENT);
> + mutex_exit(sc_lock);
> + goto cbdread_out;
> + :: else ->
> + skip;
> + fi
> + mutex_exit(sc_lock);
> +
> + /* simulate uio -> buf translation */
> + buf_t rd_buf;
> + rd_buf.direction = B_READ;
> + rd_buf.error = 0;
> + rd_buf.length = _uio.length;
> + rd_buf.address = _uio.address;
> + rd_buf.spooled = 0;
> + cbdstrategy(rd_buf);
> +
> +cbdread_out:
> +}
> +
> +inline
> +cbdwrite(_dev, _uio)
> +{
> + SETERROR(0);
> +
> + /* Check for backing disk */
> + mutex_enter(sc_lock);
> + if
> + :: !cbd_configured(_dev) ->
> + SETERROR(ENXIO);
> + mutex_exit(sc_lock);
> + goto cbdwrite_out;
> + :: !cbd_backed(_dev) ->
> + SETERROR(ENOENT);
> + mutex_exit(sc_lock);
> + goto cbdwrite_out;
> + :: else ->
> + skip;
> + fi
> + mutex_exit(sc_lock);
> +
> + /* simulate uio -> buf translation */
> + buf_t rw_buf;
> + rw_buf.direction = B_WRITE;
> + rw_buf.error = 0;
> + rw_buf.length = _uio.length;
> + rw_buf.address = _uio.address;
> + rw_buf.spooled = 0;
> +
> + cbdstrategy(rw_buf);
> +cbdwrite_out:
> +}
> +
> diff -urN cbd.null/cbd.prx cbd/cbd.prx
> --- cbd.null/cbd.prx 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbd.prx 2023-10-19 20:12:35.838587981 +0000
> @@ -0,0 +1,218 @@
> +// Spin model extractor harness written by cherry
> +//
> +%F linux/cbd.c
> +%X -n modexbug // Somehow this is needed to activate per-func LUT
> +
> +// See cbdstrategy for translation table only for *function()*s
> +%X -L cbdstrategy -n cbd_submit_bio
> +// See cbdioctl.lut for translation table only for *function()*s
> +%X -L cbdioctl -n cbd_ioctl
> +
> +%X -n cbd_attach_backend
> +%X -n cbd_detach_backend
> +%H
> +// Disable effects of all included files and try to implement a subset of the APIs they provide.
> +#define _LINUX_BLKDEV_H
> +
> +// linux/blk_types.h
> +#define SECTOR_SHIFT 9
> +typedef void struct block_device;
> +typedef void blk_mode_t;
> +
> +#define __LINUX_MUTEX_H
> +struct mutex {
> + void *dummy;
> +};
> +
> +// sys/null.h
> +#define NULL 0
> +
> +#define true 1
> +#define false 0
> +
> +typedef int bool;
> +
> +/* Linuxisms */
> +#define __user
> +#define __init
> +#define __exit
> +
> +#define module_init(x)
> +#define module_exit(x)
> +#define MODULE_LICENSE(L)
> +
> +%%
> +//%C // c_code {}
> +//%%
> +//%D // c_cdecl {}
> +//%%
> +%L // XXX: Looks like per function tables are broken for now.
> +// Common transformations
> +
> +NonState hidden cbd_major global
> +NonState hidden exclusive global
> +NonState hidden cbd_fops global
> +NonState hidden exclusive global
> +
> +mutex_lock(... mutex_enter(sc_lock)
> +mutex_unlock(... mutex_exit(sc_lock)
> +
> +// cbd_submit_bio(...) -> cbdstrategy(...)
> +NonState hidden bio cbd_submit_bio
> +NonState hidden cbp cbd_submit_bio
> +NonState hidden bdev cbd_submit_bio
> +NonState hidden bio_offset cbd_submit_bio
> +NonState hidden bdevsz cbd_submit_bio
> +
> +// Gen-ed Code substitution. This is due to a lack of a sensible parser harness language with the right semantics.
> +Substitute c_expr { ((bio_offset+bio->bi_iter.bi_size)>bdevsz) } ((bp.offset + bp.length) > cbdevsz)
> +Substitute c_expr { (bio_op(bio)&REQ_OP_READ) } is_set(bp.direction, B_READ)
> +Substitute c_code { bio_offset=((bio_offset>bdevsz) ? bdevsz : bio_offset); } bp.offset = ((bp.offset > cbdevsz) -> cbdevsz : bp.offset)
> +Substitute c_code [bio] { bio->bi_iter.bi_size=(bdevsz-bio_offset); } bp.length = (cbdevsz - bp.offset)
> +Substitute c_code [bio] { bio->bi_iter.bi_sector=(bio_offset>>9); } skip
> +Substitute c_code [bio] { bio->bi_status=BLK_STS_IOERR; } bp.error = EIO;
> +Substitute c_code [bio] { bio->bi_status=BLK_STS_OFFLINE; } bp.error = ENXIO
> +
> +// cbd_ioctl(...) -> cbdioctl(...)
> +NonState hidden userdata cbd_ioctl
> +NonState hidden arg cbd_ioctl
> +NonState hidden cmd cbd_ioctl
> +NonState hidden mode cbd_ioctl
> +NonState hidden bdev cbd_ioctl
> +NonState hidden cbp cbd_ioctl
> +NonState hidden err cbd_ioctl
> +
> +
> +Substitute c_expr { (!cbd_configured(cbp)) } cbd_configured(_dev)
> +Substitute c_code { err=(-ENXIO); } SETERROR(ENXIO)
> +Substitute c_expr { 222 == cmd } (_cmd == CBDIO_ATTACH_BACKEND)
> +Substitute c_expr { 333 == cmd } (_cmd == CBDIO_DETACH_BACKEND)
> +Substitute c_code [bdev->bd_disk && bdev] { cbp=bdev->bd_disk->private_data; } skip
> +Substitute c_code { err=cbd_attach_backend(cbp,userdata); } cbd_attach_backend(_data)
> +Substitute c_code { err=cbd_detach_backend(cbp); } cbd_detach_backend(_data)
> +Substitute c_code [cbp && cbp->backingdev->bd_disk->fops && cbp->backingdev->bd_disk && cbp->backingdev && cbp] { err=cbp->backingdev->bd_disk->fops->ioctl(cbp->backingdev,mode,cmd,arg); } ioctl_backing(backingdev, _cmd, _data, _flag)
> +Substitute c_code { err=(-ENOTTY); } skip
> +
> +// cbd_attach_backend(...) -> cbd_attach_backend()
> +NonState hidden lockedcbp cbd_attach_backend
> +NonState hidden pathname cbd_attach_backend
> +NonState hidden userdata cbd_attach_backend
> +NonState hidden backingdev cbd_attach_backend
> +
> +Substitute c_code { pathname=kzalloc(PATH_MAX,GFP_KERNEL); } skip;
> +Substitute c_expr { (pathname==0) } false
> +Substitute c_expr { (strncpy_from_user(pathname,(void *)userdata,PATH_MAX)==(-EFAULT)) } (_data != CBD_DEVID_BACKING); SETERROR(EINVAL)
> +Substitute c_code { kfree(pathname); } skip
> +Substitute c_code { backingdev=blkdev_get_by_path(strim(pathname),((BLK_OPEN_READ|BLK_OPEN_WRITE)|BLK_OPEN_EXCL),exclusive,0); } skip
> +Substitute c_expr { IS_ERR(backingdev) } false
> +Substitute c_code [lockedcbp] { lockedcbp->backingdev=backingdev; } backingdev = CBD_DEVID_BACKING
> +Substitute c_code [lockedcbp->cbdev && lockedcbp] { set_capacity(lockedcbp->cbdev->bd_disk,bdev_nr_sectors(backingdev)); } cbdevsz = CBD_BACKING_DISKSIZE
> +
> +NonState hidden lockedcbp cbd_detach_backend
> +Substitute c_code [lockedcbp] { blkdev_put(lockedcbp->backingdev,exclusive); } skip
> +Substitute c_code [lockedcbp] { lockedcbp->backingdev=0; } backingdev = CBD_DEVID_NONE
> +Substitute c_code [lockedcbp->cbdev && lockedcbp] { set_capacity(lockedcbp->cbdev->bd_disk,0); } cbdevsz = 0
> +%%
> +
> +%P
> +hidden int tmpdev; /* Saved during passthrough */
> +
> +inline
> +cbdstrategy(bp)
> +{
> +#include "_modex_cbd_submit_bio.pml"
> +}
> +
> +inline
> +cbdioctl(_dev, _cmd, _data, _flag)
> +{
> +#include "_modex_cbd_ioctl.pml"
> +}
> +
> +inline
> +cbd_attach_backend(_data)
> +{
> + SETERROR(0);
> +#include "_modex_cbd_attach_backend.pml"
> +}
> +
> +inline
> +cbd_detach_backend(_data)
> +{
> + SETERROR(0);
> +#include "_modex_cbd_detach_backend.pml"
> +}
> +
> +// linux/cbd.c doesn't have read/write fops, so we
> +// fake it to allow cbd.drv to be parsed
> +// However See: cbd.drv: -D EXPLORE_READWRITESTATE
> +/* For now, we just check re-entrancy consistency.
> + * XXX: uio validation
> + */
> +inline
> +cbdread(_dev, _uio)
> +{
> + SETERROR(0);
> +
> + /* Check for backing disk */
> + mutex_enter(sc_lock);
> + if
> + :: !cbd_configured(_dev) ->
> + SETERROR(EINVAL);
> + mutex_exit(sc_lock);
> + goto cbdread_out;
> + :: !cbd_backed(_dev) ->
> + SETERROR(ENOENT);
> + mutex_exit(sc_lock);
> + goto cbdread_out;
> + :: else ->
> + skip;
> + fi
> + mutex_exit(sc_lock);
> +
> + /* simulate uio -> buf translation */
> + buf_t rd_buf;
> + rd_buf.direction = B_READ;
> + rd_buf.error = 0;
> + rd_buf.length = _uio.length;
> + rd_buf.address = _uio.address;
> + rd_buf.spooled = 0;
> + cbdstrategy(rd_buf);
> +
> +cbdread_out:
> +}
> +
> +inline
> +cbdwrite(_dev, _uio)
> +{
> + SETERROR(0);
> +
> + /* Check for backing disk */
> + mutex_enter(sc_lock);
> + if
> + :: !cbd_configured(_dev) ->
> + SETERROR(EINVAL);
> + mutex_exit(sc_lock);
> + goto cbdwrite_out;
> + :: !cbd_backed(_dev) ->
> + SETERROR(ENOENT);
> + mutex_exit(sc_lock);
> + goto cbdwrite_out;
> + :: else ->
> + skip;
> + fi
> + mutex_exit(sc_lock);
> +
> + /* simulate uio -> buf translation */
> + buf_t rw_buf;
> + rw_buf.direction = B_WRITE;
> + rw_buf.error = 0;
> + rw_buf.length = _uio.length;
> + rw_buf.address = _uio.address;
> + rw_buf.spooled = 0;
> +
> + cbdstrategy(rw_buf);
> +cbdwrite_out:
> +}
> +
> +%%
> \ No newline at end of file
> diff -urN cbd.null/cbdstrategy.linux.lut cbd/cbdstrategy.linux.lut
> --- cbd.null/cbdstrategy.linux.lut 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbdstrategy.linux.lut 2023-10-19 13:12:04.576568138 +0000
> @@ -0,0 +1,7 @@
> +// cbd_submit_bio(...) -> cbdstrategy(...)
> +
> +cbd_backed(cbp) cbd_backed(bp.dev)
> +bio_set_dev(... tmpdev = bp.dev; bp.dev = backingdev
> +submit_bio_noacct(bio) io_backing(backingdev, bp); bp.dev = tmpdev
> +
> +
> diff -urN cbd.null/cbdstrategy.NetBSD.lut cbd/cbdstrategy.NetBSD.lut
> --- cbd.null/cbdstrategy.NetBSD.lut 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/cbdstrategy.NetBSD.lut 2023-10-23 08:23:18.598269036 +0000
> @@ -0,0 +1,10 @@
> +// cbd_strategy(...) -> cbdstrategy(...)
> +
> +cbd_backed(sc) cbd_backed(bp.dev)
> +VOP_STRATEGY(sc->sc_tvn,buf) tmpdev = bp.dev; bp.dev = backingdev; io_backing(backingdev, bp); bp.dev = tmpdev
> +
> +Substitute c_expr { (sc==0) } false
> +
> +
> +
> +
> diff -urN cbd.null/linux/cbd.c cbd/linux/cbd.c
> --- cbd.null/linux/cbd.c 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/linux/cbd.c 2023-10-22 21:10:18.387421555 +0000
> @@ -0,0 +1,315 @@
> +#include <linux/mutex.h>
> +#include <linux/blkdev.h>
> +
> +#define CBD_ATTACH_BACKEND 222 /* XXX: fit into ioctl namespace */
> +#define CBD_DETACH_BACKEND 333 /* XXX: ditto */
> +
> +
> +/* Driver-wide values that don't change over the life of the module */
> +
> +/* At the moment, we assume that there's 1:1 correspondence between a
> + * driver instance (ie; module_init/exit pair) and a device
> + * instance. Thus we use cbd_major below as the primary "key" to
> + * lookup device state, via blk_get_no_open(). Note - this is not
> + * related to "minor" devices, which can still use cbd_major as
> + * above.
> + * XXX: Fix this.
> + */
> +static int cbd_major = 0;
> +static char *exclusive = "cbd_ioctl";
> +
> +struct cbd_private {
> + struct block_device *cbdev; /* Back ref for state tracking
> + * - see cbd_configured()
> + * below
> + */
> + struct block_device *backingdev;
> + struct mutex cbdlock;
> +};
> +
> +/* Driver state enquiry */
> +static bool
> +cbd_configured(struct cbd_private *cbp)
> +{
> + return (cbp->cbdev != NULL);
> +}
> +
> +/* Implies attached */
> +static bool
> +cbd_backed(struct cbd_private *cbp)
> +{
> + // XXX: check sc->sc_size == backing disk size
> + return (cbd_configured(cbp) && cbp->backingdev != NULL);
> +}
> +
> +static int
> +cbd_attach_backend(struct cbd_private *lockedcbp, void *userdata)
> +{
> + char *pathname;
> + struct block_device *backingdev;
> +
> + /* XXX: proper dev validation */
> + if (lockedcbp->cbdev == NULL) {
> + return -ENODEV;
> + }
> +
> + pathname = kzalloc(PATH_MAX, GFP_KERNEL);
> +
> + if (pathname == NULL) {
> + return -ENOMEM;
> + }
> +
> + if (strncpy_from_user(pathname, (void __user *)userdata,
> + PATH_MAX) == -EFAULT) {
> + kfree(pathname);
> + return -EFAULT;
> + }
> +
> + backingdev = blkdev_get_by_path(strim(pathname),
> + BLK_OPEN_READ|BLK_OPEN_WRITE|BLK_OPEN_EXCL,
> + exclusive, NULL);
> +
> + kfree(pathname);
> +
> + if (IS_ERR(backingdev)) {
> + return PTR_ERR(backingdev);
> + }
> +
> + set_capacity(lockedcbp->cbdev->bd_disk, bdev_nr_sectors(backingdev));
> + lockedcbp->backingdev = backingdev;
> +
> +
> + return 0;
> +}
> +
> +static int
> +cbd_detach_backend(struct cbd_private *lockedcbp)
> +{
> + /* XXX: proper dev state validity */
> + if (lockedcbp->cbdev == NULL ||
> + lockedcbp->backingdev == NULL) {
> + return -ENODEV;
> + }
> +
> + blkdev_put(lockedcbp->backingdev, exclusive);
> +
> + lockedcbp->backingdev = NULL;
> + set_capacity(lockedcbp->cbdev->bd_disk, 0);
> +
> + return 0;
> +}
> +
> +static int cbd_ioctl(struct block_device *bdev, blk_mode_t mode,
> + unsigned cmd, unsigned long arg)
> +{
> + int err = 0;
> + void __user *userdata = (void __user *)arg;
> + struct cbd_private *cbp;
> + cbp = bdev->bd_disk->private_data;
> +
> + if (cbp == NULL) {
> + err = -ENXIO;
> + goto ioctl_out;
> + }
> +
> + mutex_lock(&cbp->cbdlock);
> +
> + if (!cbd_configured(cbp)) {
> + err = -ENXIO;
> + goto ioctl_out;
> + }
> +
> + switch(cmd) {
> + case CBD_ATTACH_BACKEND:
> + if (cbd_backed(cbp)) {
> + err = -EBUSY;
> + break;
> + }
> + if (!(mode & BLK_OPEN_WRITE)) {
> + err = -EPERM;
> + break;
> + }
> + err = cbd_attach_backend(cbp, userdata);
> + break;
> + case CBD_DETACH_BACKEND: /* XXX: */
> + if (!cbd_backed(cbp)) {
> + err = -ENODEV;
> + break;
> + }
> + if (!(mode & BLK_OPEN_WRITE)) {
> + err = -EPERM;
> + break;
> + }
> + err = cbd_detach_backend(cbp);
> + break;
> + default: /*Pass through */
> + if (cbd_backed(cbp)) {
> + err = cbp->backingdev->bd_disk->fops->
> + ioctl(cbp->backingdev, mode, cmd, arg);
> + } else {
> + err = -ENOTTY;
> + }
> + break;
> + }
> +ioctl_out:
> + mutex_unlock(&cbp->cbdlock);
> + return err;
> +}
> +
> +static void cbd_submit_bio(struct bio *bio)
> +{
> + struct block_device *bdev = bio->bi_bdev;
> + struct cbd_private *cbp = bdev->bd_disk->private_data;
> +
> + size_t bio_offset = bio->bi_iter.bi_sector << SECTOR_SHIFT;
> + /* Model assumes this is cached in driver instance after
> + * backing device is attached.
> + */
> + size_t bdevsz = bdev_nr_bytes(bdev);
> +
> + /* sanity check bio limits */
> + if ((bio_offset + bio->bi_iter.bi_size) > bdevsz) {
> + if (bio_op(bio) & REQ_OP_READ) {
> + /* Clip bio */
> + bio_offset = ((bio_offset > bdevsz) ? bdevsz : bio_offset);
> + /* Update the bio */
> + bio->bi_iter.bi_size = bdevsz - bio_offset;
> +
> + bio->bi_iter.bi_sector = (bio_offset >> SECTOR_SHIFT);
> + }
> + } else {
> + bio->bi_status = BLK_STS_NOTSUPP;
> + return;
> + }
> +
> + if (cbd_backed(cbp)) {
> + mutex_lock(&cbp->cbdlock);
> + /* Pass through to backing dev */
> + bio_set_dev(bio, cbp->backingdev);
> + mutex_unlock(&cbp->cbdlock);
> + submit_bio_noacct(bio);
> + } else {
> + bio->bi_status = BLK_STS_OFFLINE;
> + }
> +
> + return;
> +
> +}
> +
> +static const struct block_device_operations cbd_fops = {
> + .owner = THIS_MODULE,
> + .ioctl = cbd_ioctl,
> + .submit_bio = cbd_submit_bio,
> +};
> +
> +
> +static int __init cbd_init(void)
> +{
> + int err = 0;
> + struct block_device *bdev;
> + struct gendisk *gendisk;
> + struct cbd_private *cbp;
> +
> + cbp = kzalloc(sizeof(struct cbd_private),
> + GFP_KERNEL);
> +
> + if (cbp == NULL) {
> + return -ENOMEM;
> + }
> +
> + gendisk = blk_alloc_disk(NUMA_NO_NODE);
> +
> + if (gendisk == NULL) {
> + kfree(cbp);
> + goto out;
> + }
> +
> + sprintf(gendisk->disk_name, "cbd");
> + gendisk->fops = &cbd_fops;
> +
> + /* Since we pass through for now. XXX: review */
> + blk_queue_flag_set(QUEUE_FLAG_SYNCHRONOUS, gendisk->queue);
> + blk_queue_flag_set(QUEUE_FLAG_NOWAIT, gendisk->queue);
> +
> + gendisk->private_data = cbp;
> +
> + /* Note: set_capacity() is called later when the backend is attached.
> + * XXX: Will this work ?
> + */
> + err = add_disk(gendisk);
> + if (err) {
> + put_disk(gendisk);
> + kfree(cbp);
> + goto out;
> + }
> +
> + cbd_major = register_blkdev(0, "cachedrive");
> + if (cbd_major < 0) {
> + err = cbd_major;
> + put_disk(gendisk);
> + kfree(cbp);
> + goto out;
> + } else {
> + bdev = blkdev_get_by_dev(MKDEV(cbd_major, 0),
> + BLK_OPEN_READ|BLK_OPEN_WRITE, exclusive, NULL);
> + if (IS_ERR(bdev)) {
> + err = PTR_ERR(bdev);
> + put_disk(gendisk);
> + kfree(cbp);
> + goto out;
> + }
> +
> + if (bdev == NULL) {
> + put_disk(gendisk);
> + kfree(cbp);
> + goto out;
> + }
> +
> + bdev->bd_disk = gendisk;
> + mutex_init(&cbp->cbdlock);
> + mutex_lock(&cbp->cbdlock);
> + cbp->cbdev = bdev;
> + mutex_unlock(&cbp->cbdlock);
> + err = 0; /* Mark success */
> + }
> +
> +out:
> + return err;
> +}
> +
> +static void __exit cbd_exit(void)
> +{
> + struct block_device *bdev;
> + struct gendisk *gendisk;
> + struct cbd_private *cbp;
> +
> + bdev = blkdev_get_by_dev(MKDEV(cbd_major, 0),
> + BLK_OPEN_READ|BLK_OPEN_WRITE, exclusive, NULL);
> +
> + if (bdev == NULL) {
> + // XXX: panic() or something else ?
> + panic("can't find allocated block device major number %d\n", cbd_major);
> + }
> +
> + gendisk = bdev->bd_disk;
> + cbp = gendisk->private_data;
> +
> + mutex_lock(&cbp->cbdlock);
> + if (cbp->backingdev != NULL) {
> + mutex_unlock(&cbp->cbdlock);
> + cbd_detach_backend(cbp);
> + }
> + mutex_unlock(&cbp->cbdlock);
> +
> + put_disk(gendisk);
> + kfree(cbp);
> + blkdev_put(bdev, exclusive);
> +
> + unregister_blkdev(cbd_major, "cbd");
> +}
> +
> +module_init(cbd_init);
> +module_exit(cbd_exit);
> +
> +MODULE_LICENSE("GPL");
> +
> diff -urN cbd.null/linux/Makefile cbd/linux/Makefile
> --- cbd.null/linux/Makefile 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/linux/Makefile 2023-10-16 09:20:14.033674649 +0000
> @@ -0,0 +1,10 @@
> +obj-m += cbd.o
> +LINUXSRC ?= $(HOME)/linux #For dev, for now.
> +# The "standard seems to be"
> +#LINUXSRC?=/lib/modules/$(shell uname -r)/build
> +
> +all:
> + make -C $(LINUXSRC) M=$(PWD) modules
> +
> +clean:
> + make -C $(LINUXSRC) M=$(PWD) clean
> diff -urN cbd.null/Makefile cbd/Makefile
> --- cbd.null/Makefile 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/Makefile 2023-10-21 17:25:46.201831270 +0000
> @@ -0,0 +1,96 @@
> +# This set of spinroot related files were written by MattC
> +#<[email protected]> in the Gregorian Calendar year AD.2023, in the month
> +# of February that year.
> +#
> +# We have two specification files and a properties file (".inv")
> +#
> +# The properties file contains a "constraint" sections
> +# such as ltl or never claims (either or, not both).
> +#
> +# The specification is divided into two files:
> +# the file with suffix '.drv' is a "driver" which
> +# instantiates processes that will ultimately "drive" the
> +# models under test.
> +# The file with the suffix '.pml' contains the process
> +# model code, which, is intended to be the formal specification
> +# for the code we are interested in writing in C.
> +#
> +# A file with a '.pmh' extension is a "header" file which
> +# both the driver and the model can use.
> +#
> +# We process these files in slightly different ways during
> +# the dev cycle, but broadly speaking, the idea is to generate
> +# a file called 'spinmodel.pml' which contains the final
> +# model file that is fed to spin.
> +#
> +# During a model extraction verification run, "modex" is run to
> +# extract the 'specification' from C code written to implement
> +# the model defined above. We use a 'harness' file (see file with
> +# suffix '.prx' below.
> +#
> +# Once the harness has been run, spinmodel.pml should be
> +# synthesised and processed as usual.
> +#
> +# The broad idea is that software development starts by writing the
> +# spec first, validating the model, and then implementing the model in
> +# C, after which we come back to extract the model from the C file
> +# and cross check our implementation using spin.
> +#
> +# If things go well, the constraints specified in the '.inv' file
> +# should hold exactly for both the handwritten model, and the
> +# extracted one.
> +
> +spin-gen: cbd.pml cbd.drv cbd.inv
> + cp cbd.pml model #mimic modex - see below.
> + cat model >> spinmodel.pml;cat cbd.drv >> spinmodel.pml;cat cbd.inv >> spinmodel.pml;
> + spin -am spinmodel.pml
> + cc -DVECTORSZ=65536 -o cbd.pan pan.c
> +
> +all: spin-gen prog
> +
> +# Verification related targets.
> +spin-run: spinmodel.pml
> + ./cbd.pan -a #Generate cbd.pml.trail on error
> + @test -f spinmodel.pml.trail && spin -t spinmodel.pml -p -g -l && ./cbd.pan -r spinmodel.pml.trail -g || true
> +
> +# Modex Extracts from C code to 'model' - see cbd.linux.prx
> +modex-linux: cbd.linux.prx cbd.drv cbd.inv linux/cbd.c
> + modex -v -w cbd.linux.prx
> + echo "//Included via command line - see Makefile:modex-*:\n\n\n#include \"cbd.pmh\"\n\n" >> spinmodel.pml #include "cbd.pmh" - avoid modex cpp expansion
> + cat model >> spinmodel.pml;cat cbd.drv >> spinmodel.pml;cat cbd.inv >> spinmodel.pml;
> + spin -am spinmodel.pml #Sanity check
> + cc -DVECTORSZ=65536 -o cbd.pan pan.c
> +
> +# Modex Extracts from C code to 'model' - see cbd.NetBSD.prx
> +modex-NetBSD: cbd.NetBSD.prx cbd.drv cbd.inv NetBSD/cbd.c
> + modex -v -w cbd.NetBSD.prx
> + echo "//Included via command line - see Makefile:modex-*:\n\n\n#include \"cbd.pmh\"\n\n" >> spinmodel.pml #include "cbd.pmh" - avoid modex cpp expansion
> + cat model >> spinmodel.pml;cat cbd.drv >> spinmodel.pml;cat cbd.inv >> spinmodel.pml;
> + spin -am spinmodel.pml #Sanity check
> + cc -DVECTORSZ=65536 -o cbd.pan pan.c
> +
> +# Housekeeping
> +modex-gen-clean:
> + rm -f spinmodel.pml # Our consolidated model file
> + rm -f _spin_nvr.tmp # Never claim file
> + rm -f model # modex generated intermediate "model" file
> + rm -f pan.* # Spin generated source files
> + rm -f _modex* # modex generated script files
> + rm -f linux/cbd.[IM]
> + rm -f NetBSD/cbd.[IM]
> +
> +prog-clean:
> + rm -f cbd
> +
> +spin-run-clean:
> + rm -f spinmodel.pml.trail
> +
> +spin-gen-clean:
> + rm -f cbd.pan # model executables.
> + rm -f spinmodel.pml # Consolidated model source files
> + rm -f _spin_nvr.tmp # Never claim file
> + rm -f model # Intermediate "model" file
> + rm -f pan.* # Spin generated source files
> +
> +clean: modex-gen-clean spin-gen-clean spin-run-clean prog-clean
> + rm -f *~
> diff -urN cbd.null/NetBSD/cbd.c cbd/NetBSD/cbd.c
> --- cbd.null/NetBSD/cbd.c 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/NetBSD/cbd.c 2023-10-23 06:37:45.843939111 +0000
> @@ -0,0 +1,588 @@
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <sys/buf.h>
> +#include <sys/bufq.h>
> +#include <sys/conf.h>
> +#include <sys/disk.h>
> +#include <sys/disklabel.h>
> +#include <sys/errno.h>
> +
> +#include <uvm/uvm_prot.h>
> +#include <sys/acl.h>
> +#include <sys/vnode_if.h>
> +#include <sys/vnode.h>
> +
> +#include <sys/module.h>
> +
> +#include <sys/fcntl.h>
> +
> +#define CBD_ATTACH_BACKEND 222 /* XXX: fit into ioctl namespace */
> +#define CBD_DETACH_BACKEND 333 /* XXX: ditto */
> +
> +static void cbd_attach(device_t, device_t, void *);
> +static int cbd_detach(device_t, int);
> +
> +static dev_type_open(cbd_open);
> +static dev_type_close(cbd_close);
> +static dev_type_read(cbd_read);
> +static dev_type_write(cbd_write);
> +static dev_type_ioctl(cbd_ioctl);
> +static dev_type_strategy(cbd_strategy);
> +static dev_type_size(cbd_size);
> +
> +struct cbd_ioctl {
> + const char *diskpath; /* Path to backing block disk dev node */
> + struct lwp *l; /* Userland caller lwp */
> +};
> +
> +const struct dkdriver cbddkdriver = {
> + .d_open = cbd_open,
> + .d_close = cbd_close,
> + .d_strategy = cbd_strategy,
> + .d_minphys = minphys,
> +};
> +
> +struct cbd_softc {
> + device_t sc_dev; /* Back ref to autoconf dev ptr. */
> + struct disk sc_dk; /* generic disk information */
> + kmutex_t sc_lock; /* Protects access to instance of this struct. */
> + struct bufq_state *sc_buflist; /* transaction request/reply queue */
> + struct vnode *sc_tvn; /* backing device
> + * vnode */
> + int sc_tvn_bsize; /* Backing device size, which
> + * is also the size of our
> + * device
> + */
> + struct kauth_cred *sc_tvn_l_cred; /* Creds used with sc_tvn */
> +};
> +
> +const struct bdevsw cbd_bdevsw = {
> + .d_open = cbd_open,
> + .d_close = cbd_close,
> + .d_strategy = cbd_strategy,
> + .d_ioctl = cbd_ioctl,
> + .d_dump = nodump,
> + .d_psize = cbd_size,
> + .d_discard = nodiscard,
> + .d_flag = D_DISK | D_MPSAFE
> +};
> +
> +const struct cdevsw cbd_cdevsw = {
> + .d_open = cbd_open,
> + .d_close = cbd_close,
> + .d_read = cbd_read,
> + .d_write = cbd_write,
> + .d_ioctl = cbd_ioctl,
> + .d_stop = nostop,
> + .d_tty = notty,
> + .d_poll = nopoll,
> + .d_mmap = nommap,
> + .d_kqfilter = nokqfilter,
> + .d_discard = nodiscard,
> + .d_flag = D_DISK | D_MPSAFE
> +};
> +
> +CFATTACH_DECL3_NEW(cbd, sizeof(struct cbd_softc),
> + 0, cbd_attach, cbd_detach, NULL, NULL, NULL, DVF_DETACH_SHUTDOWN);
> +
> +#ifdef _MODULE
> +MODULE(MODULE_CLASS_DRIVER, cbd, NULL);
> +CFDRIVER_DECL(cbd, DV_DISK, NULL);
> +
> +
> +
> +static int
> +cbd_modcmd(modcmd_t cmd, void *arg)
> +{
> + devmajor_t bmajor = -1, cmajor = -1;
> +
> + int error = 0;
> + switch (cmd) {
> + case MODULE_CMD_INIT:
> +
> + error = devsw_attach("cbd", &cbd_bdevsw, &bmajor,
> + &cbd_cdevsw, &cmajor);
> +
> + if (error) {
> + break;
> + }
> +
> + error = config_cfdriver_attach(&cbd_cd);
> +
> + if (error) {
> + devsw_detach(&cbd_bdevsw, &cbd_cdevsw);
> + break;
> + }
> +
> + error = config_cfattach_attach(cbd_cd.cd_name, &cbd_ca);
> + if (error) {
> + config_cfdriver_detach(&cbd_cd);
> + devsw_detach(&cbd_bdevsw, &cbd_cdevsw);
> + break;
> + }
> +
> + break;
> +
> + case MODULE_CMD_FINI:
> + error = config_cfattach_detach(cbd_cd.cd_name, &cbd_ca);
> + if (error) {
> + break;
> + }
> + error = config_cfdriver_detach(&cbd_cd);
> + if (error) {
> + break;
> + }
> +
> + devsw_detach(&cbd_bdevsw, &cbd_cdevsw);
> + break;
> + case MODULE_CMD_STAT:
> + error = ENOTTY;
> + break;
> +
> + default:
> + error = ENOTTY;
> + break;
> + }
> +
> + return error;
> +
> +}
> +
> +
> +#endif
> +
> +static void cbd_attach(device_t, device_t, void *);
> +static int cbd_detach(device_t, int);
> +static bool
> +is_read(struct buf *buf)
> +{
> + return ((buf->b_flags & B_READ) == B_READ);
> +}
> +
> +#if notyet
> +static bool
> +is_write(struct buf *buf)
> +{
> + return ((buf->b_flags & B_WRITE) == B_WRITE);
> +}
> +#endif
> +
> +/*
> + * The idea here is that the state sequentially progresses
> + * forwards as configuration completes.
> + *
> + * So for eg: a device that has a caching device setup, implies it is
> + * already cbd_backed(), but not vice-versa, because an unfronted
> + * backed device is merely an uncached "passthrough" situation.
> + */
> +
> +static bool
> +cbd_configured(struct cbd_softc *sc)
> +{
> + /*
> + * Note: device_t is a pointer, no specific "NULL" typedef
> + * available
> + */
> + return (sc->sc_dev != NULL);
> +}
> +
> +/* Implies attached */
> +static bool
> +cbd_backed(struct cbd_softc *sc)
> +{
> + // XXX: check sc->sc_size == backing disk size
> + return (cbd_configured(sc) && sc->sc_tvn != NULL);
> +}
> +
> +static void
> +cbd_attach(device_t parent, device_t self, void *aux)
> +{
> + struct cbd_softc *sc = device_private(self);
> +
> + mutex_init(&sc->sc_lock, MUTEX_DEFAULT, IPL_NONE);
> +
> + mutex_enter(&sc->sc_lock);
> +
> + if (cbd_configured(sc)) { /* Sorry, busy */
> + mutex_exit(&sc->sc_lock);
> + }
> +
> + sc->sc_dev = self;
> +
> + /* Initialize and attach the disk structure. */
> + disk_init(&sc->sc_dk, device_xname(self), &cbddkdriver);
> + disk_attach(&sc->sc_dk);
> +
> + bufq_alloc(&sc->sc_buflist, "fcfs", 0);
> +
> + mutex_exit(&sc->sc_lock);
> +
> + if (!pmf_device_register(self, NULL, NULL))
> + aprint_error_dev(self, "couldn't establish power handler\n");
> +}
> +
> +static int
> +cbd_detach(device_t self, int flags)
> +{
> + struct cbd_softc *sc = device_private(self);
> +
> + mutex_enter(&sc->sc_lock);
> +
> + if (!cbd_configured(sc)) {
> + mutex_exit(&sc->sc_lock);
> + return ENXIO;
> + }
> +
> + /* We assume the !(flags & DETACH_FORCE) case here. */
> + if (((flags & DETACH_FORCE) == 0) &&
> + (bufq_peek(sc->sc_buflist) != NULL)) {
> + mutex_exit(&sc->sc_lock);
> + return EBUSY; /* There are pending transactions */
> + } else { /* DETACH_FORCE */
> + bufq_drain(sc->sc_buflist);
> + //XXX: Tell backing device to drain/flush etc ?
> + }
> +
> + // XXX: detach backed/fronted devices ?
> + // quiesce everything first, then sc->sc_tvn = NULL; ?
> + // All threads need to cease related activity.
> +
> + mutex_exit(&sc->sc_lock);
> +
> + pmf_device_deregister(self);
> + bufq_free(sc->sc_buflist);
> + disk_detach(&sc->sc_dk);
> +
> + mutex_destroy(&sc->sc_lock);
> +
> + return 0;
> +}
> +
> +static int
> +cbd_open(dev_t dev, int flag, int ifmt, struct lwp *l)
> +{
> + /* Sanity check params */
> + switch(ifmt) {
> + case S_IFCHR:
> + case S_IFBLK:
> + break;
> + default:
> + return EINVAL;
> + }
> +
> + int unit;
> + struct cbd_softc *sc;
> +
> + unit = DISKUNIT(dev);
> + sc = device_lookup_private(&cbd_cd, unit);
> +
> + /*
> + * Trying to open unconfigured dev.
> + * cbdconfig(8) should be run first.
> + */
> + if (sc == NULL) {
> + return ENXIO;
> + }
> +
> + mutex_enter(&sc->sc_lock);
> +
> + if (!cbd_backed(sc)) {
> + mutex_exit(&sc->sc_lock);
> + return ENXIO;
> + }
> +
> + mutex_exit(&sc->sc_lock);
> +
> + return 0;
> +}
> +
> +
> +static int
> +cbd_close(dev_t dev, int flag, int ifmt, struct lwp *l)
> +{
> + /* Sanity check params */
> + switch(ifmt) {
> + case S_IFCHR:
> + case S_IFBLK:
> + break;
> + default:
> + return EINVAL;
> + }
> +
> + int unit;
> + struct cbd_softc *sc;
> +
> + unit = DISKUNIT(dev);
> + sc = device_lookup_private(&cbd_cd, unit);
> +
> + /*
> + * Trying to close unconfigured dev.
> + * cbdconfig(8) should be run first.
> + */
> + if (sc == NULL) {
> + return ENXIO;
> + }
> +
> + mutex_enter(&sc->sc_lock);
> +
> + if (!cbd_backed(sc)) {
> + mutex_exit(&sc->sc_lock);
> + return ENXIO;
> + }
> +
> + mutex_exit(&sc->sc_lock);
> + return 0;
> +}
> +
> +
> +static void
> +cbd_strategy(struct buf *buf)
> +{
> +
> + dev_t dev = buf->b_dev;
> +
> + int unit;
> + struct cbd_softc *sc;
> +
> + unit = DISKUNIT(dev);
> + sc = device_lookup_private(&cbd_cd, unit);
> +
> + if (sc == NULL) {
> + return;
> + }
> +
> + /* Sanity check buffer */
> + if ((dbtob(buf->b_blkno) + buf->b_bcount) > sc->sc_tvn_bsize) {
> + if (BUF_ISREAD(buf)) {
> + /* Clip offset */
> + buf->b_blkno = (dbtob(buf->b_blkno) > sc->sc_tvn_bsize) ? btodb(sc->sc_tvn_bsize) : buf->b_blkno;
> + /* Update the request length */
> + buf->b_bcount = (sc->sc_tvn_bsize - dbtob(buf->b_blkno));
> + } else {
> + buf->b_error = EINVAL;
> + biodone(buf);
> + return;
> + }
> + }
> +
> +
> + if (cbd_backed(sc)) {
> + mutex_enter(&sc->sc_lock);
> + disk_busy(&sc->sc_dk);
> + VOP_STRATEGY(sc->sc_tvn, buf);
> + disk_unbusy(&sc->sc_dk, buf->b_bcount, is_read(buf));
> + mutex_exit(&sc->sc_lock);
> + } else {
> + buf->b_error = ENXIO;
> + biodone(buf);
> + return;
> + }
> +
> + /* We assume that the backing driver has frobbed buf
> + * appropriately, for return;
> + */
> +}
> +
> +static int
> +cbd_attach_backend(struct cbd_softc *lockedsc, void *data)
> +{
> + struct cbd_ioctl *cbdioctl = data;
> +
> + int error = 0;
> + struct pathbuf *pbuf;
> + struct vnode *vp;
> +
> + uint64_t numsec;
> + unsigned int secsz;
> +
> + /* XXX: proper dev validation */
> + if (lockedsc->sc_dev == NULL) {
> + return -ENODEV;
> + }
> +
> + error = pathbuf_copyin(cbdioctl->diskpath, &pbuf);
> + if (error != 0) {
> + return error;
> + }
> +
> + error = vn_bdev_openpath(pbuf, &vp, cbdioctl->l);
> + pathbuf_destroy(pbuf);
> + if (error != 0) {
> + return error;
> + }
> +
> + error = getdisksize(vp, &numsec, &secsz);
> + if (error != 0) {
> + vn_close(vp, /* See vn_bdev_openpath() */ FREAD | FWRITE, cbdioctl->l->l_cred);
> + return error;
> + }
> +
> + lockedsc->sc_tvn = vp;
> + lockedsc->sc_tvn_l_cred = cbdioctl->l->l_cred;
> + lockedsc->sc_tvn_bsize = numsec * secsz;
> +
> + return error;
> +}
> +
> +static int
> +cbd_detach_backend(struct cbd_softc *lockedsc)
> +{
> + /* XXX: proper dev state validity */
> + if (lockedsc->sc_dev == NULL ||
> + lockedsc->sc_tvn == NULL ||
> + lockedsc->sc_tvn_bsize == 0 ||
> + lockedsc->sc_tvn_l_cred == NULL) {
> + return -ENODEV;
> + }
> +
> + vn_close(lockedsc->sc_tvn, /* See vn_bdev_openpath() */ FREAD | FWRITE,
> + lockedsc->sc_tvn_l_cred);
> +
> + lockedsc->sc_tvn_bsize = 0;
> + lockedsc->sc_tvn_l_cred = NULL;
> + lockedsc->sc_tvn = NULL; /* reset */
> +
> + return 0;
> +}
> +
> +static int
> +cbd_ioctl(dev_t dev, u_long cmd, void *data, int flag, struct lwp *l)
> +{
> + int unit, error = 0;
> + struct cbd_softc *sc;
> + struct cbd_ioctl cbdioctlargs;
> +
> + unit = DISKUNIT(dev);
> + sc = device_lookup_private(&cbd_cd, unit);
> +
> + if (sc == NULL) {
> + return ENXIO;
> + }
> +
> + mutex_enter(&sc->sc_lock);
> +
> + if (!cbd_configured(sc)) {
> + mutex_exit(&sc->sc_lock);
> + return ENXIO;
> + }
> +
> + switch(cmd) {
> + case CBD_ATTACH_BACKEND:
> + if (cbd_backed(sc)) {
> + error = EBUSY;
> + break;
> + }
> + if (!(flag & FWRITE)) {
> + error = EPERM;
> + }
> + cbdioctlargs.diskpath = data;
> + cbdioctlargs.l = l;
> + error = cbd_attach_backend(sc, &cbdioctlargs);
> + break;
> + case CBD_DETACH_BACKEND:
> + if (!cbd_backed(sc)) {
> + error = ENODEV;
> + break;
> + }
> + if (!(flag & FWRITE)) {
> + error = EPERM;
> + break;
> + }
> + error = cbd_detach_backend(sc);
> + break;
> +
> + /* default: Pass through */
> + }
> +
> + error = VOP_IOCTL(sc->sc_tvn, cmd, data, flag, l->l_cred);
> +
> + mutex_exit(&sc->sc_lock);
> +
> + return error;
> +}
> +
> +static int
> +cbd_read(dev_t dev, struct uio *uio, int flags)
> +{
> + int unit;
> + struct cbd_softc *sc;
> +
> + unit = DISKUNIT(dev);
> + sc = device_lookup_private(&cbd_cd, unit);
> +
> + if (sc == NULL) {
> + return ENXIO;
> + }
> +
> + mutex_enter(&sc->sc_lock);
> +
> + if (!cbd_backed(sc)) {
> + mutex_exit(&sc->sc_lock);
> + return ENXIO;
> + }
> +
> + mutex_exit(&sc->sc_lock);
> +
> + return physio(cbd_strategy, NULL, dev, B_READ, minphys, uio);
> +}
> +
> +static int
> +cbd_write(dev_t dev, struct uio *uio, int flags)
> +{
> + int unit;
> + struct cbd_softc *sc;
> +
> + unit = DISKUNIT(dev);
> + sc = device_lookup_private(&cbd_cd, unit);
> +
> + /*
> + * Trying to close unconfigured dev.
> + * cbdconfig(8) should be run first.
> + */
> + if (sc == NULL) {
> + return ENXIO;
> + }
> +
> + mutex_enter(&sc->sc_lock);
> +
> + if (!cbd_backed(sc)) {
> + mutex_exit(&sc->sc_lock);
> + return ENXIO;
> + }
> +
> + mutex_exit(&sc->sc_lock);
> +
> + return physio(cbd_strategy, NULL, dev, B_WRITE, minphys, uio);
> +}
> +
> +
> +static int
> +cbd_size(dev_t dev)
> +{
> + int unit;
> + struct cbd_softc *sc;
> +
> + unit = DISKUNIT(dev);
> + sc = device_lookup_private(&cbd_cd, unit);
> +
> + /*
> + * Trying to close unconfigured dev.
> + * cbdconfig(8) should be run first.
> + */
> + if (sc == NULL) {
> + return ENXIO;
> + }
> +
> + mutex_enter(&sc->sc_lock);
> +
> + if (!cbd_backed(sc)) {
> + mutex_exit(&sc->sc_lock);
> + return ENXIO;
> + }
> +
> + // XXPass through.
> +
> + mutex_exit(&sc->sc_lock);
> +
> + return -1; // XXX:
> +}
> diff -urN cbd.null/NetBSD/cbd.ioconf cbd/NetBSD/cbd.ioconf
> --- cbd.null/NetBSD/cbd.ioconf 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/NetBSD/cbd.ioconf 2023-10-20 15:40:16.888208026 +0000
> @@ -0,0 +1,8 @@
> +# $NetBSD$
> +
> +ioconf cbd
> +
> +include "conf/files"
> +
> +pseudo-device cbd
> +
> diff -urN cbd.null/NetBSD/Makefile cbd/NetBSD/Makefile
> --- cbd.null/NetBSD/Makefile 1970-01-01 00:00:00.000000000 +0000
> +++ cbd/NetBSD/Makefile 2023-10-20 15:41:52.334710787 +0000
> @@ -0,0 +1,10 @@
> +# $NetBSD$
> +# Please build using crossbuild nbmake-$ARCH
> +
> +.include "${NETBSDSRCDIR}/sys/modules/Makefile.inc"
> +
> +KMOD= cbd
> +IOCONF= cbd.ioconf
> +SRCS= cbd.c
> +
> +.include <bsd.kmodule.mk>

2023-11-05 21:02:20

by Mathew, Cherry G.*

[permalink] [raw]
Subject: Re: [RFC+Patch] Formal models as source of truth for Software Architects.

Hi Jonas,

First of all, sorry for my late reply - I note with thanks, that you
spent some time and attention on your email, so I wanted to spend some
time composing a thoughtful reply. I'm replying in context below:


>>>>> Jonas Oberhauser <[email protected]> writes:


[...]


> A few years ago we also built systems to extract models from C
> code (for TLA+ back then), but those systems are now totally
> unused (and unmaintained) due to poor scalability of exploratory
> state-based model checking.

> And then there's the issue that many engineers don't know how to
> write reasonable temporal assertions, especially not ones that can
> give sufficient guarantees to be worth the extra effort and
> maintenance.

So my arrival at "D3" and spin was a progression of looking at what's
available (in the open domain), current consensus around development
methodology, and progress in hardware capacity.

I've used TLA+ for protocol verification, and while it's great for that,
I didn't see it translate to and from implementation code. The best I
was able to do was to use a functional language to mirror the lambda
calculus - see: https://github.com/byisystems/byihive for eg:

One of the main motivations for my current "D3" path is to provide a
cleaner separation between design and implementation - where the
"Architect" role does design, and the "Engineer" role does
implementation. This allows for each role to focus on what they do best,
while the machine is tasked with maintaining functional equivalence.

I suspect that your experience was due to the fact that engineers had to
get involved in design, without understanding or consenting to this.

> If we use a simplified model of what the system should be doing to
> specify, then usually we are not looking for equivalence but only
> for uni-directional simulation: the spec usually needs to have
> more degrees of freedom than the implementation, which is just a
> single implementation.

So spin has two approaches - one uses the inline "C" to glue the spec as
a sort of "test driver" for implementation code if required. The other
uses a separate source level model extractor, to rebuild the implicit
model in the implementation C code. I am leaning towards the latter,
because it seems to help with streamlining the software development
process as an iteration - which helps especially in early nascent stages
of design & development. What I'm beginning to see is that once design
"settles down", then the iterations become less disruptive. But
initially, the Architect/Engineer hats need very good interaction.


> What has worked well for us is anything that works

> - directly on the code

> - doesn't require learning any kind of math to find bugs with

> - is scalable enough to find bugs on real code

> - doesn't require a lot of extra effort from the engineers
> (compared to the extra assurance you get)

> - doesn't need to be adapted whenever the code changes

Have you seen spoq ?

https://www.usenix.org/conference/osdi23/presentation/li-xupeng

They use this approach - one of the reasons I'm not fully convinced by
this approach is because, I believe that the source of truth being in
implementation code limits flexibility, for eg: reuse. If you notice my
original patch, the model is a reference for both the NetBSD and Linux
drivers - nothing stops it from further OSs (as long as the codeflow
logic doesn't vary too much - and even then, the model could itself be
made modular to accommodate different FSMs)

> I find in software, usually blueprint and skyscraper are in some
> sense the same thing. Or in some sense, the skyscraper is
> extracted automatically by a compiler/interpreter and installation
> system - we never actually dig anything ourselves, but have the
> ability to experiment directly on the blueprint as we build it.

> That's because languages already try to provide us reasonable
> abstraction mechanisms by which we can simplify the code and make
> it more "model-like".

Can you give me an example of how this works/a pre-existing tool that
helps with this simplification ? I'm actually currently looking at
re-writing modex to do precisely this (but with the translation
end-point being something close to the original model).

> So writing the blueprint twice - once in some modelling language
> and once in a language meant for execution - doesn't really
> provide a lot of benefit. But it requires doing changes at two
> levels whenever the software design changes. It just violates DRY.

IMHO, this would be true if:

a) The implementation language was expressive enough to cleanly
encapsulate the model.

b) If the toolchain around it were able to do model verification (eg: by
exploring LTL bounds).

I've heard good things about rustlang wrt. a), but not b) - would be
keen to know more.

> What makes sense in our experience is identifying properties you
> would like the system to have and devising a specific checking
> mechanism for that. For example, you might specify the protocol
> of how the device communicates with the system and what
> assumptions it has (like "don't start a second command while the
> first one hasn't finished"), and then check that the driver code
> obeys this policy. Then you can change the software code without
> touching the protocol specification, and just re-verify.

> In the easiest case, the protocol can be a mock device process
> that is inserted via a form of dependency injection to talk to the
> real driver code during verification, and if the code isn't too
> complex you can exhaustively check several test cases.


Ok, from what I understand, this is a "system model" - when I talk about
D3, I'm purely talking about the software model. This model may be part
of a larger scope "system model".

Event-B
http://www.event-b.org/index.html
https://web-archive.southampton.ac.uk/deploy-eprints.ecs.soton.ac.uk/111/1/sld.ch1.intro.pdf

is an interesting (closed) industry take on what you seem to be
describing - and they seem to have taken a more "D3" like approach -
I don't have much insight into their experiences though, although I'm
speaking to one of their practitioners.


> Only if the code is highly unreadable for specific reasons like
> performance or talking to legacy ABIs, it can make sense to have
> an "executable comment" that implements the same functional
> behavior in a very concise, readable way (with crud
> performance/mock APIs replacing the actual behavior), and check
> that the real code and the "executable comment" do "the same
> thing".

I'm a bit skeptical about the "doxygen" approach (i've seen this in some
versions of Java iirc, as well) - basically you're bypassing all the
goodness that the language parser provides by hiding from it - I'm not
sure how this is useful other than to shimmy in a specific bandaid
fix. Which is fine, if you're fighting fires, but I think for a more
structured approach, I'm not seeing what you're seeing, I'm afraid. Open
to be signposted though.


> The main issues I see when we look at it through the skyscraper
> analogy is that usually people in other engineering disciplines
> have CAD tools - backed by very powerful computers - to really
> systematically and thoroughly check the design. What we usually
> have is stress testing and some test suites built by hand that
> miss many cases.

Interestingly, I was surprised at the amount of compute available "off
the shelf" - the other day I sent a model for help, saying with a bit of
trepidation that that it might take about 25G of RAM and thrash the
machine, so please be careful - the volunteer came back to me and said
he has 128G RAM available, and 1TB RAM on a pinch! What takes an hour
to run on my i5 + 16GB laptop, took him 3minutes!

So in some sense, I believe we're nearing the "Deep Neural Nets" moment,
when GPU computation took inference to the next level.


> Anyways, that's my (limited) experience.

> Hope it helps,

Yes - thank you so much for your interest! I look forward to further
comments - and also, if there are industry applications that benefit
from the broad intersect, any signposts will be much appreciated!

Best,
--
MattC

2023-11-05 21:07:11

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: [RFC+Patch] Formal models as source of truth for Software Architects.

Hi Mathew,


Am 11/5/2023 um 8:20 AM schrieb Mathew, Cherry G.*:
> Hi Jonas,
>
> First of all, sorry for my late reply - I note with thanks, that you
> spent some time and attention on your email, so I wanted to spend some
> time composing a thoughtful reply. I'm replying in context below:
>
>
>>>>>> Jonas Oberhauser<[email protected]> writes:
> [...]
>
>
> > A few years ago we also built systems to extract models from C
> > code (for TLA+ back then), but those systems are now totally
> > unused (and unmaintained) due to poor scalability of exploratory
> > state-based model checking.
>
> > And then there's the issue that many engineers don't know how to
> > write reasonable temporal assertions, especially not ones that can
> > give sufficient guarantees to be worth the extra effort and
> > maintenance.
>
> So my arrival at "D3" and spin was a progression of looking at what's
> available (in the open domain), current consensus around development
> methodology, and progress in hardware capacity.
>
> I've used TLA+ for protocol verification, and while it's great for that,
> I didn't see it translate to and from implementation code.


I meant that we implemented an internal tool to transpile from C to PlusCal.

It sounded like a great idea at the time. But then it quickly fell out
of use.


> > If we use a simplified model of what the system should be doing to
> > specify, then usually we are not looking for equivalence but only
> > for uni-directional simulation: the spec usually needs to have
> > more degrees of freedom than the implementation, which is just a
> > single implementation.
>
> So spin has two approaches - one uses the inline "C" to glue the spec as
> a sort of "test driver" for implementation code if required. The other
> uses a separate source level model extractor, to rebuild the implicit
> model in the implementation C code. I am leaning towards the latter,
> because it seems to help with streamlining the software development
> process as an iteration - which helps especially in early nascent stages
> of design & development. What I'm beginning to see is that once design
> "settles down", then the iterations become less disruptive. But
> initially, the Architect/Engineer hats need very good interaction.
>
>
> > What has worked well for us is anything that works
>
> > - directly on the code
>
> > - doesn't require learning any kind of math to find bugs with
>
> > - is scalable enough to find bugs on real code
>
> > - doesn't require a lot of extra effort from the engineers
> > (compared to the extra assurance you get)
>
> > - doesn't need to be adapted whenever the code changes
>
> Have you seen spoq ?
>
> https://www.usenix.org/conference/osdi23/presentation/li-xupeng
>
> They use this approach - one of the reasons I'm not fully convinced by
> this approach is because, I believe that the source of truth being in
> implementation code limits flexibility, for eg: reuse. If you notice my
> original patch, the model is a reference for both the NetBSD and Linux
> drivers - nothing stops it from further OSs (as long as the codeflow
> logic doesn't vary too much - and even then, the model could itself be
> made modular to accommodate different FSMs)


I have seen it, but it's not exactly what I have in mind. I wouldn't say
that it doesn't require learning any kind of math to find bugs with.
You're still writing Coq proofs, just with more automation. Nevertheless
it's going in the direction I'm thinking of.

For reuse, I think the main issue is that implementation code is always
a source of truth - the truth of what's really going to be executed.
If we want to have a second source of truth, it should be a different
truth, such as "assumptions of the other parts of the system".

Since you already have this source of truth, if you make a different
implementation in another kernel, you can compare what the original
driver was doing with what your new implementation is doing.
There's no need to have yet another copy of what the driver might be doing.



> > I find in software, usually blueprint and skyscraper are in some
> > sense the same thing. Or in some sense, the skyscraper is
> > extracted automatically by a compiler/interpreter and installation
> > system - we never actually dig anything ourselves, but have the
> > ability to experiment directly on the blueprint as we build it.
>
> > That's because languages already try to provide us reasonable
> > abstraction mechanisms by which we can simplify the code and make
> > it more "model-like".
>
> Can you give me an example of how this works/a pre-existing tool that
> helps with this simplification ? I'm actually currently looking at
> re-writing modex to do precisely this (but with the translation
> end-point being something close to the original model).


I think any higher level language, including C, goes into this
direction. Some are just a lot better at building abstractions and
describing the code more model-like than tiniest-implementation-detail-like.

(and sometimes that's not what you want).


> > So writing the blueprint twice - once in some modelling language
> > and once in a language meant for execution - doesn't really
> > provide a lot of benefit. But it requires doing changes at two
> > levels whenever the software design changes. It just violates DRY.
>
> IMHO, this would be true if:
>
> a) The implementation language was expressive enough to cleanly
> encapsulate the model.
>
> b) If the toolchain around it were able to do model verification (eg: by
> exploring LTL bounds).


We are building internal tools for such b) things. Not quite exhaustive
formal verification tools, but tools that can express and check more
complex properties at source level without false positives.

(They may have false negatives, but that's not really a showstopper.
False positives would be.)


> I've heard good things about rustlang wrt. a), but not b) - would be
> keen to know more.
>
> > What makes sense in our experience is identifying properties you
> > would like the system to have and devising a specific checking
> > mechanism for that. For example, you might specify the protocol
> > of how the device communicates with the system and what
> > assumptions it has (like "don't start a second command while the
> > first one hasn't finished"), and then check that the driver code
> > obeys this policy. Then you can change the software code without
> > touching the protocol specification, and just re-verify.
>
> > In the easiest case, the protocol can be a mock device process
> > that is inserted via a form of dependency injection to talk to the
> > real driver code during verification, and if the code isn't too
> > complex you can exhaustively check several test cases.
>
>
> Ok, from what I understand, this is a "system model" - when I talk about
> D3, I'm purely talking about the software model. This model may be part
> of a larger scope "system model".


I'm not using the word "system" in any "full system" kind of way. I just
mean any component that interacts with some outside world, such as an
algorithm, a driver, or the whole kernel.

Usually we are looking for two types of bugs:

 1) the code for some subtle reason isn't doing what one would expect
from reading the code, e.g., UB, some wrong assumption of atomicity, etc.

 2) the code is doing what it seems to be doing, but that's not the
right thing

The former can usually be specified in a semi-generic way without
knowing anything about the application, and without modeling too much of
the external system.
In the easiest cases, things like -fsanitize=undefined may already catch
most of these issues, especially when used in combination with some
other tools (e.g. fuzzers).

But for the latter, we need some way to specify at least what the wrong
things are, even if we can't specify what the right thing is.
So there it makes sense to specify some of the assumptions of the other
systems that interact with our system.

E.g., I might not be able to specify a video game, but at least I can
check that its sequence of opengl calls is well-formed (no half-defined
objects for example).

But for that I need to explain what it means for a sequence of opengl
calls to be well-formed.


> Event-B
> http://www.event-b.org/index.html
> https://web-archive.southampton.ac.uk/deploy-eprints.ecs.soton.ac.uk/111/1/sld.ch1.intro.pdf
>
> is an interesting (closed) industry take on what you seem to be
> describing - and they seem to have taken a more "D3" like approach -
> I don't have much insight into their experiences though, although I'm
> speaking to one of their practitioners.


Thanks. I haven't had an opportunity to look into event-b more closely
yet. Maybe at some point in the future.


> > Only if the code is highly unreadable for specific reasons like
> > performance or talking to legacy ABIs, it can make sense to have
> > an "executable comment" that implements the same functional
> > behavior in a very concise, readable way (with crud
> > performance/mock APIs replacing the actual behavior), and check
> > that the real code and the "executable comment" do "the same
> > thing".
>
> I'm a bit skeptical about the "doxygen" approach (i've seen this in some
> versions of Java iirc, as well) - basically you're bypassing all the
> goodness that the language parser provides by hiding from it - I'm not
> sure how this is useful other than to shimmy in a specific bandaid
> fix. Which is fine, if you're fighting fires, but I think for a more
> structured approach, I'm not seeing what you're seeing, I'm afraid. Open
> to be signposted though.


By executable comment I mean a comment that has a formal semantics that
can be executed.

Think more pre&post conditions + ghost code. E.g., for a tricky sort
algorithm like timsort the comment might be something like

var __ghost_array = copy(input_array);

.... // complicated timsort code here

insertion_sort(&__ghost_array); // timsort should give the same output
as insertion stort, note that both are stable

for (i in 0...array_len) {
    assert (sorted_array[i] == __ghost_array[i]);
}


This is probably not going to help you find the well-known timsort bug,
but it might be enough to find more trivial mistakes when rolling your
own timsort.
Anyways this is what I mean by executable comment - a more readable,
maintainable implementation of the code that tells you what the code
ought to be doing + some checks to say that they're really doing the
same thing.

As I understand, doxygen is just a way to link human-readable
documentation with code, but maybe I'm wrong about this.


> > The main issues I see when we look at it through the skyscraper
> > analogy is that usually people in other engineering disciplines
> > have CAD tools - backed by very powerful computers - to really
> > systematically and thoroughly check the design. What we usually
> > have is stress testing and some test suites built by hand that
> > miss many cases.
>
> Interestingly, I was surprised at the amount of compute available "off
> the shelf" - the other day I sent a model for help, saying with a bit of
> trepidation that that it might take about 25G of RAM and thrash the
> machine, so please be careful - the volunteer came back to me and said
> he has 128G RAM available, and 1TB RAM on a pinch! What takes an hour
> to run on my i5 + 16GB laptop, took him 3minutes!
>
> So in some sense, I believe we're nearing the "Deep Neural Nets" moment,
> when GPU computation took inference to the next level.


Yeah, it's pretty impressive. Actually even the power of laptops is
pretty impressive.
Things like GenMC or Dat3M can do bounded verification of (slightly
massaged) C algorithms from the linux kernel by more or less exploring
the full execution space (of a bounded client) within a couple of
minutes on a laptop.

But for many scenarios, we are still a factor of at least trillions away
from using something where exhaustive exploration works - not something
that we'll fix by tying a bunch of supercomputers together...


best wishes,

jonas


2023-11-06 06:15:12

by Mathew, Cherry G.*

[permalink] [raw]
Subject: Re: [RFC+Patch] Formal models as source of truth for Software Architects.

Hi Jonas,

>>>>> Jonas Oberhauser <[email protected]> writes:


[...]


> I meant that we implemented an internal tool to transpile from C
> to PlusCal.

I'm curious about the design/architecture here - how did you manage the
logical mappings from C->PlusCal - did you have a third language to
specify the mappings, or did you use heuristics with inherent
assumptions ?

> It sounded like a great idea at the time. But then it quickly fell
> out of use.

This is something I'm keen to understand why - was this because
programmers focussed on the C code, and the transpiler+constraints
became a "testing problem" which ended up in bitrot ? Or is there
something related to the software methodology/development process ? Or
perhaps the percieved ROI of formal verification wasn't as much as
initially thought ? Something else ?


[...]


> For reuse, I think the main issue is that implementation code is
> always a source of truth - the truth of what's really going to be
> executed. If we want to have a second source of truth, it should
> be a different truth, such as "assumptions of the other parts of
> the system".

> Since you already have this source of truth, if you make a
> different implementation in another kernel, you can compare what
> the original driver was doing with what your new implementation is
> doing. There's no need to have yet another copy of what the
> driver might be doing.


I understand what you're saying, but there are a few points that I'm
probably not able to express clearly.

Just to set context, and not to state the obvious - as you likely
already know, Formal languages such as pluscal or promela have an
"execution model" that is different from a programming language - in
that, when one writes code in them, one's mental model needs to pay
attention to behaviour, whereas function becomes a more abstract
problem, as you pointed out. I wrote a very hand wavy description of
this in the context of spin:

https://mail-index.netbsd.org/tech-kern/2023/09/28/msg029203.html

I believe this kind of mental frame needs its own discipline, and is an
opportunity to divide concerns - that of design/architecture vs. that of
engineering/implementation.

So to return to your concern about code duplication, in the context of
codegen, one could make the same argument about compiled or transpiled
code - if it were manually transpiled. And yet we are comfortable as
programmers, assuming that the "higher level language" is the source of
truth, while happily stepping "down" to gcc __inline__ __asm__ {} when
needed. So, for eg: (and I believe there are tools out there that can
do this to some degree) - if the programming code could be
auto-generated/"compiled" from the formal specification, then this would
become directly analogous.


[...]

>> Can you give me an example of how this works/a pre-existing tool
>> that helps with this simplification ? I'm actually currently
>> looking at re-writing modex to do precisely this (but with the
>> translation end-point being something close to the original
>> model).


> I think any higher level language, including C, goes into this
> direction. Some are just a lot better at building abstractions and
> describing the code more model-like than
> tiniest-implementation-detail-like.

C is problematic because it doesn't for eg: define concurrency or
consistency models - in many cases, even the data types are not clearly
defined (eg: "integer" is machine word size dependant). So it's really
hard to specify something formal at the level of C that is not very
context (OS/CPU arch) specific. This is one of the reasons why for eg:
in spin's promela, data types are extremely limited, and very precisely
defined. I'm sure there are several other differences, I'm not expert
enough to comment - just sharing my observations so far. The point being
that programming languages such as C are probably not expressive enough
to encapsulate formal models precisely enough.

> (and sometimes that's not what you want).

>> > So writing the blueprint twice - once in some modelling
>> language > and once in a language meant for execution - doesn't
>> really > provide a lot of benefit. But it requires doing changes
>> at two > levels whenever the software design changes. It just
>> violates DRY.
>>
>> IMHO, this would be true if:
>>
>> a) The implementation language was expressive enough to cleanly
>> encapsulate the model.
>>
>> b) If the toolchain around it were able to do model verification
>> (eg: by exploring LTL bounds).


> We are building internal tools for such b) things. Not quite
> exhaustive formal verification tools, but tools that can express
> and check more complex properties at source level without false
> positives.

> (They may have false negatives, but that's not really a
> showstopper. False positives would be.)

Fair enough - this is the level of tradeoff that only someone with a
clear industrial application would be able to make, as I imagine you
are. This is also where I believe the gap between theory (of formal
methods) and practise is, so this is insightful - I'd be curious to know
more, if you're able to share.


[...]

>>
>> I'm a bit skeptical about the "doxygen" approach (i've seen this
>> in some versions of Java iirc, as well) - basically you're
>> bypassing all the goodness that the language parser provides by
>> hiding from it - I'm not sure how this is useful other than to
>> shimmy in a specific bandaid fix. Which is fine, if you're
>> fighting fires, but I think for a more structured approach, I'm
>> not seeing what you're seeing, I'm afraid. Open to be signposted
>> though.


> By executable comment I mean a comment that has a formal semantics
> that can be executed.

> Think more pre&post conditions + ghost code. E.g., for a tricky
> sort algorithm like timsort the comment might be something like

> var __ghost_array = copy(input_array);

> .... // complicated timsort code here

> insertion_sort(&__ghost_array); // timsort should give the same
> output as insertion stort, note that both are stable

> for (i in 0...array_len) {     assert (sorted_array[i] ==
> __ghost_array[i]); }


> This is probably not going to help you find the well-known timsort
> bug, but it might be enough to find more trivial mistakes when
> rolling your own timsort. Anyways this is what I mean by
> executable comment - a more readable, maintainable implementation
> of the code that tells you what the code ought to be doing + some
> checks to say that they're really doing the same thing.

This looks closer to testing to me - the assertions for eg: seems to be
atemporal ie; only concerned about "immedate" values, unlike LTL which
can check behaviour across an "execution sequence" (in the set of all
possible execution sequences). So from an FV perspective, I would write
the assertion to look more like: "eventually is_sorted(array)", where
is_sorted() has magic to check to if the array elements are sorted in
whatever required order.

> As I understand, doxygen is just a way to link human-readable
> documentation with code, but maybe I'm wrong about this.

Yes, I just meant that as a way to put parsable code in comments.

[...]

>>
>> So in some sense, I believe we're nearing the "Deep Neural Nets"
>> moment, when GPU computation took inference to the next level.


> Yeah, it's pretty impressive. Actually even the power of laptops
> is pretty impressive. Things like GenMC or Dat3M can do bounded
> verification of (slightly massaged) C algorithms from the linux
> kernel by more or less exploring the full execution space (of a
> bounded client) within a couple of minutes on a laptop.

> But for many scenarios, we are still a factor of at least
> trillions away from using something where exhaustive exploration
> works - not something that we'll fix by tying a bunch of
> supercomputers together...


I agree - but it's still quite impressive - plus there are ways to be
smart about bounding the search space - eg: using modularity and
inter-module communications interfaces, optimisation techniques such as
partial order reduction, etc. I'm sure you've made those optimisations -
and I'm curious to know what kind of "CI" performance is possible in
state of the art.

Best,

--
MattC/(~cherry)

2024-05-06 09:09:54

by Jonas Oberhauser

[permalink] [raw]
Subject: Re: [RFC+Patch] Formal models as source of truth for Software Architects.



Am 11/6/2023 um 7:12 AM schrieb Mathew, Cherry G.*:
> Hi Jonas,
>
>>>>>> Jonas Oberhauser <[email protected]> writes:
>
>
> [...]
>
>
> > I meant that we implemented an internal tool to transpile from C
> > to PlusCal.
>
> I'm curious about the design/architecture here - how did you manage the
> logical mappings from C->PlusCal - did you have a third language to
> specify the mappings, or did you use heuristics with inherent
> assumptions ?


We stayed within a subset of C where we were quite certain about what
the semantics is on our architecture. You can add assertions for UB
conditions.


>
> > It sounded like a great idea at the time. But then it quickly fell
> > out of use.
>
> This is something I'm keen to understand why - was this because
> programmers focussed on the C code, and the transpiler+constraints
> became a "testing problem" which ended up in bitrot ? Or is there
> something related to the software methodology/development process ? Or
> perhaps the percieved ROI of formal verification wasn't as much as
> initially thought ? Something else ?
>

Mostly because TLA became too slow. We are using much faster stateless
modelcheckers
with optimal DPOR or fast SMT backends now (
https://gitee.com/s4c/vsyncer ) which work on llvm-ir level, so the need
to transpile into yet another language disappeared.

>
> [...]
>
>
> > For reuse, I think the main issue is that implementation code is
> > always a source of truth - the truth of what's really going to be
> > executed. If we want to have a second source of truth, it should
> > be a different truth, such as "assumptions of the other parts of
> > the system".
>
> > Since you already have this source of truth, if you make a
> > different implementation in another kernel, you can compare what
> > the original driver was doing with what your new implementation is
> > doing. There's no need to have yet another copy of what the
> > driver might be doing.
>
>
> I understand what you're saying, but there are a few points that I'm
> probably not able to express clearly.
>
> Just to set context, and not to state the obvious - as you likely
> already know, Formal languages such as pluscal or promela have an
> "execution model" that is different from a programming language - in
> that, when one writes code in them, one's mental model needs to pay
> attention to behaviour, whereas function becomes a more abstract
> problem[...]. I wrote a very hand wavy description of
> this in the context of spin:
>
> https://mail-index.netbsd.org/tech-kern/2023/09/28/msg029203.html


I don't understand this paragraph.

>
> So to return to your concern about code duplication, in the context of
> codegen, one could make the same argument about compiled or transpiled
> code - if it were manually transpiled. And yet we are comfortable as
> programmers, assuming that the "higher level language" is the source of
> truth, while happily stepping "down" to gcc __inline__ __asm__ {} when
> needed. So, for eg: (and I believe there are tools out there that can
> do this to some degree) - if the programming code could be
> auto-generated/"compiled" from the formal specification, then this would
> become directly analogous.


In system software, there are a lot of implementation concerns that go
beyond what you'd normally express in a formal spec. Once you add all of
that detail, it becomes an implementation.


>
>
> [...]
>
> >> Can you give me an example of how this works/a pre-existing tool
> >> that helps with this simplification ? I'm actually currently
> >> looking at re-writing modex to do precisely this (but with the
> >> translation end-point being something close to the original
> >> model).
>
>
> > I think any higher level language, including C, goes into this
> > direction. Some are just a lot better at building abstractions and
> > describing the code more model-like than
> > tiniest-implementation-detail-like.
>
> C is problematic because it doesn't for eg: define concurrency or
> consistency models - in many cases, even the data types are not clearly
> defined (eg: "integer" is machine word size dependant). So it's really
> hard to specify something formal at the level of C that is not very
> context (OS/CPU arch) specific.


In practice this is a non-issue.
Either prove that it works for the least common denominator. E.g. if you
prove that your int values stay within 16 bit, or you know your
architectures all have 32 bit int, there's no issue. Or you use a u32 or
uint32_t.

If you're paranoid about future use, you can write some assertions
checking that the assumptions you make here are really valid in your
real environment.

And C has concurrency models, both in C11+ and in LKMM.


>
> > By executable comment I mean a comment that has a formal semantics
> > that can be executed.
>
> > Think more pre&post conditions + ghost code. E.g., for a tricky
> > sort algorithm like timsort the comment might be something like
>
> > var __ghost_array = copy(input_array);
>
> > .... // complicated timsort code here
>
> > insertion_sort(&__ghost_array); // timsort should give the same
> > output as insertion stort, note that both are stable
>
> > for (i in 0...array_len) {     assert (sorted_array[i] ==
> > __ghost_array[i]); }
>
>
> > This is probably not going to help you find the well-known timsort
> > bug, but it might be enough to find more trivial mistakes when
> > rolling your own timsort. Anyways this is what I mean by
> > executable comment - a more readable, maintainable implementation
> > of the code that tells you what the code ought to be doing + some
> > checks to say that they're really doing the same thing.
>
> This looks closer to testing to me - the assertions for eg: seems to be
> atemporal ie; only concerned about "immedate" values, unlike LTL which
> can check behaviour across an "execution sequence" (in the set of all
> possible execution sequences). So from an FV perspective, I would write
> the assertion to look more like: "eventually is_sorted(array)", where
> is_sorted() has magic to check to if the array elements are sorted in
> whatever required order.

In principle you can extend the assertion language to temporal formulas
as well. There are some cases where we write assertions over a whole
trace. But it's easier to find engineers who write some non-temporal
assertions than those who write temporal assertions.

Best wishes,

jonas