2022-10-03 22:26:48

by Ali Raza

[permalink] [raw]
Subject: [RFC UKL 10/10] Kconfig: Add config option for enabling and sample for testing UKL

Add the KConfig file that will enable building UKL. Documentation
introduces the technical details for how UKL works and the motivations
behind why it is useful. Sample provides a simple program that still uses
the standard system call interface, but does not require a modified C
library.

Cc: Jonathan Corbet <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Segall <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Josh Poimboeuf <[email protected]>

Co-developed-by: Eric B Munson <[email protected]>
Signed-off-by: Eric B Munson <[email protected]>
Co-developed-by: Ali Raza <[email protected]>
Signed-off-by: Ali Raza <[email protected]>
---
Documentation/index.rst | 1 +
Documentation/ukl/ukl.rst | 104 ++++++++++++++++++++++++++++++++++++++
Kconfig | 2 +
kernel/Kconfig.ukl | 41 +++++++++++++++
samples/ukl/Makefile | 16 ++++++
samples/ukl/README | 17 +++++++
samples/ukl/syscall.S | 28 ++++++++++
samples/ukl/tcp_server.c | 99 ++++++++++++++++++++++++++++++++++++
8 files changed, 308 insertions(+)
create mode 100644 Documentation/ukl/ukl.rst
create mode 100644 kernel/Kconfig.ukl
create mode 100644 samples/ukl/Makefile
create mode 100644 samples/ukl/README
create mode 100644 samples/ukl/syscall.S
create mode 100644 samples/ukl/tcp_server.c

diff --git a/Documentation/index.rst b/Documentation/index.rst
index 4737c18c97ff..42f8cb7d4cae 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -167,6 +167,7 @@ to ReStructured Text format, or are simply too old.

tools/index
staging/index
+ ukl/ukl.rst


Translations
diff --git a/Documentation/ukl/ukl.rst b/Documentation/ukl/ukl.rst
new file mode 100644
index 000000000000..a07ebb51169e
--- /dev/null
+++ b/Documentation/ukl/ukl.rst
@@ -0,0 +1,104 @@
+SPDX-License-Identifier: GPL-2.0
+
+Unikernel Linux (UKL)
+=====================
+
+Unikernel Linux (UKL) is a research project aimed at integrating
+application specific optimizations to the Linux kernel. This RFC aims to
+introduce this research to the community. Any feedback regarding the idea,
+goals, implementation and research is highly appreciated.
+
+Unikernels are specialized operating systems where an application is linked
+directly with the kernel and runs in supervisor mode. This allows the
+developers to implement application specific optimizations to the kernel,
+which can be directly invoked by the application (without going through the
+syscall path). An application can control scheduling and resource
+management and directly access the hardware. Application and the kernel can
+be co-optimized, e.g., through LTO, PGO, etc. All of these optimizations,
+and others, provide applications with huge performance benefits over
+general purpose operating systems.
+
+Linux is the de-facto operating system of today. Applications depend on its
+battle tested code base, large developer community, support for legacy
+code, a huge ecosystem of tools and utilities, and a wide range of
+compatible hardware and device drivers. Linux also allows some degree of
+application specific optimizations through build time config options,
+runtime configuration, and recently through eBPF. But still, there is a
+need for even more fine-grained application specific optimizations, and
+some developers resort to kernel bypass techniques.
+
+Unikernel Linux (UKL) aims to get the best of both worlds by bringing
+application specific optimizations to the Linux ecosystem. This way,
+unmodified applications can keep getting the benefits of Linux while taking
+advantage of the unikernel-style optimizations. Optionally, applications
+can be modified to invoke deeper optimizations.
+
+There are two steps to unikernel-izing Linux, i.e., first, equip Linux with
+a unikernel model, and second, actually use that model to implement
+application specific optimizations. This patch focuses on the first part.
+Through this patch, unmodified applications can be built as Linux
+unikernels, albeit with only modest performance advantages. Like
+unikernels, UKL would allow an application to be statically linked into the
+kernel and executed in supervisor mode. However, UKL preserves most of the
+invariants and design of Linux, including a separate page-able application
+portion of the address space and a pinned kernel portion, the ability to
+run multiple processes, and distinct execution modes for application and
+kernel code. Kernel execution mode and application execution mode are
+different, e.g., the application execution mode allows application threads
+to be scheduled, handle signals, etc., which do not apply to kernel
+threads. Application built as a Linux unikernel will have its text and data
+loaded with the kernel at boot time, while the rest of the address space
+would remain unchanged. These applications invoke the system call
+functionality through a function call into the kernel system call entry
+point instead of through the syscall assembly instruction. UKL would
+support a normal userspace so the UKL application can be started, managed,
+profiled, etc., using normal command line utilities.
+
+Once Linux has a unikernel model, different application specific
+optimizations are possible. We have tried a few, e.g., fast system call
+transitions, shared stacks to allow LTO, invoking kernel functions
+directly, etc. We have seen huge performance benefits, details of which are
+not relevant to this patch and can be found in our paper.
+(https://arxiv.org/pdf/2206.00789.pdf)
+
+UKL differs significantly from previous projects, e.g., UML, KML and LKL.
+User Mode Linux (UML) is a virtual machine monitor implemented on syscall
+interface, a very different goal from UKL. Kernel Mode Linux (KML) allows
+applications to run in kernel mode and replaces syscalls with function
+calls. While KML stops there, UKL goes further. UKL links applications and
+kernel together which allows further optimizations e.g., fast system call
+transitions, shared stacks to allow LTO, invoking kernel functions directly
+etc. Details can be found in the paper linked above. Linux Kernel Library
+(LKL) harvests arch independent code from Linux, takes it to userspace as a
+library to be linked with applications. A host needs to provide arch
+dependent functionality. This model is very different from UKL. A detailed
+discussion of related work is present in the paper linked above.
+
+See samples/ukl for a simple TCP echo server example which can be built as
+a normal user space application and also as a UKL application. In the Linux
+config options, a path to the compiled and partially linked application
+binary can be specified. Kernel built with UKL enabled will search this
+location for the binary and link with the kernel. Applications and required
+libraries need to be compiled with -mno-red-zone -mcmodel=kernel flags
+because kernel mode execution can trample on application red zones and in
+order to link with the kernel and be loaded in the high end of the address
+space, application should have the correct memory model. Examples of other
+applications like Redis, Memcached etc along with glibc and libgcc etc.,
+can be found at https://github.com/unikernelLinux/ukl
+
+List of authors and contributors:
+=================================
+
+Ali Raza - [email protected]
+Thomas Unger - [email protected]
+Matthew Boyd - [email protected]
+Eric Munson - [email protected]
+Parul Sohal - [email protected]
+Ulrich Drepper - [email protected]
+Richard Jones - [email protected]
+Daniel Bristot de Oliveira - [email protected]
+Larry Woodman - [email protected]
+Renato Mancuso - [email protected]
+Jonathan Appavoo - [email protected]
+Orran Krieger - [email protected]
+
diff --git a/Kconfig b/Kconfig
index 745bc773f567..2a4594ae472c 100644
--- a/Kconfig
+++ b/Kconfig
@@ -29,4 +29,6 @@ source "lib/Kconfig"

source "lib/Kconfig.debug"

+source "kernel/Kconfig.ukl"
+
source "Documentation/Kconfig"
diff --git a/kernel/Kconfig.ukl b/kernel/Kconfig.ukl
new file mode 100644
index 000000000000..c2c5e1003605
--- /dev/null
+++ b/kernel/Kconfig.ukl
@@ -0,0 +1,41 @@
+menuconfig UNIKERNEL_LINUX
+ bool "Unikernel Linux"
+ depends on X86_64 && !RANDOMIZE_BASE && !PAGE_TABLE_ISOLATION
+ help
+ Unikernel Linux allows for a single, privileged process to be
+ linked with the kernel binary and be executed inplace of or
+ along side a more traditional user space.
+
+ If you don't know what this is, say N.
+
+config UKL_TLS
+ bool "Enable TLS for UKL application"
+ depends on UNIKERNEL_LINUX
+ default Y
+ help
+ Not all applications will make use of thread local storage,
+ but we need to account for it in the linker script if used.
+ For the application in samples/ this should be disabled, but
+ if you are working with glibc this should be 'Y'.
+
+ If unsure say 'Y' here
+
+config UKL_NAME
+ string "UKL Exec target"
+ depends on UNIKERNEL_LINUX
+ default "/UKL"
+ help
+ We need a way to trigger the start of the UKL application,
+ either by the kernel inplace of init or userspace when setup
+ is finished. The value given here is compared against the
+ filename passed to exec and if they match UKL is started.
+ For a more 'traditional' unikernel model, the value set here
+ should be given to the init= boot parameter.
+
+config UKL_ARCHIVE_PATH
+ string "Path static application archive"
+ depends on UNIKERNEL_LINUX
+ default "../UKL.a"
+ help
+ Where the linker should look for the statically linked application
+ and dependency archive.
diff --git a/samples/ukl/Makefile b/samples/ukl/Makefile
new file mode 100644
index 000000000000..93beb7750d4b
--- /dev/null
+++ b/samples/ukl/Makefile
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0
+
+CFLAGS += -I usr/include -fno-PIC -mno-red-zone -mcmodel=kernel
+
+UKL.a: tcp_server.o syscall.o userspace
+ $(AR) cr UKL.a tcp_server.o syscall.o
+ objcopy --prefix-symbols=ukl_ UKL.a
+
+tcp_server.o: tcp_server.c
+syscall.o: syscall.S
+
+userspace:
+ gcc -o tcp_server tcp_server.c
+
+clean:
+ rm -f UKL.a tcp_server.o syscall.o tcp_server
diff --git a/samples/ukl/README b/samples/ukl/README
new file mode 100644
index 000000000000..fbb771da033a
--- /dev/null
+++ b/samples/ukl/README
@@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+UKL test program
+================
+
+tcp_server.c is a epoll based TCP echo server written in C which uses port
+no. 5555 by default. syscall.S translates syscall() function to a call
+instruction in assembly. Normally, C libraries provide syscall() function
+that translate into syscall assembly instruction. Run `make` and it will
+create a UKL.a and a tcp_server. UKL.a can then be copied to where UKL
+Linux build expects it to be present. This can be changed through the Linux
+config options (by running `make menuconfig` etc.) The resulting Linux
+kernel can be run, and once the userspace comes up, the echo server can be
+started by running the UKL exec command, again chosen through the Linux
+config options. tcp_server is a userspace binary of the same echo server
+which can be run normally. This is meant to show that UKL can run code
+which can also be run as a userspace binary without modification.
diff --git a/samples/ukl/syscall.S b/samples/ukl/syscall.S
new file mode 100644
index 000000000000..95d1c177fb05
--- /dev/null
+++ b/samples/ukl/syscall.S
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+ .global _start
+_start:
+ jmp main
+
+ .global syscall
+
+/* Usage: long syscall (syscall_number, arg1, arg2, arg3, arg4, arg5, arg6)
+ We need to do some arg shifting, the syscall_number will be in
+ rax. */
+
+ .text
+syscall:
+ movq %rdi, %rax /* Syscall number -> rax. */
+ movq %rsi, %rdi /* shift arg1 - arg5. */
+ movq %rdx, %rsi
+ movq %rcx, %rdx
+ movq %r8, %r10
+ movq %r9, %r8
+ movq 8(%rsp),%r9 /* arg6 is on the stack. */
+ call entry_SYSCALL_64 /* Do the system call. */
+ cmpq $-4095, %rax /* Check %rax for error. */
+ jae loop /* Jump to error handler if error. */
+ ret /* Return to caller. */
+
+loop:
+ jmp loop
diff --git a/samples/ukl/tcp_server.c b/samples/ukl/tcp_server.c
new file mode 100644
index 000000000000..abf1a0e2bb79
--- /dev/null
+++ b/samples/ukl/tcp_server.c
@@ -0,0 +1,99 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <sys/epoll.h>
+#include <arpa/inet.h>
+#include <netinet/tcp.h>
+
+#define BACKLOG 512
+#define MAX_EVENTS 128
+#define MAX_MESSAGE_LEN 2048
+
+void error(char *msg);
+extern long syscall(long number, ...);
+
+int main(void)
+{
+ // some variables we need
+ struct sockaddr_in server_addr, client_addr;
+ socklen_t client_len = sizeof(client_addr);
+ int bytes_received;
+ char buffer[MAX_MESSAGE_LEN];
+ int on;
+ int result;
+ int sock_listen_fd, newsockfd;
+
+ // setup socket
+ sock_listen_fd = syscall(41, AF_INET, SOCK_STREAM, 0);
+ if (sock_listen_fd < 0)
+ error("Error creating socket..\n");
+
+ server_addr.sin_family = AF_INET;
+ server_addr.sin_port = 45845; //htons(portno);
+ server_addr.sin_addr.s_addr = INADDR_ANY;
+
+ // set TCP NODELAY
+ on = 1;
+ result = syscall(54, sock_listen_fd, IPPROTO_TCP, TCP_NODELAY, &on, sizeof(on));
+ if (result < 0)
+ error("Can't set TCP_NODELAY to on");
+
+ // bind socket and listen for connections
+ if (syscall(49, sock_listen_fd, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0)
+ error("Error binding socket..\n");
+
+ if (syscall(50, sock_listen_fd, BACKLOG) < 0)
+ error("Error listening..\n");
+
+ struct epoll_event ev, events[MAX_EVENTS];
+ int new_events, sock_conn_fd, epollfd;
+
+ epollfd = syscall(213, MAX_EVENTS);
+ if (epollfd < 0)
+ error("Error creating epoll..\n");
+
+ ev.events = EPOLLIN;
+ ev.data.fd = sock_listen_fd;
+
+ if (syscall(233, epollfd, EPOLL_CTL_ADD, sock_listen_fd, &ev) == -1)
+ error("Error adding new listeding socket to epoll..\n");
+
+ while (1) {
+ new_events = syscall(232, epollfd, events, MAX_EVENTS, -1);
+
+ if (new_events == -1)
+ error("Error in epoll_wait..\n");
+
+ for (int i = 0; i < new_events; ++i) {
+ if (events[i].data.fd == sock_listen_fd) {
+ sock_conn_fd = syscall(288, sock_listen_fd,
+ (struct sockaddr *)&client_addr,
+ &client_len, SOCK_NONBLOCK);
+ if (sock_conn_fd == -1)
+ error("Error accepting new connection..\n");
+
+ ev.events = EPOLLIN | EPOLLET;
+ ev.data.fd = sock_conn_fd;
+ if (syscall(233, epollfd, EPOLL_CTL_ADD, sock_conn_fd, &ev) == -1)
+ error("Error adding new event to epoll..\n");
+ } else {
+ newsockfd = events[i].data.fd;
+ bytes_received = syscall(45, newsockfd, buffer, MAX_MESSAGE_LEN,
+ 0, NULL, NULL);
+ if (bytes_received <= 0) {
+ syscall(233, epollfd, EPOLL_CTL_DEL, newsockfd, NULL);
+ syscall(48, newsockfd, SHUT_RDWR);
+ } else {
+ syscall(44, newsockfd, buffer, bytes_received, 0, NULL, 0);
+ }
+ }
+ }
+ }
+}
+
+void error(char *msg)
+{
+ syscall(1, 1, msg, 15);
+ syscall(60, 1);
+}
--
2.21.3


2022-10-04 02:39:14

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [RFC UKL 10/10] Kconfig: Add config option for enabling and sample for testing UKL

On 10/4/22 05:21, Ali Raza wrote:
> Add the KConfig file that will enable building UKL. Documentation
> introduces the technical details for how UKL works and the motivations
> behind why it is useful. Sample provides a simple program that still uses
> the standard system call interface, but does not require a modified C
> library.
>
<snipped>
> Documentation/index.rst | 1 +
> Documentation/ukl/ukl.rst | 104 ++++++++++++++++++++++++++++++++++++++
> Kconfig | 2 +
> kernel/Kconfig.ukl | 41 +++++++++++++++
> samples/ukl/Makefile | 16 ++++++
> samples/ukl/README | 17 +++++++
> samples/ukl/syscall.S | 28 ++++++++++
> samples/ukl/tcp_server.c | 99 ++++++++++++++++++++++++++++++++++++
> 8 files changed, 308 insertions(+)
> create mode 100644 Documentation/ukl/ukl.rst
> create mode 100644 kernel/Kconfig.ukl
> create mode 100644 samples/ukl/Makefile
> create mode 100644 samples/ukl/README
> create mode 100644 samples/ukl/syscall.S
> create mode 100644 samples/ukl/tcp_server.c

Shouldn't the documentation be split into its own patch?

--
An old man doll... just what I always wanted! - Clara

2022-10-06 22:03:59

by Ali Raza

[permalink] [raw]
Subject: Re: [RFC UKL 10/10] Kconfig: Add config option for enabling and sample for testing UKL

On 10/3/22 22:11, Bagas Sanjaya wrote:
> On 10/4/22 05:21, Ali Raza wrote:
>> Add the KConfig file that will enable building UKL. Documentation
>> introduces the technical details for how UKL works and the motivations
>> behind why it is useful. Sample provides a simple program that still uses
>> the standard system call interface, but does not require a modified C
>> library.
>>
> <snipped>
>> Documentation/index.rst | 1 +
>> Documentation/ukl/ukl.rst | 104 ++++++++++++++++++++++++++++++++++++++
>> Kconfig | 2 +
>> kernel/Kconfig.ukl | 41 +++++++++++++++
>> samples/ukl/Makefile | 16 ++++++
>> samples/ukl/README | 17 +++++++
>> samples/ukl/syscall.S | 28 ++++++++++
>> samples/ukl/tcp_server.c | 99 ++++++++++++++++++++++++++++++++++++
>> 8 files changed, 308 insertions(+)
>> create mode 100644 Documentation/ukl/ukl.rst
>> create mode 100644 kernel/Kconfig.ukl
>> create mode 100644 samples/ukl/Makefile
>> create mode 100644 samples/ukl/README
>> create mode 100644 samples/ukl/syscall.S
>> create mode 100644 samples/ukl/tcp_server.c
>
> Shouldn't the documentation be split into its own patch?
>
Thanks for pointing that out.

--Ali

2022-10-07 10:36:03

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [RFC UKL 10/10] Kconfig: Add config option for enabling and sample for testing UKL

On Fri, Oct 7, 2022 at 6:29 AM Ali Raza <[email protected]> wrote:
>
> On 10/3/22 22:11, Bagas Sanjaya wrote:
> > On 10/4/22 05:21, Ali Raza wrote:
> >> Add the KConfig file that will enable building UKL. Documentation
> >> introduces the technical details for how UKL works and the motivations
> >> behind why it is useful. Sample provides a simple program that still uses
> >> the standard system call interface, but does not require a modified C
> >> library.
> >>
> > <snipped>
> >> Documentation/index.rst | 1 +
> >> Documentation/ukl/ukl.rst | 104 ++++++++++++++++++++++++++++++++++++++
> >> Kconfig | 2 +
> >> kernel/Kconfig.ukl | 41 +++++++++++++++
> >> samples/ukl/Makefile | 16 ++++++
> >> samples/ukl/README | 17 +++++++
> >> samples/ukl/syscall.S | 28 ++++++++++
> >> samples/ukl/tcp_server.c | 99 ++++++++++++++++++++++++++++++++++++
> >> 8 files changed, 308 insertions(+)
> >> create mode 100644 Documentation/ukl/ukl.rst
> >> create mode 100644 kernel/Kconfig.ukl
> >> create mode 100644 samples/ukl/Makefile
> >> create mode 100644 samples/ukl/README
> >> create mode 100644 samples/ukl/syscall.S
> >> create mode 100644 samples/ukl/tcp_server.c
> >
> > Shouldn't the documentation be split into its own patch?
> >
> Thanks for pointing that out.
>
> --Ali
>


The commit subject "Kconfig:" is used for changes
under scripts/kconfig/.

Please use something else.


--
Best Regards
Masahiro Yamada

2022-10-13 17:30:13

by Ali Raza

[permalink] [raw]
Subject: Re: [RFC UKL 10/10] Kconfig: Add config option for enabling and sample for testing UKL

On 10/7/22 06:21, Masahiro Yamada wrote:
> On Fri, Oct 7, 2022 at 6:29 AM Ali Raza <[email protected]> wrote:
>>
>> On 10/3/22 22:11, Bagas Sanjaya wrote:
>>> On 10/4/22 05:21, Ali Raza wrote:
>>>> Add the KConfig file that will enable building UKL. Documentation
>>>> introduces the technical details for how UKL works and the motivations
>>>> behind why it is useful. Sample provides a simple program that still uses
>>>> the standard system call interface, but does not require a modified C
>>>> library.
>>>>
>>> <snipped>
>>>> Documentation/index.rst | 1 +
>>>> Documentation/ukl/ukl.rst | 104 ++++++++++++++++++++++++++++++++++++++
>>>> Kconfig | 2 +
>>>> kernel/Kconfig.ukl | 41 +++++++++++++++
>>>> samples/ukl/Makefile | 16 ++++++
>>>> samples/ukl/README | 17 +++++++
>>>> samples/ukl/syscall.S | 28 ++++++++++
>>>> samples/ukl/tcp_server.c | 99 ++++++++++++++++++++++++++++++++++++
>>>> 8 files changed, 308 insertions(+)
>>>> create mode 100644 Documentation/ukl/ukl.rst
>>>> create mode 100644 kernel/Kconfig.ukl
>>>> create mode 100644 samples/ukl/Makefile
>>>> create mode 100644 samples/ukl/README
>>>> create mode 100644 samples/ukl/syscall.S
>>>> create mode 100644 samples/ukl/tcp_server.c
>>>
>>> Shouldn't the documentation be split into its own patch?
>>>
>> Thanks for pointing that out.
>>
>> --Ali
>>
>
>
> The commit subject "Kconfig:" is used for changes
> under scripts/kconfig/.
>
> Please use something else.
>
>
Will do, thank you!

--Ali