Resending, Andrew asked if I could resend the whole set.
Just added 'Reviewed-by' to commit log for 1/2, no other
changes.
v3 patch for 2/2 as before
http://marc.info/?l=linux-mm&m=150148267803458&w=2
Previous patch set here:
http://marc.info/?l=linux-mm&m=150095815413275&w=2
Prakash Sangappa (2):
userfaultfd: Add feature to request for a signal delivery
userfaultfd: selftest: Add tests for UFFD_FEATURE_SIGBUS feature
fs/userfaultfd.c | 3 +
include/uapi/linux/userfaultfd.h | 10 ++-
tools/testing/selftests/vm/userfaultfd.c | 127 +++++++++++++++++++++++++++++-
3 files changed, 136 insertions(+), 4 deletions(-)
In some cases, userfaultfd mechanism should just deliver a SIGBUS signal
to the faulting process, instead of the page-fault event. Dealing with
page-fault event using a monitor thread can be an overhead in these
cases. For example applications like the database could use the signaling
mechanism for robustness purpose.
Database uses hugetlbfs for performance reason. Files on hugetlbfs
filesystem are created and huge pages allocated using fallocate() API.
Pages are deallocated/freed using fallocate() hole punching support.
These files are mmapped and accessed by many processes as shared memory.
The database keeps track of which offsets in the hugetlbfs file have
pages allocated.
Any access to mapped address over holes in the file, which can occur due
to bugs in the application, is considered invalid and expect the process
to simply receive a SIGBUS. However, currently when a hole in the file is
accessed via the mapped address, kernel/mm attempts to automatically
allocate a page at page fault time, resulting in implicitly filling the
hole in the file. This may not be the desired behavior for applications
like the database that want to explicitly manage page allocations of
hugetlbfs files.
Using userfaultfd mechanism with this support to get a signal, database
application can prevent pages from being allocated implicitly when
processes access mapped address over holes in the file.
This patch adds UFFD_FEATURE_SIGBUS feature to userfaultfd mechnism to
request for a SIGBUS signal.
See following for previous discussion about the database requirement
leading to this proposal as suggested by Andrea.
http://www.spinics.net/lists/linux-mm/msg129224.html
Signed-off-by: Prakash Sangappa <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Reviewed-by: Andrea Arcangeli <[email protected]>
---
fs/userfaultfd.c | 3 +++
include/uapi/linux/userfaultfd.h | 10 +++++++++-
2 files changed, 12 insertions(+), 1 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 1d622f2..0bbe7df 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -371,6 +371,9 @@ int handle_userfault(struct vm_fault *vmf, unsigned long reason)
VM_BUG_ON(reason & ~(VM_UFFD_MISSING|VM_UFFD_WP));
VM_BUG_ON(!(reason & VM_UFFD_MISSING) ^ !!(reason & VM_UFFD_WP));
+ if (ctx->features & UFFD_FEATURE_SIGBUS)
+ goto out;
+
/*
* If it's already released don't get it. This avoids to loop
* in __get_user_pages if userfaultfd_release waits on the
diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
index 3b05953..d39d5db 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -23,7 +23,8 @@
UFFD_FEATURE_EVENT_REMOVE | \
UFFD_FEATURE_EVENT_UNMAP | \
UFFD_FEATURE_MISSING_HUGETLBFS | \
- UFFD_FEATURE_MISSING_SHMEM)
+ UFFD_FEATURE_MISSING_SHMEM | \
+ UFFD_FEATURE_SIGBUS)
#define UFFD_API_IOCTLS \
((__u64)1 << _UFFDIO_REGISTER | \
(__u64)1 << _UFFDIO_UNREGISTER | \
@@ -153,6 +154,12 @@ struct uffdio_api {
* UFFD_FEATURE_MISSING_SHMEM works the same as
* UFFD_FEATURE_MISSING_HUGETLBFS, but it applies to shmem
* (i.e. tmpfs and other shmem based APIs).
+ *
+ * UFFD_FEATURE_SIGBUS feature means no page-fault
+ * (UFFD_EVENT_PAGEFAULT) event will be delivered, instead
+ * a SIGBUS signal will be sent to the faulting process.
+ * The application process can enable this behavior by adding
+ * it to uffdio_api.features.
*/
#define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0)
#define UFFD_FEATURE_EVENT_FORK (1<<1)
@@ -161,6 +168,7 @@ struct uffdio_api {
#define UFFD_FEATURE_MISSING_HUGETLBFS (1<<4)
#define UFFD_FEATURE_MISSING_SHMEM (1<<5)
#define UFFD_FEATURE_EVENT_UNMAP (1<<6)
+#define UFFD_FEATURE_SIGBUS (1<<7)
__u64 features;
__u64 ioctls;
--
1.7.1
This patch adds tests for UFFD_FEATURE_SIGBUS feature. The
tests will verify signal delivery instead of userfault events.
Also, test use of UFFDIO_COPY to allocate memory and retry
accessing monitored area after signal delivery.
This patch also fixes a bug in uffd_poll_thread() where 'uffd'
is leaked.
Signed-off-by: Prakash Sangappa <[email protected]>
---
Change log
v3: Eliminated use of sig_repeat variable and simplified error return.
v2:
- Added comments to explain the tests.
- Fixed test to fail immediately if signal repeats.
- Addressed other review comments.
v1: https://lkml.org/lkml/2017/7/26/101
---
tools/testing/selftests/vm/userfaultfd.c | 127 +++++++++++++++++++++++++++++-
1 files changed, 124 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index 1eae79a..52740ae 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -66,6 +66,7 @@
#include <sys/wait.h>
#include <pthread.h>
#include <linux/userfaultfd.h>
+#include <setjmp.h>
#ifdef __NR_userfaultfd
@@ -408,6 +409,7 @@ static int copy_page(int ufd, unsigned long offset)
userfaults++;
break;
case UFFD_EVENT_FORK:
+ close(uffd);
uffd = msg.arg.fork.ufd;
pollfd[0].fd = uffd;
break;
@@ -572,6 +574,17 @@ static int userfaultfd_open(int features)
return 0;
}
+sigjmp_buf jbuf, *sigbuf;
+
+static void sighndl(int sig, siginfo_t *siginfo, void *ptr)
+{
+ if (sig == SIGBUS) {
+ if (sigbuf)
+ siglongjmp(*sigbuf, 1);
+ abort();
+ }
+}
+
/*
* For non-cooperative userfaultfd test we fork() a process that will
* generate pagefaults, will mremap the area monitored by the
@@ -585,19 +598,59 @@ static int userfaultfd_open(int features)
* The release of the pages currently generates event for shmem and
* anonymous memory (UFFD_EVENT_REMOVE), hence it is not checked
* for hugetlb.
+ * For signal test(UFFD_FEATURE_SIGBUS), signal_test = 1, we register
+ * monitored area, generate pagefaults and test that signal is delivered.
+ * Use UFFDIO_COPY to allocate missing page and retry. For signal_test = 2
+ * test robustness use case - we release monitored area, fork a process
+ * that will generate pagefaults and verify signal is generated.
+ * This also tests UFFD_FEATURE_EVENT_FORK event along with the signal
+ * feature. Using monitor thread, verify no userfault events are generated.
*/
-static int faulting_process(void)
+static int faulting_process(int signal_test)
{
unsigned long nr;
unsigned long long count;
unsigned long split_nr_pages;
+ unsigned long lastnr;
+ struct sigaction act;
+ unsigned long signalled = 0;
if (test_type != TEST_HUGETLB)
split_nr_pages = (nr_pages + 1) / 2;
else
split_nr_pages = nr_pages;
+ if (signal_test) {
+ sigbuf = &jbuf;
+ memset(&act, 0, sizeof(act));
+ act.sa_sigaction = sighndl;
+ act.sa_flags = SA_SIGINFO;
+ if (sigaction(SIGBUS, &act, 0)) {
+ perror("sigaction");
+ return 1;
+ }
+ lastnr = (unsigned long)-1;
+ }
+
for (nr = 0; nr < split_nr_pages; nr++) {
+ if (signal_test) {
+ if (sigsetjmp(*sigbuf, 1) != 0) {
+ if (nr == lastnr) {
+ fprintf(stderr, "Signal repeated\n");
+ return 1;
+ }
+
+ lastnr = nr;
+ if (signal_test == 1) {
+ if (copy_page(uffd, nr * page_size))
+ signalled++;
+ } else {
+ signalled++;
+ continue;
+ }
+ }
+ }
+
count = *area_count(area_dst, nr);
if (count != count_verify[nr]) {
fprintf(stderr,
@@ -607,6 +660,9 @@ static int faulting_process(void)
}
}
+ if (signal_test)
+ return signalled != split_nr_pages;
+
if (test_type == TEST_HUGETLB)
return 0;
@@ -761,7 +817,7 @@ static int userfaultfd_events_test(void)
perror("fork"), exit(1);
if (!pid)
- return faulting_process();
+ return faulting_process(0);
waitpid(pid, &err, 0);
if (err)
@@ -778,6 +834,70 @@ static int userfaultfd_events_test(void)
return userfaults != nr_pages;
}
+static int userfaultfd_sig_test(void)
+{
+ struct uffdio_register uffdio_register;
+ unsigned long expected_ioctls;
+ unsigned long userfaults;
+ pthread_t uffd_mon;
+ int err, features;
+ pid_t pid;
+ char c;
+
+ printf("testing signal delivery: ");
+ fflush(stdout);
+
+ if (uffd_test_ops->release_pages(area_dst))
+ return 1;
+
+ features = UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_SIGBUS;
+ if (userfaultfd_open(features) < 0)
+ return 1;
+ fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
+
+ uffdio_register.range.start = (unsigned long) area_dst;
+ uffdio_register.range.len = nr_pages * page_size;
+ uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
+ if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register))
+ fprintf(stderr, "register failure\n"), exit(1);
+
+ expected_ioctls = uffd_test_ops->expected_ioctls;
+ if ((uffdio_register.ioctls & expected_ioctls) !=
+ expected_ioctls)
+ fprintf(stderr,
+ "unexpected missing ioctl for anon memory\n"),
+ exit(1);
+
+ if (faulting_process(1))
+ fprintf(stderr, "faulting process failed\n"), exit(1);
+
+ if (uffd_test_ops->release_pages(area_dst))
+ return 1;
+
+ if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL))
+ perror("uffd_poll_thread create"), exit(1);
+
+ pid = fork();
+ if (pid < 0)
+ perror("fork"), exit(1);
+
+ if (!pid)
+ exit(faulting_process(2));
+
+ waitpid(pid, &err, 0);
+ if (err)
+ fprintf(stderr, "faulting process failed\n"), exit(1);
+
+ if (write(pipefd[1], &c, sizeof(c)) != sizeof(c))
+ perror("pipe write"), exit(1);
+ if (pthread_join(uffd_mon, (void **)&userfaults))
+ return 1;
+
+ printf("done.\n");
+ printf(" Signal test userfaults: %ld\n", userfaults);
+ close(uffd);
+ return userfaults != 0;
+}
static int userfaultfd_stress(void)
{
void *area;
@@ -946,7 +1066,8 @@ static int userfaultfd_stress(void)
return err;
close(uffd);
- return userfaultfd_zeropage_test() || userfaultfd_events_test();
+ return userfaultfd_zeropage_test() || userfaultfd_sig_test()
+ || userfaultfd_events_test();
}
/*
--
1.7.1
On Mon, Jul 31, 2017 at 09:54:06PM -0400, Prakash Sangappa wrote:
> This patch adds tests for UFFD_FEATURE_SIGBUS feature. The
> tests will verify signal delivery instead of userfault events.
> Also, test use of UFFDIO_COPY to allocate memory and retry
> accessing monitored area after signal delivery.
>
> This patch also fixes a bug in uffd_poll_thread() where 'uffd'
> is leaked.
>
> Signed-off-by: Prakash Sangappa <[email protected]>
> ---
Reviewed-by: Mike Rapoport <[email protected]>
> Change log
>
> v3: Eliminated use of sig_repeat variable and simplified error return.
>
> v2:
> - Added comments to explain the tests.
> - Fixed test to fail immediately if signal repeats.
> - Addressed other review comments.
>
> v1: https://lkml.org/lkml/2017/7/26/101
> ---
> tools/testing/selftests/vm/userfaultfd.c | 127 +++++++++++++++++++++++++++++-
> 1 files changed, 124 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
> index 1eae79a..52740ae 100644
> --- a/tools/testing/selftests/vm/userfaultfd.c
> +++ b/tools/testing/selftests/vm/userfaultfd.c
> @@ -66,6 +66,7 @@
> #include <sys/wait.h>
> #include <pthread.h>
> #include <linux/userfaultfd.h>
> +#include <setjmp.h>
>
> #ifdef __NR_userfaultfd
>
> @@ -408,6 +409,7 @@ static int copy_page(int ufd, unsigned long offset)
> userfaults++;
> break;
> case UFFD_EVENT_FORK:
> + close(uffd);
> uffd = msg.arg.fork.ufd;
> pollfd[0].fd = uffd;
> break;
> @@ -572,6 +574,17 @@ static int userfaultfd_open(int features)
> return 0;
> }
>
> +sigjmp_buf jbuf, *sigbuf;
> +
> +static void sighndl(int sig, siginfo_t *siginfo, void *ptr)
> +{
> + if (sig == SIGBUS) {
> + if (sigbuf)
> + siglongjmp(*sigbuf, 1);
> + abort();
> + }
> +}
> +
> /*
> * For non-cooperative userfaultfd test we fork() a process that will
> * generate pagefaults, will mremap the area monitored by the
> @@ -585,19 +598,59 @@ static int userfaultfd_open(int features)
> * The release of the pages currently generates event for shmem and
> * anonymous memory (UFFD_EVENT_REMOVE), hence it is not checked
> * for hugetlb.
> + * For signal test(UFFD_FEATURE_SIGBUS), signal_test = 1, we register
> + * monitored area, generate pagefaults and test that signal is delivered.
> + * Use UFFDIO_COPY to allocate missing page and retry. For signal_test = 2
> + * test robustness use case - we release monitored area, fork a process
> + * that will generate pagefaults and verify signal is generated.
> + * This also tests UFFD_FEATURE_EVENT_FORK event along with the signal
> + * feature. Using monitor thread, verify no userfault events are generated.
> */
> -static int faulting_process(void)
> +static int faulting_process(int signal_test)
> {
> unsigned long nr;
> unsigned long long count;
> unsigned long split_nr_pages;
> + unsigned long lastnr;
> + struct sigaction act;
> + unsigned long signalled = 0;
>
> if (test_type != TEST_HUGETLB)
> split_nr_pages = (nr_pages + 1) / 2;
> else
> split_nr_pages = nr_pages;
>
> + if (signal_test) {
> + sigbuf = &jbuf;
> + memset(&act, 0, sizeof(act));
> + act.sa_sigaction = sighndl;
> + act.sa_flags = SA_SIGINFO;
> + if (sigaction(SIGBUS, &act, 0)) {
> + perror("sigaction");
> + return 1;
> + }
> + lastnr = (unsigned long)-1;
> + }
> +
> for (nr = 0; nr < split_nr_pages; nr++) {
> + if (signal_test) {
> + if (sigsetjmp(*sigbuf, 1) != 0) {
> + if (nr == lastnr) {
> + fprintf(stderr, "Signal repeated\n");
> + return 1;
> + }
> +
> + lastnr = nr;
> + if (signal_test == 1) {
> + if (copy_page(uffd, nr * page_size))
> + signalled++;
> + } else {
> + signalled++;
> + continue;
> + }
> + }
> + }
> +
> count = *area_count(area_dst, nr);
> if (count != count_verify[nr]) {
> fprintf(stderr,
> @@ -607,6 +660,9 @@ static int faulting_process(void)
> }
> }
>
> + if (signal_test)
> + return signalled != split_nr_pages;
> +
> if (test_type == TEST_HUGETLB)
> return 0;
>
> @@ -761,7 +817,7 @@ static int userfaultfd_events_test(void)
> perror("fork"), exit(1);
>
> if (!pid)
> - return faulting_process();
> + return faulting_process(0);
>
> waitpid(pid, &err, 0);
> if (err)
> @@ -778,6 +834,70 @@ static int userfaultfd_events_test(void)
> return userfaults != nr_pages;
> }
>
> +static int userfaultfd_sig_test(void)
> +{
> + struct uffdio_register uffdio_register;
> + unsigned long expected_ioctls;
> + unsigned long userfaults;
> + pthread_t uffd_mon;
> + int err, features;
> + pid_t pid;
> + char c;
> +
> + printf("testing signal delivery: ");
> + fflush(stdout);
> +
> + if (uffd_test_ops->release_pages(area_dst))
> + return 1;
> +
> + features = UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_SIGBUS;
> + if (userfaultfd_open(features) < 0)
> + return 1;
> + fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
> +
> + uffdio_register.range.start = (unsigned long) area_dst;
> + uffdio_register.range.len = nr_pages * page_size;
> + uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
> + if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register))
> + fprintf(stderr, "register failure\n"), exit(1);
> +
> + expected_ioctls = uffd_test_ops->expected_ioctls;
> + if ((uffdio_register.ioctls & expected_ioctls) !=
> + expected_ioctls)
> + fprintf(stderr,
> + "unexpected missing ioctl for anon memory\n"),
> + exit(1);
> +
> + if (faulting_process(1))
> + fprintf(stderr, "faulting process failed\n"), exit(1);
> +
> + if (uffd_test_ops->release_pages(area_dst))
> + return 1;
> +
> + if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL))
> + perror("uffd_poll_thread create"), exit(1);
> +
> + pid = fork();
> + if (pid < 0)
> + perror("fork"), exit(1);
> +
> + if (!pid)
> + exit(faulting_process(2));
> +
> + waitpid(pid, &err, 0);
> + if (err)
> + fprintf(stderr, "faulting process failed\n"), exit(1);
> +
> + if (write(pipefd[1], &c, sizeof(c)) != sizeof(c))
> + perror("pipe write"), exit(1);
> + if (pthread_join(uffd_mon, (void **)&userfaults))
> + return 1;
> +
> + printf("done.\n");
> + printf(" Signal test userfaults: %ld\n", userfaults);
> + close(uffd);
> + return userfaults != 0;
> +}
> static int userfaultfd_stress(void)
> {
> void *area;
> @@ -946,7 +1066,8 @@ static int userfaultfd_stress(void)
> return err;
>
> close(uffd);
> - return userfaultfd_zeropage_test() || userfaultfd_events_test();
> + return userfaultfd_zeropage_test() || userfaultfd_sig_test()
> + || userfaultfd_events_test();
> }
>
> /*
> --
> 1.7.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>
--
Sincerely yours,
Mike.