2015-02-06 05:22:24

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 00/29] lguest virtio PCI 1.0 adaptation.

Hi all!

I've just spent a week changing lguest from its own custom
virtio bus to PCI, and from legacy virtio to modern (1.0). I did this
mainly to test the Linux implementation.

The first 11 patches set up the framework for routing more traps to
the Lguest launcher, so we can intercept ioport and MMIO instructions.

Patches 12-15 implement emulation of virtio over PCI devices, and then
16-19 convert the four devices we support. 20-23 remove the obsolete
lguest bus support.

24-27 change the guest to bang the emerg_wr register (via the pci config
space MMIO accessor window) to perform early boot messages. It's slow
and ugly, but it works.

Finally, the last two patches remove the last of the old hypercall
notification mechanism.

Cheers!
Rusty.

Rusty Russell (29):
lguest: have --rng read from /dev/urandom not /dev/random.
lguest: add operations to get/set a register from the Launcher.
lguest: write more information to userspace about pending traps.
lguest: add infrastructure for userspace to deliver a trap to the
guest.
lguest: add infrastructure to check mappings.
lguest: send trap 13 through to userspace.
lguest: suppress PS/2 keyboard polling.
lguest: don't disable iospace.
lguest: add iomem region, where guest page faults get sent to
userspace.
lguest: disable ACPI explicitly.
lguest: Override pcibios_enable_irq/pcibios_disable_irq to our stupid
PIC
lguest: add PCI emulation.
lguest: implement virtio-PCI MMIO accesses.
lguest: fix failure to find linux/virtio_types.h
lguest: add a dummy PCI host bridge.
lguest: Convert block device to virtio 1.0 PCI.
lguest: Convert net device to virtio 1.0 PCI.
lguest: Convert entropy device to virtio 1.0 PCI.
lguest: Convert console device to virtio 1.0 PCI.
lguest: define VIRTIO_CONFIG_NO_LEGACY in example launcher.
lguest: remove support for lguest bus.
lguest: remove support for lguest bus in demonstration launcher.
lguest: remove lguest bus definitions from header.
lguest: support emerg_wr in console device in example launcher.
lguest: support backdoor window.
lguest: always put console in PCI slot #1.
lguest: use the PCI console device's emerg_wr for early boot messages.
lguest: remove NOTIFY facility from demonstration launcher.
lguest: remove NOTIFY call and eventfd facility.

arch/x86/include/asm/lguest_hcall.h | 1 -
arch/x86/lguest/boot.c | 182 ++++-
drivers/lguest/Makefile | 3 -
drivers/lguest/core.c | 29 +-
drivers/lguest/hypercalls.c | 7 +-
drivers/lguest/lg.h | 26 +-
drivers/lguest/lguest_device.c | 540 -------------
drivers/lguest/lguest_user.c | 221 ++----
drivers/lguest/page_tables.c | 75 +-
drivers/lguest/x86/core.c | 198 ++---
include/linux/lguest_launcher.h | 61 +-
tools/lguest/Makefile | 8 +-
tools/lguest/lguest.c | 1443 +++++++++++++++++++++++++++--------
13 files changed, 1572 insertions(+), 1222 deletions(-)
delete mode 100644 drivers/lguest/lguest_device.c

--
2.1.0


2015-02-06 05:24:33

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 01/29] lguest: have --rng read from /dev/urandom not /dev/random.

Theoretical debates aside, now it boots.

Signed-off-by: Rusty Russell <[email protected]>
---
tools/lguest/lguest.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/lguest/lguest.c b/tools/lguest/lguest.c
index 32cf2ce15d69..3f7f2326cd9a 100644
--- a/tools/lguest/lguest.c
+++ b/tools/lguest/lguest.c
@@ -1733,9 +1733,9 @@ static void setup_block_file(const char *filename)
}

/*L:211
- * Our random number generator device reads from /dev/random into the Guest's
+ * Our random number generator device reads from /dev/urandom into the Guest's
* input buffers. The usual case is that the Guest doesn't want random numbers
- * and so has no buffers although /dev/random is still readable, whereas
+ * and so has no buffers although /dev/urandom is still readable, whereas
* console is the reverse.
*
* The same logic applies, however.
@@ -1763,7 +1763,7 @@ static void rng_input(struct virtqueue *vq)
while (!iov_empty(iov, in_num)) {
len = readv(rng_info->rfd, iov, in_num);
if (len <= 0)
- err(1, "Read from /dev/random gave %i", len);
+ err(1, "Read from /dev/urandom gave %i", len);
iov_consume(iov, in_num, NULL, len);
totlen += len;
}
@@ -1780,8 +1780,8 @@ static void setup_rng(void)
struct device *dev;
struct rng_info *rng_info = malloc(sizeof(*rng_info));

- /* Our device's privat info simply contains the /dev/random fd. */
- rng_info->rfd = open_or_die("/dev/random", O_RDONLY);
+ /* Our device's private info simply contains the /dev/urandom fd. */
+ rng_info->rfd = open_or_die("/dev/urandom", O_RDONLY);

/* Create the new device. */
dev = new_device("rng", VIRTIO_ID_RNG);
--
2.1.0

2015-02-06 05:23:59

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 02/29] lguest: add operations to get/set a register from the Launcher.

We use the ptrace API struct, and we currently don't let them set
anything but the normal registers (we'd have to filter the others).

Signed-off-by: Rusty Russell <[email protected]>
---
drivers/lguest/core.c | 8 +++++++
drivers/lguest/lg.h | 3 +++
drivers/lguest/lguest_user.c | 49 +++++++++++++++++++++++++++++++++++++++++
drivers/lguest/x86/core.c | 46 ++++++++++++++++++++++++++++++++++++++
include/linux/lguest_launcher.h | 2 ++
5 files changed, 108 insertions(+)

diff --git a/drivers/lguest/core.c b/drivers/lguest/core.c
index 6590558d1d31..cdb2f9aa5860 100644
--- a/drivers/lguest/core.c
+++ b/drivers/lguest/core.c
@@ -208,6 +208,14 @@ void __lgwrite(struct lg_cpu *cpu, unsigned long addr, const void *b,
*/
int run_guest(struct lg_cpu *cpu, unsigned long __user *user)
{
+ /* If the launcher asked for a register with LHREQ_GETREG */
+ if (cpu->reg_read) {
+ if (put_user(*cpu->reg_read, user))
+ return -EFAULT;
+ cpu->reg_read = NULL;
+ return sizeof(*cpu->reg_read);
+ }
+
/* We stop running once the Guest is dead. */
while (!cpu->lg->dead) {
unsigned int irq;
diff --git a/drivers/lguest/lg.h b/drivers/lguest/lg.h
index 2eef40be4c04..1c98bf74fd68 100644
--- a/drivers/lguest/lg.h
+++ b/drivers/lguest/lg.h
@@ -52,6 +52,8 @@ struct lg_cpu {

unsigned long pending_notify; /* pfn from LHCALL_NOTIFY */

+ unsigned long *reg_read; /* register from LHREQ_GETREG */
+
/* At end of a page shared mapped over lguest_pages in guest. */
unsigned long regs_page;
struct lguest_regs *regs;
@@ -210,6 +212,7 @@ void lguest_arch_handle_trap(struct lg_cpu *cpu);
int lguest_arch_init_hypercalls(struct lg_cpu *cpu);
int lguest_arch_do_hcall(struct lg_cpu *cpu, struct hcall_args *args);
void lguest_arch_setup_regs(struct lg_cpu *cpu, unsigned long start);
+unsigned long *lguest_arch_regptr(struct lg_cpu *cpu, size_t reg_off, bool any);

/* <arch>/switcher.S: */
extern char start_switcher_text[], end_switcher_text[], switch_to_guest[];
diff --git a/drivers/lguest/lguest_user.c b/drivers/lguest/lguest_user.c
index 4263f4cc8c55..7f14c152dd23 100644
--- a/drivers/lguest/lguest_user.c
+++ b/drivers/lguest/lguest_user.c
@@ -173,6 +173,51 @@ static int attach_eventfd(struct lguest *lg, const unsigned long __user *input)
return err;
}

+/* The Launcher can get the registers, and also set some of them. */
+static int getreg_setup(struct lg_cpu *cpu, const unsigned long __user *input)
+{
+ unsigned long which;
+
+ /* We re-use the ptrace structure to specify which register to read. */
+ if (get_user(which, input) != 0)
+ return -EFAULT;
+
+ /*
+ * We set up the cpu register pointer, and their next read will
+ * actually get the value (instead of running the guest).
+ *
+ * The last argument 'true' says we can access any register.
+ */
+ cpu->reg_read = lguest_arch_regptr(cpu, which, true);
+ if (!cpu->reg_read)
+ return -ENOENT;
+
+ /* And because this is a write() call, we return the length used. */
+ return sizeof(unsigned long) * 2;
+}
+
+static int setreg(struct lg_cpu *cpu, const unsigned long __user *input)
+{
+ unsigned long which, value, *reg;
+
+ /* We re-use the ptrace structure to specify which register to read. */
+ if (get_user(which, input) != 0)
+ return -EFAULT;
+ input++;
+ if (get_user(value, input) != 0)
+ return -EFAULT;
+
+ /* The last argument 'false' means we can't access all registers. */
+ reg = lguest_arch_regptr(cpu, which, false);
+ if (!reg)
+ return -ENOENT;
+
+ *reg = value;
+
+ /* And because this is a write() call, we return the length used. */
+ return sizeof(unsigned long) * 3;
+}
+
/*L:050
* Sending an interrupt is done by writing LHREQ_IRQ and an interrupt
* number to /dev/lguest.
@@ -434,6 +479,10 @@ static ssize_t write(struct file *file, const char __user *in,
return user_send_irq(cpu, input);
case LHREQ_EVENTFD:
return attach_eventfd(lg, input);
+ case LHREQ_GETREG:
+ return getreg_setup(cpu, input);
+ case LHREQ_SETREG:
+ return setreg(cpu, input);
default:
return -EINVAL;
}
diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
index 922a1acbf652..f7a16b4ea456 100644
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -181,6 +181,52 @@ static void run_guest_once(struct lg_cpu *cpu, struct lguest_pages *pages)
}
/*:*/

+unsigned long *lguest_arch_regptr(struct lg_cpu *cpu, size_t reg_off, bool any)
+{
+ switch (reg_off) {
+ case offsetof(struct pt_regs, bx):
+ return &cpu->regs->ebx;
+ case offsetof(struct pt_regs, cx):
+ return &cpu->regs->ecx;
+ case offsetof(struct pt_regs, dx):
+ return &cpu->regs->edx;
+ case offsetof(struct pt_regs, si):
+ return &cpu->regs->esi;
+ case offsetof(struct pt_regs, di):
+ return &cpu->regs->edi;
+ case offsetof(struct pt_regs, bp):
+ return &cpu->regs->ebp;
+ case offsetof(struct pt_regs, ax):
+ return &cpu->regs->eax;
+ case offsetof(struct pt_regs, ip):
+ return &cpu->regs->eip;
+ case offsetof(struct pt_regs, sp):
+ return &cpu->regs->esp;
+ }
+
+ /* Launcher can read these, but we don't allow any setting. */
+ if (any) {
+ switch (reg_off) {
+ case offsetof(struct pt_regs, ds):
+ return &cpu->regs->ds;
+ case offsetof(struct pt_regs, es):
+ return &cpu->regs->es;
+ case offsetof(struct pt_regs, fs):
+ return &cpu->regs->fs;
+ case offsetof(struct pt_regs, gs):
+ return &cpu->regs->gs;
+ case offsetof(struct pt_regs, cs):
+ return &cpu->regs->cs;
+ case offsetof(struct pt_regs, flags):
+ return &cpu->regs->eflags;
+ case offsetof(struct pt_regs, ss):
+ return &cpu->regs->ss;
+ }
+ }
+
+ return NULL;
+}
+
/*M:002
* There are hooks in the scheduler which we can register to tell when we
* get kicked off the CPU (preempt_notifier_register()). This would allow us
diff --git a/include/linux/lguest_launcher.h b/include/linux/lguest_launcher.h
index 495203ff221c..f27cae27b0c1 100644
--- a/include/linux/lguest_launcher.h
+++ b/include/linux/lguest_launcher.h
@@ -63,6 +63,8 @@ enum lguest_req
LHREQ_IRQ, /* + irq */
LHREQ_BREAK, /* No longer used */
LHREQ_EVENTFD, /* + address, fd. */
+ LHREQ_GETREG, /* + offset within struct pt_regs (then read value). */
+ LHREQ_SETREG, /* + offset within struct pt_regs, value. */
};

/*
--
2.1.0

2015-02-06 05:23:58

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 03/29] lguest: write more information to userspace about pending traps.

This is preparation for userspace handling MMIO and ioport accesses.

Signed-off-by: Rusty Russell <[email protected]>
---
drivers/lguest/core.c | 7 ++++---
drivers/lguest/hypercalls.c | 7 ++++---
drivers/lguest/lg.h | 3 ++-
drivers/lguest/lguest_user.c | 14 +++++++++-----
include/linux/lguest_launcher.h | 13 +++++++++++++
tools/lguest/lguest.c | 16 ++++++++++------
6 files changed, 42 insertions(+), 18 deletions(-)

diff --git a/drivers/lguest/core.c b/drivers/lguest/core.c
index cdb2f9aa5860..9159dbc583f6 100644
--- a/drivers/lguest/core.c
+++ b/drivers/lguest/core.c
@@ -229,16 +229,17 @@ int run_guest(struct lg_cpu *cpu, unsigned long __user *user)
* It's possible the Guest did a NOTIFY hypercall to the
* Launcher.
*/
- if (cpu->pending_notify) {
+ if (cpu->pending.trap) {
/*
* Does it just needs to write to a registered
* eventfd (ie. the appropriate virtqueue thread)?
*/
if (!send_notify_to_eventfd(cpu)) {
/* OK, we tell the main Launcher. */
- if (put_user(cpu->pending_notify, user))
+ if (copy_to_user(user, &cpu->pending,
+ sizeof(cpu->pending)))
return -EFAULT;
- return sizeof(cpu->pending_notify);
+ return sizeof(cpu->pending);
}
}

diff --git a/drivers/lguest/hypercalls.c b/drivers/lguest/hypercalls.c
index 83511eb0923d..5dd1fb8a6610 100644
--- a/drivers/lguest/hypercalls.c
+++ b/drivers/lguest/hypercalls.c
@@ -118,7 +118,8 @@ static void do_hcall(struct lg_cpu *cpu, struct hcall_args *args)
cpu->halted = 1;
break;
case LHCALL_NOTIFY:
- cpu->pending_notify = args->arg1;
+ cpu->pending.trap = LGUEST_TRAP_ENTRY;
+ cpu->pending.addr = args->arg1;
break;
default:
/* It should be an architecture-specific hypercall. */
@@ -189,7 +190,7 @@ static void do_async_hcalls(struct lg_cpu *cpu)
* Stop doing hypercalls if they want to notify the Launcher:
* it needs to service this first.
*/
- if (cpu->pending_notify)
+ if (cpu->pending.trap)
break;
}
}
@@ -280,7 +281,7 @@ void do_hypercalls(struct lg_cpu *cpu)
* NOTIFY to the Launcher, we want to return now. Otherwise we do
* the hypercall.
*/
- if (!cpu->pending_notify) {
+ if (!cpu->pending.trap) {
do_hcall(cpu, cpu->hcall);
/*
* Tricky point: we reset the hcall pointer to mark the
diff --git a/drivers/lguest/lg.h b/drivers/lguest/lg.h
index 1c98bf74fd68..020fec5bb072 100644
--- a/drivers/lguest/lg.h
+++ b/drivers/lguest/lg.h
@@ -50,7 +50,8 @@ struct lg_cpu {
/* Bitmap of what has changed: see CHANGED_* above. */
int changed;

- unsigned long pending_notify; /* pfn from LHCALL_NOTIFY */
+ /* Pending operation. */
+ struct lguest_pending pending;

unsigned long *reg_read; /* register from LHREQ_GETREG */

diff --git a/drivers/lguest/lguest_user.c b/drivers/lguest/lguest_user.c
index 7f14c152dd23..dcf9efd94cf4 100644
--- a/drivers/lguest/lguest_user.c
+++ b/drivers/lguest/lguest_user.c
@@ -29,6 +29,10 @@ bool send_notify_to_eventfd(struct lg_cpu *cpu)
unsigned int i;
struct lg_eventfd_map *map;

+ /* We only connect LHCALL_NOTIFY to event fds, not other traps. */
+ if (cpu->pending.trap != LGUEST_TRAP_ENTRY)
+ return false;
+
/*
* This "rcu_read_lock()" helps track when someone is still looking at
* the (RCU-using) eventfds array. It's not actually a lock at all;
@@ -52,9 +56,9 @@ bool send_notify_to_eventfd(struct lg_cpu *cpu)
* we'll continue to use the old array and just won't see the new one.
*/
for (i = 0; i < map->num; i++) {
- if (map->map[i].addr == cpu->pending_notify) {
+ if (map->map[i].addr == cpu->pending.addr) {
eventfd_signal(map->map[i].event, 1);
- cpu->pending_notify = 0;
+ cpu->pending.trap = 0;
break;
}
}
@@ -62,7 +66,7 @@ bool send_notify_to_eventfd(struct lg_cpu *cpu)
rcu_read_unlock();

/* If we cleared the notification, it's because we found a match. */
- return cpu->pending_notify == 0;
+ return cpu->pending.trap == 0;
}

/*L:055
@@ -282,8 +286,8 @@ static ssize_t read(struct file *file, char __user *user, size_t size,loff_t*o)
* If we returned from read() last time because the Guest sent I/O,
* clear the flag.
*/
- if (cpu->pending_notify)
- cpu->pending_notify = 0;
+ if (cpu->pending.trap)
+ cpu->pending.trap = 0;

/* Run the Guest until something interesting happens. */
return run_guest(cpu, (unsigned long __user *)user);
diff --git a/include/linux/lguest_launcher.h b/include/linux/lguest_launcher.h
index f27cae27b0c1..c4451ebece47 100644
--- a/include/linux/lguest_launcher.h
+++ b/include/linux/lguest_launcher.h
@@ -68,6 +68,19 @@ enum lguest_req
};

/*
+ * This is what read() of the lguest fd populates. trap ==
+ * LGUEST_TRAP_ENTRY for an LHCALL_NOTIFY (addr is the
+ * argument), 14 for a page fault in the MMIO region (addr is
+ * the trap address, insn is the instruction), or 13 for a GPF
+ * (insn is the instruction).
+ */
+struct lguest_pending {
+ __u8 trap;
+ __u8 insn[7];
+ __u32 addr;
+};
+
+/*
* The alignment to use between consumer and producer parts of vring.
* x86 pagesize for historical reasons.
*/
diff --git a/tools/lguest/lguest.c b/tools/lguest/lguest.c
index 3f7f2326cd9a..0e754d04876d 100644
--- a/tools/lguest/lguest.c
+++ b/tools/lguest/lguest.c
@@ -1820,17 +1820,21 @@ static void __attribute__((noreturn)) restart_guest(void)
static void __attribute__((noreturn)) run_guest(void)
{
for (;;) {
- unsigned long notify_addr;
+ struct lguest_pending notify;
int readval;

/* We read from the /dev/lguest device to run the Guest. */
- readval = pread(lguest_fd, &notify_addr,
- sizeof(notify_addr), cpu_id);
+ readval = pread(lguest_fd, &notify, sizeof(notify), cpu_id);

/* One unsigned long means the Guest did HCALL_NOTIFY */
- if (readval == sizeof(notify_addr)) {
- verbose("Notify on address %#lx\n", notify_addr);
- handle_output(notify_addr);
+ if (readval == sizeof(notify)) {
+ if (notify.trap == 0x1F) {
+ verbose("Notify on address %#08x\n",
+ notify.addr);
+ handle_output(notify.addr);
+ } else
+ errx(1, "Unknown trap %i addr %#08x\n",
+ notify.trap, notify.addr);
/* ENOENT means the Guest died. Reading tells us why. */
} else if (errno == ENOENT) {
char reason[1024] = { 0 };
--
2.1.0

2015-02-06 05:22:23

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 04/29] lguest: add infrastructure for userspace to deliver a trap to the guest.

This is required for instruction emulation to move to userspace.

Signed-off-by: Rusty Russell <[email protected]>
---
drivers/lguest/lguest_user.c | 19 +++++++++++++++++++
include/linux/lguest_launcher.h | 1 +
2 files changed, 20 insertions(+)

diff --git a/drivers/lguest/lguest_user.c b/drivers/lguest/lguest_user.c
index dcf9efd94cf4..be996d173615 100644
--- a/drivers/lguest/lguest_user.c
+++ b/drivers/lguest/lguest_user.c
@@ -243,6 +243,23 @@ static int user_send_irq(struct lg_cpu *cpu, const unsigned long __user *input)
return 0;
}

+/*L:053
+ * Deliver a trap: this is used by the Launcher if it can't emulate
+ * an instruction.
+ */
+static int trap(struct lg_cpu *cpu, const unsigned long __user *input)
+{
+ unsigned long trapnum;
+
+ if (get_user(trapnum, input) != 0)
+ return -EFAULT;
+
+ if (!deliver_trap(cpu, trapnum))
+ return -EINVAL;
+
+ return 0;
+}
+
/*L:040
* Once our Guest is initialized, the Launcher makes it run by reading
* from /dev/lguest.
@@ -487,6 +504,8 @@ static ssize_t write(struct file *file, const char __user *in,
return getreg_setup(cpu, input);
case LHREQ_SETREG:
return setreg(cpu, input);
+ case LHREQ_TRAP:
+ return trap(cpu, input);
default:
return -EINVAL;
}
diff --git a/include/linux/lguest_launcher.h b/include/linux/lguest_launcher.h
index c4451ebece47..3c402b843e03 100644
--- a/include/linux/lguest_launcher.h
+++ b/include/linux/lguest_launcher.h
@@ -65,6 +65,7 @@ enum lguest_req
LHREQ_EVENTFD, /* + address, fd. */
LHREQ_GETREG, /* + offset within struct pt_regs (then read value). */
LHREQ_SETREG, /* + offset within struct pt_regs, value. */
+ LHREQ_TRAP, /* + trap number to deliver to guest. */
};

/*
--
2.1.0

2015-02-06 05:23:35

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 05/29] lguest: add infrastructure to check mappings.

We normally abort the guest unconditionally when it gives us a bad address,
but in the next patch we want to copy some bytes which may not be mapped.

Signed-off-by: Rusty Russell <[email protected]>
---
drivers/lguest/lg.h | 1 +
drivers/lguest/page_tables.c | 42 +++++++++++++++++++++++++++++-------------
2 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/drivers/lguest/lg.h b/drivers/lguest/lg.h
index 020fec5bb072..9da4f351e077 100644
--- a/drivers/lguest/lg.h
+++ b/drivers/lguest/lg.h
@@ -202,6 +202,7 @@ void guest_set_pte(struct lg_cpu *cpu, unsigned long gpgdir,
void map_switcher_in_guest(struct lg_cpu *cpu, struct lguest_pages *pages);
bool demand_page(struct lg_cpu *cpu, unsigned long cr2, int errcode);
void pin_page(struct lg_cpu *cpu, unsigned long vaddr);
+bool __guest_pa(struct lg_cpu *cpu, unsigned long vaddr, unsigned long *paddr);
unsigned long guest_pa(struct lg_cpu *cpu, unsigned long vaddr);
void page_table_guest_data_init(struct lg_cpu *cpu);

diff --git a/drivers/lguest/page_tables.c b/drivers/lguest/page_tables.c
index e8b55c3a6170..69c35caa955a 100644
--- a/drivers/lguest/page_tables.c
+++ b/drivers/lguest/page_tables.c
@@ -647,7 +647,7 @@ void guest_pagetable_flush_user(struct lg_cpu *cpu)
/*:*/

/* We walk down the guest page tables to get a guest-physical address */
-unsigned long guest_pa(struct lg_cpu *cpu, unsigned long vaddr)
+bool __guest_pa(struct lg_cpu *cpu, unsigned long vaddr, unsigned long *paddr)
{
pgd_t gpgd;
pte_t gpte;
@@ -656,31 +656,47 @@ unsigned long guest_pa(struct lg_cpu *cpu, unsigned long vaddr)
#endif

/* Still not set up? Just map 1:1. */
- if (unlikely(cpu->linear_pages))
- return vaddr;
+ if (unlikely(cpu->linear_pages)) {
+ *paddr = vaddr;
+ return true;
+ }

/* First step: get the top-level Guest page table entry. */
gpgd = lgread(cpu, gpgd_addr(cpu, vaddr), pgd_t);
/* Toplevel not present? We can't map it in. */
- if (!(pgd_flags(gpgd) & _PAGE_PRESENT)) {
- kill_guest(cpu, "Bad address %#lx", vaddr);
- return -1UL;
- }
+ if (!(pgd_flags(gpgd) & _PAGE_PRESENT))
+ goto fail;

#ifdef CONFIG_X86_PAE
gpmd = lgread(cpu, gpmd_addr(gpgd, vaddr), pmd_t);
- if (!(pmd_flags(gpmd) & _PAGE_PRESENT)) {
- kill_guest(cpu, "Bad address %#lx", vaddr);
- return -1UL;
- }
+ if (!(pmd_flags(gpmd) & _PAGE_PRESENT))
+ goto fail;
gpte = lgread(cpu, gpte_addr(cpu, gpmd, vaddr), pte_t);
#else
gpte = lgread(cpu, gpte_addr(cpu, gpgd, vaddr), pte_t);
#endif
if (!(pte_flags(gpte) & _PAGE_PRESENT))
- kill_guest(cpu, "Bad address %#lx", vaddr);
+ goto fail;
+
+ *paddr = pte_pfn(gpte) * PAGE_SIZE | (vaddr & ~PAGE_MASK);
+ return true;
+
+fail:
+ *paddr = -1UL;
+ return false;
+}

- return pte_pfn(gpte) * PAGE_SIZE | (vaddr & ~PAGE_MASK);
+/*
+ * This is the version we normally use: kills the Guest if it uses a
+ * bad address
+ */
+unsigned long guest_pa(struct lg_cpu *cpu, unsigned long vaddr)
+{
+ unsigned long paddr;
+
+ if (!__guest_pa(cpu, vaddr, &paddr))
+ kill_guest(cpu, "Bad address %#lx", vaddr);
+ return paddr;
}

/*
--
2.1.0

2015-02-06 05:23:03

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 06/29] lguest: send trap 13 through to userspace.

We copy 7 bytes at eip for userspace's instruction decode; we have to
carefully handle the case where eip is at the end of a page. We can't
leave this to userspace since kernel has all the page table decode
logic.

The decode logic moves to userspace, basically unchanged.

Signed-off-by: Rusty Russell <[email protected]>
---
drivers/lguest/x86/core.c | 133 +++++++++++++----------------------------
tools/lguest/lguest.c | 149 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 192 insertions(+), 90 deletions(-)

diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
index f7a16b4ea456..42e87bf14113 100644
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -314,95 +314,52 @@ void lguest_arch_run_guest(struct lg_cpu *cpu)
* usually attached to a PC.
*
* When the Guest uses one of these instructions, we get a trap (General
- * Protection Fault) and come here. We see if it's one of those troublesome
- * instructions and skip over it. We return true if we did.
+ * Protection Fault) and come here. We queue this to be sent out to the
+ * Launcher to handle.
*/
-static int emulate_insn(struct lg_cpu *cpu)
-{
- u8 insn;
- unsigned int insnlen = 0, in = 0, small_operand = 0;
- /*
- * The eip contains the *virtual* address of the Guest's instruction:
- * walk the Guest's page tables to find the "physical" address.
- */
- unsigned long physaddr = guest_pa(cpu, cpu->regs->eip);
-
- /*
- * This must be the Guest kernel trying to do something, not userspace!
- * The bottom two bits of the CS segment register are the privilege
- * level.
- */
- if ((cpu->regs->cs & 3) != GUEST_PL)
- return 0;

- /* Decoding x86 instructions is icky. */
- insn = lgread(cpu, physaddr, u8);
-
- /*
- * Around 2.6.33, the kernel started using an emulation for the
- * cmpxchg8b instruction in early boot on many configurations. This
- * code isn't paravirtualized, and it tries to disable interrupts.
- * Ignore it, which will Mostly Work.
- */
- if (insn == 0xfa) {
- /* "cli", or Clear Interrupt Enable instruction. Skip it. */
- cpu->regs->eip++;
- return 1;
+/*
+ * The eip contains the *virtual* address of the Guest's instruction:
+ * we copy the instruction here so the Launcher doesn't have to walk
+ * the page tables to decode it. We handle the case (eg. in a kernel
+ * module) where the instruction is over two pages, and the pages are
+ * virtually but not physically contiguous.
+ *
+ * The longest possible x86 instruction is 15 bytes, but we don't handle
+ * anything that strange.
+ */
+static void copy_from_guest(struct lg_cpu *cpu,
+ void *dst, unsigned long vaddr, size_t len)
+{
+ size_t to_page_end = PAGE_SIZE - (vaddr % PAGE_SIZE);
+ unsigned long paddr;
+
+ BUG_ON(len > PAGE_SIZE);
+
+ /* If it goes over a page, copy in two parts. */
+ if (len > to_page_end) {
+ /* But make sure the next page is mapped! */
+ if (__guest_pa(cpu, vaddr + to_page_end, &paddr))
+ copy_from_guest(cpu, dst + to_page_end,
+ vaddr + to_page_end,
+ len - to_page_end);
+ else
+ /* Otherwise fill with zeroes. */
+ memset(dst + to_page_end, 0, len - to_page_end);
+ len = to_page_end;
}

- /*
- * 0x66 is an "operand prefix". It means a 16, not 32 bit in/out.
- */
- if (insn == 0x66) {
- small_operand = 1;
- /* The instruction is 1 byte so far, read the next byte. */
- insnlen = 1;
- insn = lgread(cpu, physaddr + insnlen, u8);
- }
+ /* This will kill the guest if it isn't mapped, but that
+ * shouldn't happen. */
+ __lgread(cpu, dst, guest_pa(cpu, vaddr), len);
+}

- /*
- * We can ignore the lower bit for the moment and decode the 4 opcodes
- * we need to emulate.
- */
- switch (insn & 0xFE) {
- case 0xE4: /* in <next byte>,%al */
- insnlen += 2;
- in = 1;
- break;
- case 0xEC: /* in (%dx),%al */
- insnlen += 1;
- in = 1;
- break;
- case 0xE6: /* out %al,<next byte> */
- insnlen += 2;
- break;
- case 0xEE: /* out %al,(%dx) */
- insnlen += 1;
- break;
- default:
- /* OK, we don't know what this is, can't emulate. */
- return 0;
- }

- /*
- * If it was an "IN" instruction, they expect the result to be read
- * into %eax, so we change %eax. We always return all-ones, which
- * traditionally means "there's nothing there".
- */
- if (in) {
- /* Lower bit tells means it's a 32/16 bit access */
- if (insn & 0x1) {
- if (small_operand)
- cpu->regs->eax |= 0xFFFF;
- else
- cpu->regs->eax = 0xFFFFFFFF;
- } else
- cpu->regs->eax |= 0xFF;
- }
- /* Finally, we've "done" the instruction, so move past it. */
- cpu->regs->eip += insnlen;
- /* Success! */
- return 1;
+static void setup_emulate_insn(struct lg_cpu *cpu)
+{
+ cpu->pending.trap = 13;
+ copy_from_guest(cpu, cpu->pending.insn, cpu->regs->eip,
+ sizeof(cpu->pending.insn));
}

/*H:050 Once we've re-enabled interrupts, we look at why the Guest exited. */
@@ -410,14 +367,10 @@ void lguest_arch_handle_trap(struct lg_cpu *cpu)
{
switch (cpu->regs->trapnum) {
case 13: /* We've intercepted a General Protection Fault. */
- /*
- * Check if this was one of those annoying IN or OUT
- * instructions which we need to emulate. If so, we just go
- * back into the Guest after we've done it.
- */
+ /* Hand to Launcher to emulate those pesky IN and OUT insns */
if (cpu->regs->errcode == 0) {
- if (emulate_insn(cpu))
- return;
+ setup_emulate_insn(cpu);
+ return;
}
break;
case 14: /* We've intercepted a Page Fault. */
diff --git a/tools/lguest/lguest.c b/tools/lguest/lguest.c
index 0e754d04876d..b2217657f62c 100644
--- a/tools/lguest/lguest.c
+++ b/tools/lguest/lguest.c
@@ -41,6 +41,7 @@
#include <signal.h>
#include <pwd.h>
#include <grp.h>
+#include <sys/user.h>

#ifndef VIRTIO_F_ANY_LAYOUT
#define VIRTIO_F_ANY_LAYOUT 27
@@ -1143,6 +1144,150 @@ static void handle_output(unsigned long addr)
strnlen(from_guest_phys(addr), guest_limit - addr));
}

+/*L:216
+ * This is where we emulate a handful of Guest instructions. It's ugly
+ * and we used to do it in the kernel but it grew over time.
+ */
+
+/*
+ * We use the ptrace syscall's pt_regs struct to talk about registers
+ * to lguest: these macros convert the names to the offsets.
+ */
+#define getreg(name) getreg_off(offsetof(struct user_regs_struct, name))
+#define setreg(name, val) \
+ setreg_off(offsetof(struct user_regs_struct, name), (val))
+
+static u32 getreg_off(size_t offset)
+{
+ u32 r;
+ unsigned long args[] = { LHREQ_GETREG, offset };
+
+ if (pwrite(lguest_fd, args, sizeof(args), cpu_id) < 0)
+ err(1, "Getting register %u", offset);
+ if (pread(lguest_fd, &r, sizeof(r), cpu_id) != sizeof(r))
+ err(1, "Reading register %u", offset);
+
+ return r;
+}
+
+static void setreg_off(size_t offset, u32 val)
+{
+ unsigned long args[] = { LHREQ_SETREG, offset, val };
+
+ if (pwrite(lguest_fd, args, sizeof(args), cpu_id) < 0)
+ err(1, "Setting register %u", offset);
+}
+
+static void emulate_insn(const u8 insn[])
+{
+ unsigned long args[] = { LHREQ_TRAP, 13 };
+ unsigned int insnlen = 0, in = 0, small_operand = 0, byte_access;
+ unsigned int eax, port, mask;
+ /*
+ * We always return all-ones on IO port reads, which traditionally
+ * means "there's nothing there".
+ */
+ u32 val = 0xFFFFFFFF;
+
+ /*
+ * This must be the Guest kernel trying to do something, not userspace!
+ * The bottom two bits of the CS segment register are the privilege
+ * level.
+ */
+ if ((getreg(xcs) & 3) != 0x1)
+ goto no_emulate;
+
+ /* Decoding x86 instructions is icky. */
+
+ /*
+ * Around 2.6.33, the kernel started using an emulation for the
+ * cmpxchg8b instruction in early boot on many configurations. This
+ * code isn't paravirtualized, and it tries to disable interrupts.
+ * Ignore it, which will Mostly Work.
+ */
+ if (insn[insnlen] == 0xfa) {
+ /* "cli", or Clear Interrupt Enable instruction. Skip it. */
+ insnlen = 1;
+ goto skip_insn;
+ }
+
+ /*
+ * 0x66 is an "operand prefix". It means a 16, not 32 bit in/out.
+ */
+ if (insn[insnlen] == 0x66) {
+ small_operand = 1;
+ /* The instruction is 1 byte so far, read the next byte. */
+ insnlen = 1;
+ }
+
+ /* If the lower bit isn't set, it's a single byte access */
+ byte_access = !(insn[insnlen] & 1);
+
+ /*
+ * Now we can ignore the lower bit and decode the 4 opcodes
+ * we need to emulate.
+ */
+ switch (insn[insnlen] & 0xFE) {
+ case 0xE4: /* in <next byte>,%al */
+ port = insn[insnlen+1];
+ insnlen += 2;
+ in = 1;
+ break;
+ case 0xEC: /* in (%dx),%al */
+ port = getreg(edx) & 0xFFFF;
+ insnlen += 1;
+ in = 1;
+ break;
+ case 0xE6: /* out %al,<next byte> */
+ port = insn[insnlen+1];
+ insnlen += 2;
+ break;
+ case 0xEE: /* out %al,(%dx) */
+ port = getreg(edx) & 0xFFFF;
+ insnlen += 1;
+ break;
+ default:
+ /* OK, we don't know what this is, can't emulate. */
+ goto no_emulate;
+ }
+
+ /* Set a mask of the 1, 2 or 4 bytes, depending on size of IO */
+ if (byte_access)
+ mask = 0xFF;
+ else if (small_operand)
+ mask = 0xFFFF;
+ else
+ mask = 0xFFFFFFFF;
+
+ /*
+ * If it was an "IN" instruction, they expect the result to be read
+ * into %eax, so we change %eax.
+ */
+ eax = getreg(eax);
+
+ if (in) {
+ /* Clear the bits we're about to read */
+ eax &= ~mask;
+ /* Copy bits in from val. */
+ eax |= val & mask;
+ /* Now update the register. */
+ setreg(eax, eax);
+ }
+
+ verbose("IO %s of %x to %u: %#08x\n",
+ in ? "IN" : "OUT", mask, port, eax);
+skip_insn:
+ /* Finally, we've "done" the instruction, so move past it. */
+ setreg(eip, getreg(eip) + insnlen);
+ return;
+
+no_emulate:
+ /* Inject trap into Guest. */
+ if (write(lguest_fd, args, sizeof(args)) < 0)
+ err(1, "Reinjecting trap 13 for fault at %#x", getreg(eip));
+}
+
+
/*L:190
* Device Setup
*
@@ -1832,6 +1977,10 @@ static void __attribute__((noreturn)) run_guest(void)
verbose("Notify on address %#08x\n",
notify.addr);
handle_output(notify.addr);
+ } else if (notify.trap == 13) {
+ verbose("Emulating instruction at %#x\n",
+ getreg(eip));
+ emulate_insn(notify.insn);
} else
errx(1, "Unknown trap %i addr %#08x\n",
notify.trap, notify.addr);
--
2.1.0

2015-02-06 05:23:00

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 07/29] lguest: suppress PS/2 keyboard polling.

While hacking on getting I/O out to the lguest launcher, I noticed
that returning 0xFF for the PS/2 keyboard status made it spin for a
while thinking there was a key pending. Fix this by returning 1
instead of 0xFF.

Signed-off-by: Rusty Russell <[email protected]>
---
tools/lguest/lguest.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/lguest/lguest.c b/tools/lguest/lguest.c
index b2217657f62c..485fe13db12e 100644
--- a/tools/lguest/lguest.c
+++ b/tools/lguest/lguest.c
@@ -1259,6 +1259,10 @@ static void emulate_insn(const u8 insn[])
else
mask = 0xFFFFFFFF;

+ /* This is the PS/2 keyboard status; 1 means ready for output */
+ if (port == 0x64)
+ val = 1;
+
/*
* If it was an "IN" instruction, they expect the result to be read
* into %eax, so we change %eax.
--
2.1.0

2015-02-06 05:22:25

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 08/29] lguest: don't disable iospace.

This no longer speeds up boot (IDE got better, I guess), but it does stop
us probing for a PCI bus.

Signed-off-by: Rusty Russell <[email protected]>
---
arch/x86/lguest/boot.c | 8 --------
1 file changed, 8 deletions(-)

diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
index c1c1544b8485..47ec7f201d27 100644
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -1400,14 +1400,6 @@ __init void lguest_init(void)
atomic_notifier_chain_register(&panic_notifier_list, &paniced);

/*
- * The IDE code spends about 3 seconds probing for disks: if we reserve
- * all the I/O ports up front it can't get them and so doesn't probe.
- * Other device drivers are similar (but less severe). This cuts the
- * kernel boot time on my machine from 4.1 seconds to 0.45 seconds.
- */
- paravirt_disable_iospace();
-
- /*
* This is messy CPU setup stuff which the native boot code does before
* start_kernel, so we have to do, too:
*/
--
2.1.0

2015-02-06 05:22:40

by Rusty Russell

[permalink] [raw]
Subject: [PATCH 09/29] lguest: add iomem region, where guest page faults get sent to userspace.

This lets us implement PCI.

Signed-off-by: Rusty Russell <[email protected]>
---
drivers/lguest/lg.h | 7 ++++++-
drivers/lguest/lguest_user.c | 3 ++-
drivers/lguest/page_tables.c | 33 ++++++++++++++++++++++++++++++---
drivers/lguest/x86/core.c | 19 ++++++++++++++++++-
tools/lguest/lguest.c | 3 ++-
5 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/drivers/lguest/lg.h b/drivers/lguest/lg.h
index 9da4f351e077..eb81abc05995 100644
--- a/drivers/lguest/lg.h
+++ b/drivers/lguest/lg.h
@@ -97,8 +97,12 @@ struct lguest {
struct lg_cpu cpus[NR_CPUS];
unsigned int nr_cpus;

+ /* Valid guest memory pages must be < this. */
u32 pfn_limit;

+ /* Device memory is >= pfn_limit and < device_limit. */
+ u32 device_limit;
+
/*
* This provides the offset to the base of guest-physical memory in the
* Launcher.
@@ -200,7 +204,8 @@ void guest_pagetable_flush_user(struct lg_cpu *cpu);
void guest_set_pte(struct lg_cpu *cpu, unsigned long gpgdir,
unsigned long vaddr, pte_t val);
void map_switcher_in_guest(struct lg_cpu *cpu, struct lguest_pages *pages);
-bool demand_page(struct lg_cpu *cpu, unsigned long cr2, int errcode);
+bool demand_page(struct lg_cpu *cpu, unsigned long cr2, int errcode,
+ unsigned long *iomem);
void pin_page(struct lg_cpu *cpu, unsigned long vaddr);
bool __guest_pa(struct lg_cpu *cpu, unsigned long vaddr, unsigned long *paddr);
unsigned long guest_pa(struct lg_cpu *cpu, unsigned long vaddr);
diff --git a/drivers/lguest/lguest_user.c b/drivers/lguest/lguest_user.c
index be996d173615..c8b0e8575b44 100644
--- a/drivers/lguest/lguest_user.c
+++ b/drivers/lguest/lguest_user.c
@@ -385,7 +385,7 @@ static int initialize(struct file *file, const unsigned long __user *input)
/* "struct lguest" contains all we (the Host) know about a Guest. */
struct lguest *lg;
int err;
- unsigned long args[3];
+ unsigned long args[4];

/*
* We grab the Big Lguest lock, which protects against multiple
@@ -419,6 +419,7 @@ static int initialize(struct file *file, const unsigned long __user *input)
/* Populate the easy fields of our "struct lguest" */
lg->mem_base = (void __user *)args[0];
lg->pfn_limit = args[1];
+ lg->device_limit = args[3];

/* This is the first cpu (cpu 0) and it will start booting at args[2] */
err = lg_cpu_start(&lg->cpus[0], 0, args[2]);
diff --git a/drivers/lguest/page_tables.c b/drivers/lguest/page_tables.c
index 69c35caa955a..e3abebc912c0 100644
--- a/drivers/lguest/page_tables.c
+++ b/drivers/lguest/page_tables.c
@@ -250,6 +250,16 @@ static void release_pte(pte_t pte)
}
/*:*/

+static bool gpte_in_iomem(struct lg_cpu *cpu, pte_t gpte)
+{
+ /* We don't handle large pages. */
+ if (pte_flags(gpte) & _PAGE_PSE)
+ return false;
+
+ return (pte_pfn(gpte) >= cpu->lg->pfn_limit
+ && pte_pfn(gpte) < cpu->lg->device_limit);
+}
+
static bool check_gpte(struct lg_cpu *cpu, pte_t gpte)
{
if ((pte_flags(gpte) & _PAGE_PSE) ||
@@ -374,8 +384,14 @@ static pte_t *find_spte(struct lg_cpu *cpu, unsigned long vaddr, bool allocate,
*
* If we fixed up the fault (ie. we mapped the address), this routine returns
* true. Otherwise, it was a real fault and we need to tell the Guest.
+ *
+ * There's a corner case: they're trying to access memory between
+ * pfn_limit and device_limit, which is I/O memory. In this case, we
+ * return false and set @iomem to the physical address, so the the
+ * Launcher can handle the instruction manually.
*/
-bool demand_page(struct lg_cpu *cpu, unsigned long vaddr, int errcode)
+bool demand_page(struct lg_cpu *cpu, unsigned long vaddr, int errcode,
+ unsigned long *iomem)
{
unsigned long gpte_ptr;
pte_t gpte;
@@ -383,6 +399,8 @@ bool demand_page(struct lg_cpu *cpu, unsigned long vaddr, int errcode)
pmd_t gpmd;
pgd_t gpgd;

+ *iomem = 0;
+
/* We never demand page the Switcher, so trying is a mistake. */
if (vaddr >= switcher_addr)
return false;
@@ -459,6 +477,12 @@ bool demand_page(struct lg_cpu *cpu, unsigned long vaddr, int errcode)
if ((errcode & 4) && !(pte_flags(gpte) & _PAGE_USER))
return false;

+ /* If they're accessing io memory, we expect a fault. */
+ if (gpte_in_iomem(cpu, gpte)) {
+ *iomem = (pte_pfn(gpte) << PAGE_SHIFT) | (vaddr & ~PAGE_MASK);
+ return false;
+ }
+
/*
* Check that the Guest PTE flags are OK, and the page number is below
* the pfn_limit (ie. not mapping the Launcher binary).
@@ -553,7 +577,9 @@ static bool page_writable(struct lg_cpu *cpu, unsigned long vaddr)
*/
void pin_page(struct lg_cpu *cpu, unsigned long vaddr)
{
- if (!page_writable(cpu, vaddr) && !demand_page(cpu, vaddr, 2))
+ unsigned long iomem;
+
+ if (!page_writable(cpu, vaddr) && !demand_page(cpu, vaddr, 2, &iomem))
kill_guest(cpu, "bad stack page %#lx", vaddr);
}
/*:*/
@@ -928,7 +954,8 @@ static void __guest_set_pte(struct lg_cpu *cpu, int idx,
* now. This shaves 10% off a copy-on-write
* micro-benchmark.
*/
- if (pte_flags(gpte) & (_PAGE_DIRTY | _PAGE_ACCESSED)) {
+ if ((pte_flags(gpte) & (_PAGE_DIRTY | _PAGE_ACCESSED))
+ && !gpte_in_iomem(cpu, gpte)) {
if (!check_gpte(cpu, gpte))
return;
set_pte(spte,
diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
index 42e87bf14113..18d841e738bc 100644
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -362,9 +362,19 @@ static void setup_emulate_insn(struct lg_cpu *cpu)
sizeof(cpu->pending.insn));
}

+static void setup_iomem_insn(struct lg_cpu *cpu, unsigned long iomem_addr)
+{
+ cpu->pending.trap = 14;
+ cpu->pending.addr = iomem_addr;
+ copy_from_guest(cpu, cpu->pending.insn, cpu->regs->eip,
+ sizeof(cpu->pending.insn));
+}
+
/*H:050 Once we've re-enabled interrupts, we look at why the Guest exited. */
void lguest_arch_handle_trap(struct lg_cpu *cpu)
{
+ unsigned long iomem_addr;
+
switch (cpu->regs->trapnum) {
case 13: /* We've intercepted a General Protection Fault. */
/* Hand to Launcher to emulate those pesky IN and OUT insns */
@@ -385,8 +395,15 @@ void lguest_arch_handle_trap(struct lg_cpu *cpu)
* whether kernel or userspace code.
*/
if (demand_page(cpu, cpu->arch.last_pagefault,
- cpu->regs->errcode))
+ cpu->regs->errcode, &iomem_addr))
+ return;
+
+ /* Was this an access to memory mapped IO? */
+ if (iomem_addr) {
+ /* Tell Launcher, let it handle it. */
+ setup_iomem_insn(cpu, iomem_addr);
return;
+ }

/*
* OK, it's really not there (or not OK): the Guest needs to
diff --git a/tools/lguest/lguest.c b/tools/lguest/lguest.c
index 485fe13db12e..02f353989e6c 100644
--- a/tools/lguest/lguest.c
+++ b/tools/lguest/lguest.c
@@ -548,7 +548,8 @@ static void tell_kernel(unsigned long start)
{
unsigned long args[] = { LHREQ_INITIALIZE,
(unsigned long)guest_base,
- guest_limit / getpagesize(), start };
+ guest_limit / getpagesize(), start,
+ guest_limit / getpagesize() };
verbose("Guest: %p - %p (%#lx)\n",
guest_base, guest_base + guest_limit, guest_limit);
lguest_fd = open_or_die("/dev/lguest", O_RDWR);
--
2.1.0