Discussion:
[RFC 00/11]: powerKVM, release the compute power of secondary hwthread on host
kernelfans
2014-10-16 19:29:49 UTC
Permalink
Nowadays, when running powerKVM(book3s, hv mode), we should make the secondary hwthread
offline. Which means that if we run misc tsks other than dedicated KVM (e.g mix java and KVM),
we will lose the compute power of the secondary hwthread on host env.


This series aim to make the powerpc adaptive to the misc tsks on host.
( This series is just a sketch, with some broken patch. Sorry to bring up it in a hurry,
I am afraid that I am on the wrong direction too far. So I hope I can get some advice and feedback
in advance. I will go on the work on the "place holder" patch if my idea is reasonable.

Please consider the code as the explaining of my idea.
)


The internal:
-1.To enter guest, the primary hwthread schedule stopper func on the secondary to bring them into NAP mode.
The proto will be:
cpu1 cpuX
stop_cpus_async()
bring cpuX to a special state
signal flag and trapped
check for flag
set up guest env and ipi cpuX
-2.When exit to host, the secondary hardcode to jmp back to the stopper func, i.e back to host.


Drawbacks that I can think so far:
-1. increase the sched interval on secondary but the schduler do NOT know it.(can it cause problem?)
-2. lose some presice of hrtime on secondary hwthread for host.(To avoid the primary
has too small time slice, we need to impose a thresholdso we may lose the presice)

Any suggestion? Thanks!

Liu Ping Fan (11):
sched: introduce sys_cpumask in tsk to adapt asymmetric system
powerpc: kvm: ensure vcpu-thread run only on primary hwthread
powerpc: kvm: add interface to control kvm function on a core
powerpc: kvm: introduce a kthread on primary thread to anti tickless
sched: introduce stop_cpus_async() to schedule special tsk on cpu
powerpc: kvm: introduce online in paca to indicate whether cpu is
needed by host
powerpc: kvm: the stopper func to cease secondary hwthread
powerpc: kvm: add a flag in vcore to sync primary with secondry
hwthread
powerpc: kvm: handle time base on secondary hwthread
powerpc: kvm: on_primary_thread() force the secondary threads into NAP
mode
powerpc: kvm: Kconfig add an option for enabling secondary hwthread

arch/powerpc/include/asm/kvm_host.h | 6 ++++
arch/powerpc/include/asm/paca.h | 3 ++
arch/powerpc/kernel/asm-offsets.c | 6 ++++
arch/powerpc/kernel/smp.c | 3 ++
arch/powerpc/kernel/sysfs.c | 41 ++++++++++++++++++++++
arch/powerpc/kvm/Kconfig | 4 +++
arch/powerpc/kvm/book3s_hv.c | 39 +++++++++++++++++++++
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 61 +++++++++++++++++++++++++++++++++
arch/powerpc/sysdev/xics/xics-common.c | 12 +++++++
include/linux/init_task.h | 1 +
include/linux/sched.h | 6 ++++
include/linux/stop_machine.h | 2 ++
kernel/sched/core.c | 10 ++++--
kernel/stop_machine.c | 25 +++++++++++---
14 files changed, 212 insertions(+), 7 deletions(-)
--
1.8.3.1
kernelfans
2014-10-16 19:29:50 UTC
Permalink
Some system such as powerpc, some tsk (vcpu thread) can only run on
the dedicated cpu. Since we adapt some asymmetric method to monitor the
whole physical cpu. (powerKVM only allows the primary hwthread to
set up runtime env for the secondary when entering guest).

Nowadays, powerKVM run with all the secondary hwthread offline to ensure
the vcpu threads only run on the primary thread. But we plan to keep all
cpus online when running powerKVM to give more power when switching back
to host, so introduce sys_allowed cpumask to reflect the cpuset which
the vcpu thread can run on.

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
include/linux/init_task.h | 1 +
include/linux/sched.h | 6 ++++++
kernel/sched/core.c | 10 ++++++++--
3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 2bb4c4f3..c56f69e 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -172,6 +172,7 @@ extern struct task_group root_task_group;
.normal_prio = MAX_PRIO-20, \
.policy = SCHED_NORMAL, \
.cpus_allowed = CPU_MASK_ALL, \
+ .sys_allowed = CPU_MASK_ALL, \
.nr_cpus_allowed= NR_CPUS, \
.mm = NULL, \
.active_mm = &init_mm, \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..ce429f3 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1260,7 +1260,10 @@ struct task_struct {

unsigned int policy;
int nr_cpus_allowed;
+ /* Anded user and sys_allowed */
cpumask_t cpus_allowed;
+ /* due to the feature of asymmetric, some tsk can only run on such cpu */
+ cpumask_t sys_allowed;

#ifdef CONFIG_PREEMPT_RCU
int rcu_read_lock_nesting;
@@ -2030,6 +2033,9 @@ static inline void tsk_restore_flags(struct task_struct *task,
}

#ifdef CONFIG_SMP
+extern void set_cpus_sys_allowed(struct task_struct *p,
+ const struct cpumask *new_mask);
+
extern void do_set_cpus_allowed(struct task_struct *p,
const struct cpumask *new_mask);

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec1a286..2cd1ae3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4596,13 +4596,19 @@ void init_idle(struct task_struct *idle, int cpu)
}

#ifdef CONFIG_SMP
+void set_cpus_sys_allowed(struct task_struct *p,
+ const struct cpumask *new_mask)
+{
+ cpumask_copy(&p->sys_allowed, new_mask);
+}
+
void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
{
if (p->sched_class && p->sched_class->set_cpus_allowed)
p->sched_class->set_cpus_allowed(p, new_mask);

- cpumask_copy(&p->cpus_allowed, new_mask);
- p->nr_cpus_allowed = cpumask_weight(new_mask);
+ cpumask_and(&p->cpus_allowed, &p->sys_allowed, new_mask);
+ p->nr_cpus_allowed = cpumask_weight(&p->cpus_allowed);
}

/*
--
1.8.3.1
kernelfans
2014-10-16 19:29:51 UTC
Permalink
When vcpu thread runs at the first time, it will ensure to stick
to the primary thread.

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
arch/powerpc/include/asm/kvm_host.h | 3 +++
arch/powerpc/kvm/book3s_hv.c | 17 +++++++++++++++++
2 files changed, 20 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 98d9dd5..9a3355e 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -666,6 +666,9 @@ struct kvm_vcpu_arch {
spinlock_t tbacct_lock;
u64 busy_stolen;
u64 busy_preempt;
+#ifdef CONFIG_KVMPPC_ENABLE_SECONDARY
+ bool cpu_selected;
+#endif
#endif
};

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 27cced9..ba258c8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1909,6 +1909,23 @@ static int kvmppc_vcpu_run_hv(struct kvm_run *run, struct kvm_vcpu *vcpu)
{
int r;
int srcu_idx;
+#ifdef CONFIG_KVMPPC_ENABLE_SECONDARY
+ int cpu = smp_processor_id();
+ int target_cpu;
+ unsigned int cpu;
+ struct task_struct *p = current;
+
+ if (unlikely(!vcpu->arch.cpu_selected)) {
+ vcpu->arch.cpu_selected = true;
+ for (cpu = 0; cpu < NR_CPUS; cpu+=threads_per_core) {
+ cpumask_set_cpu(cpu, &p->sys_allowed);
+ }
+ if (cpu%threads_per_core != 0) {
+ target_cpu = cpu/threads_per_core*threads_per_core;
+ migrate_task_to(current, target_cpu);
+ }
+ }
+#endif

if (!vcpu->arch.sane) {
run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
--
1.8.3.1
kernelfans
2014-10-16 19:29:52 UTC
Permalink
When kvm is enabled on a core, we migrate all external irq to primary
thread. Since currently, the kvmirq logic is handled by the primary
hwthread.

Todo: this patch lacks re-enable of irqbalance when kvm is disable on
the core

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
arch/powerpc/kernel/sysfs.c | 39 ++++++++++++++++++++++++++++++++++
arch/powerpc/sysdev/xics/xics-common.c | 12 +++++++++++
2 files changed, 51 insertions(+)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 67fd2fd..a2595dd 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -552,6 +552,45 @@ static void sysfs_create_dscr_default(void)
if (cpu_has_feature(CPU_FTR_DSCR))
err = device_create_file(cpu_subsys.dev_root, &dev_attr_dscr_default);
}
+
+#ifdef CONFIG_KVMPPC_ENABLE_SECONDARY
+#define NR_CORES (CONFIG_NR_CPUS/threads_per_core)
+static DECLARE_BITMAP(kvm_on_core, NR_CORES) __read_mostly
+
+static ssize_t show_kvm_enable(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+}
+
+static ssize_t __used store_kvm_enable(struct device *dev,
+ struct device_attribute *attr, const char *buf,
+ size_t count)
+{
+ struct cpumask stop_cpus;
+ unsigned long core, thr;
+
+ sscanf(buf, "%lx", &core);
+ if (core > NR_CORES)
+ return -1;
+ if (!test_bit(core, &kvm_on_core))
+ for (thr = 1; thr< threads_per_core; thr++)
+ if (cpu_online(thr * threads_per_core + thr))
+ cpumask_set_cpu(thr * threads_per_core + thr, &stop_cpus);
+
+ stop_machine(xics_migrate_irqs_away_secondary, NULL, &stop_cpus);
+ set_bit(core, &kvm_on_core);
+ return count;
+}
+
+static DEVICE_ATTR(kvm_enable, 0600,
+ show_kvm_enable, store_kvm_enable);
+
+static void sysfs_create_kvm_enable(void)
+{
+ device_create_file(cpu_subsys.dev_root, &dev_attr_kvm_enable);
+}
+#endif
+
#endif /* CONFIG_PPC64 */

#ifdef HAS_PPC_PMC_PA6T
diff --git a/arch/powerpc/sysdev/xics/xics-common.c b/arch/powerpc/sysdev/xics/xics-common.c
index fe0cca4..68b33d8 100644
--- a/arch/powerpc/sysdev/xics/xics-common.c
+++ b/arch/powerpc/sysdev/xics/xics-common.c
@@ -258,6 +258,18 @@ unlock:
raw_spin_unlock_irqrestore(&desc->lock, flags);
}
}
+
+int xics_migrate_irqs_away_secondary(void *data)
+{
+ int cpu = smp_processor_id();
+ if(cpu%thread_per_core != 0) {
+ WARN(condition, format...);
+ return 0;
+ }
+ /* In fact, if we can migrate the primary, it will be more fine */
+ xics_migrate_irqs_away();
+ return 0;
+}
#endif /* CONFIG_HOTPLUG_CPU */

#ifdef CONFIG_SMP
--
1.8.3.1
kernelfans
2014-10-16 19:29:53 UTC
Permalink
(This patch is a place holder.)

If there is only one vcpu thread is ready(the other vcpu thread can
wait for it to execute), the primary thread can enter tickless mode,
which causes the primary keeps running, so the secondary has no
opportunity to exit to host, even they have other tsk on them.

Introduce a kthread (anti_tickless) on primary, so when there is only
one vcpu thread on primary, the secondary can resort to anti_tickless
to keep the primary out of tickless mode.
(I thought that anti_tickless thread can goto NAP, so we can let the
secondary run).

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
arch/powerpc/kernel/sysfs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index a2595dd..f0b110e 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -575,9 +575,11 @@ static ssize_t __used store_kvm_enable(struct device *dev,
if (!test_bit(core, &kvm_on_core))
for (thr = 1; thr< threads_per_core; thr++)
if (cpu_online(thr * threads_per_core + thr))
- cpumask_set_cpu(thr * threads_per_core + thr, &stop_cpus);
+ cpumask_set_cpu(core * threads_per_core + thr, &stop_cpus);

stop_machine(xics_migrate_irqs_away_secondary, NULL, &stop_cpus);
+ /* fixme, create a kthread on primary hwthread to handle tickless mode */
+ //kthread_create_on_cpu(prevent_tickless, NULL, core * threads_per_core, "ppckvm_prevent_tickless");
set_bit(core, &kvm_on_core);
return count;
}
--
1.8.3.1
kernelfans
2014-10-16 19:29:54 UTC
Permalink
The proto will be:
cpu1 cpuX
stop_cpus_async()
bring cpuX to a special state
signal flag and trapped
check for flag

The func help powerpc to reuse the scheme of cpu_stopper_task
to force the secondary hwthread goto NAP state, in which state,
cpu will not run any longer until the master cpu tells them to
go.

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
include/linux/stop_machine.h | 2 ++
kernel/stop_machine.c | 25 ++++++++++++++++++++-----
2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index d2abbdb..871c1bf 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -32,6 +32,8 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *
void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg,
struct cpu_stop_work *work_buf);
int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);
+int stop_cpus_async(const struct cpumask *cpumask, cpu_stop_fn_t fn,
+ void *arg);
int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg);

#else /* CONFIG_SMP */
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 695f0c6..d26fd6a 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -354,13 +354,15 @@ static void queue_stop_cpus_work(const struct cpumask *cpumask,
}

static int __stop_cpus(const struct cpumask *cpumask,
- cpu_stop_fn_t fn, void *arg)
+ cpu_stop_fn_t fn, void *arg, bool sync)
{
struct cpu_stop_done done;

- cpu_stop_init_done(&done, cpumask_weight(cpumask));
+ if (sync)
+ cpu_stop_init_done(&done, cpumask_weight(cpumask));
queue_stop_cpus_work(cpumask, fn, arg, &done);
- wait_for_completion(&done.completion);
+ if (sync)
+ wait_for_completion(&done.completion);
return done.executed ? done.ret : -ENOENT;
}

@@ -398,7 +400,20 @@ int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg)

/* static works are used, process one request at a time */
mutex_lock(&stop_cpus_mutex);
- ret = __stop_cpus(cpumask, fn, arg);
+ ret = __stop_cpus(cpumask, fn, arg, true);
+ mutex_unlock(&stop_cpus_mutex);
+ return ret;
+}
+
+/* similar to stop_cpus(), but not wait for the ack. */
+int stop_cpus_async(const struct cpumask *cpumask, cpu_stop_fn_t fn,
+ void *arg)
+{
+ int ret;
+
+ /* static works are used, process one request at a time */
+ mutex_lock(&stop_cpus_mutex);
+ ret = __stop_cpus(cpumask, fn, arg, false);
mutex_unlock(&stop_cpus_mutex);
return ret;
}
@@ -428,7 +443,7 @@ int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg)
/* static works are used, process one request at a time */
if (!mutex_trylock(&stop_cpus_mutex))
return -EAGAIN;
- ret = __stop_cpus(cpumask, fn, arg);
+ ret = __stop_cpus(cpumask, fn, arg, true);
mutex_unlock(&stop_cpus_mutex);
return ret;
}
--
1.8.3.1
kernelfans
2014-10-16 19:29:55 UTC
Permalink
Nowadays, powerKVM runs with secondary hwthread offline. Although
we can make all secondary hwthread online later, we still preserve
this behavior for dedicated KVM env. Achieve this by setting
paca->online as false.

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
arch/powerpc/include/asm/paca.h | 3 +++
arch/powerpc/kernel/asm-offsets.c | 3 +++
arch/powerpc/kernel/smp.c | 3 +++
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 12 ++++++++++++
4 files changed, 21 insertions(+)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index a5139ea..67c2500 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -84,6 +84,9 @@ struct paca_struct {
u8 cpu_start; /* At startup, processor spins until */
/* this becomes non-zero. */
u8 kexec_state; /* set when kexec down has irqs off */
+#ifdef CONFIG_KVMPPC_ENABLE_SECONDARY
+ u8 online;
+#endif
#ifdef CONFIG_PPC_STD_MMU_64
struct slb_shadow *slb_shadow_ptr;
struct dtl_entry *dispatch_log;
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 9d7dede..0faa8fe 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -182,6 +182,9 @@ int main(void)
DEFINE(PACATOC, offsetof(struct paca_struct, kernel_toc));
DEFINE(PACAKBASE, offsetof(struct paca_struct, kernelbase));
DEFINE(PACAKMSR, offsetof(struct paca_struct, kernel_msr));
+#ifdef CONFIG_KVMPPC_ENABLE_SECONDARY
+ DEFINE(PACAONLINE, offsetof(struct paca_struct, online));
+#endif
DEFINE(PACASOFTIRQEN, offsetof(struct paca_struct, soft_enabled));
DEFINE(PACAIRQHAPPENED, offsetof(struct paca_struct, irq_happened));
DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context.id));
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index a0738af..4c3843e 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -736,6 +736,9 @@ void start_secondary(void *unused)

cpu_startup_entry(CPUHP_ONLINE);

+#ifdef CONFIG_KVMPPC_ENABLE_SECONDARY
+ get_paca()->online = true;
+#endif
BUG();
}

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index f0c4db7..d5594b0 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -322,6 +322,13 @@ kvm_no_guest:
li r0, KVM_HWTHREAD_IN_NAP
stb r0, HSTATE_HWTHREAD_STATE(r13)
kvm_do_nap:
+#ifdef PPCKVM_ENABLE_SECONDARY
+ /* check the cpu is needed by host or not */
+ ld r2, PACAONLINE(r13)
+ ld r3, 0
+ cmp r2, r3
+ bne kvm_secondary_exit_trampoline
+#endif
/* Clear the runlatch bit before napping */
mfspr r2, SPRN_CTRLF
clrrdi r2, r2, 1
@@ -340,6 +347,11 @@ kvm_do_nap:
nap
b .

+#ifdef PPCKVM_ENABLE_SECONDARY
+kvm_secondary_exit_trampoline:
+ b .
+#endif
+
/******************************************************************************
* *
* Entry code *
--
1.8.3.1
kernelfans
2014-10-16 19:29:56 UTC
Permalink
To enter guest, primary hwtherad schedules the stopper func on
secondary threads and force them into NAP mode.
When exit to host,secondary threads hardcode to restore the stack,
then switch back to the stopper func, i.e host.

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
arch/powerpc/kvm/book3s_hv.c | 15 +++++++++++++++
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 34 +++++++++++++++++++++++++++++++++
2 files changed, 49 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ba258c8..4348abd 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1486,6 +1486,21 @@ static void kvmppc_remove_runnable(struct kvmppc_vcore *vc,
list_del(&vcpu->arch.run_list);
}

+#ifdef KVMPPC_ENABLE_SECONDARY
+
+extern void kvmppc_secondary_stopper_enter();
+
+static int kvmppc_secondary_stopper(void *data)
+{
+ int cpu =smp_processor_id();
+ struct paca_struct *lpaca = get_paca();
+ BUG_ON(!(cpu%thread_per_core));
+
+ kvmppc_secondary_stopper_enter();
+}
+
+#endif
+
static int kvmppc_grab_hwthread(int cpu)
{
struct paca_struct *tpaca;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index d5594b0..254038b 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -349,7 +349,41 @@ kvm_do_nap:

#ifdef PPCKVM_ENABLE_SECONDARY
kvm_secondary_exit_trampoline:
+
+ /* all register is free to use, later kvmppc_secondary_stopper_exit set up them*/
+ //loop-wait for the primary to signal that host env is ready
+
+ LOAD_REG_ADDR(r5, kvmppc_secondary_stopper_exit)
+ /* fixme, load msr from lpaca stack */
+ li r6, MSR_IR | MSR_DR
+ mtsrr0 r5
+ mtsrr1 r6
+ RFI
+
+_GLOBAL_TOC(kvmppc_secondary_stopper_enter)
+ mflr r0
+ std r0, PPC_LR_STKOFF(r1)
+ stdu r1, -112(r1)
+
+ /* fixme: store other register such as msr */
+
+ /* prevent us to enter kernel */
+ li r0, 1
+ stb r0, HSTATE_HWTHREAD_REQ(r13)
+ /* tell the primary that we are ready */
+ li r0,KVM_HWTHREAD_IN_KERNEL
+ stb r0,HSTATE_HWTHREAD_STATE(r13)
+ nap
b .
+
+/* enter with vmode */
+kvmppc_secondary_stopper_exit:
+ /* fixme, restore the stack which we store on lpaca */
+
+ ld r0, 112+PPC_LR_STKOFF(r1)
+ addi r1, r1, 112
+ mtlr r0
+ blr
#endif

/******************************************************************************
--
1.8.3.1
kernelfans
2014-10-16 19:29:57 UTC
Permalink
The secondary thread can only jump back to host until primary has set
up the env. Add host_ready field in kvm_vcore to sync this action.

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
arch/powerpc/include/asm/kvm_host.h | 3 +++
arch/powerpc/kernel/asm-offsets.c | 3 +++
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 ++++++++++-
3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 9a3355e..1310e03 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -305,6 +305,9 @@ struct kvmppc_vcore {
u32 arch_compat;
ulong pcr;
ulong dpdes; /* doorbell state (POWER8) */
+#ifdef CONFIG_KVMPPC_ENABLE_SECONDARY
+ u8 host_ready;
+#endif
void *mpp_buffer; /* Micro Partition Prefetch buffer */
bool mpp_buffer_is_valid;
};
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 0faa8fe..9c04ac2 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -562,6 +562,9 @@ int main(void)
DEFINE(VCORE_LPCR, offsetof(struct kvmppc_vcore, lpcr));
DEFINE(VCORE_PCR, offsetof(struct kvmppc_vcore, pcr));
DEFINE(VCORE_DPDES, offsetof(struct kvmppc_vcore, dpdes));
+#ifdef CONFIG_KVMPPC_ENABLE_SECONDARY
+ DEFINE(VCORE_HOST_READY, offsetof(struct kvmppc_vcore, host_ready));
+#endif
DEFINE(VCPU_SLB_E, offsetof(struct kvmppc_slb, orige));
DEFINE(VCPU_SLB_V, offsetof(struct kvmppc_slb, origv));
DEFINE(VCPU_SLB_SIZE, sizeof(struct kvmppc_slb));
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 254038b..89ea16c 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -351,7 +351,11 @@ kvm_do_nap:
kvm_secondary_exit_trampoline:

/* all register is free to use, later kvmppc_secondary_stopper_exit set up them*/
- //loop-wait for the primary to signal that host env is ready
+ /* wait until the primary to set up host env */
+ ld r5, HSTATE_KVM_VCORE(r13)
+ ld r0, VCORE_HOST_READY(r5)
+ cmp r0, //primary is ready?
+ bne kvm_secondary_exit_trampoline

LOAD_REG_ADDR(r5, kvmppc_secondary_stopper_exit)
/* fixme, load msr from lpaca stack */
@@ -1821,6 +1825,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
li r0, KVM_GUEST_MODE_NONE
stb r0, HSTATE_IN_GUEST(r13)

+#ifdef PPCKVM_ENABLE_SECONDARY
+ /* signal the secondary that host env is ready */
+ li r0, 1
+ stb r0, VCORE_HOST_READY(r5)
+#endif
ld r0, 112+PPC_LR_STKOFF(r1)
addi r1, r1, 112
mtlr r0
--
1.8.3.1
kernelfans
2014-10-16 19:29:58 UTC
Permalink
(This is a place holder patch.)
We need to store the time base for host on secondary hwthread.
Later when switching back, we need to reprogram it with elapse
time.

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 89ea16c..a817ba6 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -371,6 +371,8 @@ _GLOBAL_TOC(kvmppc_secondary_stopper_enter)

/* fixme: store other register such as msr */

+ /* fixme: store the tb, and set it as MAX, so we cease the tick on secondary */
+
/* prevent us to enter kernel */
li r0, 1
stb r0, HSTATE_HWTHREAD_REQ(r13)
@@ -382,6 +384,10 @@ _GLOBAL_TOC(kvmppc_secondary_stopper_enter)

/* enter with vmode */
kvmppc_secondary_stopper_exit:
+ /* fixme: restore the tb, with the orig val plus time elapse
+ * so we can fire the hrtimer as soon as possible
+ */
+
/* fixme, restore the stack which we store on lpaca */

ld r0, 112+PPC_LR_STKOFF(r1)
--
1.8.3.1
kernelfans
2014-10-16 19:29:59 UTC
Permalink
The primary hwthread ceases the scheduler of secondary hwthread by
bringing them into NAP. Then, the secondary is ready for guest.

Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
arch/powerpc/kvm/book3s_hv.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 4348abd..7896c31 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1593,15 +1593,22 @@ static int on_primary_thread(void)
{
int cpu = smp_processor_id();
int thr;
+ struct cpumask msk;

/* Are we on a primary subcore? */
if (cpu_thread_in_subcore(cpu))
return 0;

thr = 0;
+#ifdef KVMPPC_ENABLE_SECONDARY
+ while (++thr < threads_per_subcore)
+ cpumask_set_cpu(thr, &msk);
+ stop_cpus_async(&msk, kvmppc_secondary_stopper, NULL);
+#else
while (++thr < threads_per_subcore)
if (cpu_online(cpu + thr))
return 0;
+#endif

/* Grab all hw threads so they can't go into the kernel */
for (thr = 1; thr < threads_per_subcore; ++thr) {
--
1.8.3.1
kernelfans
2014-10-16 19:30:00 UTC
Permalink
Signed-off-by: Liu Ping Fan <pingfank at linux.vnet.ibm.com>
---
arch/powerpc/kvm/Kconfig | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 602eb51..de38566 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -93,6 +93,10 @@ config KVM_BOOK3S_64_HV

If unsure, say N.

+config KVMPPC_ENABLE_SECONDARY
+ tristate "KVM support for running on secondary hwthread in host"
+ depends on KVM_BOOK3S_64_HV
+
config KVM_BOOK3S_64_PR
tristate "KVM support without using hypervisor mode in host"
depends on KVM_BOOK3S_64
--
1.8.3.1
Loading...