Subject: sched: Fix idle_cpu()
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 15 Sep 2011 15:32:06 +0200
Patch-mainline: v3.2-rc1
Git-commit: 908a3283728d92df36e0c7cd63304fd35e93a8a9
References: bnc#726023

On -rt we observed hackbench waking all 400 tasks to a single cpu.
This is because of select_idle_sibling()'s interaction with the new
ipi based wakeup scheme.

The existing idle_cpu() test only checks to see if the current task on
that cpu is the idle task, it does not take already queued tasks into
account, nor does it take queued to be woken tasks into account.

If the remote wakeup IPIs come hard enough, there won't be time to
schedule away from the idle task, and would thus keep thinking the cpu
was in fact idle, regardless of the fact that there were already
several hundred tasks runnable.

We couldn't reproduce on mainline, but there's no reason it couldn't
happen.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-3o30p18b2paswpc9ohy2gltp@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Mike Galbraith <mgalbraith@suse.de>

---
 kernel/sched.c |   15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Index: linux-3.0-SLE11-SP2/kernel/sched.c
===================================================================
--- linux-3.0-SLE11-SP2.orig/kernel/sched.c
+++ linux-3.0-SLE11-SP2/kernel/sched.c
@@ -5114,7 +5114,20 @@ EXPORT_SYMBOL(task_nice);
  */
 int idle_cpu(int cpu)
 {
-	return cpu_curr(cpu) == cpu_rq(cpu)->idle;
+	struct rq *rq = cpu_rq(cpu);
+
+	if (rq->curr != rq->idle)
+		return 0;
+
+	if (rq->nr_running)
+		return 0;
+
+#ifdef CONFIG_SMP
+	if (rq->wake_list)
+		return 0;
+#endif
+
+	return 1;
 }
 
 /**
