Currently, all non-coldboot HARTs busy spin in wait_for_coldboot()
until the entire coldboot init sequence is completed.
This means:
1) On QEMU, all non-coldboot HARTs will eat host CPU time and
also slow down the coldboot HART until the entire coldboot
init sequence is completed.
2) On real HW, all non-coldboot HARTs will consume more CPU
power until the entire coldboot init sequence is completed.
To address this, wake up all non-coldboot HARTs as early as
possible in the coldboot init sequence.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>