RtOS – implementing it: conserving power

Hi again!

In a previous blog entry I have shown You how to make Your tasks to wait for some events. Now it is a time to make this waiting useful.

First let me remind You how the main task switching kernel loop looks like:

subroutine yield()
  push "called save" registers
  TaskTable[CTP].SP = SP
  for(;;)
  {
    CTP--
    if (carry over/borrow) CTP = NUMBER_OF_TASKS-1
    if ((TaskTable[CTP].event_flags &
         TaskTable[CTP].event_mask ) !=0)
        {
        SP = TaskTable[CTP].SP
        pop "called save" registers
        return
        }
  }

It is spinning all the way round, isn’t it? Spinning and spinning forever. But what if we alter it a bit:

subroutine yield()
  push "called save" registers
  TaskTable[CTP].SP = SP
  for(;;)
  {
   for(i=NUMBER_OF_TASKS;--i>=0;)
   {
    CTP--
    if (carry over/borrow) CTP = NUMBER_OF_TASKS-1
    if ((TaskTable[CTP].event_flags &
         TaskTable[CTP].event_mask ) !=0)
        {
        SP = TaskTable[CTP].SP
        pop "called save" registers
        return
        }
   }
   *** this is a sweet spot ***
  }

Not a much have changed. We have now an inner loop which is checking if any of existing tasks is ready to be awaken and an outer loop which repeats it forever.

But let us think about it for a moment: what does it exactly mean when the program reaches the “sweet spot“?

That there was not task ready to be awoken.

Hmm….

What could it mean? Since no task is ready to be awoken, including the task which just have called the yield(), are there any means for any of task to be ever brought up?

Surely no task can set any bit in any of event_flags variables because no task will be running.

Does it simply mean that we are stuck?

Well….

Not if there are any interrupts.

And when we are talking about interrupts…

Interrupts and power saving modes

Nowadays most micro-controllers or even most microprocessors do have some kind of “power saving” functionality. In 80586, as far as I recall, there was an instruction which just paused CPU for one cycle in low power mode. Nothing fancy, honestly. In modern micro-controllers we have much, much wider choice: from just putting a CPU execution core to sleep, through shutting down some sub-systems, then shutting down some clocks, down to the deep sleep from which only reset can pull us out.

Of course the “deep sleep” is not what we are aiming at, but some intermediate sleep modes will be fine.

But what does it have in common with interrupts?

Everything!

This is an interrupt what awakes CPU: the hardware is getting back online, it starts executing the interrupt service routine, gets to the end of it and then…

Exactly… what happens then?

The details will depend on CPU but if processor has a useful power saving modes it has an ability to perform three atomic operations:

  • to put processor in a sleep mode X and atomically enable interrupts;
  • to atomically return from interrupt, restore CPU to previous sleep mode and enable interrupts;
  • to atomically return from interrupt, restore CPU to active mode and enable interrupts;

I will call the first operation sleep(X), the second one reti (because an active mode is just one of sleep modes, right? So a regular return from interrupt should be able to restore the sleep mode), and the last one I will call: awake_reti

So let us imagine we will do something like that:

subroutine yield()
  push "called save" registers
  TaskTable[CTP].SP = SP
  for(;;)
  {
   for(i=NUMBER_OF_TASKS;--i>=0;)
   {
    CTP--
    if (carry over/borrow) CTP = NUMBER_OF_TASKS-1
    if ((TaskTable[CTP].event_flags &
         TaskTable[CTP].event_mask ) !=0)
        {
        SP = TaskTable[CTP].SP
        pop "called save" registers
        return
        }
   }
  sleep(X)
  }

What have we just done ?

Basically we said: “If there is no task ready to run, put the CPU to sleep with interrupts enabled.

Of course to make any of tasks alive again we need some interrupt code. This code must both signal an event and awake the CPU so that the loop would continue.

Like, for an example this:

interrupt handling routine()
{
  .... blah blah blah
  TaskTable[1].event_flags |= 0b1000_0000; //signal some event to some task
  awake_reti
  }

And we are done!

Well… not really in fact, but let us pretend for a moment we are indeed done. What exactly happens?

The main kernel loop checks all tasks and finds that there is nothing to do. So it puts the CPU to sleep. Then an interrupt happens and decides to awake some task, in his case task number 1. So it sets an event signal and returns from an interrupt in such a manner, that the CPU does not return to sleep mode and instead executes code normally. In effect it is re-running the kernel loop. And this time this loop will find the task to run, so it will awake it.

Case closed, hurray?

Race condition

Sadly, no, case is not closed.

Why?

Because even tough the sleep(X) is atomic, the event_flags testing loop is not. And a following sequence of events is possible:

  1. Task 0 is checked, no it is not to be awaken.
  2. Task 1 is checked, no it is not to be awaken.
  3. Task 2 is checked, no it is not to be awaken.
  4. Interrupt happens, and it is doing: TaskTable[1].event_flags|=0b1000_000;
  5. Task 3 is checked, no it is not to be awaken.
  6. Nothing to awake, execute sleep(X)we should NOT be doing that, right?

Because the task checking loop is not atomic against interrupts an interrupt could slip in during the loop and alter the state of a task which was already checked. In a result the CPU will be put to sleep while in fact it should not be. In most cases this kind of race may get unnoticed, because sooner or later an another interrupt will happen during the sleep(X), awake the CPU and then all pending events will be noticed and all pending tasks will be awaken. But a delay will be introduced and in some rare cases, when it is the only interrupt which could awake anything, we will get stuck.

The obvious solution is to make the entire testing loop to be atomic against interrupts:

subroutine yield()
  push "called save" registers
  TaskTable[CTP].SP = SP
  for(;;)
  {
   disable interrupts
   for(i=NUMBER_OF_TASKS;--i>=0;)
   {
    CTP--
    if (carry over/borrow) CTP = NUMBER_OF_TASKS-1
    if ((TaskTable[CTP].event_flags &
         TaskTable[CTP].event_mask ) !=0)
        {
        enable interrupts
        SP = TaskTable[CTP].SP
        pop "called save" registers
        return
        }
   }
  sleep(X) //note: Interrupts are enabled as a side effect of entering sleep mode.
  }

but I hate this solution.

Why do I hate it?

Because it adds to the interrupt latency a lot. This is a significant code block which does not have to be atomic and is looping very frequently. This is really not worth to pay that latency cost if there is a better solution.

Which is something like that:

   boolean interrupt_updated_an_event;
 subroutine yield()
  push "called save" registers
  TaskTable[CTP].SP = SP  
  for(;;)
  {
   disable interrupts
    interrupt_updated_an_event=false
   enable interrupts
   for(i=NUMBER_OF_TASKS;--i>=0;)
   {
   .....
   }
  disable interrupts
   if (not interrupt_updated_an_event)
   {
      sleep(X) //note: Interrupts are enabled as a side effect of entering sleep mode.
   }
  }

And inside an interrupt we just add:

interrupt handling routine()
{
  .... blah blah blah
  TaskTable[1].event_flags |= 0b1000_0000; //signal some event to some task
  interrupt_updated_an_event = true;
  awake_reti
  }

We just added one global flag which indicates that any interrupt did adjust any event flag during the task scanning loop. Of course interrupt does not bother if in fact it happen during the loop or not. It just sets the flag to true. Our main kernel loop atomically clears it when it is sure, that it will scan all tasks again. Once the scan finishes without awaking any task it enters the block which is atomic against interrupts and before going to sleep it tests that flag. If it is not set it is safe to go to sleep. If it is set all tasks needs to be checked again.

Selecting X in sleep(X)

As You probably noticed we have now one, centralized location in our entire program in which CPU is entering the energy saving mode. No more complex decision trees, no more pondering if I can put CPU to sleep or not. If there is no task to run it goes to sleep. If there is a task to run it stays awake. Plain an simple.

I like it. I hope, You also will like. But what if we can squeeze even more from it?

Power saving woes

The vast number of power saving modes does not come without a price. Sure, I can turn off the Auxilary Clock signal. Sure, I can turn off the main oscillator. Or a temperature compensating frequency locked loop. But if I turn them off they will not work.

For an example my beloved MSP430 will stop background transfers from USB endpoint incoming shift register to memory if main system clock is disabled. The entire USB machine works, but the data are not transferred to memory. But if USB is not running I can get much more power saving with disabling main system clock.

Or in the same CPU the FLL loop which is stabilizing RC system clock against low power watch quartz crystal must run for at least 10ms each minute or the clock will go off too much if temperature will change.

Or…. You can imagine.

With our centralized system it is enough to slightly modify the atomic piece which is calling a sleep(X). For an example like this:

   
 subroutine yield()
  ...
   if (not interrupt_updated_an_event)
   {
     if USB.is_on
             sleep(main_clock_on)
       else
             sleep(main_clock_off)
  }
  ....

Again we have a single point where all decisions about power saving are made. This is a really, really good for product quality.

Summary

After getting trough all this blog entry You should be now able to squeeze as much power saving from Your CPU as possible with just one, plain and simple piece of code. Everything related to power saving is kept in one place and a decision to put processor to sleep happens transparently without any special action form You. If there is nothing to do processor sleeps. If there is something to do it stays up.

In next blog entry I will show You how to introduce an elementary protection against hangs into the RtOS and I will explain why with RtOS standard methods do no longer work.

Leave a comment