System to model and assumptions

In this recipe, we describe an architecture where computing units (i.e. cores) have no shared hardware resources (i.e. cache or memory units) and tasks are statically allocated on the cores. Tasks cannot migrate from one core to another. Tasks assigned on different computing units cannot share software resources as there is not shared memory. This kind of architecture is called a partitioned multiprocessor.

Typical solution

We introduce 3 typical solutions for this recipe, each where processor is assumed to have a unique computing unit:

A partitionned multiprocessor solution with independant processors. Each processor has its own memory unit. Each processor has its own scheduler. Then, each thread is located on a given processor, cannot migrate, and can only shared resources with the other threads of the same processor. In this solution, each thread competes only with the threads located on the same processor to access it. This solution can be used to model, for example, a set of board connected by a bus where communication latency is negligeble and where each board hosts a unicore processor.
A global multiprocessor solution. This solution has the highiest level of flexiblity. Processors shared share the same memory unit. There is only one scheduler allocating all the thread on all processors. Then, threads can migrate and may share various software resources. Processor and memory units can be connected either with a bus, a crossbar switch or by a network-on-chip. This architecture solution can be used to model a multicore architecture a SMP with a global scheduling for example.

Cheddar properties

To model such various architecture models, Cheddar proposes a set of properties given below:

    Supported_Soc_Type : type enumeration (
        SoC_Processing_Unit, SoC_Memory_Unit, Soc_Interconnection_Unit);	
    
    System_Soc_Type : Supported_SoC_Type applies to (system);

    ---------------------------------------------------   
 
    Supported_Instruction_Set_Architecture_Type  : type enumeration (
       i386, powerpc, risc5, sparc1, sparc2, sparc3,
			sparc4, sparc5, sparc6, sparc7, sparc8);	
    Instruction_Set_Architecture_Type : Cheddar_Multicore_Properties::Supported_Isa_Type applies to (memory, system);
 
    Supported_Processors_type : type enumeration (
        Unicore_type, Identical_Multicores_Type, Uniform_Multicores_Type,
		Unrelated_Multicores_Type );	
    Processors_type : Cheddar_Multicore_Properties::Supported_Processors_type applies to (memory, system);

    Supported_Migrations_Type : type enumeration (
        No_Migration_Type, Job_Level_Migration_Type, Time_Unit_Migration_Type);	
    Migrations_Type : Cheddar_Multicore_Properties::Supported_Migrations_Type applies to (memory, system);
  
    Supported_Multiprocessors_Type : type enumeration (
        Identical, Homogeneous, Heterogeneous);	
    Multiprocessors_Type : Cheddar_Multicore_Properties::Supported_Multiprocessors_Type applies to (memory, system);
 
    Implement_Runtime_Protection: aadlboolean applies to (processor,system);
 
    Peak_MIPS : addlinteger applies to (processor,system);

------------------------------------------------------------ -- This property is used to tad a system component -- to specify if it implement a SoC and what kidnd of SoC ------------------------------------------------------------ Supported_Soc_Type : type enumeration ( SoC_Processing_Unit, SoC_Memory_Unit, Soc_Interconnection_Unit); System_Soc_Type : Supported_SoC_Type applies to (system); --------------------------------------------------- -- Property for Processing units modeling --------------------------------------------------- Supported_Instruction_Set_Architecture_Type : type enumeration ( i386, powerpc, risc5, sparc1, sparc2, sparc3, sparc4, sparc5, sparc6, sparc7, sparc8); Instruction_Set_Architecture_Type : Cheddar_Multicore_Properties::Supported_Isa_Type applies to (memory, system); Supported_Processors_type : type enumeration ( Unicore_type, Identical_Multicores_Type, Uniform_Multicores_Type, Unrelated_Multicores_Type ); Processors_type : Cheddar_Multicore_Properties::Supported_Processors_type applies to (memory, system); Supported_Migrations_Type : type enumeration ( No_Migration_Type, Job_Level_Migration_Type, Time_Unit_Migration_Type); Migrations_Type : Cheddar_Multicore_Properties::Supported_Migrations_Type applies to (memory, system); Supported_Multiprocessors_Type : type enumeration ( Identical, Homogeneous, Heterogeneous); Multiprocessors_Type : Cheddar_Multicore_Properties::Supported_Multiprocessors_Type applies to (memory, system); -- Memory space isolation -- Define whether a processor may have its own private addressing space -- Related to the capability of hardware to isolate thread's memory access flows (virtual memory context, address space numbers, ?) -- Implement_Runtime_Protection: aadlboolean applies to (processor,system); -- Maximum number of run instruction per second -- i.e. Peak_MIPS is an upper bound of the processor speed -- This is the worst mean to express the speed of a processor, but everyone knows the limit of that metric -- Peak_MIPS : addlinteger applies to (processor,system);

1. Partitionned multiprocessor solution with independant processors

An example of the solution with independant processors and private memory (can be downloaded here) is:

package partitionned

processor core
 properties
    Scheduling_Protocol => RM;
end core;
  	
thread get_line
 properties
    Dispatch_Protocol => Periodic;
    Period => 125 ms;
    Deadline => 125 ms;
    Compute_Execution_Time => 1 ms .. 25 ms;
    Priority => 10 ;
end get_line;
thread implementation get_line.Impl
end get_line.Impl;

thread edge
 properties
    Dispatch_Protocol => Periodic;
    Period => 125 ms;
    Deadline => 125 ms;
    Compute_Execution_Time => 25 ms .. 25 ms;
    Priority => 9 ;
end edge;
thread implementation edge.Impl
end edge.Impl;
  
thread sharp
 properties
    Dispatch_Protocol => Periodic;
    Period => 250 ms;
    Deadline => 250 ms;
    Compute_Execution_Time => 25 ms .. 25 ms;
    Priority => 8 ;
end sharp;
thread implementation sharp.Impl
end sharp.Impl;

process Soft
end Soft;

process implementation Soft.Impl
 subcomponents
    edge     : thread edge.Impl;
    get_line : thread get_line.Impl;
    sharp    : thread sharp.Impl;
end Soft.Impl;

system partitionned
end partitionned;

system implementation partitionned.Impl
 subcomponents
    soft1 : process   soft.Impl;
    soft2 : process   soft.Impl;
    cpu1  : processor core;
    cpu2  : processor core;
 properties
    Actual_Processor_Binding 
          => (reference(cpu1)) 
          applies to soft1;
    Actual_Processor_Binding 
          => (reference(cpu2)) 
          applies to soft2;
end partitionned.Impl;
 	   
end partitionned;

This example is fully compliant with AADL V2 and do not need any of the specific Cheddar property. The only noticeable part is the component partitionned.Impl in which the partitioning of the thread is defined by two processes (called soft1 and soft2) and two processors (called cpu1 and cpu2). Assignment of each process on its processor is given by the two following property assignement:

    Actual_Processor_Binding 
          => (reference(cpu1)) applies to soft1;
    Actual_Processor_Binding 
          => (reference(cpu2)) applies to soft2;

2. Partitionned multiprocessor solution with processors sharing the same memory unit

We now define solution 2 by extending solution 1 with a common memory unit between the processors. A typical model for this solution ((can be downloaded here) can be:

package core_affinity
public	
  with multicore_crossbar_units;

  thread ordo_bus
   properties
    Dispatch_Protocol => Periodic;
    Period => 125 ms;
    Deadline => 125 ms;
    Compute_Execution_Time => 1 ms .. 25 ms;
    Priority => 10 ;
  end ordo_bus;
  thread implementation ordo_bus.Impl
  end ordo_bus.Impl;

  thread pilotage
   properties
    Dispatch_Protocol => Periodic;
    Period => 250 ms;
    Deadline => 250 ms;
    Compute_Execution_Time => 25 ms .. 25 ms;
    Priority => 8 ;
  end pilotage;
  thread implementation pilotage.Impl
  end pilotage.Impl;
  
  system implementation core_affinity.Impl
   subcomponents
    a_process : process  application.Impl;
    cpu       : system   multicore_crossbar_units::dual_core.impl;
   properties
    Actual_Processor_Binding 
          => (reference(cpu.core1)) 
          applies to a_process.t1;
    Actual_Processor_Binding 
          => (reference(cpu.core2)) 
          applies to a_process.t2;
  end core_affinity.Impl;

  system core_affinity
  end core_affinity;
  
  process application
  end application;

  process implementation Application.Impl
   subcomponents
    t1 : thread pilotage.Impl;
    t2 : thread ordo_bus.Impl;
  end Application.Impl;
  
end core_affinity;

This second solution makes use of the following components (which can be downloaded here):

package multicore_crossbar_units
public
  with Cheddar_Multicore_Properties;
	
processor uni_core
properties
   Scheduling_Protocol=>(POSIX_1003_HIGHEST_PRIORITY_FIRST_PROTOCOL);
end uni_core;

system dual_core
end dual_core;
  
system implementation dual_core.impl
 subcomponents
   core1 : processor uni_core;
   core2 : processor uni_core;
 properties
   Cheddar_Multicore_Properties::System_Soc_Type => SoC_Processing_Unit;	
   Cheddar_Multicore_Properties::SoC_Interconnection_Type => Crossbar; 		 
end dual_core.impl;

system quad_core
end quad_core;
  
system implementation quad_core.impl
 subcomponents
  cores : processor uni_core [4];
 properties
  Cheddar_Multicore_Properties::System_Soc_Type => SoC_Processing_Unit;	
  Cheddar_Multicore_Properties::SoC_Interconnection_Type => Crossbar; 		 
end quad_core.impl;
  
end multicore_crossbar_units;

3. Global multiprocessor

Finally, the third example is a global multiprocessor solution, where threads can migrate from one computing unit to another, and may share each other various software resources (which can be downloaded here):

package global_scheduling

public
with multicore_crossbar_units;

  thread ordo_bus
  properties
    Dispatch_Protocol => Periodic;
    Period => 125 ms;
    Deadline => 125 ms;
    Compute_Execution_Time => 1 ms .. 25 ms;
    Priority => 10 ;
  end ordo_bus;
  thread implementation ordo_bus.Impl
  end ordo_bus.Impl;

  thread donnees
  properties
    Dispatch_Protocol => Periodic;
    Period => 125 ms;
    Deadline => 125 ms;
    Compute_Execution_Time => 25 ms .. 25 ms;
    Priority => 9 ;
  end donnees;
  thread implementation donnees.Impl
  end donnees.Impl;
  
  thread pilotage
  properties
    Dispatch_Protocol => Periodic;
    Period => 250 ms;
    Deadline => 250 ms;
    Compute_Execution_Time => 25 ms .. 25 ms;
    Priority => 8 ;
  end pilotage;
  thread implementation pilotage.Impl
  end pilotage.Impl;

  process Application
  end Application;

  process implementation Application.Impl
  subcomponents
    ordo_bus : thread ordo_bus.Impl;
    donnees : thread donnees.Impl;
    pilotage : thread pilotage.Impl;
  end Application.Impl;
  
  	
  system product
  end product;
	
  System implementation product.impl
  subcomponents
    hard : system multicore_crossbar_units::dual_core.impl;
    soft : process Application.impl;
  properties
    actual_processor_binding => (reference(hard)) applies to soft;
    Scheduling_Protocol => (RMS) applies to hard;
  end product.impl;

end global_scheduling;

Possible analysis

migraion, nombre de communication de contexte et de préemption

Case study

Global scheduling versus partitioned scheduling

We want to analyze scheduling of an architecture composed of 5 threads defined by the following parameters:

Threads	Execution times	Periods
T1	3 ms	6 ms
T2	2 ms	24 ms
T3	2 ms	10 ms
T4	1 ms	6 ms
T5	2 ms	5 ms

Deadlines are equal to periods. Threads have the first release time at the same time. We want to apply a preemptive fixed priority scheduling with a Rate Monotonic priority assignment.

We assume an architecture with 2 processors and we want to compare global and partitionned scheduling.

First we apply a partitioning approach. T3 and T5 are located on the first processor while the other threads are run on the second processor. Design an AADL model with AADLInspector or OSATE for this architecture and compute with these tools the scheduling during the first 24 ms. What are the worst case response time of each thread?
First we apply a global approach. Design an AADL model with AADLInspector or OSATE for this second architecture and compute with these tools the scheduling during the first 24 ms. What are the worst case response time of each thread?
For this thread set, what is the best approach between partitionning and global scheduling? Usually, what is the best approach between global and partitionned scheduling?

Possible solution/AADL model for this case study here.

Migrations and global scheduling

We want to analyze scheduling of an architecture composed of 4 threads defined by the following parameters:

Threads	Execution times	Periods
T1	2 ms	5 ms
T2	8 ms	20 ms
T3	2 ms	25 ms
T4	3 ms	6 ms

Deadlines are equal to periods. Threads have the first release time at the same time. We want to apply a preemptive fixed priority scheduling with a Rate Monotonic priority assignment.

Design an AADL model for this architecture.
Without computing the scheduling of the thread set, what analysis feature can we use with OSATE or AADLInspector to prove that threads cannot met their deadline on one processor?
With AADLInspector or OSATE, compute the scheduling during the first 20 ms and verify that your answer to question 1 is correct.
In order to met the deadline, we now modify the architecture by adding un second processor. Scheduling on the two processors is made by a global approach. Each thread is allowed to migration only when it start the execution of a job/activation, i.e. when the thread executes the first instruction at each periodic release. With AADLInspector or OSATE, modify your previous AADL model and compute the scheduling during the first 20 ms.
Modify again your AADL model to allow the threads to migrate at any time, and then, compute the scheduling of the threads during the 20 first ms.
Compare the results of the questions 4 and 5. In which case do the threads have the shortest worst case response? What is the migration policy which reduce the best the worst case response time with this thread set? More generally, what is the migration policy which reduce the best the worst case response time?

Possible solution/AADL model for this case study here.