Synchronous Modeling of Data-Intensive Applications e, Huafeng. - - PowerPoint PPT Presentation

synchronous modeling of data intensive applications
SMART_READER_LITE
LIVE PREVIEW

Synchronous Modeling of Data-Intensive Applications e, Huafeng. - - PowerPoint PPT Presentation

Synchronous Modeling of Data-Intensive Applications e, Huafeng. Yu, A. Gamati E. Rutten, P. Boulet and J.-L. Dekeyser { Yu, Gamatie, Boulet, Dekeyser } @lifl.fr Eric.Rutten@inrialpes.fr DART project, INRIA Futurs / WEST group, LIFL This


slide-1
SLIDE 1

Synchronous Modeling of Data-Intensive Applications

  • Huafeng. Yu, A. Gamati´

e, ´

  • E. Rutten,
  • P. Boulet and J.-L. Dekeyser

{Yu, Gamatie, Boulet, Dekeyser}@lifl.fr Eric.Rutten@inrialpes.fr DART project, INRIA Futurs / WEST group, LIFL

slide-2
SLIDE 2

This talk :

◮ Detailed presentation of Array-OL/Gaspard ◮ First results of study

slide-3
SLIDE 3

Plan

Introduction Gaspard methodology General scheme Data-intensive processing Array-OL language Existing works Simple synchronous modeling of Gaspard models Parallel model Serialized model Validation issues Conclusions

slide-4
SLIDE 4

Introduction

Context : data-intensive applications (DIA) in embedded systems

◮ regular multidimensional data processing ◮ parallel processing in System-on-Chip (SoC)

Motivations : adequate techniques for

◮ efficient data manipulation ◮ analysis of implementation properties

Approach : combination of

◮ a formalism dedicated to DIA (Array-OL) ◮ data-flow synchronous equation models

slide-5
SLIDE 5

Gaspard methodology

Interop Bridge Interop Bridge

Appli PIM archi PIM

High Perf PSM Corba PSM SystemC PSM VHDL PSM Fortran code Corba code SystemC code VHDL files

Association PIM

produce

TLM PIMs RTL PIM

Verilog PSM Verilog files SystemC PSM SystemC code

Depolyed

Refactoring Array-OL pretty editors Performance Analysis Synchronous Technologies

slide-6
SLIDE 6

General scheme

synchronous equations

(metamodel) code generators

format intermediate modeling language

control

+

specification language formal diagnosis, debug control

+

transf2 transf1

simulation verification clock calculus transformations compilation codesign analysis simulation TLM, RTL Lustre Lucid Synchrone Signal Gaspard2

(Array-OL metamodel)

slide-7
SLIDE 7

Introduction Gaspard methodology General scheme Data-intensive processing Array-OL language Existing works Simple synchronous modeling of Gaspard models Parallel model Serialized model Validation issues Conclusions

slide-8
SLIDE 8

Array-OL

Array-OL (Array-Oriented Language) : initially proposed by Thomson Marconi Sonar [DD98]

◮ Specification language for full parallelism ◮ Data manipulation through arrays ◮ Deadlock free and deterministic by construction

slide-9
SLIDE 9

Array-OL

Array-OL (Array-Oriented Language) : initially proposed by Thomson Marconi Sonar [DD98]

◮ Specification language for full parallelism ◮ Data manipulation through arrays ◮ Deadlock free and deterministic by construction ◮ Descriptions independent from implementation platforms ◮ Two types of parallelism in application specifications :

Task parallelism and Data parallelism

slide-10
SLIDE 10

Array-OL (cont’d)

Task parallelism and data dependencies :

Task1

(1920, 1080, ∞) (720, 1080, ∞)

Task2

(1600, 1200, ∞) (720, 1080, ∞)

Task3

(720, 1080, ∞) (720, 1080, ∞) (720, 480, ∞)

slide-11
SLIDE 11

Array-OL (cont’d)

Different task models :

◮ Elementary task : atomic computation block

(instantaneous function)

slide-12
SLIDE 12

Array-OL (cont’d)

Different task models :

◮ Elementary task : atomic computation block

(instantaneous function)

◮ Hierarchical task : task represented by hierarchical acyclic

graphs in which

◮ each node consists of a task, and ◮ edges are labeled by the arrays

slide-13
SLIDE 13

Array-OL (cont’d)

Different task models :

◮ Elementary task : atomic computation block

(instantaneous function)

◮ Hierarchical task : task represented by hierarchical acyclic

graphs in which

◮ each node consists of a task, and ◮ edges are labeled by the arrays

◮ Repetition task : expression of data parallelism

slide-14
SLIDE 14

Array-OL (cont’d)

Data parallelism

◮ Repetition element : the subtask to be repeated ◮ Repetition space : limitation of repetition number and

link between inputs and outputs

◮ Interface : input and output arrays ◮ Tiler : defines how to obtain sub-arrays from a input array

and how to store sub-arrays in a output array

slide-15
SLIDE 15

Array-OL (cont’d)

Example of a repetition task

R

(3, 2) (9, 8) (11, 6) (3, 2)

E

(1) (3, 4) (2, 3) F = 1

  • =
  • P =
  • 1
  • F =

1 1

  • =
  • P =
  • 3

4

  • F =

2 1 1

  • =
  • P =
  • 3

3

slide-16
SLIDE 16

Array-OL (cont’d)

Tiler specification :

◮ o : original point of the array or reference pattern ◮ P : paving matrix (how the array is tiled by patterns) ◮ F : fitting matrix (how patterns are filled by array

elements)

slide-17
SLIDE 17

Array-OL (cont’d)

Paving : how the array is tiled by patterns. These patterns are calculated in any order. Paving example : o = ( 0

0 ), P = ( 2 0 0 3 )

Original point

Repetition space : [5,4], limitation of pattern repetitions.

slide-18
SLIDE 18

Array-OL (cont’d)

Paving : how the array is tiled by patterns. These patterns are calculated in any order. Paving example : o = ( 0

0 ), P = ( 2 0 0 3 )

  • Original point

Point obtained from the iteration on the first vector

Repetition space : [5,4], limitation of pattern repetitions.

slide-19
SLIDE 19

Array-OL (cont’d)

Paving : how the array is tiled by patterns. These patterns are calculated in any order. Paving example : o = ( 0

0 ), P = ( 2 0 0 3 )

  • Point obtained from the iteration on the second vector

Original point Point obtained from the iteration on the first vector

Repetition space : [5,4], limitation of pattern repetitions.

slide-20
SLIDE 20

Array-OL (cont’d)

Paving : how the array is tiled by patterns. These patterns are calculated in any order. Paving example : o = ( 0

0 ), P = ( 2 0 0 3 )

  • Point obtained from the iteration on the second vector

Point obtained from the iteration on the two vectors Original point Point obtained from the iteration on the first vector

Repetition space : [5,4], limitation of pattern repetitions.

slide-21
SLIDE 21

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 Array 1 Pattern

slide-22
SLIDE 22

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 Array 1 Pattern

slide-23
SLIDE 23

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 Array 1 Pattern

slide-24
SLIDE 24

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 3 Array 1 Pattern

slide-25
SLIDE 25

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 3 4 Array 1 Pattern

slide-26
SLIDE 26

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 Array 1 Pattern

slide-27
SLIDE 27

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 Array 1 Pattern

Fitting example 2 : o = ( 0 ), F = ( 2

6 )

1 2 3 4 5 6 7 8 9 10 Pattern Array 2

slide-28
SLIDE 28

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 Array 1 Pattern

Fitting example 2 : o = ( 0 ), F = ( 2

6 )

1 2 3 4 5 6 7 8 9 10 Pattern Array 2 2

slide-29
SLIDE 29

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 Array 1 Pattern

Fitting example 2 : o = ( 0 ), F = ( 2

6 )

1 2 3 4 5 6 7 8 9 10 Pattern Array 2 2 4

slide-30
SLIDE 30

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 Array 1 Pattern

Fitting example 2 : o = ( 0 ), F = ( 2

6 )

1 2 3 4 5 6 7 8 9 10 Pattern Array 2 2 4 6

slide-31
SLIDE 31

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 Array 1 Pattern

Fitting example 2 : o = ( 0 ), F = ( 2

6 )

1 2 3 4 5 6 7 8 9 10 Pattern Array 2 2 4 6 8

slide-32
SLIDE 32

Array-OL (cont’d)

Fitting : how each pattern is filled by array elements. Fitting example 1 : o = ( 0 ), F = ( 1

3 )

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 Array 1 Pattern

Fitting example 2 : o = ( 0 ), F = ( 2

6 )

1 2 3 4 5 6 7 8 9 10 Pattern Array 2 2 4 6 8 10

slide-33
SLIDE 33

Array-OL (cont’d)

An example of repetition task : array product

A3

DotProduct [2 , 3]

A1 A2

Number of instances : 2*3=6 ; Repetition point : [0,0]

slide-34
SLIDE 34

Array-OL (cont’d)

An example of repetition task : array product

A3

DotProduct [2 , 3]

A1 A2

Number of instances : 2*3=6 ; Repetition point : [0,0]

slide-35
SLIDE 35

Array-OL (cont’d)

An example of repetition task : array product

A3

DotProduct [2 , 3]

A1 A2

Number of instances : 2*3=6 ; Repetition point : [0,0]

slide-36
SLIDE 36

Array-OL (cont’d)

An example of repetition task : array product

A3

DotProduct [2 , 3]

A1 A2

Number of instances : 2*3=6 ; Repetition point : [0,0]

slide-37
SLIDE 37

Array-OL (cont’d)

An example of repetition task : array product

A3

DotProduct [2 , 3]

A1 A2

Number of instances : 2*3=6 ; Repetition point : [0,0]

slide-38
SLIDE 38

Array-OL (cont’d)

An example of repetition task : array product

A3

DotProduct [2 , 3]

A1 A2

Number of instances : 2*3=6 ; Repetition point : [0,0]

slide-39
SLIDE 39

Array-OL (cont’d)

An example of repetition task : array product

A3

DotProduct [2 , 3]

A1 A2

Number of instances : 2*3=6 ; Repetition point : [0,1]

slide-40
SLIDE 40

Array-OL (cont’d)

An example of repetition task : array product

A3

DotProduct [2 , 3]

A1 A2

Number of instances : 2*3=6 ; Repetition point : [1,0]

slide-41
SLIDE 41

Existing works

Alpha language [Mauras, 1989] (vs. Array-OL)

◮ multidimensional data structures for data-intensive

applications

◮ union of convex polyhedra (vs. arrays) ◮ data access through indices calculated by affine functions

(vs. hierarchical and modular pattern )

◮ absence of modulo (vs. presence of modulo)

slide-42
SLIDE 42

Introduction Gaspard methodology General scheme Data-intensive processing Array-OL language Existing works Simple synchronous modeling of Gaspard models Parallel model Serialized model Validation issues Conclusions

slide-43
SLIDE 43

Synchronous modeling of Gaspard models

Illustration of the modeling by the following example : R

(3, 2) (9, 8) (11, 6) (3, 2)

E

(1) (3, 4) (2, 3) F =

  • 1
  • =
  • P =
  • 1
  • F =
  • 1

1

  • =
  • P =
  • 3

4

  • F =
  • 2

1 1

  • =
  • P =
  • 3

3

slide-44
SLIDE 44

Parallel model

Modeling of repetition task ∀j ∈ r, A3[< indj

3 >] := E(A1[< indj 1 >], A2[< indj 2 >]) ◮ j : a point in the repetition space r ◮ < indj

i > : the set of index associated with pattern j

◮ Ai[< indj

i >] : the pattern j associated with array Ai

slide-45
SLIDE 45

Parallel model (cont’d)

Decomposition of a repetition

Input tilers : pj

1 := A1[< indj 1 >]

pj

2 := A2[< indj 2 >]

Task : pj

3 := E(pj 1, pj 2)

Output tiler : A3[< indj

3 >] := pj 3

Introduction of local variables : pj

1, pj 2, pj 3

slide-46
SLIDE 46

Parallel model (cont’d)

Decomposition of a repetition A complete system of equations :

(| p1

1 := A1[< ind1 1 >] | p1 2 := A2[< ind1 2 >]

| p1

3 := E(p1 1, p1 2) | A3[< ind1 3 >] := p1 3

| ... | pk

1 := A1[< indk 1 >] | pk 2 := A2[< indk 2 >]

| pk

3 := E(pk 1, pk 2) | A3[< indk 3 >] := pk 3

|) where p1

1, p1 2, p1 3, ..., pk 1, pk 2, pk 3; end ;

(1)

slide-47
SLIDE 47

Parallel model (cont’d)

Restructuring and finalization of the model

A2

A1

A3

✂ ✄ ✁

✄ ✁

✄ ✁

✲ ✲ ✲ ✲ ✂ ✄ ✁

✄ ✁

✄ ✁

✂ ✄ ✁

✲ ✲

pk

2 = A2[< indk 2 >]

p1

1 = A1[< ind1 1 >]

p2

2 = A2[< ind2 2 >]

p1

2 = A2[< ind1 2 >]

pk

1 = A1[< indk 1 >]

p2

1 = A1[< ind2 1 >]

A3[< ind1

3 >] = p1 3

A3[< indk

3 >] = pk 3

A3[< ind2

3 >] = p2 3

p2

3

pk

3

p1

1

p2

1

pk

1

p2

2

p1

2

pk

2

p1

3

... ... ... ...

E E E

◮ Commutativity and associativity of composition operator

slide-48
SLIDE 48

Case study : video downscaling

Downscaler

(640, 480, ∞) (320, 240, ∞) (80, 60, ∞) (8, 8) (4, 4) F =   1 1  

  • =

    P =   8 8 1   F =   1 1  

  • =

    P =   4 4 1  

Horizontal

(8, 8) (4, 8) (8)

Hfilter

(8) (4)

Vertical

(4, 8) (4, 4) (4)

Vfilter

(8) (4)

slide-49
SLIDE 49

Generated Signal code

module Downscaler_module = process DOWNSCALER = (?type_array_i A_i; !type_array_o A_o;) (|(P_i1,...,P_iN:= HV_TILER_i(A_i) |(P_o1,...,P_oN):= R_HV_FILTER(P_i1,... |A_o:=HV_TILER_o(P_o1,... |) where type_pattern_i P_i1,... type_pattern_o P_o1,... process HV_TILER_i = (?type_array_i A_i; !type_pattern_i P_i1,..., (|P_i1:=HV_PATTERN_i1(A_i) |...|) where process HV_PATTERN_i1 = ... end%HV_TILER_i% ; process R_HV_FILTER = (?type_pattern_i P_i1,..., !type_pattern_o P_o1,..., (|P_o1:=HV_FILTER(P_i1) |...|) where process HV_FILTER = (? type_pattern_i P_i; ! type_pattern_o P_o;) (| p:= H_FILTER (P_i) | P_o := V_FILTER(p) |) where type_pattern_l p; process H_FILTER = ... process V_FILTER = ... end%HV_FILTER%;

slide-50
SLIDE 50

Serialized model

◮ Simple parallel model

◮ Semantically equivalent ◮ naively enumeration

◮ Association of application with architecture :

from repetition to iteration, introduction of flows

◮ Sequentialization at different granularity degrees [Labbani

2006]

slide-51
SLIDE 51

Serialized model

From repetition to iteration : introduction of flows

✲ ✲ ✲ ✲ ✂ ✄ ✁

✄ ✁

✄ ✁

✲ ✂ ✄ ✁

  • A1

A3 A2 p1

1...pk 1

p1

3...pk 3

p1

2...pk 2

t1 t2 t3

E to Flow) (Array to Flow) (Array to Array) (Flow

◮ Array to flow : produces pattern flows from arrays ◮ Flow to array : produces arrays from pattern flows

slide-52
SLIDE 52

Serialized model

Array to flow

...

Seq Clock a p Extract

Mem index_1 index_2 index_n

Main components : clock oversampling, sequencer and Extraction

slide-53
SLIDE 53

Serialized model

Flow to array

...

Mem

Seq Clock Insert p a

index_1 index_2 index_n

Main components : clock undersampling, sequencer and insertion

slide-54
SLIDE 54

Introduction Gaspard methodology General scheme Data-intensive processing Array-OL language Existing works Simple synchronous modeling of Gaspard models Parallel model Serialized model Validation issues Conclusions

slide-55
SLIDE 55

Validation issues

◮ We have a synchronous model with parallel and serialized

version that can be combined (mixed model)

◮ We want to use synchronous analysis tools to address

design correctness issues

  • ex. N-synchronous Kahn network [Cohen et al. 2006],

clock calculus, model-checking

◮ Example : a simple application with affine clocks

synchronizability analysis [Smarandache et al. 1999]

slide-56
SLIDE 56

General scheme

synchronous equations

(metamodel) code generators

format intermediate modeling language

control

+

specification language formal diagnosis, debug control

+

transf2 transf1

simulation verification clock calculus transformations compilation codesign analysis simulation TLM, RTL Lustre Lucid Synchrone Signal Gaspard2

(Array-OL metamodel)

slide-57
SLIDE 57

Synchronizability analysis

Camera functionality in a cell phone

C p p k i C C a i po k

Downscaler TFT CMOS sensor display

slide-58
SLIDE 58

Synchronizability analysis

Clock constraints :

c_p c_a c_i

... ... ...

  • 1. ca is an affine undersampling of cp : cp

(1,φ1,d1)

→ ca ;

  • 2. ci is an affine undersampling of ca : ca

(1,φ2,d2)

→ ci ;

slide-59
SLIDE 59

Synchronizability analysis

Clock constraints :

c_p c_a c_i

... ... ...

  • 1. ca is an affine undersampling of cp : cp

(1,φ1,d1)

→ ca ;

  • 2. ci is an affine undersampling of ca : ca

(1,φ2,d2)

→ ci ; Now, let us consider a given external constraint, which imposes a particular image production rate c′

i, from cp such that :

cp

(1,φ3,d3)

→ c′

  • i. What about the synchronizability of c′

i and ci ?

slide-60
SLIDE 60

Synchronizability analysis

Clock constraints :

c_p c_a c_i

... ... ...

  • 1. ca is an affine undersampling of cp : cp

(1,φ1,d1)

→ ca ;

  • 2. ci is an affine undersampling of ca : ca

(1,φ2,d2)

→ ci ; Now, let us consider a given external constraint, which imposes a particular image production rate c′

i, from cp such that :

cp

(1,φ3,d3)

→ c′

  • i. What about the synchronizability of c′

i and ci ?

c′

i and ci are synchronizable ⇔

  • φ1 + d1φ2 = φ3

d1d2 = d3 (2)

slide-61
SLIDE 61

Conclusions and perspectives

Current results :

◮ Synchronous modeling of Gaspard specifications ◮ Analysis of Gaspard applications with the help of

synchronous techniques

◮ Implementation of modeling approach following MDE

In the future :

◮ Complete implementation and validation of current results ◮ Extension with control features : mode-automata ◮ Using mixed models (parallel/serialized) combined with

task fusion technique for placements in time and space