Скачать презентацию Data-flow Analysis for Interruptdriven Microcontroller Software Nathan Cooprider Скачать презентацию Data-flow Analysis for Interruptdriven Microcontroller Software Nathan Cooprider

590cdc5b6496f9cbae6667b28447e4e2.ppt

  • Количество слайдов: 74

Data-flow Analysis for Interruptdriven Microcontroller Software Nathan Cooprider Advisor: John Regehr Dissertation defense School Data-flow Analysis for Interruptdriven Microcontroller Software Nathan Cooprider Advisor: John Regehr Dissertation defense School of Computing University of Utah

Data-flow Analysis for Interruptdriven Microcontroller Software • A whole program analysis • Targeting embedded Data-flow Analysis for Interruptdriven Microcontroller Software • A whole program analysis • Targeting embedded C programs • Suitable for use in a compiler 2

Microcontrollers (MCUs) • 10 billion units / year • $12. 5 billion market in Microcontrollers (MCUs) • 10 billion units / year • $12. 5 billion market in 2006 • Cheap • Resource constrained • e. g. Wireless sensor networks – Mica 2 mote ATmega 128 L (4 MHz 8 -bit MCU) 128 k. B code, 4 k. B data SRAM 3

Problem • Resources are constrained • Software outlives hardware – Code reuse leads to Problem • Resources are constrained • Software outlives hardware – Code reuse leads to bloat • Low-level code confuses analysis – Interrupt-driven concurrency – Device register access 4

Solution • Traditional data-flow analysis – Not adequate precision for MCU software • New Solution • Traditional data-flow analysis – Not adequate precision for MCU software • New techniques to increase precision – Deal with concurrency – Track volatile data • Use in code transformations Thesis statement – Optimizations 5

Contributions • Analysis techniques – Interatomic concurrent data-flow (ICD) – Tracking data through volatile Contributions • Analysis techniques – Interatomic concurrent data-flow (ICD) – Tracking data through volatile variables • Tool – c. Xprop • Applications – Practical memory safety – Safe Tiny. OS – Offline RAM Compression 6

 • Open-source OS for WSNs • Written in nes. C main – Dialect • Open-source OS for WSNs • Written in nes. C main – Dialect of C • Concurrency – Tasks and interrupts – No threads – Atomic sections Interrupt task Interrupt 7

ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe Tiny. OS RAM compression Pointer analysis 8

Abstract interpretation • Abstract domain switch (x) {. . . – Abstract values break; Abstract interpretation • Abstract domain switch (x) {. . . – Abstract values break; – Form poset case 42: case 7: case -1: • Subset relation ( ) if (x < 0) x={42, 7, -1} – Lattice x *= -1; • Undefined ( ) x++; • Unknown (⊥ ) {} or if (x == 0) {42} {7} {-1} assert(0); break; {42, 7} {7, -1} {42, -1}. . . {42, 7, -1} or ⊥ 9

Abstract interpretation switch (x) {. . . break; case 42: case 7: case -1: Abstract interpretation switch (x) {. . . break; case 42: case 7: case -1: if (x < 0) x={42, 7, -1} x *= -1; x++; if (x == 0) assert(0); break; . . . • Abstract domain – Abstract values – Form poset • Subset relation ( ) – Lattice • Undefined ( ) • Unknown (⊥ ) • Data-flow analysis – Transfer functions – Merging ( ) – Fixed point 10

Abstract interpretation • Abstract domain {42, 7, -1} Τ – Abstract values – Form Abstract interpretation • Abstract domain {42, 7, -1} Τ – Abstract values – Form poset Τ {-1} x<0 < {42, 7} Τ x*=-1; *= ++ x++; Τ {1} {42, 7, 1} Τ {43, 8, 2} Τ x==0 == Τ {43, 8, 2} • Subset relation ( ) – Lattice • Undefined ( ) • Unknown (⊥ ) • Data-flow analysis assert(0); Τ Τ – Transfer functions – Merging ( ) – Fixed point 11

ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe Tiny. OS RAM compression Pointer analysis 12

Interrupt-driven concurrency • Problems – C statements not necessarily atomic x = 0 x Interrupt-driven concurrency • Problems – C statements not necessarily atomic x = 0 x 4242; ldi r 24, 0 x 42 Interrupt ldi r 25, 0 x 42 13

Interrupt-driven concurrency • Problems – C statements not necessarily atomic – Preempts sequential control Interrupt-driven concurrency • Problems – C statements not necessarily atomic – Preempts sequential control flow • Complicated control flow • Synchronization A race – One flow does not “break” another – Bad synchronization happens • Difficult or impossible to reason about • Must deal with conservatively (⊥ ) 14

Related work • Thread-based concurrency – M. B. Dwyer, L. A. Clarke, J. M. Related work • Thread-based concurrency – M. B. Dwyer, L. A. Clarke, J. M. Cobleigh, and G. Naumovich. Flow analysis for verifying properties of software systems. TOSEM 2004. – M. C. Rinard. Analysis of multithreaded programs. SAS 2001. • Leveraging race detection – R. Chugh, J. W. Voung, R. Jhala, and S. Lerner. Dataflow analysis for concurrent programs using datarace detection. PLDI 2008. • Formal semantics – X. Feng, Z. Shao, Y. Dong, Y. Gho. Certifying low-level programs with hardware interrupts and preemptive threads. PLDI 2008. 15

Race detection • Lockset analysis - standard technique – Lock status = interrupt enable Race detection • Lockset analysis - standard technique – Lock status = interrupt enable bit status – Only one lock – no lock aliasing – nes. C uses lexical nesting • Data classification – Unshared – accessed only from main – Shared – accessed from interrupts 16

Race detection o. Accessed without locking o. Written in shared or unlocked unshared code Race detection o. Accessed without locking o. Written in shared or unlocked unshared code o. Accessed in shared code R A C E • Data classification – Unshared – accessed only from main – Shared – accessed from interrupts 17

Race detection case analysis Interrupt Write Read Use Racing Not racing Interrupt or task Race detection case analysis Interrupt Write Read Use Racing Not racing Interrupt or task Write Read Access Atomic section 18

Data classification Data Heap Concurrent Static (Global) Sequential Shared ⊥ Racing 6% Stack Unshared Data classification Data Heap Concurrent Static (Global) Sequential Shared ⊥ Racing 6% Stack Unshared 50% Not racing 44% 19

Published at LCTES 2006 Atomic interleaving Atomic section main Interrupt Atomic section Interatomic Concurrent Published at LCTES 2006 Atomic interleaving Atomic section main Interrupt Atomic section Interatomic Concurrent Data-flow 20

Volatile • C type qualifier – volatile int • Special case of C’s memory Volatile • C type qualifier – volatile int • Special case of C’s memory model – Read value may change “randomly” – Write may affect system state • E. g. , racing data, device registers • Behavior opaque at C level • Prevents compiler optimizations 21

Tracking volatile RAM • Locate variables backed by RAM • Introduce concurrency information – Tracking volatile RAM • Locate variables backed by RAM • Introduce concurrency information – Interatomic concurrent dataflow • Have sound approximation of mutators – Behavior not opaque at system level • Safely analyze volatile variables in RAM 22

Tracking volatile device registers • Hardware registers – Memory mapped I/O – Hardware not Tracking volatile device registers • Hardware registers – Memory mapped I/O – Hardware not actually random (volatile) • Can track using MCU-specific information – OK to track individual bits • Instead of whole register • Interrupt bit of status register Volatile tracking 23

Pointer analysis • Points-to sets – must and may alias – Two pluggable domains Pointer analysis • Points-to sets – must and may alias – Two pluggable domains – Subtleties from context-insensitivity • Targets: – Device registers – Scalars – Structs – Arrays – not-NULL – Heap Pointer analysis 24

Conditional X propagation • Pluggable abstract domains – From conditional constant propagation • Clean Conditional X propagation • Pluggable abstract domains – From conditional constant propagation • Clean domain interface – Transfer functions – Abstract interpretation Abstract domain utility functions Conditional X propagation Analysis 25

Domains Constant Bitwise Interval Conditional X propagation Value set 26 Domains Constant Bitwise Interval Conditional X propagation Value set 26

ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe Tiny. OS RAM compression Pointer analysis 27

Struct splitter Inliner Fixed point computation Cleaner Value-flow Pointer-flow ICD Volatile tracking • Constant Struct splitter Inliner Fixed point computation Cleaner Value-flow Pointer-flow ICD Volatile tracking • Constant propagation • Dead code elimination • Dead data elimination Transformations Cleaner Implemented as a CIL extension 28

Suppose we have a WSN… 29 Suppose we have a WSN… 29

Suppose we have a WSN… • What happened? – State got corrupted – array Suppose we have a WSN… • What happened? – State got corrupted – array out-of-bounds Memory safety error – Hard to debug • Limited visibility into executing systems • Difficult to replicate complex bugs • Memory safety can – Catch all pointer and array bounds errors • Before they corrupt state – Provide a choice of recovery action • Display error message or reboot 30

Safe Tiny. OS Expand Deputy: existing solution for making C safe into system safety Safe Tiny. OS Expand Deputy: existing solution for making C safe into system safety • Modify Tiny. OS to work with Deputy • Enforce Deputy’s safety model under concurrency • Reduce overhead c. Xprop Published at Sen. Sys 2007 31

Safe Tiny. OS toolchain int post(val_t* buf, int int post(val_t* COUNT(n) buf, n); run Safe Tiny. OS toolchain int post(val_t* buf, int int post(val_t* COUNT(n) buf, n); run modified nes. C compiler enforce safety using Deputy deal with concurrency Tiny. OS code c. Xprop compress error messages Safe whole-program optimization Tiny. OS app c. Xprop Annotate Safe Tiny. OS code Modify Tiny. OS to work with Deputy Enforce Deputy’s safety model under concurrency Reduce overhead 32

 • Deputy enforces safety in sequential code • c. Xprop avoids extraneous protection • Deputy enforces safety in sequential code • c. Xprop avoids extraneous protection – Only racing variables need protection Atomic block Concurrency Potentially unsafe read to local Interrupt Deputy check Potentially Read If ( unsafelocal ) read 33

Code size 35 Code size 35

Code size 35% 13% -11% Safe Tiny. OS 36 Code size 35% 13% -11% Safe Tiny. OS 36

A closer look at RAM usage • On-chip RAM for MCUs expensive – Kilobytes, A closer look at RAM usage • On-chip RAM for MCUs expensive – Kilobytes, not megabytes or gigabytes – Data in SRAM – 6 transistors / bit – SRAM can dominate power consumption of a sleeping chip 37

A closer look at RAM usage • On-chip RAM for MCUs expensive – Kilobytes, A closer look at RAM usage • On-chip RAM for MCUs expensive – Kilobytes, not megabytes or gigabytes –On-chip RAM–is persistentlybit Data in SRAM 6 transistors / scarce in tiny MCU-based systems – SRAM can dominate power consumption of a sleeping chip • Is RAM used efficiently? – Performed value profiling for MCU apps • Apps already heavily tuned for RAM usage – Result: Average byte stores four values! 38

Offline RAM compression • Automated sub-word packing for statically allocated scalars, pointers, structs, arrays Offline RAM compression • Automated sub-word packing for statically allocated scalars, pointers, structs, arrays – No heap on targeted MCUs – Trades ROM and CPU cycles for RAM Published at PLDI 2007 39

Method x ≝ variable that occupies n bits Vx ≝ conservative estimate of value Method x ≝ variable that occupies n bits Vx ≝ conservative estimate of value set log 2|Vx| < n ⇒ RAM compression possible Cx ≝ another set such that |Cx| = |Vx| fx ≝ bijection between Vx and Cx n - log 2|Cx| ⇒ bits saved through compression of x 40

Example Compression void (*function_queue[8])(void); 41 Example Compression void (*function_queue[8])(void); 41

Example Compression void (*function_queue[8])(void); x n = size of a function pointer = 16 Example Compression void (*function_queue[8])(void); x n = size of a function pointer = 16 bits 42

Example Compression x Vx &function_A &function_B &function_C NULL 43 Example Compression x Vx &function_A &function_B &function_C NULL 43

Example Compression x Vx n = 16 bits |Vx| = 4 log 2|Vx| < Example Compression x Vx n = 16 bits |Vx| = 4 log 2|Vx| < n 2 < 16 44

Example Compression x Vx Cx 0 1 2 fx ≝ Vx to Cx ≝ Example Compression x Vx Cx 0 1 2 fx ≝ Vx to Cx ≝ compression fx-1 ≝ Cx to Vx ≝ decompression 3 45

Example Compression ROM x Cx Vx = { , , , } 0 1 Example Compression ROM x Cx Vx = { , , , } 0 1 2 3 fx ≝ compression table scan fx-1 ≝ decompression table lookup 46

Example Compression ROM x Cx Vx = { , , , } 0 1 Example Compression ROM x Cx Vx = { , , , } 0 1 2 128 bits reduced to 16 bits 3 112 bits of RAM saved 47

RAM compression results 49 RAM compression results 49

RAM compression results c. Xprop (no compression) 10% RAM reduction 20% ROM reduction 5. RAM compression results c. Xprop (no compression) 10% RAM reduction 20% ROM reduction 5. 9% duty cycle reduction fs f eo ad Compression Tr 22% RAM reduction 3. 6% ROM reduction 29% duty cycle increase 50

ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe Tiny. OS RAM compression Pointer analysis 51

Conclusion • Interatomic concurrent data-flow • Volatile data may be tracked • Better analysis Conclusion • Interatomic concurrent data-flow • Volatile data may be tracked • Better analysis more optimizations – Safe Tiny. OS – practical memory safety – RAM compression – 22% RAM reduction http: //www. cs. utah. edu/~coop/research/cxprop/ http: //www. cs. utah. edu/~coop/safetinyos/ http: //www. cs. utah. edu/~coop/research/ccomp/ Thank you 52

53 53

Cost/Benefit Ratio C ≝ access profile A, B ≝ platform-specific costs V ≝ cardinality Cost/Benefit Ratio C ≝ access profile A, B ≝ platform-specific costs V ≝ cardinality of value set Su ≝ original size Sc ≝ compressed size 54

Turning the RAM Knob 0% 55 Turning the RAM Knob 0% 55

Turning the RAM Knob 10% 56 Turning the RAM Knob 10% 56

Turning the RAM Knob 20% 57 Turning the RAM Knob 20% 57

Turning the RAM Knob 30% 58 Turning the RAM Knob 30% 58

Turning the RAM Knob 40% 59 Turning the RAM Knob 40% 59

Turning the RAM Knob 50% 60 Turning the RAM Knob 50% 60

Turning the RAM Knob 60% 61 Turning the RAM Knob 60% 61

Turning the RAM Knob 70% 62 Turning the RAM Knob 70% 62

Turning the RAM Knob 80% 63 Turning the RAM Knob 80% 63

Turning the RAM Knob 90% 64 Turning the RAM Knob 90% 64

Turning the RAM Knob 100% 65 Turning the RAM Knob 100% 65

Turning the RAM Knob 95% 66 Turning the RAM Knob 95% 66

Future work • Triggering and sequencing Timer interrupt handler Sense Data ready interrupt handler Future work • Triggering and sequencing Timer interrupt handler Sense Data ready interrupt handler Fire Trigger Fire Data • Caching compressed values read x decompress x 67

More related work • Safe Tiny. OS – R. K. Rengaswamy, E. Kohler, and More related work • Safe Tiny. OS – R. K. Rengaswamy, E. Kohler, and M. Srivastava. Softwarebased memory protection in sensor nodes. Em. Nets 2006. – B. L. Titzer. Virgil: Objects on the head of a pin. OOPSLA 2006. – S. Kowshik, D. Dhurjati, and V. Adve. Ensuring code safety without runtime checks for real-time control systems. CASES 2002. • Offline RAM compression – Y. Zhang and R. Gupta. Compressing heap data for improved memory performance. Software—Practice and Experience 2006. – L. S. Bai, L. Yang, and R. P. Dick. Automated compile-time and run-time techniques to increase usable memory in MMU-less embedded systems. CASES 2006. 68

PAG • Program Analysis Generator – Domain specific language input describes • Domain lattice PAG • Program Analysis Generator – Domain specific language input describes • Domain lattice • Transfer functions • Language-describing grammar • Fixed point solution method – Data-flow analyzer as output • Does not deal with concurrency • Used to evaluate fixed point solutions 69

Feature comparison 12% 5. 5% 70 Feature comparison 12% 5. 5% 70

Domain comparison 71 Domain comparison 71

Resource reduction 12% 8. 3% 2. 5% 1. 8% 72 Resource reduction 12% 8. 3% 2. 5% 1. 8% 72

Published at LCTES 2006 Atomic interleaving Atomic section main Interrupt Atomic section Interatomic Concurrent Published at LCTES 2006 Atomic interleaving Atomic section main Interrupt Atomic section Interatomic Concurrent Data-flow 73

Context insensitivity a is a global variable foo int x = 7; bar(&x); a Context insensitivity a is a global variable foo int x = 7; bar(&x); a = {27} x = {7} {7, 42} bar(int *y) goo(y); a = {27} y = {&x} goo(int *z) *z = 42; a = *z; {27} a = {7, 27, 42} z = {&x} 74

Benchmark descriptions • • • AVR ATmega 128 code Tiny. OS 3, 000 -26, Benchmark descriptions • • • AVR ATmega 128 code Tiny. OS 3, 000 -26, 000 lines of C code Analysis times - seconds to an hour Metrics – Duty cycle • % of time processor is on • Obtained from Avrora – Cycle-accurate simulator for WSNs – Code size and data size 75

Wireless sensor networks • 10 billion units / year • $12. 5 billion market Wireless sensor networks • 10 billion units / year • $12. 5 billion market in 2006 • Cheap • Resource constrained • e. g. Wireless sensor networks – Mica 2 mote ATmega 128 L (4 MHz 8 -bit MCU) 128 KB code, 4 KB data SRAM 76