590cdc5b6496f9cbae6667b28447e4e2.ppt
- Количество слайдов: 74
Data-flow Analysis for Interruptdriven Microcontroller Software Nathan Cooprider Advisor: John Regehr Dissertation defense School of Computing University of Utah
Data-flow Analysis for Interruptdriven Microcontroller Software • A whole program analysis • Targeting embedded C programs • Suitable for use in a compiler 2
Microcontrollers (MCUs) • 10 billion units / year • $12. 5 billion market in 2006 • Cheap • Resource constrained • e. g. Wireless sensor networks – Mica 2 mote ATmega 128 L (4 MHz 8 -bit MCU) 128 k. B code, 4 k. B data SRAM 3
Problem • Resources are constrained • Software outlives hardware – Code reuse leads to bloat • Low-level code confuses analysis – Interrupt-driven concurrency – Device register access 4
Solution • Traditional data-flow analysis – Not adequate precision for MCU software • New techniques to increase precision – Deal with concurrency – Track volatile data • Use in code transformations Thesis statement – Optimizations 5
Contributions • Analysis techniques – Interatomic concurrent data-flow (ICD) – Tracking data through volatile variables • Tool – c. Xprop • Applications – Practical memory safety – Safe Tiny. OS – Offline RAM Compression 6
• Open-source OS for WSNs • Written in nes. C main – Dialect of C • Concurrency – Tasks and interrupts – No threads – Atomic sections Interrupt task Interrupt 7
ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe Tiny. OS RAM compression Pointer analysis 8
Abstract interpretation • Abstract domain switch (x) {. . . – Abstract values break; – Form poset case 42: case 7: case -1: • Subset relation ( ) if (x < 0) x={42, 7, -1} – Lattice x *= -1; • Undefined ( ) x++; • Unknown (⊥ ) {} or if (x == 0) {42} {7} {-1} assert(0); break; {42, 7} {7, -1} {42, -1}. . . {42, 7, -1} or ⊥ 9
Abstract interpretation switch (x) {. . . break; case 42: case 7: case -1: if (x < 0) x={42, 7, -1} x *= -1; x++; if (x == 0) assert(0); break; . . . • Abstract domain – Abstract values – Form poset • Subset relation ( ) – Lattice • Undefined ( ) • Unknown (⊥ ) • Data-flow analysis – Transfer functions – Merging ( ) – Fixed point 10
Abstract interpretation • Abstract domain {42, 7, -1} Τ – Abstract values – Form poset Τ {-1} x<0 < {42, 7} Τ x*=-1; *= ++ x++; Τ {1} {42, 7, 1} Τ {43, 8, 2} Τ x==0 == Τ {43, 8, 2} • Subset relation ( ) – Lattice • Undefined ( ) • Unknown (⊥ ) • Data-flow analysis assert(0); Τ Τ – Transfer functions – Merging ( ) – Fixed point 11
ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe Tiny. OS RAM compression Pointer analysis 12
Interrupt-driven concurrency • Problems – C statements not necessarily atomic x = 0 x 4242; ldi r 24, 0 x 42 Interrupt ldi r 25, 0 x 42 13
Interrupt-driven concurrency • Problems – C statements not necessarily atomic – Preempts sequential control flow • Complicated control flow • Synchronization A race – One flow does not “break” another – Bad synchronization happens • Difficult or impossible to reason about • Must deal with conservatively (⊥ ) 14
Related work • Thread-based concurrency – M. B. Dwyer, L. A. Clarke, J. M. Cobleigh, and G. Naumovich. Flow analysis for verifying properties of software systems. TOSEM 2004. – M. C. Rinard. Analysis of multithreaded programs. SAS 2001. • Leveraging race detection – R. Chugh, J. W. Voung, R. Jhala, and S. Lerner. Dataflow analysis for concurrent programs using datarace detection. PLDI 2008. • Formal semantics – X. Feng, Z. Shao, Y. Dong, Y. Gho. Certifying low-level programs with hardware interrupts and preemptive threads. PLDI 2008. 15
Race detection • Lockset analysis - standard technique – Lock status = interrupt enable bit status – Only one lock – no lock aliasing – nes. C uses lexical nesting • Data classification – Unshared – accessed only from main – Shared – accessed from interrupts 16
Race detection o. Accessed without locking o. Written in shared or unlocked unshared code o. Accessed in shared code R A C E • Data classification – Unshared – accessed only from main – Shared – accessed from interrupts 17
Race detection case analysis Interrupt Write Read Use Racing Not racing Interrupt or task Write Read Access Atomic section 18
Data classification Data Heap Concurrent Static (Global) Sequential Shared ⊥ Racing 6% Stack Unshared 50% Not racing 44% 19
Published at LCTES 2006 Atomic interleaving Atomic section main Interrupt Atomic section Interatomic Concurrent Data-flow 20
Volatile • C type qualifier – volatile int • Special case of C’s memory model – Read value may change “randomly” – Write may affect system state • E. g. , racing data, device registers • Behavior opaque at C level • Prevents compiler optimizations 21
Tracking volatile RAM • Locate variables backed by RAM • Introduce concurrency information – Interatomic concurrent dataflow • Have sound approximation of mutators – Behavior not opaque at system level • Safely analyze volatile variables in RAM 22
Tracking volatile device registers • Hardware registers – Memory mapped I/O – Hardware not actually random (volatile) • Can track using MCU-specific information – OK to track individual bits • Instead of whole register • Interrupt bit of status register Volatile tracking 23
Pointer analysis • Points-to sets – must and may alias – Two pluggable domains – Subtleties from context-insensitivity • Targets: – Device registers – Scalars – Structs – Arrays – not-NULL – Heap Pointer analysis 24
Conditional X propagation • Pluggable abstract domains – From conditional constant propagation • Clean domain interface – Transfer functions – Abstract interpretation Abstract domain utility functions Conditional X propagation Analysis 25
Domains Constant Bitwise Interval Conditional X propagation Value set 26
ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe Tiny. OS RAM compression Pointer analysis 27
Struct splitter Inliner Fixed point computation Cleaner Value-flow Pointer-flow ICD Volatile tracking • Constant propagation • Dead code elimination • Dead data elimination Transformations Cleaner Implemented as a CIL extension 28
Suppose we have a WSN… 29
Suppose we have a WSN… • What happened? – State got corrupted – array out-of-bounds Memory safety error – Hard to debug • Limited visibility into executing systems • Difficult to replicate complex bugs • Memory safety can – Catch all pointer and array bounds errors • Before they corrupt state – Provide a choice of recovery action • Display error message or reboot 30
Safe Tiny. OS Expand Deputy: existing solution for making C safe into system safety • Modify Tiny. OS to work with Deputy • Enforce Deputy’s safety model under concurrency • Reduce overhead c. Xprop Published at Sen. Sys 2007 31
Safe Tiny. OS toolchain int post(val_t* buf, int int post(val_t* COUNT(n) buf, n); run modified nes. C compiler enforce safety using Deputy deal with concurrency Tiny. OS code c. Xprop compress error messages Safe whole-program optimization Tiny. OS app c. Xprop Annotate Safe Tiny. OS code Modify Tiny. OS to work with Deputy Enforce Deputy’s safety model under concurrency Reduce overhead 32
• Deputy enforces safety in sequential code • c. Xprop avoids extraneous protection – Only racing variables need protection Atomic block Concurrency Potentially unsafe read to local Interrupt Deputy check Potentially Read If ( unsafelocal ) read 33
Code size 35
Code size 35% 13% -11% Safe Tiny. OS 36
A closer look at RAM usage • On-chip RAM for MCUs expensive – Kilobytes, not megabytes or gigabytes – Data in SRAM – 6 transistors / bit – SRAM can dominate power consumption of a sleeping chip 37
A closer look at RAM usage • On-chip RAM for MCUs expensive – Kilobytes, not megabytes or gigabytes –On-chip RAM–is persistentlybit Data in SRAM 6 transistors / scarce in tiny MCU-based systems – SRAM can dominate power consumption of a sleeping chip • Is RAM used efficiently? – Performed value profiling for MCU apps • Apps already heavily tuned for RAM usage – Result: Average byte stores four values! 38
Offline RAM compression • Automated sub-word packing for statically allocated scalars, pointers, structs, arrays – No heap on targeted MCUs – Trades ROM and CPU cycles for RAM Published at PLDI 2007 39
Method x ≝ variable that occupies n bits Vx ≝ conservative estimate of value set log 2|Vx| < n ⇒ RAM compression possible Cx ≝ another set such that |Cx| = |Vx| fx ≝ bijection between Vx and Cx n - log 2|Cx| ⇒ bits saved through compression of x 40
Example Compression void (*function_queue[8])(void); 41
Example Compression void (*function_queue[8])(void); x n = size of a function pointer = 16 bits 42
Example Compression x Vx &function_A &function_B &function_C NULL 43
Example Compression x Vx n = 16 bits |Vx| = 4 log 2|Vx| < n 2 < 16 44
Example Compression x Vx Cx 0 1 2 fx ≝ Vx to Cx ≝ compression fx-1 ≝ Cx to Vx ≝ decompression 3 45
Example Compression ROM x Cx Vx = { , , , } 0 1 2 3 fx ≝ compression table scan fx-1 ≝ decompression table lookup 46
Example Compression ROM x Cx Vx = { , , , } 0 1 2 128 bits reduced to 16 bits 3 112 bits of RAM saved 47
RAM compression results 49
RAM compression results c. Xprop (no compression) 10% RAM reduction 20% ROM reduction 5. 9% duty cycle reduction fs f eo ad Compression Tr 22% RAM reduction 3. 6% ROM reduction 29% duty cycle increase 50
ICD c X p r o p Volatile tracking Abstract interpretation Conditional x propagation Safe Tiny. OS RAM compression Pointer analysis 51
Conclusion • Interatomic concurrent data-flow • Volatile data may be tracked • Better analysis more optimizations – Safe Tiny. OS – practical memory safety – RAM compression – 22% RAM reduction http: //www. cs. utah. edu/~coop/research/cxprop/ http: //www. cs. utah. edu/~coop/safetinyos/ http: //www. cs. utah. edu/~coop/research/ccomp/ Thank you 52
53
Cost/Benefit Ratio C ≝ access profile A, B ≝ platform-specific costs V ≝ cardinality of value set Su ≝ original size Sc ≝ compressed size 54
Turning the RAM Knob 0% 55
Turning the RAM Knob 10% 56
Turning the RAM Knob 20% 57
Turning the RAM Knob 30% 58
Turning the RAM Knob 40% 59
Turning the RAM Knob 50% 60
Turning the RAM Knob 60% 61
Turning the RAM Knob 70% 62
Turning the RAM Knob 80% 63
Turning the RAM Knob 90% 64
Turning the RAM Knob 100% 65
Turning the RAM Knob 95% 66
Future work • Triggering and sequencing Timer interrupt handler Sense Data ready interrupt handler Fire Trigger Fire Data • Caching compressed values read x decompress x 67
More related work • Safe Tiny. OS – R. K. Rengaswamy, E. Kohler, and M. Srivastava. Softwarebased memory protection in sensor nodes. Em. Nets 2006. – B. L. Titzer. Virgil: Objects on the head of a pin. OOPSLA 2006. – S. Kowshik, D. Dhurjati, and V. Adve. Ensuring code safety without runtime checks for real-time control systems. CASES 2002. • Offline RAM compression – Y. Zhang and R. Gupta. Compressing heap data for improved memory performance. Software—Practice and Experience 2006. – L. S. Bai, L. Yang, and R. P. Dick. Automated compile-time and run-time techniques to increase usable memory in MMU-less embedded systems. CASES 2006. 68
PAG • Program Analysis Generator – Domain specific language input describes • Domain lattice • Transfer functions • Language-describing grammar • Fixed point solution method – Data-flow analyzer as output • Does not deal with concurrency • Used to evaluate fixed point solutions 69
Feature comparison 12% 5. 5% 70
Domain comparison 71
Resource reduction 12% 8. 3% 2. 5% 1. 8% 72
Published at LCTES 2006 Atomic interleaving Atomic section main Interrupt Atomic section Interatomic Concurrent Data-flow 73
Context insensitivity a is a global variable foo int x = 7; bar(&x); a = {27} x = {7} {7, 42} bar(int *y) goo(y); a = {27} y = {&x} goo(int *z) *z = 42; a = *z; {27} a = {7, 27, 42} z = {&x} 74
Benchmark descriptions • • • AVR ATmega 128 code Tiny. OS 3, 000 -26, 000 lines of C code Analysis times - seconds to an hour Metrics – Duty cycle • % of time processor is on • Obtained from Avrora – Cycle-accurate simulator for WSNs – Code size and data size 75
Wireless sensor networks • 10 billion units / year • $12. 5 billion market in 2006 • Cheap • Resource constrained • e. g. Wireless sensor networks – Mica 2 mote ATmega 128 L (4 MHz 8 -bit MCU) 128 KB code, 4 KB data SRAM 76


