Скачать презентацию SOC Embedding System Group o Embedding System Скачать презентацию SOC Embedding System Group o Embedding System

9cfc05798de5d15a039ffdf39a28816a.ppt

  • Количество слайдов: 45

SOC & Embedding System Group o Embedding System n n o Embedded OS – SOC & Embedding System Group o Embedding System n n o Embedded OS – 曾建超 Multimedia – 蔡淳仁 , 蔡文錦 Low power mobile – 曹孝櫟 Storage – 張立平 SOC Design & CAD n n n Network – 林盈達 Architecture and Systems – 鍾崇斌 、 單智君 Wireless base-band Processor – 許騰尹 Multimedia SOC Design – 蔡淳仁 , 彭文孝 Electrical Design Automation – 李毅郎

Research Interests Chien-Chao Tseng 曾建超 網路 程研究所 系統設計研究所 交通大學資訊學院 cctseng@csie. nctu. edu. tw Research Interests Chien-Chao Tseng 曾建超 網路 程研究所 系統設計研究所 交通大學資訊學院 [email protected] nctu. edu. tw

Wireless Access to Internet : -) fs sts dof ere an Int d H Wireless Access to Internet : -) fs sts dof ere an Int d H My an • 3 G/GPRS/PHS ? ing • Wi. Max/WLAN/Bluetooth/PAN oam R ØHeterogeneous Wireless Overlay Networks – Multi-interface Handheld Devices

n Embedded OS for Multi-interface Handheld Devices n Cross-layer design for Real-time Applications Linux/Windows n Embedded OS for Multi-interface Handheld Devices n Cross-layer design for Real-time Applications Linux/Windows XP/CE Ø Driver, Network, and Application Layers (Vo. IP) Ø n Heterogeneous Wireless Networks n n WPAN, WLAN and Mobile Router Roaming and Routing Embedded Wireless Mesh and Sensor Networks n n PHS Multi-tier Wireless Network n n WLAN/Wi. Max/3 G/GPRS/PHS 3 G/GPRS Roaming and Handovers Address Assignment and Routing WLAN Secured and Fast Accesses to Wireless Network

Embedded Systems (曹孝櫟助理教授 ) Research Directions o n n o o Embedded Software for Embedded Systems (曹孝櫟助理教授 ) Research Directions o n n o o Embedded Software for B 3 G/4 G Mobile Devices Protocol Stacks for 4 G access Embedded Operating System and Device Drivers, and their Optimization for Mobile Devices Cooperate with international and local vendors and institutes to development 4 G/multimode radio SOC Establish the reference embedded software for next generation mobile devices/Radio So. Cs

Embedded Systems (曹孝櫟助理教授 ) Low Power and Fast Handover R&D Results - Cellular/WLAN Dual Embedded Systems (曹孝櫟助理教授 ) Low Power and Fast Handover R&D Results - Cellular/WLAN Dual Model Mobiles Power Consumption Evaluation System Architecture and Prototype of Cellular/WLAN Dual Mode Mobile Awarded by Handover Latencies Evaluation 2005 Mobile Communications Contest of Industrial Development Bureau, MOEA 2005 Software Contest of National Center of High-Performance Computing 2006 Embedded Software Contest of MOE

Prof. Li-Pin Chang 張立平 o Recent research directions n n n Embedded storage systems Prof. Li-Pin Chang 張立平 o Recent research directions n n n Embedded storage systems Real-time systems and scheduling algorithms Hardware-software co-design

Embedded Storage: Efficient wear-leveling algorithm for flash memory o o To capture uneven usages Embedded Storage: Efficient wear-leveling algorithm for flash memory o o To capture uneven usages from millions of blocks and to level them Result: the most fast, effective, economic approach available!! LBA Erase cycle # Worn-out quickly! Time Block # Access pattern Block usage

Real-Time Systems: Overload Management for Real-Time Object Tracking Inter-arrival time of frames : 4 Real-Time Systems: Overload Management for Real-Time Object Tracking Inter-arrival time of frames : 4 ms. Workload-scaling factor: 4/7 (57%) Average RMS error t Firm-real-time: (c, 4) ((4, 7), c, 4) Average RMS error (1, 4) i drop t" j drop FE ((4, 7), 2, 4) t Proportional Adjustment: (c, 4) (c, 7) (1, 4) i t' j PP (2, 7)

Hardware-Software Co-design Reconfigurable computing for overload management o Reconfigurable computing for overload management n Hardware-Software Co-design Reconfigurable computing for overload management o Reconfigurable computing for overload management n n Past achievement: Ø Overload management for event-driven real-time embedded systems Working-in-progress: Ø To deal with transient workload bursts with hardware acceleration Ø Move critical tasks onto FPGA • • • Computing resource reclamation On-line floor planning On-line topology reconfiguration for network-on-chip (No. C)

Embedded Systems (蔡文錦 ) Research Directions o o Low-power embedded systems Video compression/decompression Embedded Systems (蔡文錦 ) Research Directions o o Low-power embedded systems Video compression/decompression

Plan in the near future o Low-power AVC/H. 264 video CODEC algorithm and system Plan in the near future o Low-power AVC/H. 264 video CODEC algorithm and system design

Multimedia Embedded Systems Lab ( 蔡淳仁 ) – Research Directions o o o So. Multimedia Embedded Systems Lab ( 蔡淳仁 ) – Research Directions o o o So. C Design for Advanced Video Codecs DVB/MHP middleware & Java Runtime Java Processor for DVB/MHP Flexible Multimedia Codec So. C Platforms OS Kernel Scheduler for Tightly-coupled Heterogeneous Multi-core Platforms

Multimedia Embedded Systems Lab R&D Results o o o H. 264 Codec Accelerators on Multimedia Embedded Systems Lab R&D Results o o o H. 264 Codec Accelerators on ARM Integrator Java Processor Accelerating Technologies on Spartan 3 and ML-310 Platforms (based on the open source JOP project) Video Rate Control for HW/SW Codesigned So. Cs (patent application) Tightly-coupled H. 264 encoder on TIOMAP 5912 Tightly-coupled kernel scheduler module for ARM-Linux on TI-OMAP 5912

Future Plans o o Implementation a flexible multimedia codec So. C platform Design of Future Plans o o Implementation a flexible multimedia codec So. C platform Design of a new Java Processor for DVB/MHP Design of Hardware-Friendly Psychovisual-models for Video Codec Clean Design of a Multi-core OS kernel suitable for Tightly-Coupled Task Scheduling

Architecture and Systems Research Directions (單智君 鍾崇斌 ) o o o o Embedded processor Architecture and Systems Research Directions (單智君 鍾崇斌 ) o o o o Embedded processor and So. C Java processor, JIT compilation &VM DSP designs and compilation Low-power systems Graphic processor Superscalar ARM processor Reconfigurable computing

Architecture and Systems R&D Results o o o o o ARM 9 -compatible processor Architecture and Systems R&D Results o o o o o ARM 9 -compatible processor with video/audio capabilities Java stack operations folding Memory Constrained Java Just-in-time Compiler DSP– instruction set extensions Low-power Branch-Target-Buffer Low-power bus encodings Low-power cache memory Graphic processor design techniques Superscalar ARM Reconfigurable computing

ARM 9 -compatible Processor with Audio/Video Capabilities o o ARMAVP (ARM Audio Video Processor) ARM 9 -compatible Processor with Audio/Video Capabilities o o ARMAVP (ARM Audio Video Processor) 為 32位元微處理器,採用負載平衡良好的 五階管線設計,分別為 Fetch Unit、 Decoder Unit、 Execution Unit、 Memory access Unit 以及 Write Back Unit。對各 階的設計進行效能的最佳化,以提高時脈 頻率,並提供有效率的機制,降低了因為記 憶體速度太慢對微處理機效能上的影響 特性 n n n n n o 支援 Conditional Execution ABP 緩衝器設計 改良指令抓取所需時間 精確中斷控制結構 非同步的記憶體存取 動態暫存器組的映射 分支指令的快速處理 多功能有效率的執行路徑 分散式指令控制編碼 功能驗證與評估 n 所有功能已在 Altera EP 20 K 600 EBC 652 -1 上完成驗證。根據 Decode Stage之模擬結 果,在FPGA上可 作於 45 MHz,預期實做為 晶片時可達 210 MHz

DSP– Instruction Set Extensions o Current research topics n n n Multiple-issue architecture Ø DSP– Instruction Set Extensions o Current research topics n n n Multiple-issue architecture Ø Exploring ISE in a multiple-issue architecture, such as superscalar or Very Long Instruction Word (VLIW) Hardware reusebility Ø Reuse same or similar hardware resources in different ASFUs while keep same performance Overcome register file read/write port constraint Ø Try to schedule the input and output of ASFU at different time slots

Low-power Bus Encodings o 在此我們針對不同的匯流排架構的特性,提出了不同的低電 耗匯流排編碼系統。我們的編碼系統利用了各種編碼方法, 將藉由匯流排傳輸的資料,以最具有電耗效率的方式來傳送, 達到省電的效果。 匯流排編碼架構 傳送端 接收端 編碼過的資料 原始資料 o Low-power Bus Encodings o 在此我們針對不同的匯流排架構的特性,提出了不同的低電 耗匯流排編碼系統。我們的編碼系統利用了各種編碼方法, 將藉由匯流排傳輸的資料,以最具有電耗效率的方式來傳送, 達到省電的效果。 匯流排編碼架構 傳送端 接收端 編碼過的資料 原始資料 o 編碼器 額外控制線路 解碼器 原始資料 低電耗匯流排編碼系統 資料 記憶體 資料位址匯流排 T 0_BI_1, Variable-Stride, SRWEC 資料匯流排 Leading-bytes encoding 指令位址匯流排 處理器 T 0 + Discontinuous Address Table 指令匯流排 BIBITS with Register Relabling 指令、位址混和之位址匯流排 處理器 I/D Selector, T 0 DAT+Stride-Table 指令、位址混和之匯流排 I/D Selector, BIBITS_RR+Leading-bytes 記憶體 指令 記憶體

Low-power Cache Memory 快取記憶體佔有整體處 理器超過 50%之功耗 低功耗快取記憶體設計 o o n n Loop Buffer: 將 Low-power Cache Memory 快取記憶體佔有整體處 理器超過 50%之功耗 低功耗快取記憶體設計 o o n n Loop Buffer: 將 loop code置入低耗電存取 之 loop buffer中以節 省指令擷取之功耗 Power Manager: 將不 常使用之快取記憶體 區塊置入低耗電模式 以節省快取記憶體之 靜態功號。

Graphic Processor 2 3 4 5 1 6 研究目的 ︰ 進行新一代繪圖處理器架構 研究,於像素著色器 (Pixel Shader)、材質 Graphic Processor 2 3 4 5 1 6 研究目的 ︰ 進行新一代繪圖處理器架構 研究,於像素著色器 (Pixel Shader)、材質 (Texture) 及深 度處理 (Depth Processing) 等 三大方向提出硬體架構及軟 體驗證環境。 目前成果分項說明如下 ︰ 1. A dynamically reconfigurable graphics hardware for resource reallocatable rendering pipeline 2. A Reconfigurable Texture Mapping Architecture 3. Implementation of texture Compression by GPU Driver 4. Register Renaming for Pixel Shaders data/value management 5. Instruction scheduling mechanism for 3 D GPU pixel shader 6. An Efficient Texture Memory System Designs 7. Alpha Blending without Z Sort

Superscalar ARM Goal: a superscalar embedded processor featuring o 800 MHz clock rate @ Superscalar ARM Goal: a superscalar embedded processor featuring o 800 MHz clock rate @ 0. 13 um 1. 8 DMIPS / MHz – superscalar performance under tough pipeline latency 800 K gate count – cost-effective design n Directions and achievements o Micro-architecture n Ø A 12 -stage dual-issue superscalar processor with good instruction fetch rate, issue rate, and efficient forwarding Simulator n Ø A cycle-accurate simulator modeling more details than the well-known simplescalar simulator Compiler n Ø Working on GCC machine description to optimize performance

Reconfigurable Computing Motivations: o Improving the Design Methodology of Embedded System Hardware o Providing Reconfigurable Computing Motivations: o Improving the Design Methodology of Embedded System Hardware o Providing a Better Performance with Low Development Cost o Shorting the Time-to. Market of So. C Products Research Issues: o Hardware/Software Partition o Synthesize Technology o Reconfigurable Processing Element Design Reconfigurable Architecture (1/2)

Research overview in SOC and Embedded Systems (林盈達 ) o Research theme: Ø o Research overview in SOC and Embedded Systems (林盈達 ) o Research theme: Ø o o Content networking with deep packet inspection by software and hardware solutions; with applications in Internet security (intrusion detection, anti-virus, antispam, content filtering, MSN/P 2 P management) Embedded software n Embedded Linux solutions: 7 -in-1 10 -in-1 n A startup company, L 7 Networks (L 7 -Networks. com), 2002, for all-in-one security gateways So. C n Key component in content networking: string matching hardware acceleration needed! n FPGA-based development to accelerate Aho Crosaic and Bloom Filtering algorithms

Embedded and So. C Group Selected R&D Results (2/2) o o 7 -in-1 integrated Embedded and So. C Group Selected R&D Results (2/2) o o 7 -in-1 integrated security gateway String Matching Engine to Accelerate Aho Corasic Machine Unified Content Filtering Hardware Platform String Matching Hardware with Bloom Filters

7 -in-1 Integrated Security Gateway • 7 -in-1: VPN, Firewall, NAT, Routing, Content Filtering, 7 -in-1 Integrated Security Gateway • 7 -in-1: VPN, Firewall, NAT, Routing, Content Filtering, Intrusion Detection, Bandwidth Management • Launched a startup in 2002: L 7 Networks Inc. LAN/DMZ MAC Filter WAN LAN/DMZ to WAN Outbound Traffic Redirect Y In-LAN Filter Policy Route Out-WAN Filter IPsec VPN Y Out-LAN Filter Bandwidth Mgt. Y FTP/POP 3/SMTP/ Web/URL Filter with Many-to-One NAT Bandwidth Mgt. NAT Alerting System Intrusion Detection Y Route In-WAN Filter Redirect WAN to DMZ/LAN Inbound Traffic de. NAT Y sniff IPsec de. VPN

String Matching Engine to Accelerate Aho Corasic Machine o New Parallel Architecture with Pre-Hashing String Matching Engine to Accelerate Aho Corasic Machine o New Parallel Architecture with Pre-Hashing and Root. Indexing

Unified Content Filtering Hardware Platform o Resolve content filtering issues n n Match without Unified Content Filtering Hardware Platform o Resolve content filtering issues n n Match without interrupt CPU Multiple connections management On-fly match non -fixed payload Multiple patterns and multiple matched outputs Content Filtering Hardware Text First Matched Last Match Status Length Offset ID Offset Text Pointer FA State . . . Text First Match Last Match Status Length ID Offset Text Descriptors in DRAM

String Matching Hardware with Bloom Filters shift controller Leaving byte Bloom filter(1) Bloom filter(2) String Matching Hardware with Bloom Filters shift controller Leaving byte Bloom filter(1) Bloom filter(2) Platform: Xilinx ML 310 Embedded Development Platform with embedded Power. PC 405 processor Xilinx Virtex-II Pro XC 2 VP 30 FPGA Monta. Vista Linux Professional Edition 3. 0 Bloom filter(3) Feature Set: 1. Allow maximum shift distance if possible. 2. Reconfigure rules easily. 3. Keep constant hardware complexity. Entering byte detect prefix(p, 1) detect prefix(p, 2) detect factor in p

Embedded and So. C Group Major Projects o o o Excellence Project: Next Generation Embedded and So. C Group Major Projects o o o Excellence Project: Next Generation Information Communication Networks (卓越後續計劃 , 國科會 2004~2008): n 林盈達 , 曾文貴 (with 24 faculty members) Network Benchmarking Lab ( 研院交大網路測試中心 , www. nbl. org. tw, 經濟部 業局 , 2003~2007) n 林盈達 Attack Session Extraction and Comparison with Nessus (Cisco San Jose, 2005~2006) n 林盈達 Content-based Network Security - Content Classification: Design, Implementation, and Evaluation (整合型計劃 , 國科會 , 2004~2006) n 林盈達 (with 李程輝 , 孫雅麗 ) Open Source Product Testing Tools: In-Lab Live Testing (國科會 , 2005~2006) n 林盈達

Biography of Ying-Dar Lin 林盈達 n n n n B. S. , NTU-CSIE, 1988 Biography of Ying-Dar Lin 林盈達 n n n n B. S. , NTU-CSIE, 1988 Ph. D. , UCLA-CS, 1993 Professor, NCTU-CS, 1999~ Founder and Director, ITRI-NCTU Network Benchmarking Lab (NBL; www. nbl. org. tw), 2002~ Co-Founder, L 7 Networks Inc. (www. L 7. com. tw), co-invested by DLink, Zy. XEL, and Advantech, 2002 Consultant, CCL/ITRI, 2002~ Well-cited paper: Multihop Cellular: A New Architecture for Wireless Communications, INFOCOM 2000, YD Lin and YC Hsu; # of citations: 150 n Areas of research interests q q n Design, implementation, analysis, benchmarking of Internet gateway devices (10 -in-1: routing, NAT, firewall, VPN, IDP, CF, anti-virus, anti-spam, IM, P 2 P, bandwidth management, link load balance, etc. ) Internet security and Qo. S Content networking Test technologies of switch, router, WLAN, security, and Vo. IP Publications q q q q International journal: 39 International conference: 33 IETF Internet Draft: 1 Industrial articles: 124 Books: 2 Patents: 16 Tech transfers: 8

Wireless Baseband Processor (許騰尹 ) o o MIMO OFDM PHY Ultra Low-power PHY Generic Wireless Baseband Processor (許騰尹 ) o o MIMO OFDM PHY Ultra Low-power PHY Generic PHY architecture Chip Implementations

Wireless Baseband Processor Spreading Gate Count : 500 Max. Freq : 80 MHz PAM Wireless Baseband Processor Spreading Gate Count : 500 Max. Freq : 80 MHz PAM Match Filter Gate Count : 4800 Max. Freq : 80 MHz Clock Recovery Gate Count : 1500 Max. Freq : 178 MHz CTRL Gate Count : 1500 Max. Freq : 80 MHz Spreading PAM Match Filter Clock Generator Clock Recovery Divider Digital Divider Gate Count : 900 Max. Freq : 60 MHz Clock Generator Gate Count : 2600 Max. Freq : 165 MHz

Proto-type 802. 11 b Baseband+MAC chip Item Technology A/D (Q) A/D (I) PLL 0. Proto-type 802. 11 b Baseband+MAC chip Item Technology A/D (Q) A/D (I) PLL 0. 25 um CMOS 1 P 5 M VLSI Type Cell-Based Design Function 802. 11 b Baseband+MAC System Frequency 44 MHz Package 208 QFP Gate Count D/A Specification Not available Chip Size Not available Power supply 2. 5 V (digital) 3. 3 V (analog) Power Dissipation 650 m. W

Architecture and Systems R&D Results o o o ARM 9 -compatible processor with video/audio Architecture and Systems R&D Results o o o ARM 9 -compatible processor with video/audio capabilities (technology transferring) Java stack operations folding (patents) Asynchronous 8051 on FPGA Low-power Branch-Target-Buffer (patent application) Low-power bus encodings (patent applications) Graphic processor design techniques

SOC Electrical Design Automation ( 李毅郎 ) – Research Directions o Reliable Interconnect Design SOC Electrical Design Automation ( 李毅郎 ) – Research Directions o Reliable Interconnect Design n n o Layout Migration n o Crosstalk-driven Interconnect Design-for-Manufacture (DFM) Interconnect Design VLSI Cell Migration with Topology Preservation Post-Layout Platform for Verification and Optimization

SOC Electrical Design Automation– RD Results o Tile-based Gridless ECO Router with Graph Reduction SOC Electrical Design Automation– RD Results o Tile-based Gridless ECO Router with Graph Reduction n o NEMO: A New Full-Chip Gridless Router n o o Two times faster than existing tile-based routers. Faster than all academic gridless routers Crosstalk-driven Track Assignment Pre-Detailed Routing Design Flow Considering Capacitive- and Inductive-Noise Constraints

SOC EDA Group RD Results - New ECO Routing Design Flow SOC EDA Group RD Results - New ECO Routing Design Flow

SOC EDA Group RD Results – Full. Chip Gridless Router SOC EDA Group RD Results – Full. Chip Gridless Router

Electronic System Level Design http: //mapl. nctu. edu. tw (彭文孝 ) Traditional Design Flow Electronic System Level Design http: //mapl. nctu. edu. tw (彭文孝 ) Traditional Design Flow with ESL System Level Verification and Integration First Time Silicon Success

Design Practice: Transaction Level Modeling for H. 264 Decoder (彭文孝 ) http: //mapl. nctu. Design Practice: Transaction Level Modeling for H. 264 Decoder (彭文孝 ) http: //mapl. nctu. edu. tw Cache SDRAM Controller Data Transaction Bus Arbitration Video Pipe Control Bus CPU Output Interface

So. C for Multi-Standard Video Codec (彭文孝 ) http: //mapl. nctu. edu. tw Video So. C for Multi-Standard Video Codec (彭文孝 ) http: //mapl. nctu. edu. tw Video Codec HD Capturing System on Chip ARM-9 CPU 3 -A Functionalities Color Transform Embedded SRAM and Ob-Chip Bus Networking Bus Arbitration Architecture C Model

VLSI/SOC Research for Graphics System (范倫達老師 ) VLSI Information Processing LAB Advisor: Lan-Da Van VLSI/SOC Research for Graphics System (范倫達老師 ) VLSI Information Processing LAB Advisor: Lan-Da Van ([email protected] nctu. edu. tw) 3 -D Graphics Demo Here!

VLSI/SOC Research for Adaptive Communications (范倫達老師 ) o 虛擬系統單晶片平台 (Virtual SOC Platform)建置 – 使用 VLSI/SOC Research for Adaptive Communications (范倫達老師 ) o 虛擬系統單晶片平台 (Virtual SOC Platform)建置 – 使用 Co. Ware Platform Architect n n n 提供虛擬系統平台供軟體人員程式開發 提升系統模擬之層級以提高系統驗證效率 發展效能評估指標 : 根據效能評估指標的模擬結果進而得到系統架構的最佳配置,以 供系統開發時有所依據 Ø Ø 在不同的軟硬體組態,模擬各功能函數所花費的時間 在不同的軟硬體組態,計算模組對 bus之進行存取次數 Block diagram of platform Memory location (size) Addr. Bits / Data. Bits AHB SW stub 0 x 0 (0 x 100000) RAM 0 x 400 0000 (0 x 100000) 20 / 32 ARM 926 Instruction Data clock reset ROM 32 / 32 20 / 32 32 / 32 FFT HW 0 x 1000 0000 (0 x 4) APB i. TCM d. TCM Virtual SOC Verification Platform IP Implement ation 1/8 din Display 0 xc 0000 (0 x 1) FFT/IFFT Chip Design