Скачать презентацию Test Methodology for Characterizing the SEE Sensitivity of Скачать презентацию Test Methodology for Characterizing the SEE Sensitivity of

cd4e8c9d30db231df190978aa021e9a1.ppt

  • Количество слайдов: 27

Test Methodology for Characterizing the SEE Sensitivity of a Commercial IEEE 1394 Serial Bus Test Methodology for Characterizing the SEE Sensitivity of a Commercial IEEE 1394 Serial Bus (Fire. Wire) Christina Seidleck Raytheon ITSS Lanham, MD Abstract Introduction Typical PC-based Implementation The Protocol Layers What Was Tested? Stephen Buchner QSS Landover, MD Modes of Operation Packet-Based Transactions Test Part Function and Acc Radiation Characterization Radiation Test Hardware Setup Hak Kim Jackson & Tull Wahsington, DC Radiation Test Hardware Diagram P. W. Marshall Consultant Brookreal, VA Kenneth La. Bel NASA GSFC Greenbelt, MD Example of a Soft Error Example of a Hard Error Radiation Test Software Flow for Asynchronous Mode SEFIs Categorized by Steps Required to Start Communications Software Flow for Isochronous Mode Results LLC Asynchronous Mode Two Main Types of Error Observed Results PHY Asynchronous Mode Results LLC Irradiated Hard Errors Results LLC Irradiated Soft Errors Results PHY Irradiated Hard Errors Conclusions References

Abstract The Single Event Effect (SEE) responses of two Fire. Wire serial buses based Abstract The Single Event Effect (SEE) responses of two Fire. Wire serial buses based on the IEEE 1394 standard were tested with heavy ions and protons. A unique approach to testing and categorizing the SEEs is presented.

Introduction and Background IEEE 1394 is a formal description of the architecture called Fire. Introduction and Background IEEE 1394 is a formal description of the architecture called Fire. Wire originally developed by Apple Computer. Fire. Wire is an advanced serial bus used for connecting numerous high performance devices together. Why Fire. Wire? • Less Expensive Alternative to Parallel Buses - a variety of devices can connect directly to a single serial bus, ~4. 5 meters allowed between devices (cable implementation) • Backplane and Cable Implementations Supported -only the cable implementation is presented here • Plug and Play Support - supports automatic configuration of devices without intervention from the host system • Scalable Performance - support of transfer rates of 400 Mb/s, 200 Mb/s, and 100 Mb/s • Attachment Of Up To 63 Nodes On A Single Serial Bus • Supports Two Transmission Modes: Isochronous and Asynchronous • Peer to Peer Transfers - data can be transferred between individual nodes without intervention from the host system

Typical PC-Based 1394 Implementation The serial bus allows a variety of high-speed peripheral devices Typical PC-Based 1394 Implementation The serial bus allows a variety of high-speed peripheral devices to be attached and supported PC 1394 Cable Laser Printer Digital VCR Digital Camera CD-ROM

The Protocol Layers of the 1394 Bus Software Driver Asynchronous Isochronous Management Transfer Interface The Protocol Layers of the 1394 Bus Software Driver Asynchronous Isochronous Management Transfer Interface Bus Manager Isochronous Resource Manager Transaction Layer Cycle Master Link Layer Node Controller Physical Layer Serial Bus Management Layer Serial Bus

What Was Tested? Physical Layer PHY • 16 Internal Registers Link Layer LLC Commercial What Was Tested? Physical Layer PHY • 16 Internal Registers Link Layer LLC Commercial 1394 Development Board Vendor TI NSC LLC Part Number TSB 12 LV 26 PZT CS 4210 VJG Lot Date Code Development Board TSB 41 AB 3 PFP OCC 4 RTT TSBKOHCI 403 CS 4103 VHG VS 052 ABC 4 Lot Date Code PHY Part Number CA-OAAO 45 T VS 052 ABC 4 • FIFOs • PCI Registers • OHCI Registers CS 4210 A-DK

Modes of Operation Asynchronous Isochronous • Data transfers target a particular node based on Modes of Operation Asynchronous Isochronous • Data transfers target a particular node based on a unique address (one-to-one transfers) • Data transfers target nodes based on a channel number of the transfer (like a broadcast, one-to-many) • Data transfers do not require a constant data rate • Receiving nodes “listen” to channel numbers to receive data packets • All data transfers of this type are guaranteed 20% (min. ) of overall bus bandwidth • No error detection or retransmits • Verifies data delivery with acknowledge, CRC checks and response codes • Supports data retransmits • Used when data integrity is required/critical • Uses constant bandwidth which is requested from the isochronous resource manager • 80% of bus bandwidth used for isochronous data transfers • Used for time critical, error-tolerant data transfers

Packet-Based Transactions All transactions are transmitted over the bus in a packetized form. Different Packet-Based Transactions All transactions are transmitted over the bus in a packetized form. Different types of packets are defined for asynchronous and isochronous modes. Asynchronous Packets: Sample Write Request Packet • Reads • Writes • Locks Sample Acknowledge Packet Destination Address Source ID Transaction Type Data Label CRC Acknowledge Code Parity Isochronous Packets • Stream Data Sample Stream Packet Channel Number Transaction Type Data CRC Acknowledge

Test Part Function and Accessibility Link Layer LLC Physical Layer PHY Functions • Forms Test Part Function and Accessibility Link Layer LLC Physical Layer PHY Functions • Forms packets for transmission • Provides address decoding for incoming asynchronous packets • Provides channel number decoding for incoming isochronous packets • Performs CRC error checking • Electrical and mechanical interface for transmission and reception of packets transferred across the bus • Arbitration - ensures only one node at a time transmits on the bus Registers Monitored • FIFOs • Due to the volatility of the 16 registers on the PHY, none was monitored • 42 out of a possible 102 Open Host Controller Interface (OHCI) registers • 21 out of a possible 22 PCI registers

Radiation Characterization • Protons (TRIUMF) and heavy ions (BNL and TAMU) used to test Radiation Characterization • Protons (TRIUMF) and heavy ions (BNL and TAMU) used to test parts from Texas Instruments and National Semiconductor. • Irradiate PHY and LINK chips separately on DUT board. • National Semiconductor part underwent destructive latchup when irradiated with ions having a LET = 27 Me. V. cm 2/mg. Therefore, did a full characterization on the TI parts only.

Radiation Test Hardware Setup • Two personal computers (PCs) with PCI slots were used Radiation Test Hardware Setup • Two personal computers (PCs) with PCI slots were used in the test • Each had an IEEE 1394 board • One of the PCs with the devices-under-test (DUTs) was placed in the beam line while the other was placed in a remote area • The two PCs were connected by their 1394 interface via a 10 ft 1394 cable for data communication • A PCI bus isolation card was placed between the DUT board and its host PC • This card enables current consumption readings from the +5 V supply to the DUT board from the host PC via the PCI interface • A HP 34401 A Digital Multi-Meter (DMM) was used to read and record this supply current

Radiation Test Hardware Diagram Target Area Beam PHY 1394 DUT PCI Bus Isolation Card Radiation Test Hardware Diagram Target Area Beam PHY 1394 DUT PCI Bus Isolation Card Host PC 10 ft. LLC 1394 Cable HP 34401 A DMM 1394 Board Remote PC (CTRL) Monitor, Keyboard, Mouse Laptop

Radiation Test Software • Custom device driver software was developed using C++ and Jungo’s Radiation Test Software • Custom device driver software was developed using C++ and Jungo’s Win. Driver targeted for a PC Windows NT 4. 0 platform • Software was an interrupt driven program which established continuous communications between DUT and CTRL at 100 Mbps • For SEL testing at BNL no registers were monitored • For proton testing at TRIUMF only asynchronous mode was implemented • For heavy ion testing at TAMU both asynchronous and isochronous modes were implemented

Software Flow for Asynchronous Mode Setup • Lockdown memory • Enable receive buffers • Software Flow for Asynchronous Mode Setup • Lockdown memory • Enable receive buffers • Set node ID • Turn on interrupts • Set Delay Wait for interrupt Determine test type LLC or PHY Form data request packet and send to DUT Response Buffer • Compare Data • Log Errors • Continue Test Loop CTRLR Register data request packet Register data response packet Request Buffer • Determine test type requested • Poll LLC or PHY registers • Form Data Response Packet DUT

Software Flow for Isochronous Mode Setup • Lockdown memory • Enable ARRS for bus Software Flow for Isochronous Mode Setup • Lockdown memory • Enable ARRS for bus reset packets • Turn on Isoch receive buffer • Turn on interrupts Form register data solicit packet Receive Buffer • Compare register values • Log errors • Continue loop CTRLR Setup • Lockdown memory • Enable ARRS for bus reset packets • Turn on Isoch receive buffer • Turn on interrupts Register data solicit stream packet Register data stream packet Wait for interrupt Receive Buffer • Poll LLC registers • Build data packet DUT

Two Main Type of Errors Observed Soft Errors • Bit flips logged by software Two Main Type of Errors Observed Soft Errors • Bit flips logged by software which occurred in registers, FIFOs or data that did not disrupt communications between the DUT and CTRLR during the test run Hard Errors or SEFIs • Errors occurring in registers which halted communications between the DUT and CTRLR during the test run • Errors of this type required a series of software and/or operator steps in order to recover communications • SEFIs were further classified by the steps taken to re-establish communications

Example of a Soft Error Asynchronous Request Filter Low Register on the LLC Enables Example of a Soft Error Asynchronous Request Filter Low Register on the LLC Enables reception of asynchronous request packets on a per-node basis (handles lower node IDs). When an asynchronous request packet is received, the source node ID is examined. If the bit corresponding to the node ID is not set in this register, then the packet is not acknowledged and the request is not queued. In this example, the register is setup such that only asynchronous request packets from nodes 0 and 1 will be accepted. Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 0 0 0 0 17 16 15 14 0 0 0 13 12 11 10 0 9 0 0 8 7 0 0 6 5 4 3 2 0 0 0 1 1 If bit 26 transitions to a 1, this incorrectly would enable asynchronous request packets from node 26 to be accepted. 0 1

Example of a Hard Error (SEFI) Host Controller Control Register on the LLC Provides Example of a Hard Error (SEFI) Host Controller Control Register on the LLC Provides flags for controlling the TSB 12 LV 26 Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 0 0 0 Reserved 0 0 0 1 17 16 15 14 1 1 0 0 0 13 12 11 10 0 9 0 0 8 7 0 0 6 5 4 3 2 0 0 0 1 0 Reserved Bit 17 is the Link Enable bit. This bit is set to 1 when the system is ready to begin operation. If an upset cleared it to 0, the TSB 12 LV 26 would be logically and immediately disconnected from the 1394 bus. No packets would be received or transmitted. Communications would be halted between the CTRLR and DUT. 0

SEFIs Categorized By Steps Required to Start Communications Step 1 5 Action SEU test SEFIs Categorized By Steps Required to Start Communications Step 1 5 Action SEU test loop is restarted on the CTRLR, i. e. , a packet is sent to DUT requesting register information Software bus reset. Force CTRLR to be root, initiate bus reset in the PHY, reset node on LLC. Restore registers and flush FIFOs. Set bus Ops, IRMC, CMC, ISC, configuration ROM, enable transmit and receive. Implies step 1. Reload software application. This refreshes the lockdown memory region shared by hardware and software. Implies steps 2, 1. Able to verify CTRLR is sending register data solicit packets to DUT. Able to verify that DUT receives the packets and sends data response packet to CTRLR cannot see response packet from DUT. Power cycle the CTRLR. Implies steps 3, 2, 1. Disconnect/reconnect the 1394 cable. This causes hard bus reset, tree ID process. 6 Step 5, followed by steps 3, 2, 1. 7 8 9 10 Step 6 followed by cold rebooting DUT followed by steps 3, 2, 1. Cold reboot DUT followed by steps 3, 2, 1. Step 5 followed by step 8. Reboot CTRLR followed by steps 3, 2, 1. 11 Reboot both CTRLR and DUT PCs followed by steps 3, 2, 1. 2 3 4

Results - LLC Running Asynchronous Mode ERRORS IN LLC RUNNINGASYNCHRONOUS MODE 3 4. 2 Results - LLC Running Asynchronous Mode ERRORS IN LLC RUNNINGASYNCHRONOUS MODE 3 4. 2 0 0 8. 39 11. 9 0 27. 7 39. 2 0 51. 6 59. 6 73 0 x x x “Soft” Errors 1 No errors observed current jumped from 18 m. A >44 m. A 2 3 Register error, self corrected and no change in current Register error, self corrected, current jumped 18 m. A >44 m. A 1. 3 E-4 0 0 0 1. 0 E-5 4. 6 E-5 2. 5 E-5 8. 8 E-5 3. 1 E-4 0 0 0 2. 4 E-4 1. 3 E-4 0 0 “Hard” Errors 4 Restart communications from CTRLR 5 Software bus reset current junped from 18 m. A to 44 m. A 6 Reset CTRLR and/or DUT software 7 Software bus reset and reset software on DUT and CTRLR 8 CTRLR sends packet, does not listen cold reboot CTRLR 9 Disconnect/reconnect cable (hard bus reset) 10 Disconnect/reconnect cable, reload bus DUT software 11 Reset cable and cold reboot DUT 12 Cold reboot DUT after lockup, but no change in current 13 Cold reboot DUT after lockup, current jump 18 m. A to 44 m. A 14 Reset cable, reboot DUT and software, delta I=0 15 Reset cable, reboot DUT and software: 18 - >44 m. A 16 Reboot CTRLR, reload software on bus, DUT and CTRLR 17 Reboot both computers, reset all software 0 0 0 0 0 0 0 0 8. 3 E-7 0 0 6. 8 E-6 0 0 0 2. 6 E-5 0 0 4. 3 E-6 8. 3 E-7 2. 3 E-6 0 4. 3 E-6 4. 2 E-7 0 1. 3 E-5 0 0 0 2. 3 E-6 0 0 0 0 6. 8 E-6 0 0 0 0 8. 3 E-7 2. 2 E-6 1. 7 E-6 0 0 0 4. 3 E-6 0 2. 3 E-6 0 0 4. 5 E-6 2. 6 E-5 1. 4 E-5 4. 5 E-6 0 0 0 0 0 1. 4 E-5 0 0 0 6. 8 E-6 0 0 0 0 5. 7 E-5 0 0 0 x x x x

Results - PHY Running Asynchronous Mode ERRORS IN PHY RUNNINGASYNCHRONOUS MODE 3 4. 2 Results - PHY Running Asynchronous Mode ERRORS IN PHY RUNNINGASYNCHRONOUS MODE 3 4. 2 27. 7 39. 2 0 0 0 8. 39 11. 9 x x x x x 0 51. 6 59. 6 73 0 0 0 x x x x x x x x x “Soft” Errors 1 No errors observed current jumped from 18 m. A >44 m. A 0 0 2 3 Register error, self corrected and no change in current Register error, self corrected, current jumped 18 m. A >44 m. A 0 0 0 0 9. 1 E-8 0 0 “Hard” Errors 4 Restart communications from CTRLR 5 Software bus reset current junped from 18 m. A to 44 m. A 6 Reset CTRLR and/or DUT software 7 Software bus reset and reset software on DUT and CTRLR 8 CTRLR sends packet, does not listen cold reboot CTRLR 9 Disconnect/reconnect cable (hard bus reset) 10 Disconnect/reconnect cable, reload bus DUT software 11 Reset cable and cold reboot DUT 12 Cold reboot DUT after lockup, but no change in current 13 Cold reboot DUT after lockup, current junp 18 m. A to 44 m. A 14 Reset cable, reboot DUT and software, delta I=0 15 Reset cable, reboot DUT and software: 18 - >44 m. A 16 Reboot CTRLR, reload software on bus, DUT and CTRLR 17 Reboot both computers, reset all software 0 0 0 0 0 8. 3 E-7 3. 3 E-6 0 0 1. 0 E-4 9. 1 E-6 0 0 0 0 2. 5 E-6 3. 6 E-5 6. 4 E-5 0 0 0 2. 0 E-4 0 0 0 0 0 2. 0 E-4 2. 6 E-4

Results LLC Irradiated Hard Errors Results LLC Irradiated Hard Errors

Results - LLC Irradiated Soft Errors Results - LLC Irradiated Soft Errors

Results - PHY Irradiated Hard Errors Results - PHY Irradiated Hard Errors

Conclusions • NSC part exhibited destructive latchup at LET=27 Me. V. cm 2/mg • Conclusions • NSC part exhibited destructive latchup at LET=27 Me. V. cm 2/mg • TI part exhibited both SEUs (soft errors) and SEFIs (hard errors) • At low LETs, the errors are mostly soft errors • The presence of SEFIs resulting in rebooting of the system makes this part problematic for space usage. – power cycling may be required • An improved test would involve: – automatic reboot – another device

References Anderson, Don and Mindshare, Inc. Fire. Wire System Architecture. Addison-Wesley: Reading Massachusetts, 1999. References Anderson, Don and Mindshare, Inc. Fire. Wire System Architecture. Addison-Wesley: Reading Massachusetts, 1999. 1394 Open Host Controller Interface Specification. Release 1. 1, January, 2000. S. Buchner, et al. Radiation Testing of the 1394 Fire. Wire. Presentation SEU Symposium, Los Angeles. April, 2002. Sponsors NEPP NRL/NPOES Special thanks to Kent Larson and Mike Worcester of Boeing