Скачать презентацию EDA Court Hierarchical Construction and Timing Sign-off of Скачать презентацию EDA Court Hierarchical Construction and Timing Sign-off of

8159b584651ea1498944457808e688fa.ppt

  • Количество слайдов: 51

EDA Court: Hierarchical Construction and Timing Sign-off of So. Cs TAU 2013 Panel EDA Court: Hierarchical Construction and Timing Sign-off of So. Cs TAU 2013 Panel

The good side of hierarchy Chip (h=0) Chiplet (h=1) Core (h=2) …k …k Macro The good side of hierarchy Chip (h=0) Chiplet (h=1) Core (h=2) …k …k Macro (h=4) …k …k Unit (h=3) …k …k …k

Impact of pruning h=0 Fraction of chip at top h=1 2 h= 3 h= Impact of pruning h=0 Fraction of chip at top h=1 2 h= 3 h= 4 h= 5 h= Unpruned fraction Sweet spot: 50 B objects 2 M per macro Fraction at top = 4 e-5 4 levels of hierarchy Pruning = 93%

The bad side of hierarchy o Accuracy? Pessimism? n n n o o o The bad side of hierarchy o Accuracy? Pessimism? n n n o o o Coupling noise? Functional noise? Multiple interacting clocks? Parasitics on boundary nets? Is “context” required? If so, we cannot “shelve and re-use” macros Construction flow? Draconian methodology restrictions?

Chandu Visweswariah Distinguished Engineer IBM East Fishkill, NY chandu@us. ibm. com Oleg Levitsky Solutions Chandu Visweswariah Distinguished Engineer IBM East Fishkill, NY chandu@us. ibm. com Oleg Levitsky Solutions Architect Cadence San Jose, CA oleg@cadence. com Qiuyang Wu Senior Staff Engineer Synopsys Hillsboro, OR qwu@synopsys. com Amit Shaligram Principal Engineer STMicroelectronics Scottsdale, AZ amit. shaligram@st. com Guntram Wolski Principal Engineer Cisco San Jose, CA gwolski@cisco. com Larry Brown Design Center Engineer IBM San Jose, CA lmbrown@us. ibm. com Alex Rubin Senior Engineer IBM San Jose, CA rubin 1@us. ibm. com Alexander Skourikhin EDA Engineer Intel Haifa, Israel alexander. skourikhin@intel. com Igor Keller Senior Architect Cadence San Jose, CA ikeller@cadence. com

Panel plan Charge 1: Hierarchical implementation and hence hierarchical timing sign 10 min -off Panel plan Charge 1: Hierarchical implementation and hence hierarchical timing sign 10 min -off don’t have a future Plaintiff: Oleg Levitsky, Cadence Defendant: Qiuyang Wu, Synopsys Charge 2: EDA tools and flows are inadequate for a construction flow: 10 min budgeting, IP models and hierarchical constraint development are lacking Plaintiff: Amit Shaligram, STMicro. Defendant: Alex Rubin, IBM Charge 3: You can never really close out-of-context + 10 min Misdemeanor charge: too much additional complexity and software Plaintiff: Guntram Wolski, Cisco Defendant: Alexander Skourikhin, Intel Charge 4: hierarchical timing cannot handle multiple interacting 10 min synchronous clocks Plaintiff: Larry Brown, IBM Defendant: Igor Keller, Cadence 30 min Discussion and audience questions 5 min Verdicts and “damages”

Charge 1: Hierarchical implementation and hence hierarchical timing sign-off don’t have a future Plaintiff: Charge 1: Hierarchical implementation and hence hierarchical timing sign-off don’t have a future Plaintiff: Oleg Levitsky, Cadence Defendant: Qiuyang Wu, Synopsys

Evolution of design flow Prototype Implement Sign Off Evolution of design flow Prototype Implement Sign Off

Evolution of design flow Prototype Implement Sign Off Evolution of design flow Prototype Implement Sign Off

Evolution of design flow Prototype Blk 1 Blk 2 … Implement Sign Off Blkn Evolution of design flow Prototype Blk 1 Blk 2 … Implement Sign Off Blkn

Evolution of design flow Prototype Blk 1 Blk 2 … Quiz: Why hierarchical flow? Evolution of design flow Prototype Blk 1 Blk 2 … Quiz: Why hierarchical flow? Blkn Create more work for managers Contribute to real estate bubble Implement Control time to market schedule? Blk 1 Blk 2 … Sign Off Blkn

Hierarchical design flow Prototype Complexity Blk 1 Blk 2 … Blkn Implement Blk 1 Hierarchical design flow Prototype Complexity Blk 1 Blk 2 … Blkn Implement Blk 1 Blk 2 … Sign Off Hierarchical scalability Blkn

Hierarchical design flow Prototype Step 1 Step 2 Blk 1 Blk 2 … Blkn Hierarchical design flow Prototype Step 1 Step 2 Blk 1 Blk 2 … Blkn … Implement Step n tapeout Blk 1 Blk 2 … Sign Off Blkn Flow convergence is a key

Hierarchical design flow Prototype o Technical challenges: n Blk 1 Blk 2 … Blkn Hierarchical design flow Prototype o Technical challenges: n Blk 1 Blk 2 … Blkn Implement Convergence n n n Blk 1 Blk 2 … Blkn n Sign Off n n SI Over the block routing Useful skew distribution CPPR modeling Power budgeting Channeless designs … o Human factor: n n n Level of expertise Human error Lack of sleep

Hierarchical design flow Prototype Complexity Blk 1 Blk 2 … Blkn Implement Convergence Blk Hierarchical design flow Prototype Complexity Blk 1 Blk 2 … Blkn Implement Convergence Blk 1 Blk 2 … Sign Off Blkn Hierarchical scalability Failed to control TTM

What is the alternative? What is the alternative?

Charge 1: Hierarchical implementation and hence hierarchical timing sign-off don’t have a future Plaintiff: Charge 1: Hierarchical implementation and hence hierarchical timing sign-off don’t have a future Plaintiff: Oleg Levitsky, Cadence Defendant: Qiuyang Wu, Synopsys

Hierarchical Design and Timing Closure is the Only Way to Have a Future Qiuyang Hierarchical Design and Timing Closure is the Only Way to Have a Future Qiuyang Wu Sr. Staff Engineer, Synopsys Inc. March 2013 © Synopsys 2013 18

Hierarchical Implementation is Proven • Way back when in the last century – Designs Hierarchical Implementation is Proven • Way back when in the last century – Designs grew beyond the reach of flat implementation – Established hierarchical methodologies, tried, and true • The success will continue because – naturally an iterative and gradual refinement process – relatively larger error margins and tolerances for tradeoff – more about reuse and integration, less about from scratch – … +100 M Gates +1 M Gates

But, “Classic” Hierarchical Timing is Inadequate for Signoff Gap #1 - Burden is on But, “Classic” Hierarchical Timing is Inadequate for Signoff Gap #1 - Burden is on the users: “Garbage in, garbage out” – Block designers do not have quality constraints Can’t close block timing with confidence: pessimism, optimism Can’t create quality models: pessimism, optimism Gap #2 - Language limitations: critical details can’t be elaborated – Chip level designers do not have means to express design intention Can’t describe I/O timing context accurately and completely Can’t cover different reuse scenarios chip netlist chip parasitics TOP Inst The rescue: flat signoff. Full chip golden constraints Hier STA Flat STA (golden) block netlist parasitics Block constraints (ad-hoc) Block ILM, ETM, glass-box, black-box, … However, hierarchical signoff is the only way to stay on top of the technology curve.

And Here is How to do Hierarchical Signoff • The Recipe on Top of And Here is How to do Hierarchical Signoff • The Recipe on Top of Signoff Quality Engine • Provide hierarchical constraint management – Check and highlight inconsistencies • Provide context feedback and allow refinement – Produce accurate and elaborate timing environment • Provide Ease-of-Use through data / flow automation – Minimize/prevent user errors by construction • The Benefits Go Beyond Signoff – Design faster: throughput and interoperation with implementation – Design better: accuracy enables further optimization for power, leakage, robustness, area, etc.

Charge 2: EDA tools/flows are inadequate for a construction flow: budgeting, IP models, hierarchical Charge 2: EDA tools/flows are inadequate for a construction flow: budgeting, IP models, hierarchical constraint development are lacking Plaintiff: Amit Shaligram, STMicroelectronics Defendant: Alex Rubin, IBM

Hierarchical Constraints & Budgeting Amit Shaligram, Principal Engineer STMicroelectronics Hierarchical Constraints & Budgeting Amit Shaligram, Principal Engineer STMicroelectronics

Models – Accuracy, speed and compatibility • Which model to use? • ETM or. Models – Accuracy, speed and compatibility • Which model to use? • ETM or. lib – Reasonable for use before clock tree. • ILM – Required after clock tree insertion • Model accuracy • Different modes at block and top level, block/top constraint mismatches • Handling of high fanout and static nets • Model compatibility • Models between different vendors/tools are not compatible. • Some tools create “physical ILMs” others only “timing ILMs” • It takes time. . • For a ~2 M instance block: 1 scenario (1 mode/1 corner), it takes ~6 -8 hours • Quickly becomes impractical with 25 blocks, ~5 modes and ~16 corners • Can someone create models on the fly? Just use the DEF! Presentation Title 24

Budgeting • Floorplan and constraints – a chicken and egg problem! • Estimation of Budgeting • Floorplan and constraints – a chicken and egg problem! • Estimation of feedthru delays can be challenging. • Consider crosstalk effect! • Best practices not easy to follow all the time (FF at the boundary) • Critical path from a macro, legacy design, cannot tolerate extra latency • Managing hold violations with FF at the boundary • Uncommon clock path creates hold violations due to OCV impact. • SDC format limitations after clock tree insertion • Input/Output delay is specified with respect to virtual clock • Latency of virtual clock changes with every step of the flow (post. CTS, post. Route. SI) Presentation Title 25

Hierarchical Constraints • Top down or bottom-up constraints development flow ? • How to Hierarchical Constraints • Top down or bottom-up constraints development flow ? • How to ensure that block and top constraints are aligned? • Constraint modifications required when using. lib or ILMs in top level • Generated clock definitions inside blocks create “new internal” clocks/pins • Handling large constraint files created within ILM generation flow(s) • Boundary conditions for hold? • How to estimate set_min_delay accurately? • Crosstalk effects of top level clock tree • How much margin is too much margin inside the blocks? • Using infinite timing windows inside the blocks is an overkill Presentation Title 26

Charge 2: EDA tools/flows are inadequate for a construction flow: budgeting, IP models, hierarchical Charge 2: EDA tools/flows are inadequate for a construction flow: budgeting, IP models, hierarchical constraint development are lacking Plaintiff: Amit Shaligram, STMicroelectronics Defendant: Alex Rubin, IBM

Living in a flat world? March 27, 2013 Living in a flat world? March 27, 2013

Long list of charges that simply don’t stick… Many teams have used hierarchy successfully Long list of charges that simply don’t stick… Many teams have used hierarchy successfully to tape out designs! – Large problems require the use of “divide and conquer”. Vast amount of design experience, understanding and overcoming practical challenges. Tools help establish hand-shake across hierarchical levels. – Verification of boundary conditions and assumptions. – Automatic constraint generation and management. – Enforcement of best design practices. Significant body of “do’s and don’ts” to help provide guidance, improve efficiency and reduce pessimism.

Follow best hierarchical design practices Isolate output loading from internal paths! Flop bound the Follow best hierarchical design practices Isolate output loading from internal paths! Flop bound the design! Simple rules can make hierarchy easy(er)! Macro A Flop 1 D Q CLK Macro B Flop 2 D Q CLK Avoid critical paths crossing boundaries! Use single macro clock input!

Hierarchy is a “must have”! 44 M Objects! 5 X Speedup 10+ days Run Hierarchy is a “must have”! 44 M Objects! 5 X Speedup 10+ days Run time (hours) Object count per unit Deterministic Timing Statistical Timing Parallelizes timing and optimization of independent paths to improve over -all efficiency. Better supports timing closure when different macros / top level are at different “stages” of completeness. Fosters un-interrupted design fix-up loop. More resilient to failure.

Charge 3: You can never really close out-of-context + Misdemeanor charge: too much additional Charge 3: You can never really close out-of-context + Misdemeanor charge: too much additional complexity and software Plaintiff: Guntram Wolski, Cisco Defendant: Alexander Skourikhin, Intel

Hierarchical Timing Felonies or Misdemeanors? Guntram Wolski – Cisco Systems Principal Engineer Enterprise Networking Hierarchical Timing Felonies or Misdemeanors? Guntram Wolski – Cisco Systems Principal Engineer Enterprise Networking Group 33

 • You can come close, but that only counts in …. . Or • You can come close, but that only counts in …. . Or if you start worst casing things, you’ve overdesigned… • You can set goals/targets for blocks, but then reality sets in. You end up opening block as it is the “right thing to do” in order to close. • Multiple instances of same core How do you wire over/through the cores? Wiring bays – what if you don’t have enough in some areas? Wire over the top == create new extraction/unique timing problems. Noise issues Every instance doesn’t have same IR drop/noise profile 34

 • Requires strict PD requirements to be effective Very strict methodology to be • Requires strict PD requirements to be effective Very strict methodology to be effective Need flopped boundaries Long distance routes/fly overs need extra handling or pushed down Legacy designs/IP integration cause immediate loss of benefit Integration/Adopt complexity seems more so than with other tools Logic designers have very little interest in helping PD It’s good enough, live with it. I’m not paid to improve your problems, I just meet timing. I have to work on something else, you have to fix it. • Are we leaving performance on table? Subchips need to be designed to guardbanded conditions on I/Os and IR drop 35

 • Why are we not looking at taking advantage of parallelism? Are these • Why are we not looking at taking advantage of parallelism? Are these not many individual paths? If DRC can run on 120 cpus and benefit, why can’t timing? Break up the problem and distribute to my farm…. 36

Charge 3: You can never really close out-of-context + Misdemeanor charge: too much additional Charge 3: You can never really close out-of-context + Misdemeanor charge: too much additional complexity and software Plaintiff: Guntram Wolski, Cisco Defendant: Alexander Skourikhin, Intel

Defense • Timing closure is an iterative process • Controllability is the key for Defense • Timing closure is an iterative process • Controllability is the key for success • Start from initial spec • Once design is getting mature, gradually refine environmental requirements and increase model accuracy • Finally, you see the “real” timing requirements, avoiding overdesign • Non-overdesigned multi-instantiated blocks are reality • Must see all the requirements (timing, parasitics) w/o worst casing • Clocks handling is the real challenge • Noise is never an issue (at most – make worst case between instances) • Reusable IPs are feasible • Have to use accurate block models (adjustable to a new env. ) • Have to apply design restrictions on interfaces

Defense (cont. ) • Have to apply methodological restrictions to block interfaces • Driver Defense (cont. ) • Have to apply methodological restrictions to block interfaces • Driver size, wire length, ports, etc. • All of them are manageable and ease integration on top level • Doesn’t necessarily lead to overdesign, due to accurate block models • Applicable to both flop and latch based designs • Timing analysis is highly parallelizable • Individual block analysis is naturally done in parallel • Top level analysis might • leverage multi-threading technologies in STA algorithms • be divided in clusters and every cluster is analyzed in parallel

Summary • Efficient and Reliable Hierarchical Flow requires two essential factors: • A robust Summary • Efficient and Reliable Hierarchical Flow requires two essential factors: • A robust project methodology, which • Enforces design restrictions • Takes advantage of IP Reuse • Provides continuous timing picture throughout all project phases • Allows productive ECO work • Advanced EDA tools, which • Are flexible and allow controllability between accuracy and simplicity • Can efficiently handle Multi-X environments (X=system, corner, clocks, etc. ) • Utilize parallel computing techniques • Support batch and ECO modes

Charge 4: Hierarchical timing cannot handle multiple interacting synchronous clocks Plaintiff: Larry Brown, IBM Charge 4: Hierarchical timing cannot handle multiple interacting synchronous clocks Plaintiff: Larry Brown, IBM Defendant: Igor Keller, Cadence

Hierarchical timing cannot handle multiple interacting synchronous clocks o Define the problem: Hierarchical timing cannot handle multiple interacting synchronous clocks o Define the problem:

Definition continued o o o If clk 1 X is later than clk 2 Definition continued o o o If clk 1 X is later than clk 2 X, we reduce our setup margin. If clk 1 X is earlier than clk 2 X, we reduce our hold margin. n We don’t know the real relationship between the two clocks until we have our top level established. o This makes it difficult to close timing on the logic macro and “put it on the shelf. ” The problem is magnified if the logic macro is re-used. n In that case, the setup and hold margins of the logic macro must span all existing clk 1 X-clk 2 X relationships.

Fixes from timing methodology o o Option 1: Assert an uncertainty between clk 1 Fixes from timing methodology o o Option 1: Assert an uncertainty between clk 1 X and clk 2 X in macro timing, and validate this uncertainty when running top level timing. n Problem with this: o Leave performance/area on the table by lowering cycle time and/or over-padding hold fails. o If top level can’t meet this requirement, we must open up logic macro for further work. Option 2: ? ? ?

The best solution: Fix the design Update the design so we do not have The best solution: Fix the design Update the design so we do not have multiple synchronous clock inputs in the first place.

Conclusion Perhaps it’s more accurate to say that hierarchical timing can handle multiple synchronous Conclusion Perhaps it’s more accurate to say that hierarchical timing can handle multiple synchronous clock inputs, but cannot do this without leaving performance and/or area on the table. In other words, it does not lead to the most efficient design.

Charge 4: Hierarchical timing cannot handle multiple interacting synchronous clocks Plaintiff: Larry Brown, IBM Charge 4: Hierarchical timing cannot handle multiple interacting synchronous clocks Plaintiff: Larry Brown, IBM Defendant: Igor Keller, Cadence

Defense: First and foremost, defendant pleads not guilty The charge from plaintiff only means Defense: First and foremost, defendant pleads not guilty The charge from plaintiff only means that there is no free lunch For Hierarchical Timing to work designers must follow certain rules They are well described in Alex Rubin defense Specifically, one should have a single clock pin in a block to avoid extra pessimism in hold/setup timing In the case of multiple clock pins plaintiff himself exonerated defender by proposing a solution: it is possible to remove some of the pessimism by describing relationship between two clocks 48

Defense (cont. ) Advanced SI analysis today reduces pessimism today if victim and aggressor Defense (cont. ) Advanced SI analysis today reduces pessimism today if victim and aggressor share same clock SI analysis also becomes more problematic with multiple clock pins With multiple clock pins one assumes the clocks are different leading to Pessimism if uncertainty is assigned to both pins Optimism if no uncertainty is assigned As often is true, the best way to resolve a problem is to avoid creating it: stick to rules of hierarchy-friendly design methodology

Ways to Remove the Limitation CLK There are ways to define relationship between two Ways to Remove the Limitation CLK There are ways to define relationship between two internal clocks: Through parent external clock Explicitly define ranges of skews Parameterization of timing models with skew on two clocks is possible These enhancement are feasible but need to be driven by real commercial interest

Q & A Verdicts Damages!!! Charge 1: Hierarchical implementation and hence hierarchical timing sign Q & A Verdicts Damages!!! Charge 1: Hierarchical implementation and hence hierarchical timing sign 10 min -off don’t have a future Plaintiff: Oleg Levitsky, Cadence Defendant: Qiuyang Wu, Synopsys Charge 2: EDA tools and flows are inadequate for a construction flow: 10 min budgeting, IP models and hierarchical constraint development are lacking Plaintiff: Amit Shaligram, STMicro. Defendant: Alex Rubin, IBM Charge 3: You can never really close out-of-context + 10 min Misdemeanor charge: too much additional complexity and software Plaintiff: Guntram Wolski, Cisco Defendant: Alexander Skourikhin, Intel Charge 4: hierarchical timing cannot handle multiple interacting 10 min synchronous clocks Plaintiff: Larry Brown, IBM Defendant: Igor Keller, Cadence 30 min Discussion and audience questions 5 min Verdicts and “damages”