104
Designing a chip Challenges, Trends, and Latin America Opportunity Victor Grimblatt R&D Group Director SASE 2012

Designing a Chip SASE 2012

  • Upload
    sabaree

  • View
    230

  • Download
    2

Embed Size (px)

DESCRIPTION

This book covers the chip designing and soc design methodology

Citation preview

Synopsys 20121 Designing a chip Challenges, Trends, and Latin America Opportunity Victor Grimblatt R&D Group Director SASE 2012 Synopsys 20122 Agenda Introduction The Evolution of Synthesis SoC IC Design Methodology New Techniques and Challenges IP Market, an opportunity for Latin America Synopsys 20123 Introduction Synopsys 20124 Interesting Facts from Cisco Source:Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 20112016, Feb 14, 2012 Last years mobile data traffic eight times the size of the entire global Internet in 2000 Global mobile data traffic grew 2.3-fold in 2011, more than doubling for 4th year in a row Mobile video traffic exceeded 50% for the first time in 2011 Average smartphone usage nearly tripled in 2011 In 2011, a 4th generation (4G) connection generated 28x more traffic on average than non-4G connection Synopsys 20125 A Decade of DigitalUniverse Growth 0100020003000400050006000700080002005 2010 20157.910Zettabytes 1.2 Zettabytes 130 Exabytes Bandwidth Increase Drives Exploding Need for Bandwidth and Storage Synopsys 20126 One zettabyte = stacks of books from Earth to Pluto 20 times (72 billion miles) If an 11 oz. cup of coffee equals 1 gigabtye, then 1 zettabyte would have the same volume of the Great Wall of ChinaSource:IBS and Cisco Synopsys 20127 Tomorrows World Reality Augmented Reality Blended Reality Search Agents Info That Finds You(and networks that know you) 2D 3D Immersive Video Holographics Medical Mobile Medical Personal Medical Person to Person Machine to Machine Human Machines Synopsys 20128 What the Future Has in Store Synopsys 20129 How Does This Affect Design? Synopsys 201210 Today ItsUsed to Be Megatrends Change Design Requirements Computing Creating Info Compute PowerBusiness At your desk Work Connectivity Consuming Info Battery Power Consumer Anywhere, anytime Entertainment Synopsys 201211 3% 5% 6%5% 13% 20% 31% 13% 4% 0%5%10%15%20%25%30%35%250nm180nm 130nm 90nm 65/55nm 45/40nm 32/28nm 22/20nm 100M, 3% >100M, 13% 0%5%10%15%20%25%30%35%40%45%50%2010 2011Synopsys Global User Survey, Feb 2012 Synopsys 201213 and Faster Designs 50MHz 51-100MHz 101-200MHz 201-300MHz 301-400MHz 401-500MHz 501-750MHz 751MHz-1GHz 1-2GHz >2GHz 0%20%40%60%80%100%2004 2005 2006 2007 2008 2009 2010 201142% Synopsys Global User Survey, Feb 2012 N = 962 Synopsys 201214 while requiring aggressive Power Management 0%50%100%150%200%250%300%350%400%2010 2011OtherBack-biasing/Well-biasingLibrary Variables (e.g., multi-channellength libraries)Low Vdd StandbyState retentionMTCMOS/Power gatingLower Vdd operationDynamic Voltage/Frequency Scaling(DVFS)Multi-Corner, Multi-Mode (MCMM)optimizationMulti-voltage domainsMulti-Vt leakage optimizationClock gatingSynopsys Global User Survey, Feb 2012 N = 282 Synopsys 201215 Design Challenges are Multiplying Example of 28-nm challenges Unidirectional Poly (and other RDRs) Requires separate layouts, verification & test effort. GF and TSMC have different preferred orientations (N/S v. E/W) No poly for local routing Device segmentation Limited device sizes, large analog devices broken up into smaller pieces; Increases analog area Complexity Approximately 1700 design rule checks at 28nm vs. 700 at 65nm 8x the # of corners at 65 v. 28nm Lower Vddmin resulting in less design headroom Metal resistance doubles from 40 nm to 28 nm Global versus local Vth variations due to random doping effects Device Aging Must take into account device degradation over time due to threshold voltage instability (NBTI/PBTI) and mobility degradation (HCI) 40 nm layout 28 nm analog layout 9% larger than 40 nm due to limitationson poly area 28 nm is 2X harder than 40 nm28 nm IP area increases without circuit innovation Synopsys 201216 SoftwareSoC = on a chipSystem $- $0.50 $1.00 $1.50 $2.00 $2.501 2 3 4 5 6 7 8 9 101112131415161718192021222324252627$M Months HW & SW Development Costs App-Specific SWLow-Level SWOS SupportDesign ManagementPost-silicon ValidationMasksPhysical DesignRTL VerificationRTL DevelopmentSpec DevelopmentIP Qualification Source: IBS, Synopsys Software is Half the Time to Market For a Typical SoC ! Synopsys 201217 $0$25$50$75$100$125$150$17590nm (60M) 65nm (90M) 45/40nm (130M) 32/28nm (180M) 22/20nm (240M)Cost ($M) Feature Dimension (Transistor Count) Hardware Software Source: IBS and Synopsys, 2011 And Half the Cost Synopsys 201218 Unlike Moore Software Guys are Pessimists Pages Law:2009 Software gets twice as slow every 18 months. Wirths Law:1995 Software is getting slower more rapidly than hardware becomes faster. Synopsys 201219 What Can We Do About It? Synopsys 201220 The Evolution of Synthesis Synopsys 201221 Source: GE, 1986 Placement & Routing Ronald L. Rivest, Charles M. Fiduccia, Robert M. Mattheyses, GE & MIT, 1982 Synopsys 201222 Logic Synthesis David Gregory, Karen Bartlett, Aart J. de Geus, Gary D. Hachtel, GE & University of Colorado at Boulder, 1986 Synopsys 201223 Until Late 80s The Implementation Flow Was Quite Straight Forward There Was Already a Wall Schematic Capture Timing Simulation Front-End Place & Route DRC/LVS Back-End Synopsys 201224 Early 90s The Relationship Needs Improvements Badly: Walls Now Lead to Iterations, Often Out of Control Delay Calculation Timing Simulation Sign-Off RTL Simulation Logic Synthesis Front-End Place & Route DRC/LVS Back-End Synopsys 201225 Early 00s, 130nm, 7+ Metals PC and Astro+Blast+SilEnsemble The Relationship Matures Still, Too Many Walls, and # of Iterations Too High RTL Simulation Logic, Power & Test Synthesis Floorplan Physical Synthesis Floorplan P&R Back-End Extraction & STA DRC/LVS Sign-Off Front-End Synopsys 201226 The Evolution Of The Relationship Convergence !2009 32/28 Nanometers In-Design 2007 45/40 Nanometers Look Ahead 2005 65 Nanometers Correlation 2003 90 Nanometers Interoperability Synopsys 201227 Late 80s - Early 90s. Attempt #1 : Predict the future based on the past Wire load models, broken by nanometer wires Mid 90s. Attempt #2 : Predict the future based on the present Front-end floorplanning, broken by Frankenstein flows Late 90s Today. Attempt #3 : Partner to create the future , rather than attempt to predict it Convergence of synthesis and place & route But underlying mathematics is different The Evolution Of The Relationship Quick Summary Synopsys 201228 Logic Synthesis And Place & Route A Revolutionary Evolution : Convergence ! Logic Compiler, ca. 1986Design Compiler, 2010.03 From Equations to Gates, to Placed and Routable Gates Synopsys 201229 SoC Synopsys 201230 What is High-Level Synthesis? User inputs: High-level algorithm Constraints Automation usingHigh-Level Synthesis Designer Intent HLS outputs: Synthesizable RTL C-model RTL testbench Scripts for synthesis, verification and downstream tools HLS Results Design technology and methodology Develop and verify hardware at a higher level of abstraction Much smaller code with fewer bugs introduced Rapid architecture exploration Automate implementation and verification Automatic optimizations that equal hand-coded QoR Eliminate manual RTL coding & verification Example benefits 2-5X productivity for initial designs 5-10X productivity for design re-use Increased exploration leading to better results Multi-million gate designs in weeks vs. months ; * c b a c Synopsys 201231 High-Level Synthesis Advantage Algorithm Design RTL Coding Architecture Exploration RTL Verification Implementation Cycle by cycle functional debug For single architecture onlySpreadsheets Traditional Block Design Algorithm Design High-Level Design RTL Verification Implementation HLS-based Block Design Better Designs,Faster Faster, more automatic model-to-RTL validation,reduced RTL-level debug Quickly evaluate multiple architectures RTL automatically generated Faster design at higher abstraction Synopsys 201232 Best Quality of Results May not be suitable for largest FPGA designs (long runtimes and large memory requirements) Classic FPGA MethodologyTop Down Implementation Reduced Quality of Results Shorter runtime -preserve unchanged parts Design Preservation, block based flows, and Incremental P&R with SmartGuide Divide and ConquerTop Down Incremental Implementation Distributed development Better design preservation and isolation Design style adjustments needed to achieve optimal timing Quality of Results (e.g. registering module boundaries Emerging Mix and Match Bottom Up and Top Down FlowChanging FPGA Design Methodology Synopsys 201233 Unified RTL Flow for FPGA and SOC FPGA Synthesis DW Implementation Synplify Premier/Certify ASIC Implementation DW Implementation Galaxy DesignWare Building Blocks Common RTL from prototype to production a combination of IP and tools All DW Building blocks, minPower and Macrocell Blocks are supported in Synplify Premier and Certify for FPGA-based prototyping Your IP DesignWare IP Synopsys 201234 Designs are getting larger and larger. Schedule stays the same or shorter despite the increases in design complexity. Engineering resources are not increasing to handle this complexity. Todays SOC Designs How can EDA help manage this complexity? Synopsys 201235 Many Methods of Designing SOC Design Similar Approach But End Results Vary Instructions 1.Preheat the oven to 450. 2.Melt butter and chocolate together in the top of a double broiler or in the microwave. Add sea salt. 3.Meanwhile, beat together the egg, egg yolks, and sugar with a whisk or an electric beater until light and slightly foamy. 4.Add the egg mixture to the warm chocolate; whisk quickly to combine. Add flour and stir just to combine. The batter will be quite thick. 5.Butter small ramekins, or use Reynolds foil cupcake liners. 6.Divide the batter evenly among the ramekins. (You can make the cakes in advance to this point and chill them until you're ready to bake. Be sure to bring the batter back to room temperature before baking.) 7.Baking time will depend on your oven; start with 7 minutes for a thin outer shell with a completely molten interior. 8.Melt a little more chocolate to drizzle on top. Sprinkle a little more salt, and serve with berries or ice cream. Building Blocks Instructions Final Product Varies Synopsys 201236 Ever Increasing Chip SizeLeads to Hierarchical Design Instances 3M5M15M100M+ Hierarchical Flat Typical Threshold Flat versus Hierarchical Synopsys 201237 Ten Best Practices for Hierarchical Design Understanding These Practices Can Help #6 Block-Level I/O Paths Affects block design closure #7 Block-Level Drivers/LoadsAffects block boundary closure #8 Inter-Block Critical Paths Absence helps chip closure #9 Constraints Management Affects design closure & TAT #10 SignoffSTA Correlates to close timing #1 FloorplanAffects design closure #2 Top-Level Style Requires different discipline #3 Block Size Tradeoff size versus TAT #4 Modeling Modeling for top-level closure #5 Top-Level ClosureMeeting the inter-block signals Synopsys 201238 Partitioning Guidelines Logical connectivity Clock Voltage areas Physical size Multiple Instantiated Modules (MIM) Macro Placement Power Planning IO Planning #1 Floorplan Affects Design Closure Example 1 Example 2 vs. vs. ChallengeBetter Approach Synopsys 201239 #2 Top-Level Style Requires Different Design Discipline AbuttedNarrow ChannelChannel clock Data Implementation Complexity Synopsys 201240 #3 Block Size Tradeoff Size versus TAT (turn around time) 1.5M 1.5M 1.5M 1.5M 1.5M 1.5M 2M2M 3M 5M 5M Faster TAT per block but more blocks to integrate Longer TAT per block but fewer blocks to integrate What Is Reasonable Size Depends A Lot On Design Team Preference? Note: Block Size in instances Synopsys 201241 Extracted Timing Model (ETM) Blocks modeled by timing arcs only Used for customized IP Abstract Model Interface cells of each block retained Recommended for P&R blocks #4 Modeling ETM vs. Abstract Model Synopsys 201242 #5 Top-Level Closure Meeting Timing on Inter-Block Signals Chg graphic Closing top-level inter block signals can be challenging Can be minimized with Proper estimation of interface constraints Proper floorplanning for signal connectivity between blocks Simultaneous optimization of top-level and inter-block signals needed Synopsys 201243 Typical Hierarchical Structure I/O paths are not finalized during early stage block design Overconstraining these paths direct the tool to focus on I/O paths instead of the intra-block paths Accuracy of proportional time budgets is affected if interfaces are still changing #6 Block Level I/O Paths I/O Paths Are Typically Not Finalized Early Block Under DesignAdjacent BlockAdjacent Block LogicLogicLogicLogicLogic RegistersRegistersRegistersRegisters Synopsys 201244 A Better Approach Registering block outputs makes budgeting less dependent on completeness of the netlist and easier Re-partitioning logic hierarchy helps manage constraints complexity Partitioning according to power domains / logic hierarchy makes flow easier #6 Block Level I/O Paths Registering Block Outputs Makes Budgeting Easier Block Under DesignAdjacent BlockAdjacent Block LogicLogicLogic RegistersRegistersRegistersRegisters Logic Synopsys 201245 When designing Block A, need to consider load at output port A set_load When designing Block B, need to consider driving cell at input port B set_driving_cell #7: Block Level Drivers and Loads Modeling I/O with Realistic Values Drives Convergence Block ABlock B AB Block Interface timing is one of the toughest issues in hierarchical flow Realistic model of your input and output ports helps design convergence Synopsys 201246 Without good estimation of loads and driving cell Integrating these blocks forces iterations unnecessary to meet timing Budgeting can automatically generate driver and load information Generate a quick netlist to run through budgeting for more accurate results #7: Block Level Drivers and Loads Inter-blocks Paths Are One Of The Toughest SOC Challenges n If no load is specified Cell cannot be sizedcorrectly Synopsys 201247 If tool cannot see complete path, may be challenge to stitch them at top-level Avoid critical paths crossing multiple blocks Makes timing closure difficult Contain them within the same block or if you must cross multiple blocks, minimize the number of blocks Budgeting, sizing, and load estimations are needed to solve inter-block critical paths violations #8: Inter-Block Critical Paths Absence Helps Chip Closure . Block to Block path, crossing Top Top to Block path Synopsys 201248 Use shielding to reduce crosstalk effects between the block- and top-level t significantly improve timing closure in inter-block critical paths Use new Transparent Interface Optimization (TIO)in IC Compiler #8: Inter-Block Critical Paths Shielding Helps Chip Closure Without Shielding With Shielding Synopsys 201249 #9: Constraints Management Pay Attention to Constraints Infeasible paths are paths that are impossible to meet timing Missing false path/multi-cycle path constraints Unreasonable input/output delay constraints Other things to watch out size_only attributes dont_touch attributes Multi-cycle paths False paths Etc. Eg: Infeasible Path, insufficient for 1 clock cycle Eg: Infeasible Path, i/p delay too large Synopsys 201250 Use IC Compiler signoff correlation checker system Performs both consistency and correlation check with user controllable accuracy level Supports both pre-route and post-route checks #10 Signoff Correlation Tighter Correlation Helps Close Timing Synopsys 201251 Focus on environment and library setup for pre-route correlation Certain variables for correlation may have runtime and/or QoR impact on optimization Correlation setup may change and re-check may be needed for post-route#10 Signoff Correlation Flows Flows for Pre-route and Post-route Correlation Checks Pre-Route Flow Synopsys 201252 Todays Designs Are Big & Hierarchical Source: L. Besson, STMicroelectronics Timing Signoff Challenges More effects, more variation Impacts accuracy vs. runtime Hierarchical P&R vs. flat signoff Large machines and runtime Interactions between top & block 30-40% blocks are tough to close 10 to 20 ECO iterations Lots of scenarios to analyze more machines, more reports Synopsys 201253 The Nanometer Challenges Top Issues to Look at Source: ITRS 2009; C.A. Malachowsky, NVIDIA, EDPS 2009; P. Saxena, Intel, ISPD 2003 (1) SION Dielectric/Polysilicon Gate; (2) High-k Dielectric/Metal Gate Synopsys 201254 IC Design Methodology Synopsys 201255 But, Synthesis has Evolved Synthesis has evolved beyond logic mapping Its now predicting and resolving congestion for physical design Synthesis prediction of physical effects evolution is key to progress Synopsys 201256 And, Physical Design Under Heavy Load Increasingly, Physical Design is the driver forimplementation schedule Its where the rubber meets the road speed, die-size, power, yield .. P&R evolution key to progress Synopsys 201257 Whats on Designers Mind? Design & Project Management! Is everyone using the same tool version and the standard scripts? How close are we to our design goals? Whats the status of the blocks right now? How can I use the experience from this project to plan the next one better? How much compute and license resources are we using?Whats taking up the most time?Which step? Which block? Synopsys 201258

Many Flavors Of Methodology Imagination Is the Only Limit Source: www.bk.com 2010 Synopsys 201259 create_clock -period [0.7 * target]high performance set_max_area to 0small area Use small blocks for fast turnaround time Past Guidance doesntAlwaysApply to the Present Things have changed but users are still using the above techniques! Synthesis Place & Route Signoff 2005-2008 Look-ahead Signoff Design Planning Synthesis DRC / LVS Place & Route 2000-2005 Correlation 2009-2010 In-Design Place & Route DRC / LVS Synthesis Signoff 2011- Exploration Place & Route DRC / LVS Signoff Synthesis Exploration Implementation DRC / LVS Synopsys 201260 Wireload Model (WLM) results in higher frequency during Synthesis than using Design Compiler Topographical (DCT) technology The Past vs. The Present With WLM, these two circuitshave the same delay Figure 1Figure 2 With DCT, the delay is a reflection of the x-y location of the cells Which is more realistic? Synopsys 201261 Ten Best Practices forDesign Methodology #6 MethodologyOne or Two Flows #7 OptimizationAdjust Accordingly #8 SignoffReview Your Environment #9 PerformanceLeverage Your EDA Partner #10 Low PowerArchitecture Drives Power #1 LibrariesKnow Your Attributes #2 SetupCorrelation and Runtime #3 ScriptsImpacts Your Design #4 ConstraintsWatch Your Constraints #5 AnalyzeAnalyze-Fix-Proceed Synopsys 201262 Why is my design larger in area? Why is it taking so long to run? #1 Libraries: Know Your AttributesWatch for dont_use, dont_touch, and size_only usage in your libraries and scripts Attributes are user-controlled to guide optimization Restricting optimization may lead to problems After Optimization Original Area New Area Synopsys 201263 A properly designed set of library cells give optimization engines more choice Avoid cells sensitive to minor change in load, impedes convergence Footprint-equivalent cells are useful for final-stage optimization w/ minimal perturbation to other design metrics Std. cell pins should be on grid - (especially complex cells with small drive strength: higher pin density) Multiple variants for each flop (drive strengths, delays, setup times, .. ) Library quality enabler for targeted performance Technology and IP Make Sure to Have a Good Quality Library Example:Cell Sensitivity To Load Uncertainty Delay Cload C* D* Cell A Cell B B A Synopsys 201264 #2 Setup: Correlation and Runtime Netlist v1.0 SDC v1.0 Compile 3.2M instances Netlist v1.1 SDC v1.1 Compile 6.8M instances?? What happened??? Found issues after days of engineering work Size_only on 3.7M cells SDC with all cells set with set_disable_clock_gating on What do designers do when they run into these? Synopsys 201265 Review Your Settings and Input Understand the Different Objectives Detect design issues and dirty constraints styles that can lead to bad runtime/memory and QoR DC Utility Checker Detect readiness of physical design before going into various implementation stages ICC Utility Checker Detects application variables, settings and design issues causing runtime or memory increase PT Utility Checker Synopsys 201266 Need to put things in perspective First Step: review your script How was the script migrated to Tool A? Did you also update the script to leverage the latest technologies? Early stage of your design, think fast mode Final stage of your design, think QoR #3 Scripts: Impacts Your Design When someone tells you Tool A is X times faster than Tool B IncompleteComplete Synopsys 201267 Todays design requires completeness Synopsys tools are tailored for performance, but they also have a mode to run fast Recommendations The typical complaint is long runtime, choose your goal setting accordingly Make sure your script is up to date for your end goal and to take advantage of the latest features Tool Input can Impact Results Understand How the Tool Can Help Meet Design Goals Synopsys 201268 Symptoms of over-constraining: long runtime, excessive buffering and huge violations #4 Constraints: Watch Your Constraints Original Clock period Input DelayOutput Delay Time Available for logic Over-constraining could guide the tool to focus on artificial critical paths Over-constraining happens with Unrealistic input and/or output delays Tightening the clock period Specifying large clock uncertainty Synopsys tools are designed to work towards meeting design goals but dont expect miracles! Synopsys 201269 Understanding EDA Tool will help Simple IllustrationCircuit ACircuit B Will DC do this transformation? CLKA wns = -0.300 CLKB wns = -0.100 CLKA wns = -0.280 CLKB wns = -0.150 Default WeightsDelay Cost BeforeDelay Cost After CLKA weight = 1 CLKB weight = 1 0.30 0.10 0.28 0.15 Total WNS Cost0.400.43 Adjusted WeightsDelay Cost BeforeDelay Cost After CLKA weight = 10 CLKB weight = 1 3.00 0.10 2.80 0.15 Total WNS Cost3.102.95 Total cost increased Transformation rejected Worst WNS = -0.300 Total cost reduced Transformation accepted Worst WNS = -0.280 < > Cost = pi * wi Synopsys 201270 #5 Analyze: Analyze-Fix-ProceedPush Button Flow does not exists Know your circuit to guide the tool Synopsys 201271 Synopsys Galaxy Implementation Flow DC Graphical IC Compiler place_opt -spg clock_opt route_opt signoff_opt compile_ultra -spg insert_dft compile_ultra spg -incr StarRC PrimeTimeSI Signoff extraction Signoff STA Analyze results between design stages Synopsys 201272 Design specifications and constraints changes constantly during the design cycle #6 Methodology: One or Two Flows 180 nanometers (2000) 225K gates,11 RAMs 150 MHz 45 nanometers (2010) 96mm2, ~ 300M transistors 7-9W One flow for both exploration &Implementation Exploration flow target for early specs & constraints Implementation flow for final design realization Synopsys 201273 Exploration Throughout Galaxy DC Explorer Early RTL Exploration Accelerates Design Schedules Design Compiler Look-ahead & Physical Guidance Creates a better starting point IC Compiler Design Exploration Creates initial floorplan Block Feasibility Determines physical feasibility Galaxy Constraint Analyzer Continuous improvement RTL Exploration RTL Synthesis Design Exploration Design Planning Block Feasibility Block Implementation ImplementationExploration RTL Physical Synopsys 201274 Adjust your constraints to model effects ofdownstream design steps #7 Optimization: Adjust Accordingly Design Compiler Account for clock trees No hold-timing fixing Be careful with critical range Do not over-constrain An Illustration Synopsys 201275 Synthesis and placement Do not over-constrain during synthesis Use DC SPG flow Account for max_transition and clock uncertainty Specify pre-CTS estimated constraints CTS Remove pre-CTS estimated constraints Route Remove/adjust pre-route constraints Adjust crosstalk thresholds Manage Design Constraints ThroughoutGuidelines For Convergent Timing Closure 1029 971 913 8008509009501,0001,0501,100Synthesis Place Clock RouteMHz Addnl. Customization For High-PerformanceTuned For Hi-Performance/Low PowerRM (Baseline)Timing Closure Profile Timing Closure Profile Do Not over Complicate your flow Synopsys 201276 Runtime (CPU Hrs) #8 Signoff: Review your Environment 01632486480961121281.1 1.2 5.5 37.0 50+01020304050601.1 1.2 5.5 37.0 50+Memory Usage (GB) 172 GB Instances (Million) Instances (Million) Designs run at customer site using revisedPrimeTime scripts and latest release version Unlike wine, scripts grow stale with age Synopsys 201277 PrimeTime Scripts: Key Areas to Review Environment and setup Use latest release and ensure adequate hardware resources Reading parasitics Use binary parasitics when possible Multiple timing updates Eliminate redundant/legacy update_timing steps Inefficient TCL scripting and reporting PrimeTime Design Utility Checker can help with some of these tasks Synopsys 201278 #9 Performance: Leverage YourEDA Partner Starting Point Built on Synopsys RM Understand the new technologies and features Easy to use Reduce time-to-results Automated methodology to achieve 90% of target quickly Additional advanced techniques to reach final goal Minimize number of iterations or trial and errors Reduce ECO efforts Synthesis Design Schedule Typical Flow HSLP Flow Signoff + ECO Iterations P&R Synopsys 201279 HSLP Implementation Best Practices Reduces Time-to-Results Time Targets 100% 90% 75% Typical Flow With HSLP Implementation Best Practices Design-specific customization Reduces time-to-results Typical Flow on Regular designs Typical Flow on High Performance designs HSLP Flow High Performance, Low Power (HSLP) Flow Requires Customization Synopsys 201280 #10 Low Power: Architecture Drives Power 0.9V0.7V 0.9V OFF 0.9V0.9V 0.9V OFF Multiple Voltage (MV) Domains Multi-Supply with shutdown No State Retention Multi-Voltage with shutdown 0.9V0.7V 0.9V 0.9V0.7V 0.9V OFF Multi-voltage with shutdown & State Retention SR Retention Registers Power Switches (MTCMOS) Level Shifters IsolationCells Always-on Logic DESIGN TECHNIQUES VDDB VSS IN OUT EN VDD ISO VSS IN VDDIVDDO OUT LS AO IN OUT VDD VDDB VSS GateGate on/off VDD Gate VSS VDDB VDD RR Synopsys 201281 New Techniques and Challenges Synopsys 201282 Leading The Way In 20nm DesignThe Race to 20nm Is On! Synopsys 201283 The 20 nm Challenge: Single Exposure Last Pitch With Single Exposure ~ 80 Nanometers We Can Print This,But We Cannot Print This Source M. van den Brink, ASML, ITF 2009; P. Magarshack, STMicroelectronics, 2010 Synopsys 201284 And Then This! The Solution: Double Patterning A Significant Change We Can Print This, and This, Synopsys 201285 Synopsys Solution DPT Ready IC Compiler P&R, and IC Validator DRC Source: Synopsys Research 2011 Wide Spacing EnforcedTwo-Color Decomposed Design Synopsys 201286 Synopsys Solution DPT Ready IC Compiler P&R, and IC Validator DRC Source: Synopsys Research 2011 Synopsys 201287 The Challenge: Planar CMOS Insufficient Performance, Excessive Power32 Nanometer PlanarPerformance Power Source: K. Kuhn, Intel, IDF 2011 Synopsys 201288 The Solution: Non-Planar CMOS FinFET or Tri-Gate CMOS 22 Nanometer Tri-GatePerformance Power Source: K. Kuhn, Intel, IDF 2011 Synopsys 201289 The Solution: Non-Planar CMOS The First Revolution Source: M. Bohr, Intel, YouTube 2011 Synopsys 201290 There Are Many Flavors, But Reality and Fantasy Are not the Same Thing ! Synopsys 201291 Superior drive current Active region spans the fin heightand thickness (3 sides) Ids (2*Hfin+Tfin) as opposed to just thickness for planar Reduced leakage Depleted substrate Enhanced electron mobility High-K gate oxide Metal gates in place of PolySilicon Strained silicon Multiple fins possible to increase total drive strength for higher performance FinFET Advantages FinFET vs Planar Transistor Source: Intel FinFET Planar Inversion Layer Fin Synopsys 201292 This Is Not The End of Moores Law! But the Gap Between Intel and the Crowd Is Widening Source: M. Bohr, Intel, IDF 2011 Synopsys 201293 3D ICs: Technology Trends Four Main Categories of > 2D-IC Ahead Memory Cube (Wide I/O) Memory Cube on Logic Silicon Interposer 3D Stack C4 TSVBump 12 3 4 Synopsys 201294 3D-IC Two Basic Configurations Emerging Addressing Gigascale Design Challenges Silicon Interposer(2.5D) Horizontallyconnecteddies Drivers: Consumer, Storage, Networking Benefits: Yield, Cost, TTM & Flexibility3D-IC Verticallystacked dies with TSVs Drivers: Wireless handset, Processors Benefits: Performance, form factor Synopsys 201295 The Memory Cube Now Source: C.-G. Hwang, Samsung, IEDM 2006 8 die stack 50 microns 560 microns 1 Synopsys 201296 IP Market, an opportunity for Latin America Synopsys 201297 IP Intellectual property core, IP core, or IP block is a reusable unit of logic, cell, or chip layout design that is the intellectual property of one party IP cores may be licensed to another party or can be owned and used by a single party alone IP cores can be used as building blocks within ASIC chip designs or FPGA logic designs Synopsys 201298 IP IP cores in the electronic design industry have had a profound impact on the design of systems on a chip IP core licensor spread the cost of development among multiple chip makers IP cores for standard processors, interfaces, and internal functions have enabled chip makers to put more of their resources into developing the differentiating features of their chips new innovations faster Licensing and use of IP cores in chip design came into common practice in the 1990s Synopsys 201299 2011 Design IP Revenue: $1.9B Semiconductor IP Market Segments Microprocessors 39% DSP 5% Fixed Function (GPUs, Security) 15% Wired Interfaces 19% Memory Cells/Blocks 10% GP Analog/MS 4% Block Libraries 1% Physical libaries 3% Other IP 4% Processors (CPUs, GPUs, DSPs) Source: Gartner, March 2012 Synopsys 2012100 Semiconductor IP Market Size Synopsys ShareCY04 CY05 CY06 CY07 CY08 CY09 CY10 CY11Semiconductor IP Market Size 964.0 1,068.3 1,267.3 1,378.2 1,464.1 1,351.0 1,695.0 1,910.9Synopsys Share 7.9% 7.6% 7.3% 7.2% 7.2% 9.1% 11.3% 12.4%0.0%2.0%4.0%6.0%8.0%10.0%12.0%14.0%0.0200.0400.0600.0800.01,000.01,200.01,400.01,600.01,800.02,000.0$M Synopsys Share Source: Gartner, March 2012 Synopsys 2012101 Rank Company 2010 2011 Growth 2011 Share1 ARM Hol di ngs 575.8 732.5 27.2% 38.3%2 Synopsys 191.8 236.2 23.2% 12.4%3 I magi nati on Technol ogi es 91.5 126.4 38.1% 6.6%4 MI PS Technol ogi es 85.3 72.1 -15.5% 3.8%5 Ceva 44.9 60.2 34.1% 3.2%6 Si l i con I mage 38.5 42.8 11.2% 2.2%7 Rambus 41.4 38.9 -6.0% 2.0%8 Tensi l i ca 31.5 36.3 15.2% 1.9%9 Mentor Graphi cs 27.3 23.6 -13.8% 1.2%10 AuthenTec 19.6 22.8 16.3% 1.2%Top Semiconductor IP Vendors Source: Gartner, March 2012 Synopsys 2012102 102030405060700204060801001202005 2006 2007 2008 2009 2010 2011 2012 2013 2014% Design Reuse Total Number of IP Blocks per SoC Avg. # IP Blocks per SoC% Design ReuseSource: Semico, October 2010 IP Blocks IP Subsystems IP Vendors Also Need to Provide More Functions and Functionality Synopsys 2012103 Complete Solution: HW, SW, Prototype Pre-integratedand Verified SoC Ready: Seamlessly Drop-in and Go Subsystems:The Next Evolution in The IP Market What is a Subsystem? Synopsys 2012104 Thank You