Tech Design Forum Journal

DECEMBER 2011 // VOLUME 8, ISSUE 5 // TECHDESIGNFORUM.COM

ARM PREVIEWS 64BIT ARCHITECTURE SEVEN PILLARS OF VIRTUAL PROTOTYPINGTRENDS FOR THE INTERNATIONAL CESDEFINING FUNCTIONAL QUALIFICATION3D INNOVATIONS FROM ITC 2011

PLUS

VERIFICATION& TEST

IN FOCUS

TECHDESIGNFORUMIN 2012

SEE PAGE 8

Untitled-2 1 8/8/11 9:22:10 AM

Tech Design Forum is a trademark of Mentor Graphics Corporation, and is owned and published by Mentor Graphics. Rights in contributed works remain the copyright of the respective authors. Rights in the compilation are the copyright of Mentor Graphics Corporation. Publication of information about third party products and services does not constitute Mentor Graphics’ approval, opinion, warranty, or endorsement thereof. Authors’ opinions are their own and may not reflect the opinion of Mentor Graphics Corporation.

ContentsPublished quarterly for the electronic products design community, the Tech Design Forum Journal is a premier source for the latest advancements and new technologies used by hardware and software engineers to design and develop electronic products for the aerospace, automotive, consumer, medical, industrial, military, and semiconductor industries. The journal provides an ongoing forum in which to discuss, debate and communicate these industries’ most pressing issues, challenges, methodologies, problem-solving techniques and trends.

Start HereSteve Jobs – a Man in FullBeyond the gossip, the new biography has lessons for us all.

The FutureTame the systemFind out how Tech Design Forum is changing and expanding in 2012.

ArchitectureWhen ARM’s 64First details of the processing giant’s latest architecture and market ambitions have emerged.

AnalysisCES mourns the computer, hails computingWe preview key trends at next month’s Consumer Electronics Show in Las Vegas.

6

8

10

20

Commentary

Tech Forum24 ESL/SystemCThe seven habits of highly effective virtual prototypesShabtay Matalon and Yossi Veller, Mentor Graphics

30 Verified RTL to GatesHow to achieve power estimation, reduction and verification in low power designKiran Vittal, Atrenta

38 Verified RTL to GatesThe principles of functional qualificationGeorge Bakewell, SpringSoft

42 Tested Component to SystemDesign for test: a chip-level problemSandeep Bhatia, Oasys Design Systems

46 Tested Component to SystemPre-bond test for 3D ICs at ITC 2011Special report, Tech Design Forum

techdesignforums.com

December 2011Volume 8, Issue 5

4 TECH DESIGN FORUM // DECEMBER 2011

TeamEDITORIAL TEAMEditor-in-ChiefPaul Dempsey +1 703 536 1609 [email protected]

Managing EditorSandra Sillion +1 949 226 2011 [email protected]

Copy EditorRochelle Cohn

CREATIVE TEAMArt DirectorKirsten Wyatt [email protected]

Graphic DesignerMaream Milik [email protected]

EXECUTIVE MANAGEMENT TEAMPresidentJohn Reardon [email protected]

Vice PresidentCindy Hickson [email protected]

Vice President of FinanceCindy Muir [email protected]

Vice President of Corporate MarketingAaron Foellmi [email protected]

SALES TEAMAdvertising & Event Sales ManagerMark Dunaway+1 949 226 2023 // [email protected] Advertising & Event Sales ManagerJohn Koon+1 949 226 2010 // [email protected]

Account ManagementSandra Sillion+1 949 226 2011 // [email protected]

Untitled-1 1 8/3/11 6:29:30 PM

accelerating engineers

When complex integrated circuits need to be designed fast, Xinyi Tanis on the scene in a flash. Using the Laker Custom Layout System with sign-off quality verification feedback from Calibre® RealTime, she speeds up her project, reduces rework, and shortens the overall design cycle. Which makes her a superhero to everybody around her. Including the ones who matter the most.

WHAT WILL YOU DO WITH YOUR SPRINGSOFT SUPERPOWERS?Tell us your story. Logon to www.springsoft.com

TM

Untitled-5 1 4/28/11 11:08:45 AM


Start Here

Walter Isaacson’s excellent biography of Steve Jobs has attracted headlines for its more controversial anecdotes, specifically those that detail the public and private behavior of the late Apple founder. The book itself, though, is far from an exercise in gratuitous

muckraking.If you happen to work in high technology, many of the tales about Jobs and his often frac-

tious relationships with staff, partners, suppliers and others in our industry should not really come as a surprise. The corridor gossip was always that Steve had an “edge.” And sometimes it was more than gossip. Even if you were lucky enough to avoid his temper personally, you probably know someone who was once less fortunate.

As for the details of his personal life, his character was pretty much laid bare by his own sister, the novelist Mona Simpson, as Tom Owens in A Regular Guy. And that was published in 1997. Do you remember the opening line? “He was a man too busy to flush toilets.”

The fact is that many corporate leaders are driven, ruthless and—yes—even profanely im-polite men and women. Well, fancy that. They typically sacrifice much in their private lives to push forward in their public ones, and whatever their motivations, those of us who decide not to do that often owe our livelihoods and more to them. Taking a company from start-up to empire is not for the faint of heart.

So what I liked enormously about Isaacson’s book was the feeling that he understood this, and while his role as biographer has required him to chronicle the missteps and confrontations, he avoids simple judgments and instead explores what made Steve Jobs, beyond the actually-more-public-than-realized aspects of his background.

In many respects, it was his upbringing as an indulged adopted son. But as deeply as he loved that father and mother, he spent much of his life on quests. As Isaacson shows his subject following them, he makes a couple of things more explicit than any Jobs-watcher before him.

First, he sets out not merely how much importance Jobs attributed to the intersection of aes-thetics and technology, but also how he reached that conclusion. The influences that drove Jobs’ design perfectionism are set out in full, as is his philosophy—and the latter was something that, while alive, he actually tended to guard from his competitors.

Second, Isaacson makes it clear, again once and for all, that Jobs was far more a systems guy than either a high-level technologist or an electronics engineer specifically. For him, it was always about the interplay of hardware and software, and then how you had to package every-thing in an attractive and usable form.

The book is not a “How To” for anyone who wants to build the next generation’s Apple. You still need a Steve Jobs to do that. But it does highlight the main design challenges facing high technology today. It is a world that has become still more inherently systemic and integrated, a process that Jobs did live to see. In that respect, Isaacson’s book could prove as useful to you as any of the articles in this very magazine as a slice of all-too-contemporary history.

Steve Jobs – a Man in Full

Copyright 2011 Taiwan Semiconductor Manufacturing Company Ltd. All rights reserved. Open Innovation Platform™ is a trademark of TSMC.

Performance. To get it right, you need a foundry with an Open Innovation Platform™ and process technologies that provides the flexibility to expertly choreograph your success. To get it right, you need TSMC.

Whether your designs are built on mainstream or highly advanced processes, TSMC ensures your products achieve maximum value and performance.

Product Differentiation. Increased functionality and better system performance drive product value. So you need a foundry partner who keeps your products at their innovative best. TSMC’s robust platform provides the options you need to increase functionality, maximize system performance and ultimately differentiate your products.

Faster Time-to-Market. Early market entry means more product revenue. TSMC’s DFM-driven design initiatives, libraries and IP programs, together with leading EDA suppliers and manufacturing data-driven PDKs, shorten your yield ramp. That gets you to market in a fraction of the time it takes your competition.

Investment Optimization. Every design is an investment. Function integration and die size reduction help drive your margins. It’s simple, but not easy. We continuously improve our process technologies so you get your designs produced right the first time. Because that’s what it takes to choreograph a technical and business success.

Find out how TSMC can drive your most important innovations with a powerful platform to create amazing performance. Visit www.tsmc.com

A Powerful Platform for Amazing Performance

Untitled-4 1 2/1/11 11:49:15 AM


Tech Design Forum is evolving into a predominantly on-line journal. Why? So that

an expanded editorial team can focus on bringing you the key information you need to get your job done, whether it’s designing a chip or building an entire electronics product.

Our premise is that there’s plenty of information out there for design engineers and managers. So much that working out what’s important and relevant has become a burden. Researching the im-pact of a new technology or evolving standard involves following a breadcrumb trail of web links and pointers. We want to lift that burden, organizing the material into a flow that matches each of the major phases of electronics system design, and curating it so that we bring you the most relevant information first.

Result? We make it easier for you to get your job done and get your product to market.

We’ll be doing two other important things. The first will be to look at the issues that affect the whole design process, such as silicon intellectual property and design management. The second will be to ensure that, although we’re covering the leading edge, we also bring insight from the leading edge to designers working with established tools and ma-ture processes.

The new Tech Design Forum will launch as a website on January 5, 2012. With The RTC Group, we still expect to publish special print issues to cover some major events and big themes.

Who’s behind the new Tech Design Forum? Paul Dempsey, current Editor-in-Chief, is being joined by highly experienced technology journalists Luke Collins and Chris Edwards, who will jointly own the new site through a wholly independent group, The Curation Company.

Together they bring more than 60 years of experience in technol-ogy journalism and more than 30 years of experience in specific cov-

erage of electronics system design.“The RTC group has successful-

ly managed and produced the Tech Design Forum journal and events for four years and we look forward to an ongoing partnership with The

Curation Company,” said John Reardon, CEO of The RTC Group.The new journal will be based at http://www.techdesignforum.

com. Please drop by, have a look around and sign up as a registered user for extra content. But right now, stick with us over the new cou-ple of pages and we’ll explain what we are changing.

Finding what mattersWe’re not claiming to be the only source of technical information for system design. Quite the opposite. The challenge is that digging out what’s relevant is harder than it needs to be.

We have all tried to get a quick answer to a tough question online and been frustrated by having to sift through the host of links served up. The existing media does a good job of researching and writing news, but that does not necessarily mesh well with the needs of a busy engineering team where people need answers and need them quickly.

Tech Design Forum already structures content under headings that describe each article’s main themes relative to the design flow, and our research says you like this approach. We are going to build on this by producing a series of overview articles that draw together the most important elements, and discuss how they increasingly interact. Each overview will also direct you toward the most relevant support-ing material we can find, using embedded links—it’s an approach that has been surprisingly underplayed by the technical media.

Creative curationThe key difference in our role versus those of other information pro-viders is that of curation.

Consider a museum. There may be a vast amount in the collection,

This January, Tech Design Forum moves online and will roll out a new approach to helping you manage technical content.

Tame the system

COMMENTARY: [THE FUTURE]

9DECEMBER 2011 // TECH DESIGN FORUM

but it is the curator’s job to make sense of it. What should be put in the main galleries? What special exhibitions are most timely? And how, if someone really needs to get into the archives, do you make that pro-cess as manageable as possible and guarantee that those archives are in good order?

We are adopting the same model. We will go beyond simply adding to the site to actively manage all of its aspects from the day material arrives to the day on which it can actually be retired.

Our commitment to you is that we will fill the roles of both editors and curators to provide the most timely help in getting your job done as efficiently and effectively as pos-sible.

All in one placeRecognizing that design doesn’t hap-pen in a vacuum, we will focus on three key challenges: the issues that affect the whole flow, such as version management; the discrete issues that have to be overcome in each phase of the design; and the increasing com-plexity of the interactions between different phases of the design. Labels may be necessary and useful, but silos are dangerously counter-productive. As new technologies and approaches emerge and existing ones are updat-ed, the material on techdesignforum.com will evolve, so that it will always be relevant to today’s challenge.

All the timeBy concentrating the efforts of three highly experienced design journal-ists, we can provide updates when and where they make sense, keep-ing the digital noise to a minimum while boosting the signal.

Responding to needsWe’d like your help making the site as relevant as possible.

Each month, we will invite your input on what you think are the most pressing issues facing your de-sign teams and to which you would like your suppliers to respond.

We will also propose some cover-age areas that you can vote on, and set a core editorial calendar so you can see what topics are coming up.

Beyond that, we plan to work

with the broad system design community to gather experts from across the market for our traditional articles and as white paper contributors. We will be setting up regular panels and ‘ask the expert’ postings. We’ll have questions of our own but we also want them from you, the trickier the better. And we’ll be assembling a roster of bloggers as well as offering our own thoughts on the latest news and events.

Since its launch, Tech Design Forum has been about tailoring its con-tent and execution to the practical challenges facing designers. We be-lieve that our move online will enable us to build on that core principle and so serve you, the readership, more usefully.

So, what happens now? From January 5, 2012, we will be gradually adding functionality, framing content and other new features over the course of the first quarter of 2012. Continuous editorial updates and blogs will start immediately.

The first thing we would encourage you to do is to register as a user at techdesignforum.com as soon as possible. Then you can start giving us feedback and guidance on what your priorities from this project are. It will also allow us to keep you up to date with the new look, articles that are appearing and more.

You can also follow us on Facebook ‘Tech Design Forum’, Twitter ‘Tech Design Forum’ and LinkedIn ‘Tech Design Forum’, or you can email us directly at [email protected] with any suggestions.

We all know the shared challenge ahead and we all know how jour-nals can help you overcome it. Together, it is time to tame the system.

Meet the team Paul Dempsey is the current editor-in-chief and a founder of the Tech Design Forum journal. He has more than 20 years’ experience covering various branches of technology and engineering in both the UK and the USA. Paul has held senior editorial positions with specialist newsletters published by The Financial Times, and Electronic Engineering Times UK, and is also the current Washington Correspondent for the Institution of Engineering & Technology’s flagship title, E&T.

Chris Edwards is a freelance technology journalist with 20 years’ experience of covering the electronics, embedded systems and electronic design automation (EDA) industries. He is a former Editor-in-Chief of Electronics Times and Electronic Engineering Times UK and was launch editor for two magazines for the Institution of Engineering & Technology: Electronics Systems and Software and Information Professional.

Luke Collins is a freelance technology journalist with 22 years’ experience of covering the electronics and electronic design au-tomation industries. He is a former Editor-in-Chief of Electronics Times in the UK, and co-founded the IP9x series of conferences on semiconductor intellectual property in Silicon Valley and Europe. Since 2001 Luke has edited the Features and Communications En-gineering sections of the Institution of Engineering & Technology’s flagship title E&T, and written extensively on research, develop-ment and innovation management.

Paul Dempsey

Luke Collins

Chris Edwards


COMMENTARY: [ARCHITECTURE]

It was an architectural announcement. Even its positioning on the last day of the ARM TechCon event was intended to em-phasize that the company’s confirmation of its move into 64bit

processing is one for the future, albeit the relatively near future.The ARMv8 was unveiled in its applications form only and then

with just the basic details. According to Mike Muller, chief tech-nology officer, the main reasons for making the announcement now are to clarify the roadmap and to allow time for the construction of an appropriate support ecosystem around the new core.

That second point is important. Once upon a time, the ARM “Connected Com-munity” was relatively small even though the technology was becoming increasingly influential. Today, it has more than 770 members and continues to grow.

The idea that ARM could quietly nurture its 64bit architecture toward a commercial release without any details leaking across this size of ecosystem simply doesn’t hold water. More to the point, getting the best support out there for what is likely to be a fairly bloody commercial battle will involve getting the best tools and other support built around the v8 quickly.

A tough fightAnd when we say “bloody” we mean it. ARM has again bearded Intel with a direct challenge to the x86 architecture. However, its intentions with the v8 are not directly or even largely focused on the desktop.

Yes, the move to 64bit does feed into Microsoft’s decision to tailor its still dominant Windows PC operating system (OS) for ARM-based chips as well as traditional x86 ones. At the an-

nouncement, K.D. Hallman, a general manager with the software giant, was on hand to provide one of the potted quotations.

“ARM is an important partner for Microsoft. The evolution of ARM to support a 64bit architecture is a significant development for ARM and for the ARM ecosystem. We look forward to wit-nessing this technology’s potential to enhance future ARM-based solutions,” he said.

Also present was Nvidia, with its de-clared intentions in low-power processors to take it beyond its historical strength in graphics.

“The combination of Nvidia’s leadership in energy-efficient, high-performance pro-cessing and the new ARMv8 architecture will enable game-shifting breakthroughs in devices across the full range of computing, from smartphones through to supercom-puters,” said Dan Vivoli, senior vice presi-dent.

Muller’s comments at ARM TechCon, however, suggested that it is the very high performance and, more important perhaps, low power pressures on that market that his company sees as the sweet spot: servers and other enterprise-class hardware.

Up and awayAccording to the Environmental Protection Agency, U.S. energy consumption powering servers will exceed $7B this year and has risen by more than 40% in the last five years.

With the move of an increasing amount of functionality and stor-age to the cloud—key features in the recent high profile launches of the Amazon Fire tablet and the Mac OS X Lion—it seems fair to

There’s already some love out there for ARM’s v8 64bit architecture as the processor giant builds out the ecosystem.

When ARM’s 64

Mike Muller

Continued on page 12

TECHNICAL JOURNAL

ONLINE COMMUNITY

GLOBAL CONFERENCE SERIES

The ever popular EDA Tech Forum is now

Sponsored by:

VISIT US ONLINE• Blogs• Technical Articles • Industry Related Information

www.TECHDESIGNFORUMS.COM

tdfAd_1112v1.indd 1 12/1/11 4:06:34 PM



assume that server activity will increase, but the degree to which such increases in power consumption are acceptable, economi-cally or environmentally, must be open to question.

That has opened a window of opportunity for ARM. While the enterprise market is demanding in terms of performance, it is also relatively conservative at its heart. IT managers of huge, complex systems are wary of major architectural changes unless they are forced upon them. Change is risk, and in a world where the term “mission critical” is more than a cliché, risk must be mitigated to the greatest degree.

Now, however, the burgeoning demands being placed on serv-er farms already are being added to those likely to spring from consumer devices and productivity-focused tablets pulling from them also. And that is a scenario that could apply within an indi-vidual company and its staff network. Throw consumers into the mix, and the likelihood of a further hugely costly ramp in power consumption becomes clear. All that plays to ARM’s strengths.

One other important point here is commoditization. Server chips attract far chunkier margins—some reports have put these as high as 67%—than either PC processors or ARM-based smart-phone chips.

By contrast, shortly before putting the v8 on its roadmap, ARM also announced the Cortex A7 MPCore processor and introduced its concept of big.LITTLE processing. By combining the low-power

focus of the existing Cortex A8 with high-performance features from the Cortex A15, ARM has assembled a clever combination.

But the product’s focus is largely on “sub-$100 entry-level smartphones.” There’s demand for this stuff alright, and not just in emerging markets (see page 20). But the inherent price sensi-tivities are obvious.

Certainly, ARM’s recent relationship with analysts has been marked by plaudits for its success in mobile communications but also warnings about the perennially aggressive shrinkage in margins for that market. And there is also a long-standing requirement set upon ARM to prove itself beyond that space—something it has already addressed with a successful foray into microcontrollers and which now will also play out in servers and elsewhere with v8.

Having addressed a large part of the commercial background to the move to 64bit, let’s now go under the hood.

First lookLast year, ARM introduced the Large Physical Address Extension (LPAE) to translate the 32bit virtual addresses within its v7 archi-tecture into 40bit physical addresses. However, the memory limit for that architecture remained 4GByte, insufficient for the more computationally complex software that runs on servers, particu-

{

CRYPTO

VFPv3/v4

VFPv2

Jazelle®

ARMv5 ARMv6 ARMv7-A/R ARMv8-A

Thumb®-2

TrustZone®

SIMD

NEON™Adv SIMD

A32 + T32 ISAs

AArch32

CRYPTO

Key featureARMv7-A

compatability

ARMv8

• A-profile only (at this time)• 64-bit architecture support

A64 ISA

AArch64

including• Scalar FP (SP AND DP)• ADV SIMD (SP+DP Float)

including:• SCALAR FP (SP AND DP)• ADV SIMD(SP Float)

Figure 1The application profile for the ARMv8


DESIGN RULE CHECKING PLUS (DRC+) PATTERN MATCHING - ANOTHER FIRST

FROM GLOBALFOUNDRIES

TM

www.globalfoundries.com/drcplus

“DRC+ significantly

shortens design validation in

advanced silicon geometries

starting with 40nm.”

Awarded Best Paper at DAC2011 and finalist for the 2010 EDN Innovation Award

Luigi Capodieci, Fellow, Director

CAD Design Engineering, GLOBALFOUNDRIES

Since this ad last ran in June,

GLOBALFOUNDRIES’ DRC+ was

awarded Best Paper at DAC

and was a finalist for the

EDN Innovation Award.

Untitled-3 1 8/4/11 9:22:26 AM



larly for database management. The v8 now fills that gap.For the launch, the detail provided is mainly intended for OS

and compiler companies as well as those providing tool support to hardware designers. As such, it focuses on the two execution states, AArch64 and an enhanced AArch32. The AArch64 execu-tion state introduces a new instruction set, A64.

Meanwhile, key features of the v7 architecture are maintained or extended in the v8 architecture.

In a separate ARM TechCon presentation from Mike Muller’s launch keynote, ARM fellow Richard Grisenthwaite went into some more detail as to how this will play out.

In addition to A64, headline features for the AArch64 state in-clude: revised exception handling for exceptions in the AArch64 state, with fewer banked registers and modes; support for the same architectural capabilities as in ARMv7, includingTrustZone, virtualization and NEON advanced SIMD; and a memory trans-lation system based on the existing LPAE table format.

Noting that work on the 64bit version has been under way since 2007, Grisenthwaite said that last year’s LPAE format “was designed to be easily extendable to AArch64-bit” and that the new technology features up to 48bit of virtual address space from a translation table base register.

Instructions in A64 are 32bit with a clean decode table based on 5bit register specifiers. The semantics are broadly the same as in AArch32 and changes have been made “only where there is a compelling reason.”

Some 31 general purpose registers are accessible at all times, with a view to a balance between performance and energy. The general purpose registers are 64bits wide, with no banking and neither the stack pointer nor the PC is one of them. An additional dedicated zero register is available for most instructions.

There are obviously differences between AArch64 and AArch32, although much has been done to preserve compatibility and scal-

ability. Here are some of the key points.There are necessarily new instructions to support 64bit oper-

ands, but most instructions can have 32bit or 64bit arguments. Addresses are assumed to be 64bits in size

The primary target data models are LP64 and LLP64, respectively the models used in Unix/Unix-based systems and in Windows. Meanwhile, there are far fewer conditional instructions than in AArch32, and there are no arbitrary length load/store multiple instructions.

Finally, here, Grisenthwaite’s paper set out some details of the A64 Advanced SIMD and floating point (PD) instruction set.

It is semantically similar to A32: advanced SIMD shares the floating-point register file as in AArch32. A64 then provides three major functional enhancements:

1. There are more 128bit registers—32x128bit wide registers, and registers can be viewed as 64bit wide.

2. Advanced SIMD supports double-precision floating-point execution; and

3. Advanced SIMD support, full IEEE754 execution, including rounding-modes, denorms, and NaN handling.

The register packing model in A64 is different from that in A32, so the 64bit register view fits in the bottom of the 128-bit regis-ters. In line with support for the current IEEE754-2008 standard for floating point arithmetic, there are some FP instructions (e.g., MaxNum/MinNum instructions, float-to-integer conversions with RoundTiesAway).

Changes between AArch32 and AArch64 occur on exception/exception return only. The increasing exception level cannot de-crease register width (or vice versa) and there is no branch and link between AArch32 and AArch64.

AArch32 applications are allowed under the AArch64 OS Kernel and also alongside AArch64 applications. An AArch32 guest OS will run under AArch64 Hypervisor and alongside an AArch64 guest OS.

Grisenthwaite’s entire introductory description of features un-derpinning the roll-out of v8 to the ARM ecosystem can be down-loaded at www.arm.com/files/downloads/ARMv8_Architec-ture.pdf.

ARM and servers todaySome preparatory work has already taken place with selected partners. The ARM compiler and Fast Models with ARMv8 sup-port have been distributed and, as noted, work has begun on sup-port for a range of open source operating systems, as well as—it is reasonable to assume—for Windows. A number of applications and third-party tools are also in development.

According to Muller, we can expect the full framework for v8 implementations next year and products should begin to appear

The Barcelona Supercomputing Center has already developed an ARM-based HPC


Untitled-4 1 4/28/11 11:07:14 AM



over the 2013-2104 timeframe. Nevertheless, the first implemen-tation has already been announced.

Applied Micro has unveiled a demonstration based on an Xilinx Virtex6 FPGA running its Server SoC consisting of an ARM-64 CPU complex, coherent CPU fabric, high-performance I/O network, memory subsystem and a fully functional SoC subsystem.

The work is paving the way for Applied’s X-Gene server-on-a-chip family, which the company says, will be scalable from 2 to 128 cores running at 3.0GHz with power consumption of just 2W per core.

“The current growth trajectory of data centers, driven by the vi-ral explosion of social media and cloud computing applications, will continue to accelerate,” said Dr. Paramesh Gopi, Applied’s president and CEO. “In offering the world’s first 64bit ARM ar-chitecture processor, we harmonize the network with cloud com-puting and environmental responsibility. Our next-generation of multicore SoCs will bring in a new era of energy-efficient perfor-mance that doesn’t break the bank on a limited power supply.”

Applied’s plan, running slightly ahead of Muller’s timetable, is to start offering customer sampling on a TSMC-produced v8 device in the second half of next year.

Meanwhile, the Barcelona Supercomputing Center is also part of the push to take ARM into the high-performance market, in conjunction with Nvidia. It showed a hybrid system, Mont-Blanc, which combines ARM-based Nvidia Tegra CPU chips with GPUs based on Nvidia’s Cuda technology, at a supercomputing confer-ence in Seattle this November.

The objective of this work, much like Applied’s, places great emphasis on energy savings.

“In most current systems, CPUs alone consume the lion’s share of the energy, often 40 percent or more,” said Alex Ramirez, lead-er of the Mont-Blanc Project. “By comparison, the Mont-Blanc ar-chitecture will rely on energy-efficient compute accelerators and ARM processors used in embedded and mobile devices to achieve a four-to-10-times increase in energy-efficiency by 2014.”

Mont-Blanc’s use of the existing v7 architecture makes it more of a pathfinder than a product, but it remains very much a decla-ration of intent.

The battle aheadHowever, one must note that ARM is not alone in pursuing low-power options for the high-performance market. It is, excuse the pun, a hot button issue throughout the server and enterprise busi-ness.

“Collectively, data centers around the world consume nearly 1.5 percent of total electricity production and almost $44.5B a year is spent on powering the servers in these data centers,” said Lin-ley Gwennap, principal analyst from Linley Group.

“Looking at the growth projections for data center usage and the future of power generation growth, this trajectory is unsus-tainable. A new paradigm for developing data centers based on energy efficiency will certainly help make data centers scale real-istically with future demand growth.”

For example, in its Redstone low-energy server design project, HP plans to use and evaluate both ARM-based and Intel x86-based chips. AMD, which has had a tough time in servers of late, has put low-power on the agenda as part of its recent restruc-turing, unveiling its Opteron 3000 for the micro-server market as well as promising enhancements to its mainstream devices.

ARM’s other challenge will be getting silicon vendors to adopt the new architecture. It has had success in wooing players in the microcontroller market already. And it is the go-to brand for low power. But x86 is powerful here and the challenge in general computing is arguably greater than “evolving up” from mobile phones, particularly since while ARM-based 64bit-based PCs can be expected, the company essentially wants to leapfrog that heav-ily commoditized market.

Hence the focus on making the unveiling an architectural rather than product announcement, notwithstanding the innovative play from Applied Micro. Server players will want everything in place before they jump—ARM, though, has an existing infrastructure that can deliver the necessary components.

HP’s Redstone low energy project is reviewing both ARM and x86 options

Remove the hurdles inyour SoC design

The Architecture for the Digital World®

© ARM Ltd. AD273 | 02.11

The increasing complexity inembedded design is creating new hurdles toovercome in advanced SoCs. You can exert a lot of energyaddressing these challenges yourself or you can choose a simpler path.

ARM’s DesignStart™ online access portal provides silicon proven Artisan® advancedlogic, memories and interface physical IP optimized for Cortex® processors. This tuned suite ofphysical IP delivers industry leading low power, high performance technology optimized for leadingfoundries and supported by a broad range of EDA solutions.

Accelerate your design today with ARM DesignStart™

www. designstart.arm.com

Untitled-5 1 2/16/11 9:53:16 AM

18 TECH DESIGN FORUM // DECEMBER 2011TECH DESIGN FORUM // DECEMBER 2011

1. Apps (for mobile devices)2. Tablets3. Devices for streamed content4. Internet-enabled TVs5. Devices to enable sharing content

Indeed, tablets remain hot even though they have already reached about 10% of U.S. households according to the latest CEA data. But beyond that, the preview also showed the same trend moving into car radios that can access the Android apps market-place. And this is not an isolated process.

“People used to go to CES to see devices,” said DuBravac. “Now they go to see entire ecosystems. They’re interested in how all these products interact.”

With that in mind, here are the three CES trends that DuBravac and Koenig picked out.

1. The end of the computer... all hail computingThere has been an ongoing friction between the drive to pack more capability into end products while at the same time re-ducing power consumption and extending battery life.

But, according to the CEA duo, an extra nuance is now being added where the next big trend is going to be in adding wire-less interconnectivity to all your consumer electronics. It’s not about just doing more, but broadening the experience for the

The 2012 International CES takes place in Las Vegas from January 10-13, slightly later than usual, which will come as a relief to everyone recovering from the holidays before as-

sailing the show’s crazy crowds.In November, the Consumer Electronics Association presented

its usual preview in New York City, where it released some head-line data and also highlighted a few trends to follow for those going to Sin City or simply following the gushing spigot of an-nouncements that will flood across all media.

One addition this year was a pre-show survey of buyers. There is always some preview data on what consumers are thinking—and we’ll come to that shortly—but the new exercise threw up one important point: the guys and gals stocking the warehouses also think that we now unquestionably live in a system world.

Consider first the five main product groups that they said were on their minds before CES: tablets, smartphones, TVs, ul-trabooks and accessories.

Then there are the “activities” around which they see sales be-ing generated. The top five here: computing, streaming, mobil-ity, control and connecting.

Finally, the tech trends. The five buzz concepts here follow on logically from their other priorities: smart, cloud, touch, voice and apps.

As the CEA’s leading analysts, Shawn DuBravac and Steve Koenig, noted in a typically zippy presentation, consumer prod-ucts used to be quite closely defined, particularly in terms of their functionality. Today, though, we want devices that are customizable or that might consolidate the role several boxes played in the past. And, of course, we want products that we can buy and suddenly do things with them that we had not origi-nally imagined.

The tablet and the smartphone have undoubtedly done this, and so the next list of the five “hottest” specific areas should come as little surprise:

COMMENTARY: [ANALYSIS]

We track some of the major trends to look out for at January’s International CES in Las Vegas.

CES mourns the computer, hails computing

2007

% of gist spending on CE 22 28 29 31 32

194 206 222 232 246Amount ($)

2008 2009 2010 2011

Figure 1Allocation of gift spending on CE (all)


Transform your design fl ow.

Introducing the Blue Pearl Software Suite.

Analyze RTL & CDC.

Create & Validate timing constraints.

Contact us for a personal demonstration:

(408) 961-0121

[email protected]

The Blue Pearl Suite includes a unique Visual Verifi cation Environment. Design set-up is streamlined, offering quick feedback on design structure and hierarchy, with an intuitive design environment browser.

Find out more at www.bluepearlsoftware.com.

Untitled-1 1 2/16/11 9:43:10 AM

DECEMBER 2011 // TECH DESIGN FORUM

20

COMMENTARY: [SECTION]

TECH DESIGN FORUM // DECEMBER 2011

user across a range of devices.Beyond that, as noted above, the degree to which products are “sub-

stitutive” will also be a key driver in determining how much market traction they will get.

Given all these shifts, DuBravac noted that last year’s CES saw more than 100 tablets offered. “But there was a lot of experimentation. Now those devices have more concrete use-cases,” he added.

2. CE now stands for “customizable experiences”This is an obvious play on consumer electronics’ traditional abbrevia-tion. It extends the observation that vendors now have a clearer view of the tablet use-case came to make the broader point that buyers want to be able to tailor devices to their tastes.

As Koenig noted, this does not mean that tablets are simply “emp-ty vessels”—there is a range of core functionality and processing power that they must provide.

“However, it is not just about the platform but also the ecosys-tem,” he continued. “‘What can I get?’ Not just devices but also ser-vices and accessories.”

3. The year of the interfaceThere is a natural progress in innovation, which DuBravac illustrated well in terms of TV remote controls. What was new, gets pushed out (or more precisely into the background) to make way for the next wave of ideas.

So remotes began as clunky push button devices linked to just one thing, the TV itself. More recently, we got multi-button, multi-device remotes. Then the interface was simplified and we are now seeing touchscreen products that can run not merely your home theater but also appliances and central heating. But LG’s wand-like remote now takes that even further—it has barely any controls on it at all, but rath-er you control products with movements of your hand.

The UI has long been the area where consumer electronics was felt

to fall down—hence, so much of Apple’s success—but the move to-ward a more ecosystem-based set of products means that ease-of-use is becoming increasingly important and must therefore be simpler and intuitive.

Much of this may seem relatively straightforward, and it is. But the main idea is that it points to this year’s CES as the venue for a matura-tion in the product cycle that, from a silicon design perspective, ap-pears to presage two things: a still greater importance for software and the still further wireless integration within devices, while the power-performance stand-off continues.

The shape of the marketThe other important part of the CEA’s preview event is the pre-show market data. Here, the headline numbers are encouraging but there was something of a devil in the detail.

The U.S. consumer electronics industry is headed for shipment value of $190B in 2011, 5.6% up on the year before. The current CEA forecast for 2012 is $197B. Consumer confidence, meanwhile, is at its highest level since December 2010, and the typical holiday spend on electronics is set to be around $246, up 6% on 2010 gift spending.

0

100

200

300

400

500

$478

$101

Spend More Spend Less

Figure 2Allocation of gift spending on CE, 2011


Find innovation at the Venetian

The Eureka Park is a new addition to the International CES that will showcase more than 70 innovative compa-nies in their own area at the Venetian Hotel.

This latest TechZone (the show now has 25) is specifical-ly targeting media, venture capitalists, analysts and others looking to catch emerging companies and ideas.

Its sitting at the Venetian will place it alongside many of CES’ conference sessions and keynote addresses, hopefully encouraging those audiences to visit the Eureka Park and see, in the flesh, some of the new ideas described by speak-ers and on panels.

“Innovation and entrepreneurship drive our economy forward, and the Eureka Park TechZone proves that CES is the global platform for growing companies to unveil their game-changing technologies to the marketplace,” said Gary Shapiro, president and CEO.

“While leaders strive for policies that will create jobs, the companies within Eureka Park are creating products and services that will bring economic prosperity. We are excited to welcome these companies to CES and look forward to witnessing their cutting-edge innovations.”

WWW.NANGATE.COM(408) 541-1992

IP, Services, and EDA Company

A Portfolio of Custom Solutions

Excelling Above the Rest

• Optimal library IP for high performance design

• Fast prototyping• Full custom automation

for creation, characterization and optimization of digital design

• Tool suite consists of an integrated package of powerful EDA solutions

• Choose one or more products to fi t into your existing tool suite and process fl ow

• Benefi t from full custom design in a time-relevant and cost-effective platform

• Digital performance, power and area optimization

• Seamless integration

nangateadv3.indd 1 8/9/11 4:20:23 PM


COMMENTARY: [ANALYSIS]

So what’s the problem? Well, perhaps it is not a problem as such, more a set of possible warning signs about the general state of the still troubled U.S. economy.

The public is pushing its purchases out further and further, and many families are waiting on Black Friday deals. Retailers such as Best Buy are following the lead set over Thanksgiving 2010 by Toys“R”Us by opening at midnight, immediately after the holiday has finished, rather than the already brutally early 5am. Black Friday itself is turn-ing into Black November with shops extending offers.

At the same time, retailer inventories are tight, as low as they have been in four years. “They are not at all-time lows, but they are close,” added DuBravac.

However, the most striking statistic comes in the breakdown of that average holiday spend. About one-third of consumers surveyed by the CEA said that they intend to cut back on holiday spending this year, and the difference in the budgets between those who will spend more and those who will spend less is stark.

“There’s about a 5X difference,” said Koenig. “Clearly these two different groups have very different products in mind, and if we see movement between them it will significantly impact the results.”

The specific average gift buying numbers are $478 per household for those who plan to push the boat out, and just $101 for those who

0

10

20

30

40

50

60

70

80

Vide

o

Elec

troni

c gam

ing

Emer

ging

tech

nolo

gies

Ente

rtain

men

t/con

tent

Com

pute

r har

dwar

e and

sostw

are

Conn

ecte

d hom

e

Inte

rnet

-bas

ed m

ultim

edia

serv

ices

Lifes

tyle

elect

ronic

s

Wireles

s and

wire

less d

evice

s

80%

35% 31% 29% 28% 27% 26% 25% 25%

Figure 3Buyers’ ranking of CES trends by category

are reining in.A simplistic view would be that this reflects headline grabbing U.S.

concerns about the polarization in society between haves and have-nots. There may well be some of that and the Occupy movement is a factor here. But the other question that stands is how it reflects senti-ment. The $101 group will undoubtedly include people who have lost jobs in the family, but another important slice will be those who are concerned about the economy, their own prospects and paying down existing debt.

Again, there do seem to be implications here that will spread to elec-tronics design. The last few months have seen a number of vendors roll out products that target low-cost versions of existing products such as tablets and smartphones. One of the latest was ARM (see page 14).

The assumption in the past was that such offerings were primarily aimed at emerging markets. However, given the distinction drawn by the CEA itself between the different types of product that two dis-tinct groups of consumer will seek out, it would appear that the economy is now making demands for the low-cost, entry-level product more universal.

Registration, exhibitors and conference programs for the 2012 Inter-national CES are available online at www.cesweb.org.

DECEMBER 2011 // TECH DESIGN FORUM

Cool Value

You won’t fi nd a better CPLD value than in

MAX® V CPLDs. With a non-volatile architecture

and one of the largest density CPLDs

on the market, the MAX V family gives you:

• Lower total system cost

• Up to 50 percent lower total power vs. competitive CPLDs

• Robust new features

And with Altera, you know you’ll get devices in volume when you need them.

How can you resist such a value?

MAX V CPLDs: Cool Value

www.altera.com/maxv

Untitled-4 1 2/11/11 4:47:48 PM


Yossi Veller is the chief scientist in the Mentor Graphics ESL Division. During his long software ca-reer, Yossi has led ADA compiler, VHDL, and C simula-tion development groups. He was also the CTO of Summit Design. He holds degrees in computer science, mathematics, and electrical engineering.

Shabtay Matalon is ESL market development manager for Mentor Graphics Design Creation and Synthesis Division. He received a BS in Electrical Engineering from the Technion, Israel Institute of Technology, Haifa, Israel. He has been active in system-level design and verification tools and meth-odologies for over 20 years and published several articles in these areas. At Mentor Graphics, Shabtay focuses on architectural design and analysis at the transaction level. Prior to joining Mentor Graphics, Shabtay held senior marketing and engineering positions at Cadence Design, Quickturn, Zycad and Daisy Systems.

The share of key functionality imple-mented in software running on pro-cessors continues to grow in new de-

signs. No longer dominating just laptops and PCs, software reigns in communica-tion, networking, and automotive devices, and embedded software is found in many consumer devices. With off-the-shelf plat-forms providing the foundation for mod-ern designs, it is software combined with selected hardware accelerators that dif-ferentiates one product from another. The growing importance of low power con-sumer and green devices is one reason for the increase in software processing units. Modern low power design techniques have built-in facilities to control power, but embedded and application software have “smarts” that add the use case con-text and determine how and when appro-priate power control techniques can be applied. In addition, optimizing software for the processors it runs on can also help reduce power consumption.

How well software and hardware in-teract defines a device’s key performance, power consumption, and cost attributes. Integrating and optimizing software after hardware has been already built is no lon-ger an option; nor is the common practice

of validating hardware and software in isolation. Software and hardware interac-tions must be validated before either set of architectural decisions is finalized.

Virtual prototyping gives software engi-neers the ability to influence the hardware specification before the RTL is implement-ed and reduces the final HW/SW integra-tion and verification effort. It also provides significant benefits over hardware pro-totyping by using high-speed abstracted simulation models of the hardware. Virtu-al prototyping enables software engineers to use their software debugger of choice. It facilitates the debugging of complex HW/SW interactions by providing simulation control and visibility into the hardware-states, memories, and registers. And it provides a comprehensive set of analysis capabilities that allows engineers to op-timize the software and improve how it controls the hardware to meet perfor-mance and power goals.

Early virtual prototyping modeling techniques did not address today’s chal-lenges. Most ran software against loose, proprietary mockup models of the hard-ware, providing a programmer’s view of the hardware to the software routines. They allowed partial validation of the

Virtual prototyping is not a new technique, but the advent of transaction level modeling and an increased focus on seven key requirements for their effective use means that today’s versions are much more broadly applicable and comparatively future proof. Those seven qualities are: industry standards; platform modeling; processor modeling; virtual prototype creation; integrated hardware/software visualization and debug; performance/power analysis under software control; and optimization of software on multiple cores. The article provides a brief review of the importance of each and briefly describes a modeling strategy in terms of the Mentor Graphics Vista tool suite.

Shabtay Matalon and Yossi Veller, Mentor Graphics

The seven habits of highly effective virtual prototypes

TECH FORUM: [ESL/SYSTEM C]

DECEMBER 2011 // TECH DESIGN FORUM 25

software functionality against the hard-ware register address space, but had limited capabilities when it came to vali-dating the functionality of an entire de-vice. To overcome that limitation, these virtual prototypes attempted to provide additional cycle accurate models that represented the functional behavior dur-ing each clock cycle, but at a faster speed compared to RTL.

Where the programmer’s view did not contain sufficient granularity of the un-derlying hardware to completely validate the software, cycle accurate models re-quired a modeling effort close to that of writing the RTL, and frequently suffered from insufficient simulation performance to run software application code. Evalu-ating either performance or power under software control using either technique was generally impractical. Reuse of these models to produce virtual prototypes outside the framework of a single vendor environment was impossible due to the proprietary, closed nature of the model in-terface. Reuse in downstream flows, such as RTL verification, was non-existent.

To overcome these issues, today’s more advanced virtual prototyping technolo-gies, such as the Vista virtual prototyp-ing technology from Mentor Graphics, should have seven key attributes that enable them to address current and fu-ture design challenges.

1. Industry standardsAdvanced virtual prototypes are com-posed of transaction level models (TLMs) that abstract functionality, timing, and communication. The SystemC TLM2.0 standard allows these models to be re-used from project to project and makes them interoperable both among internal design teams and across the entire in-dustry. Industry-compliant TLMs can be run on any industry-compliant SystemC simulator without requiring proprietary extensions. In addition, TLM2.0 contains specific enhancements that enable very ef-

ficient communication for optimal simula-tion speeds.

2. Platform modeling (LT/AT)A platform modeling strategy not only defines the level of investment to create the platform but also the capabilities pro-vided to the end user. A scalable TLM-based methodology separates communi-cation, functionality, and the architectural aspects of timing and power into distinct models. Such a model can run in a loosely

timed (LT) mode at a very high speed—or it can switch to an approximately timed (AT) mode for more detailed performance and power evaluations under software control. When modeling power, AT mode allows engineers to associate power val-ues with transaction-level computation and communication time and consider the power state of each model. LT/AT switching can be facilitated during run

Communication Layer

Port

Port

Function

Timing Power

Communication Layer

Comm

unication LayerCom

mun

icatio

n La

yer

Figure 1Scalable TLM power model and power modeling policiesSource: Mentor Graphics

Figure 2Vista communication, computation, and state-based power policiesSource: Mentor Graphics

Continued on next page


time to adapt to the software mode of operation, such as a boot versus running application code.

3. Processor modeling (JIT)Processor models that run the embed-ded software are at the heart of any ef-fective virtual prototype. These models usually determine the overall simula-tion performance of the platform, de-pending on their modeling and com-munication efficiency. Just In time (JIT) modeling allows the embedded code to run most efficiently on the host by al-lowing the target processor instruction code to be natively compiled into the host processor instruction code struc-ture when needed. This is done while preserving thread safety and correctly supporting multiple instances of the same processor type or different proces-sor types on the host.

4. Virtual prototype creationThis process assembles the individual industry-standard-based processors, pe-ripherals, buses, and memory TLMs into a virtual platform capable of executing soft-ware natively. The platform producer can use the TLM block diagram capability to define the design topology by connecting graphical symbols representing each TLM. Similarly, topology changes can be imple-mented quickly by interactively changing the connections. Saving the topology can automatically generate the complete vir-tual TLM platform model used to run the software on the embedded processors. A virtual prototyping compiler can produce executables in sufficient quantities to serve large software design teams. Such an ex-ecutable provides the level of hardware visibility and control needed to integrate, validate, and optimize OS and application code against hardware.

5. Integrated HW/SW debugAs software is running on the virtual pro-totype, it is important to provide the right level of visibility into the hardware states. Software engineers are used to running their favorite software debug tools, such as off-the-shelf GDB, ARM RVDE, and Men-tor Sourcery CodeBench. An integrated debug environment on a virtual platform allows them to use their preferred debug-gers to validate, debug, and optimize the software using standard hardware visual-ization techniques, such as state, memory, and register views of the hardware. Simu-lation control of the software and hard-ware—such as single stepping through software advancing simulation time—is needed. So are stop, checkpoint, and re-start capabilities.

6. Performance/power analysisAs the software controls the hardware’s modes of operation and use models, it is important to perform software optimiza-tion to meet device performance and low power goals. This can only be accom-plished by using performance analysis graphs, such as data throughput, latency, and power, that display hardware dy-namic power for each software routine executing on the platform. The software designer can see the direct impact of soft-ware changes on the virtual platform’s performance and power attributes.

7. Optimization across multiple coresTo maximize performance and lower power consumption, software is parti-tioned on multiple cores to provide the best throughput for the desired func-tionality. However, over-partitioning may reduce performance due to in-creased inter-processor communication or competition when sharing common hardware resources. Inversely, under-utilizing the processor/core resources will result in sub-optimal performance.

Sostware

OS(e.g. MEL)

Communication Layer

Port

Port

Port

Port

Function

TimingPower

Communication Layer

HardwareTracing

SimulationControl

Performance

Power

Figure 3Vista hardware-aware virtual prototypesSource: Mentor Graphics


Untitled-2 1 12/2/11 2:41:51 PM


Thus the virtual prototyping solution must have the capabilities to conduct “what if” analyses to determine the optimal hardware configuration and software partitioning for differentiated designs.

Mentor Graphics’ Vista hardware-aware virtual prototyping solution has all of these key attributes, enabling early validation of software against the target hardware, reducing the HW/SW verification effort, and easing the creation of post-silicon reference platforms. Virtual prototyping can be conducted at a much earlier design stage than physical prototypes—even be-fore any RTL is designed—increasing pro-ductivity and shrinking time-to-market.

Because virtual prototypes are highly abstract, the code representing the hard-ware is much smaller and simpler. Thus, virtual prototypes simulate orders of magnitude faster than RTL code, captur-ing bugs manifested in complex scenarios that are impossible to simulate at the RTL stage and making debug much easier. Fur-ther, TLM platform models can be used as golden reference models, reducing the time to construct RTL self-checking verifi-cation environments.

Even after the device and chip are fab-ricated, virtual prototypes provide post-silicon reference platforms for simulating scenarios that are difficult to replicate and

control on the final product. They also give visibility into internal performance, power, and design variables not reach-able within the physical chip. Virtual prototypes can be used for isolating field reported problems and for exploring and fixing a problem through software patches or design revisions.

Advanced virtual prototyping allows validation of the entire functionality im-plemented in both hardware and software. When in AT mode, virtual prototypes run several orders of magnitude faster than cycle accurate platform models, yet they still achieve a sufficient level of accuracy to support comprehensive performance and power optimization. When running in LT mode, virtual prototyping allows software engineers to quickly run the application, middleware, and OS code at close to real-time speeds against a complete functional model of the hardware. Switching between AT and LT modes during run time allows the virtual prototype to be used effectively at any time during simulation.

Vista platform models give software engineers the ability to influence the hardware design before the RTL is imple-mented, reducing the final integration and verification efforts. Vista can also create an executable specification that can be provid-ed to a large number of software engineers so that they can validate their application

software against the hardware during the pre-silicon design stage. This executable can even be given to field engineers as a reference debugging platform during the post-silicon stage, after the product has been sold to customers.

As the amount of functionality imple-mented in software running on multicore processors continues to grow, how well software and hardware interact defines device performance, power consumption, and cost attributes. Advanced, hardware-aware virtual prototyping is the best way to optimize these important attributes and enable concurrent hardware/software de-velopment throughout the design flow.

Mentor GraphicsCorporate Office8005 SW Boeckman RdWilsonvilleOR 97070USA

T: +1 800 547 3000W: www.mentor.com

Video

INTC

UART WD ADCTIMER

DMA USB FLASH DDRETHERNET

Accelerator Multi-coreISS

Interconnect Fabric

PeripheralsVirtual

Prototype

End UserApplicationSostware

Sostware Debuggers(ARM RVDE, Code Bench)

PowerPerformance

Figure 4Creating a virtual prototypeSource: Mentor Graphics

Untitled-1 1 5/10/11 11:03:41 AM


For wireless electronic appliances, bat-tery life has a major influence on pur-chasing decisions. Mobile phones,

PDAs, digital cameras and personal MP3 players are increasingly marketed accord-ing to the battery life they offer.

In wired applications, power con-sumption determines heat generation, which in turn drives packaging cost. If not managed properly, this may have a significant impact on the price of the fin-ished product.

The increasing density of ICs leads to progressively increasing power density and this further complicates the challenge inherent in packing more into systems while consuming less power. Industry projections suggest that today’s designs face further increases in leakage power in the range of 4-6X with each new process generation, so all available techniques must be used.

The goal for the designer is simple—to control power consumption to the greatest extent possible. However, suc-cessfully doing so entails attention to

fundamental issues such as functionality, testability, manufacturability, area, tim-ing and constraints.

Moreover, modern day designs typi-cally have many millions of components that require projects to be apportioned out to multiple design teams. Each team will be tasked to reach various project milestones that determine whether the finished chip ships on time. This com-plex web of interdependencies means it is extremely expensive to add additional steps to a flow, wherein a design is modi-fied to deliver power efficiency. Time-to-market pressures leave little room for maneuver in the creation of a power-stingy design.

Design perspectiveDesigners already have a number of techniques that they can use to reduce power consumption. They can adopt lower supply voltages, draw upon off-the-shelf power management features, and exploit several programming strategies.

Reducing power consumption requires design strategies that address potential savings as early in the flow as possible. Late stage design changes bring with them enormous costs in terms of time and money, and can even cause an entire project to fail. The article describes an approach, based on Atrenta’s SpyGlass-Power tool, that builds power consciousness into the RTL early enough in the flow to maximize savings and minimize disruption.

The strategy uses structural and formal analysis, a broad range of rules and checks, and a set of capabilities to check for power inefficiencies to pinpoint weak areas and make automatic changes while preserving functional integrity.

Kiran Vittal, Atrenta

How to achieve power estimation, reduction and verification in low-power design

TECH FORUM: [VERIFIED RTL TO GATES]

Kiran Vittal is senior director of product marketing at Atrenta.


RTL itself, or that implements such chang-es automatically.

Power saving approachThe techniques cited above need to be applied in the context of a flow from the initial architectural definition through to the final design representation. The main steps are shown in Figure 1.

During the architectural stage, the de-sign team plans whether to use voltage domains, power domains and/or clock gating, typically using its own mem-bers’ expertise and a spreadsheet from previous designs. SpyGlass-Power from Atrenta provides additional high-speed, highly accurate power estimation at RTL. The key is to estimate power early, while the design can still be transformed rather than waiting until after a gate-level implementation.

The challenge in designing for low power is that most tools offer the de-signer no visibility into the ramifica-tions of these techniques at the RTL stage. The focus is on functional aspects of the design and power consumption gets pushed to the margins. The result is that easy-to-plug power loopholes creep into the design, and by the time the project has reached the stage where traditional tools will spot these prob-lems, it is too late to make fundamental changes.

Power optimization needs to become an integral part of the design process at the RTL. A tool is needed that identifies power inefficiencies in the RTL code, and then suggest ways in which power can be reduced based on tried-and-tested mecha-nisms. Designers want a tool that allows them to make changes while coding the

After power consumption information is calculated, the design team changes the design to reduce power. The Atrenta tool provides an activity-based power calculation of the impact of each gated enable, giving more intelligence and driving downstream clock-gate inser-tion. Moreover, it further reduces power by making sure existing clock enables are effective and by identifying new clock enables.

While working with the RTL, level shift-ers and isolation logic are auto-inserted or inserted with an in-house script. These changes must be verified against the de-sign intent—the definition of the voltage domains, power domains and their prop-er power-up/power-down sequencing. Managing all this at RTL means problems are caught early.

Implementation tools synthesize and then place and route the design. They in-sert clock gates with guidance from ear-lier analysis. Placement and timing opti-mization transforms the design, inserts buffers, or swaps in low voltage threshold cells to help meet timing at the expense of leakage power.

Once all this is complete, the final step is an impartial verification that the de-sign is implemented true to the original power intent. Level shifters and isolation logic are verified, and power and ground pins on cells are checked to ensure they


Power ArchitectureDecide voltage and power domains, global clock gates

Frozen very early in design flow

Power Estimation - power targets met?

Power Reduction

Was voltage/power/clock planning effective?Forecast possible results from gate selection

RTL Power VerificationDomain sequencing; check level shister & isolation logic;

power-aware simulation

Implementation - Synthesis, Place & RouteClock-gating, multi-VT, MTCMOS, power-recovery

Post-Layout Power VerificationLevel shister, isolation logic, power routing, verification

Sign-off power estimation

Figure 1Power saving flowSource: Atrenta

Figure 2RTL power estimation: leakage power, internal power and switching power over timeSource: Atrenta


are connected to the correct power and ground nets.

Power saving techniquesPower saving encompasses a breadth of

techniques and addresses power at different

instances in the flow from RTL to gate level to post-layout. Performing power estima-tion as early as possible, at RTL, provides valuable information about the power con-sumption while giving designers the time to make any necessary or beneficial changes.

Since power consumption can be clas-sified into two broad categories, dynam-ic power and static power, both require attention in a complete power manage-ment solution. Voltage domains and clock gating address dynamic power. Power domains to isolate (or “sleep”) parts of the design and multiple thresh-old voltage techniques address static (leakage) power.

RTL power estimationAn important precursor to power reduc-tion work involves understanding how much power is being consumed. As early as possible in the flow, as soon as RTL is ready, power estimation will help design-ers understand where the greatest power is being consumed.

Spreadsheet-based estimates may work for derivative designs, but new RTL is uncharted territory. The SpyGlass platform has been made timing-aware by loading timing constraints, design activity files and libraries to calculate a more accurate power number. This is especially important for timing-critical designs.

The tool quickly builds a design rep-resentation to calculate the cycle-by-cy-cle and average power for the design’s dynamic, leakage and internal power. Graphical displays and generated re-ports of clock power, control power and memory power guide the design effort (Figure 2, p. 31).

Multiple threshold voltagesIn an effort to save leakage power, back-end implementation tools may use mul-tiple libraries where cells have the same function but different threshold volt-ages. Overall, cells with a higher thresh-old voltage are used since they exhibit lower leakage power. After place and route, a timing optimization step sweeps through the design and swaps lower volt-age threshold cells along a timing-critical

Non-gated output:

clk

en

d

d

q

clk

en

q

Clock-gated output:

Typical ICGC(IntegratedClock Gating

Cell)

Figure 3Clock gatingSource: Atrenta

Original Implementation Original Implementation

Modified ImplementationModified Implementation

CLKEN

EN

CLK

EN

EN

Figure 4Identify new enablesSource: Atrenta


path. Though these cells are faster, they are higher leakage.

Design teams who utilize multiple threshold voltage techniques have a sense of the typical “mix percentage” of high Vt to low Vt cells. The power esti-mation in SpyGlass-Power can use this percentage to compute power consump-tion at RTL.

Power reductionClock gatingThe set of practices focused on controlling the activity of nets in a digital circuit with a view to reducing power consumption is called “activity management.”

Clocks are the most active nets in a de-sign and contribute significantly to over-all activity. Therefore, they are the main targets of activity control techniques that seek to reduce power consumption.

Clock gating is a prominent technique here. An explicit clock enable in the RTL code allows synthesis tools to choose be-tween two implementations, as shown in Figure 3.

The Atrenta tool takes the RTL de-scription of the design, performs a fast synthesis, and analyzes it to suggest clocks that could be gated to achieve power efficiencies. Rather than allow a synthesis tool to insert gated clocks based only on the width of the data bus, the tool shows the designer which enables will save the most power. It then creates a constraint file for down-stream synthesis.

Beyond that, it will report new op-portunities for clock enables that may not have occurred to the RTL designer. Ungated downstream registers present an opportunity for power savings if the enable is delayed by one clock cy-cle. Alternatively, data can be gated to disable activity in parts of the design where the data is not being listened to. Figure 4 illustrates some of these tech-niques.

The focus is to leverage extremely effec-tive power saving techniques early and in-grain them in the RTL creation phase.

Activity managementWhile clock gating is an important tech-nique for reducing power consumption through controlling active nets, Spy-

Glass-Power also helps in analysis of datapaths, control structures and buses for activity reduction.

Clock nets account for a large propor-tion of dynamic power consumption for two reasons:


Missing level shister

VoltageDomain 1

VoltageDomain 2

Figure 6SpyGlass-Power detects missing level shiftersSource: Atrenta

Enable

Clock

Candidates forGating

Figure 5SpyGlass-Power recommends a candidate for clock gatingSource: Atrenta


a) clocks are the most active nets in the design; and,

b) clocks account for a large portion of capacitive load.

The tool analyzes simulation data to compute activities and probabilities for each net of a given design. It incorporates both simulation-based and statistical ap-proaches to analyze activity. Nets and regions with higher activity are reported and this data is used in guiding clock gat-ing opportunities.

The tool analyzes each flop in the de-sign to propose a set of candidates for clock gating. In addition, designers choose heuristics that help pinpoint those candidates that will have the maximum impact on power consumption. Visual-ization tools help highlight the areas in the design that will be impacted by gat-ing, thus helping the designer make the required changes.

Figure 5 (p. 33) shows a set of enabled flops that share clocks and enable and thus are good candidates for clock gat-ing. Such flops can be spread across

design unit boundaries. Through fast synthesis and analysis capabilities de-ployed early in the design cycle, Spy-Glass helps identify such candidates for clock gating. In addition, flops with a combinational feedback loop around them (indicating that data is being held by that flop) also become good candi-dates for clock gating.

Similar guidance helps with other power reduction activities, such as us-ing guarded evaluation to latch inputs in power hungry units, the coding of finite state machines for low power op-eration, and using reduced glitch activ-ity in the design.

Power verificationAlong with the techniques for power re-duction, the verification of voltage do-mains (level shifters) and power domains (isolation logic) can also be performed as the design transforms from RTL, to gate level, to post-layout.

Leakage power managementA significant source of power consump-

tion in modern day designs is leakage power. Also, since the leakage current increases by a factor of approximately five for each successive process genera-tion, it is poised to be a dominant is-sue in power consumption in the near future.

The main way to handle leakage dis-sipation is based on the fact that not all portions of a design will be active all the time. Hence, during certain operational modes, some portions can be turned off. This gives us the concept of multiple power domains. Creating these multiple domains, managing the interfaces be-tween them, and ensuring uncompro-mised functionality comprises essentially what we call power management.

A principal concern while using mul-tiple power domains is to manage the interfaces; that is to mutually isolate the outputs of disparate power domains in order to rule out floating nodes in the de-sign that would lead to circuit malfunc-tion. The idea is to insert isolation logic to ensure that signals do not end up taking unknown values during the power-down modes. The isolation cells need to be con-nected to a common enable signal that controls the values under the shutdown condition. The enable signal, in turn, needs to be generated out of the power down domain.

The Atrenta tool reads the design and analyzes every power domain for miss-ing or incorrect isolation logic. The is-sues detected are displayed in a text window and schematic viewer where they can be easily accessed, cross ref-erenced to the RTL, and then modified and recreated.

Voltage managementReducing the supply voltage to conse-quently reduce power consumption is at the core of voltage management. It refers


Power switch 1.2 V, always on

1.0 V, always on

1.2 V, power domainVDDB

Retention register

Always-on buffer

Level shister

VDD 12

VDD 10

SHUT_B

Figure 7Physical power connection scenariosSource: Atrenta

“First Time Success with Industry’s Most Accurate Temperature Aware Physical Sign-off Analysis with Support for Analog, Digi-tal, Mixed-Signal SOC and 3D designs”

Invarian develops a comprehensive sign-off analysis solution In ar to ensure rst time tape-out success. Invarian’s sign-off analysis for analog, digital and mixed signal inte-grated circuits identi es post-manufacturing failures efore expensive tape-outs. In ar reports real life ehavior of integrated circuits as it ta es pac aging into account, and simulates integrated circuits in a continuous space of temperatures and voltages, elimi-nating traditional error-prone over-constraining methodologies.

Please visit us at www.invarian.com, or send us an email to [email protected]

Tape-out Without Re-Spins!

Take the Guesswork out of our Si n-off ow!Invarian Sign-off = One Tool, One Run, True-to-Life Results

Guesswork True-to-LifeResults

Untitled-2 1 8/4/11 5:39:49 PM


ing market pressures, makes it critical to success. Any late stage design changes bring with them enormous costs, both in terms of time and money, and on oc-casion they can actually doom an entire project. What is required here is a full-fledged design management strategy that helps build power consciousness into the RTL itself right from the very in-fancy of the design.

SpyGlass-Power leverages a breadth of technologies to ensure designers achieve their power goals. It employs structural and formal analysis, a broad range of rules and checks, and a set of capabilities to check for power inef-ficiencies to pinpoint weak areas and make automatic changes while preserv-ing functional integrity.

to the practice of using higher voltages only where they are absolutely neces-sary to meet the performance standards. This technique can lead to tremendous power savings, sometimes in the region of 50%.

Using different voltage supplies obvi-ously creates two or more voltage do-mains. This raises the challenge of hav-ing proper interfacing circuits between any two voltage levels so that the de-sign as a whole functions as intended. Consequently, level shifter circuits are required on all signals at all voltage level crossings.

Whether the level shifters are insert-ed at RTL or the netlist level depends upon specific design practices; but in both cases, SpyGlass-Power can use the RTL code and a description of power intent to guide level shifter insertion and detect missing level shifters (Figure 6, p. 33).

Post-layout verificationAfter verification at the RTL and post-synthesis stages, there is then post-layout verification. This is a last independent check of the design to prevent chip fail-ure due to a power bug. Of course, volt-age domains and power domains can be checked again to be sure physical opti-mization engines have not introduced a power bug. Also at this stage, the sup-ply and ground nets are represented in the logical and physical connectivity of design. Adding one voltage domain and one power domain to a design will result in six different scenarios for power con-nections. Post-layout verification will ensure that supply nets and ground nets are correctly connected to prevent chip failure. Figure 7 (p. 34) illustrates some power connection scenarios.

ConclusionPower consumption is a major concern for designers today, and increasing de-sign complexity, coupled with mount-

Atrenta, Inc.2077 Gateway PlaceSuite 300San JoseCA 95110USA

T: 1-866-287-3682W: www.atrenta.com

Untitled-1 1 8/5/11 9:06:31 AM


Functional verification consumes a significant portion of the time and resources devoted to a typical design

project. As chips continue to grow in size and complexity, designers must increas-ingly rely on dedicated verification teams to ensure that systems fully meet their specifications.

Verification engineers have at their dis-posal a set of dedicated tools and meth-odologies for automation and quality improvement. In spite of this, functional logic errors remain a significant cause of project delays and re-spins. One of the main reasons is that two important aspects of verification environment quality—the ability to propagate the effect of a bug to an observable point and the ability to ob-serve the faulty effect and thus detect the bug—cannot be analyzed or measured. Existing techniques, such as functional coverage and code coverage, largely ig-nore these two issues, allowing functional errors to slip through verification even where there are excellent coverage scores. Existing tools simply cannot assess the overall quality of simulation-based func-tional verification environments.

This paper describes the fundamental aspects of functional verification that re-main invisible to existing verification tools. It introduces the origins and main concepts of a technology that allows this gap to be

closed: mutation-based testing. It describes how SpringSoft uses this technology to deliver the Certitude Functional Qualifica-tion System, how it seeks to fill the “quality gap” in functional verification, and how it interacts with other verification tools.

Functional verification qualityDynamic functional verification is a spe-cific field with specialized tools, meth-odologies, and measurement metrics to manage the verification of increasingly complex sets of features and their interac-tions. From a project perspective, the main goal of functional verification is to get to market with acceptable quality within given time and resource constraints, while avoiding costly silicon re-spins.

At the start of a design, once the sys-tem specification is available, a functional testplan is written. From this testplan, a verification environment is developed. This environment has to provide the de-sign with the appropriate stimuli and check if the design’s behavior matches ex-pectations. The verification environment is thus responsible for confirming that a design behaves as specified.

The current state of playA typical functional verification environ-ment can be decomposed into the follow-ing components (Figure 1):

Functional logic errors remain a significant cause of project delays and re-spins. One of the main reasons is that two important aspects of verification environment quality—the ability to propagate the effect of a bug to an observable point and the ability to observe the faulty effect and thus detect the bug—cannot be analyzed or measured. The article describes tools that use a technique called mutation-based testing to achieve functional qualification and close this gap.

George Bakewell, SpringSoft

The principles of functional qualification

TECH FORUM: [VERIFIED RTL TO GATES]

George Bakewell is director of product marketing at SpringSoft responsible for product management and technical direction of the company’s verification enhancement systems. During more than 20 years in the EDA industry, George has actively participated in industry standards organizations, presented at numerous technical conferences, and conducted in-depth tutorials and application workshops. He holds a Bachelor of Science degree in Electronic En-gineering and Computer Science from the University of Colorado.


• A testplan that defines the functional-ity to verify

• Stimuli that exercise the design to en-able and test the defined functionality

• Some representation of expected op-eration, such as a reference model

• Comparison facilities that check the observed operation versus the expect-ed operation

If we define a bug as some unexpected behavior of the design, the verification en-vironment must plan which behavior must be verified (testplan), activate this behav-ior (stimuli), propagate this behavior to an observation point (stimuli), and detect this behavior if something is “not expected” (comparison and reference model). The quality of a functional verification envi-ronment is measured by its ability to sat-isfy these requirements. Having perfect activation does not help much if the detec-tion is highly defective. Similarly, a poten-tially perfect detection scheme will have nothing to detect if the effects of buggy behavior are not propagated to observa-tion points.

Code coverage determines if the veri-fication environment activates the design code. However, it provides no information about propagation and detection abilities. Therefore, a 100% code coverage score does not measure if design functionality is correctly or completely verified. Consider the simple case in which the detection part of the verification environment is replaced with the equivalent of “test=pass.” The code coverage score stays the same, but no verification is performed.

Functional coverage encompasses a range of techniques that can be generalized as determining whether all important areas of functionality have been exercised by the stimuli. In this case, “important areas of functionality” are typically represented by points in the functional state space of the design or critical operational sequences.

Although functional coverage is an important measure, providing a means

of determining whether the stimuli exer-cises all areas of functionality defined in the functional specification, it is inherently subjective and incomplete. The functional coverage points are defined by humans, based on details in the specification or ex-perience and judgment applied during the verification process.

Focus is placed on the good operation of the design—ensuring that states have been reached or sequences traversed as expected—and not on checking for un-

expected or inappropriate operations. The result is a metric that provides good feedback on how well the stimuli covers the operational universe described in the functional specification, but is a poor mea-sure of the quality and completeness of the verification environment.

Clearly, there is a lack of adequate tools and metrics to track the progress of verification. Current techniques pro-vide useful but incomplete data to help engineers decide if the performed veri-fication is sufficient. Indeed, determin-ing when to stop verification remains a key challenge. How can a verification team know when to stop when there is no comprehensive, objective measure that considers all three portions of the process—activation, propagation, and detection?

A new verification technique known as functional qualification addresses this problem. It is built on mutation-based principles. Mutation-based testing al-lows both improvement and debugging of the “compare” or checking part of a verification environment and measure-ment of overall verification progress. It goes well beyond traditional coverage techniques by analyzing the propagation and detection abilities of the verification environment.

Mutation -based testingFunctional qualification exhaustively ana-lyzes the propagation and detection capac-ities of verification environments, without which functional verification quality can-not be accurately assessed.

Mutation-based testing originated in the early 1970s in software research. The tech-nique aims to guide software testing toward the most effective test sets possible. A “mu-tation” is an artificial modification in the tested program, induced by a fault operator. The Certitude system uses the term “fault” to describe mutations in RTL designs.

A mutation is a behavioral modification; it changes the behavior of the tested pro-gram. The test set is then modified in order to detect this behavior change. When the


Design underVerification

(1) (3)

Testplan

Verification Environment

Compare

ReferenceModel

(4)

Stimuli

(2)

Figure 1The four aspects of functional verificationSource: SpringSoft


test set detects all the induced mutations (or “kills the mutants” in mutation-based nomenclature), the test set is said to be “mutation-adequate.” Several theoretical constructs and hypotheses have been de-fined to support mutation-based testing.

“If the [program] contains an error, it is likely that there is a mutant that can only be killed by a testcase that also detects this

error” [1] is one of the basic assumptions of mutation-based testing. A test set that is mutation-adequate is better at finding bugs than one that is not [2]. So, mutation-based testing has two uses. It can:

1. assess/measure the effectiveness of a test set to determine how good it is at finding bugs; or

2. help in the construction of an effec-tive test set by providing guidance on what has to be modified and/or aug-mented to find more bugs

Significant research continues to concentrate on the identification of the

most effective group of fault types [3]. Research also focuses on techniques aimed at optimizing the performance of this testing methodology. Some of the optimization techniques that have been developed include selective mu-tation [4], randomly selected mutation [5] or constrained mutation [6] (Mathur A.P., 1991).

CertitudeUsing the principles of mutation-based testing and the knowledge acquired through years of experimentation in this field and in digital logic verification, the Certitude functional qualification tech-nology was created. It has been produc-tion-proven on numerous functional verification projects with large semicon-ductor and systems manufacturers. The generic fault model, adapted to digital logic, has been refined and tested in ex-treme situations, resulting in the Certi-tude fault model. Specific performance improvement algorithms have been de-

veloped and implemented to increase performance when using mutation-based methodologies for functional verification improvement and measurement.

The basic principle of injecting faults into a design in order to check the qual-ity of certain parts of the verification en-vironment is known to verification engi-neers. Verifiers occasionally resort to this technique when they have a doubt about their test bench and there is no other way to obtain feedback. In this case of hand-crafted, mutation-based testing, the check-ing is limited to a very specific area of the verification environment that concerns the verification engineer.

Expanding this manual approach be-yond a small piece of code would be im-practical. By automating this operation, Certitude enables the use of mutation-based analysis as an objective and exhaus-tive way to analyze, measure, and im-prove the quality of functional verification environments for complex designs.

Certitude provides detailed information on the activation, propagation, and detec-tion capabilities of verification environ-ments, identifying significant weaknesses and holes that have gone unnoticed by classical coverage techniques. The analysis of the faults that do not propagate or are not detected by the verification environ-ment points to weaknesses in the stimuli, the observability and the checkers.

Certitude enables users to efficiently locate weaknesses and bugs in the veri-fication environment and provides de-tailed feedback to help correct them. An intuitive, easy-to-use HTML report gives complete and flexible access to all re-sults of the analysis (Figure 2). The report shows where faults have been injected in the HDL code, the status of these faults, and provides easy access to details about any given fault. The original HDL code is presented with colorized links indicat-ing where faults have been qualified by Certitude. Usability is enhanced by a TCL shell interface.

Figure 2Example of Certitude HTML reportSource: SpringSoft


Certitude is tightly integrated with the most common industry simulators. It does not require modification to the organiza-tion or execution of the user’s existing verification environment. It is fully com-patible with current verification method-ologies such as constrained random stim-ulus generation and assertions.

Bibliography[1] A.J. Offutt. Investigations of the Soft-

ware Testing Coupling Effect. ACM Trans Soft Eng and Meth, Vol. 1, No 1, p. 5-20, 1992.

[2] A.J. Offutt and R.H. Untch. Muta-tion 2000: Uniting the Orthogonal. Mutation 2000: Mutation Testing in the Twentieth and the Twenty First Centuries, p. 45-55, San Jose, CA, 2000.

[3] M. Mortensen and R.T. Alexander. An Approach for Adequate Testing of AspectPrograms. 2005 Workshop on Testing Aspect Oriented Programs (held in conjunction with AOSD 2005), 2005

[4] A.J. Offutt, G. Rothermel and C. Zapf. An Experimental Evaluation of Selec-tive Mutation. In 15th International Conference on Software Engineering, p. 100-107, Baltimore, MD, 1993.

[5] A.T. Acree, T.A. Budd, R.A. DeMillo, R.J. Lipton, and F.G. Sayward. Muta-tion Analysis. Technical Report GIT ICS79/08, Georgia Institute of Technology, Atlanta GA, 1979.

[6] A.P. Mathur. Performance, Effective-ness and Reliability Issues in Software Testing. In 15th Annual International Computer Software and Applications Conference, p. 604-605, Tokyo, Japan, 1991.

[7] R.A. DeMillo, R.J. Lipton and F.J. Sayward. Hints on Test Data Selection: Help for the Practicing Programmer. IEEE Computer, 11(4): p. 34-43, 1978.

SpringSoft2025 Gateway PlaceSuite 400San JoseCA 95110USA

T: 1-888-NOVAS-38 or (408) 467-7888W: www.springsoft.com


Mark Twain said, “Everyone talks about the weather but nobody does anything about it.” Design

for test (DFT) is a bit like that. We pay lip service to the fact that every chip needs to be tested as well as manufactured, but somehow all the glamour goes into sim-ulation, synthesis, place and route, and other aspects of design creation. But ignor-ing a problem does not make it go away. It really is true that every chip needs to be tested. With testers getting more and more expensive, and test times increasing as chips get larger, the cost of test is not a negligible component of the overall pro-duction cost.

Historically, the way designers have handled test has been largely to ignore it. It was assumed that test was a process that could be grafted on after the design was complete. The increasing prevalence of memory built-in self test (BIST) and scan chains with automatic test-pattern generation (ATPG) for logic has meant that most aspects of test would be left to a specialist test expert when the design was largely complete.

That approach worked well enough in the world of smaller chips with single clock domains, single voltage domains, low clocks speeds, relatively generous power budgets, and not too many wor-ries about congestion or signal integrity. SoCs today are not like that. Yes, it is

true even today that not every project has to deal with all of these complica-tions. But most system-on-chips (SoCs) are large, have large numbers of clocks, multiple voltage domains and so on. In our world, leaving test until the end is a recipe for surprise schedule slips just before tapeout.

It is also important to note that it is a chip that gets tested. We can use various techniques to get vectors to blocks, but ul-timately it is a chip that sits on the tester and not a block, and so test is a chip-level problem. And, not surprisingly, chip-level problems are best handled at the chip level.

The inherent complexity of today’s system-on-chips, with their multiple clock and voltage domains, requires test considerations to be moved further up design flows. The article describes strategies for and benefits from apply test before RTL goes through synthesis, augmenting what is already achieved through memory built-in self test and automatic test pattern generation.

Sandeep Bhatia, Oasys Design Systems

Design for test: a chip-level problem

TECH FORUM: [TESTED COMPONENT TO SYSTEM]

Sandeep Bhatia is senior R&D director at Oasys Design Systems, where he leads Design-for-Test (DFT) and Low-Power-Synthesis. He received his PhD degree in Electrical Engineering from Princeton University, and Master’s degree in Computer Engi-neering from the University of Rochester. Before joining Oasys, he was a product director for DFT at Atrenta, and senior architect for DFT synthesis at Cadence Design Systems.

Figure 1Small design with scan chains that do not account for physical placementSource: Oasys Design Systems


The solution to these conundrums is to handle synthesis at the chip level and make your DFT strategy an integral part of that. It means that we address the prob-lem earlier in the design cycle and at a higher level.

Moving test up the flowThe first part of handling DFT in this way is to check the RTL before synthesis.

There are some RTL constructs that lead to gate-level structures that are inherently untestable with a standard DFT metho-dology. One good example is asynchronous set/reset or clocks that lack controllability. In addition, the commonly used power re-duction technique of clock gating changes a DFT-friendly structure into a problem that needs to be solved by using clock-gating cells with an additional test pin.

When it comes to actually linking up the scan chains, there are a number of com-plications that need to be addressed or optimized since different flops may have different power supplies or clocks and so cannot just be naïvely hooked together.

Scan chains can cross power domains, such as areas of the chip with different power supply voltages or areas that can be powered down. For such domains, level-shifters and isolation cells need to be inserted automatically at the boundar-ies. This is driven of course by the file that specifies the power policy and defines the separate power domains, be it expressed in the CPF or the UPF standard.

Clock domains also need to be taken into account: that is, the areas are controlled by different clocks during normal (i.e., “non-test”) operation of the chip. Sometimes, one solution is simply to restrict scan chains to individual clock domains. But that is not always desirable. Specifically, there are two cases to consider.

If the two clock domains do not inter-act during normal operation of the chip, then different clock trees may end up with different timing, creating hold violations. To avoid these violations, lockup latches

need to be inserted. These latches hold the value on the inverted value of the clock and so ensure that the value is available downstream without any race condition.

The second case is when clock domains do interact during normal operation. In this case, they should already be synchro-nized correctly and then can be treated as identical during scan chain generation without causing any problems.

To make better use of tester resources, scan test programs are almost always compressed. This requires placing a test compression block on the chip. These de-signs are proprietary to each ATPG vendor such as Mentor Graphics with its Tessent TestKompress tool suite.

Test compression blocks allow a com-paratively small number of test pins com-ing onto the chip to be used to generate perhaps hundreds of times more scan chains, shortening test times as well as minimizing test pin overhead. In practice, the test compression structure is a block of RTL created by the test compression soft-ware that is then added to the RTL for the whole chip and hooked up to the chains.

The flop factorBut the biggest challenge that needs to be taken into account when creating scan

chains is the physical location of the flops. It is here that working at the chip level re-ally offers a big advantage over working at the block level and then manually hook-ing up the sub-chains. The scan chains are not limited by the logical hierarchy of the design. During physical design a particu-lar logical block may end up being placed in a compact region that is good for scan insertion, but when it is not, it may end up spread out across the whole chip with the scan chain stretched out everywhere.

Another advantage of doing scan inser-tion during synthesis is that potential test problems can be debugged early in the de-sign cycle. Since test, and especially scan chain reordering using block-based meth-odologies, occurs late in the design cycle, unexpected problems almost always have an impact on the final tapeout schedule.

Figure 1 shows a design where the scan chains have not been ordered in a way that takes into account their physical place-ment after synthesis.

Figure 2 is the same design re-imple-mented making use of the physical place-ment information. Each scan chain is a dif-ferent color so the advantage in terms of routing is clear.


Figure 2Design from Fig.1 taking advantage of placement information during synthesisSource: Oasys Design Systems

Figure 3Large design that does not use placement infor-mation during scan insertionSource: Oasys Design Systems


Figure 3 is not a piece of abstract art but is a much larger design where the scan chains were hooked up using only logical information.

Figure 4 is the same design using physi-cal placement information during synthe-sis. Most chains are compact enough that they look like separate areas on the die.

The output of generating the scan chains is a standard “scandef” file that can be used by both downstream physical de-sign tools and ATPG tools. The user may choose to do another round of scan-chain ordering after physical placement.

Increasingly, large parts of chips are not synthesized directly but are blocks of IP from third-party suppliers. The standard way to handle test for such blocks is to provide test information using the Core Test Language (CTL) IEEE1450.6 stan-dard. It communicates the existing scan chains and how they are hooked up, and then allows for them to be merged into the top level scan chains.

RealTime DesignerChip synthesis needs to be a high-capacity and very fast turnaround process. Oasys RealTime Designer can handle 100,000 flops per minute for analysis and runs at about half that rate for insertion. So, a 10

million instance design that might contain one million flops can be processed for scan insertion in around 10 minutes for analy-sis and 20 minutes for scan insertion.

Figure 5 shows the DFT flow and the various files that are used to create the final DFT placed netlist and test program.

By operating at a high level, test inser-tion can be treated as a global problem and a more suitable DFT architecture can be chosen. Performing scan insertion dur-ing synthesis means that it is not necessary to leave the tool, and the full-chip view makes it easy to do full-chip analysis and optimize the overall architecture. This in turn leads to shorter test times, smaller die, and fewer secondary problems. The apt comparison here is with the traditional approach of carrying out test at the block level where decisions need to be locked down early on as to how many scan chains are in each block; with the full-chip view, this is completely automated.

Figure 4Design from Fig.3 taking placement information into account during synthesisSource: Oasys Design Systems

LIB

LEF

DEFNetlist

P&R

ATPG

RTL

RealTimeDFT/Physical

Synthesis

SDC

DFTIP

CTL

CTL

STILScandef

Figure 5DFT flow and filesSource: Oasys Design Systems

Oasys Design Systems3250 Olcott StreetSuite 120Santa ClaraCA 95054USA

T: 408-855-8531W: www.oasys-ds.com

MARK YOUR CALENDERSJune 3-7, 2012

San Francisco, CA - Moscone Center

DAC Offers:• An exciting technical program on Electronic Design

Automation and Embedded Systems & Software (ESS)

• ESS Executive Day

• Management Day: The edge of business and technology

• Colocated conferences, tutorials and workshops

• Over 190+ Exhibitors

• 140+ User Track presentations

Contribute Now!• User Track Extended Abstracts:

Deadline: January 16, 2012

• Work-in-Progress (WIP):

Deadline: March 12, 2012

DAC is the only conference focused on Electronic Design and Embedded Systems and Software (ESS).

DAC is the LARGEST EDA and Design Exhibition

49

WHY ATTEND DAC?dac.com

BOOK YOUR HOTELat dac.com

DAC.COM

Untitled-7 1 10/31/11 1:53:52 PM


Big probes, small featuresA team from Duke University in North Carolina addressed the challenges of pre-bond testing of thru-silicon-vias (TSVs) at the 2011 International Test Conference.

Where post-bond test checks for faults caused by the thinning, alignment or bonding of the die that compose a 3DIC, pre-bond test addresses problems that may arise in the TSVs themselves. If un-discovered before full assembly, these can still obviously lead to the outright failure of the finished device.

“Pre-bond testing of TSVs has been highlighted as a major challenge for yield assurance in 3D ICs,” Duke’s paper notes. “Successful pre-bond defect screening can allow defective dies to be discarded before stacking. Moreover, pre-bond testing and diagnosis can facilitate defect localization and repair prior to bonding.”

The types of defects being sought are, not surprisingly, very much akin to those that occur in more traditional intercon-nects. TSVs play the role of interconnects.

“Incomplete metal filling or microvoids in the TSV increase resistance and path delay. Partial or complete breaks in the TSV result in a resistive or open path, re-

spectively. Impurities in the TSV may also increase resistance and interconnect delay. Pinhole defects can lead to a leakage path to the substrate, with a corresponding in-crease in the capacitance between the TSV and the substrate,” the paper notes.

So, what to look for is in many ways straightforward. But the current state of the art in probe technology makes test challenging. Today’s cantilever and ver-tical probes have typical minimum pitch of 35um. However, to meet the needs of current process technologies, TSV pitch is more typically 4-5um on 0.5um spac-ing. The single-ended nature of TSVs also limits what is possible through built-in-self-test.

Duke’s proposal combines two existing test technologies: those apparently over-sized probes and a variant of the on-die scan architecture used in post-bond test-ing. The tester surface itself then com-prises many individual probe needles that contact multiple TSVs.

“In the proposed test method, a num-ber of TSVs are shorted together through contact with a probe needle to form a net-work of TSVs,” the paper explains. “The capacitance can be tested through an ac-

The move to thru-silicon-vias for stacked 3D systems and so-called 2.5D silicon interposer technology represents a major challenge to maintaining profitable yield. As well as the issues associated with retaining the integrity of multiple die in a single bonded product, there are also major constraints related to the space available for test entry pins and the viable geometries at which existing testers can provide acceptable results.

The two papers reviewed in this article were presented in a dedicated 3D test session at the 2011 International Test Conference in Anaheim by researchers from Duke University and a team combining talent from Cascade Microtechnology and the IMEC research institute. They look at opportunities to use variants or extensions of existing technologies to control test time and cost while meeting those demands on yield.

The third paper from this session, “Post-bond Testing of 2.5D-SICs and 3D-SICs containing a passive silicon interposer base” from IMEC, National Tsing-Hua University and TSMC, is described in the extended online version of this article at www.techdesignforum.com.

Special report, Tech Design Forum

Pre-bond test for 3D ICs at ITC 2011

TECH FORUM: [TESTED COMPONENT TO SYSTEM]

The papers featured in this article and the extended version online are available in their complete and original form as part of the full ITC Test Week con-ference proceedings by downloading the order form at http://www.itctestweek.org/papers/publication-sales


tive driver in the probe needle itself, and then the resistance of each TSV can be de-termined by asserting each TSV on to the shorted net.”

Post-bond foundationThe methodology begins by, as noted, building on techniques used in post-bond test, albeit making the “advanced” assumption that a currently proposed 1500-style die wrapper for scan-based TSV is commercially available.

Here, in place of a standard scan flop, Duke uses a gated one (Figures 13).

“As seen at the block level in Figure 1, the gated scan flop accepts either a func-tional input or a test input from the scan chain; the selection is made depending on operational mode. A new signal, namely the ‘open signal,’ is added; it determines whether the output Q floats or takes the value stored in the flip-flop,” the paper notes.

“In our design, shown at gate level in Figure 2 and at the transistor level in Figure 3 (p. 48), two cross-coupled inverters are used to store data. Transmission gates are inserted between the cross-coupled in-verters and at the input (D) and output (Q) of the flop itself.

“The widths of the transistor in the first cross-coupled inverter stage are greater than the widths of the second cross-cou-pled inverter stage such that the second stage takes the value of the first stage when the buffer between them is open and they are in contention. An internal inverter buffer is added before the output transmission gate such that the gated scan flop can drive a large capacitance on its output net without altering the value held in the flop. The ‘open’ signal controls the final transmission gate.”

A centralized gate controller identifies the open gates in a TSV network and is routed through a decoder to control the various networks simultaneously. Each network has its own probe needle, so TSVs in one network can be tested in parallel

with TSVs in another. Each TSV is driven by a dedicated gated scan-flop.

“A limitation of a central controller is that outputs from the decoder must be routed to each TSV network,” the paper acknowledges. “However, since we only need as many wires leaving the decoder as there are TSVs in the largest network, routing can be greatly simplified, especial-ly when compared to BIST techniques.”

Capacitance and resistanceEach probe requires both an active driver and a detection method to assess capaci-tance for each TSV network and the resis-tance for each via.

Duke aimed to make the circuitry here (Figure 4, p. 48) as straightforward as pos-sible. It comprises:

• a DC source with a voltage on the or-der of the circuit under test;

• a switch, S2, to connect or disconnect the source from a capacitor (Ccharge) of known value;

• a voltmeter that continuously monitors the voltage across the capacitor; and

• a second switch, S1, which effectively connects or disconnects the capacitor from the probe needle.

This charge sharing circuit allows for design and analysis in HSPICE.

There is a risk of errors attributable to leakage in this circuit design but these can be reduced by the use of an AC capaci-tance scheme.


Test

Functional Q

CLK

Open

Figure 1Block-level design of a gated scan flopSource: Duke University/ITC 2011

Functional

TestD

CLK CLK Open

Figure 2Gate-level design of a gated scan flopSource: Duke University/ITC 2011


Another factor here is that digital tes-ters typically do not offer off-the-shelf functionality to measure capacitance. Users of this methodology, therefore, have two options.

They can use analog/mixed-signal tes-ters that do have such options. Or they need to have capacitance sensors added to digital ones.

One hope is that commercial test equip-ment suppliers who note this work may make the necessary additions to digital products anyway.

Probe card configurationTSVs are inherently delicate, so the probe card needs to be configured in such a way that contacts are minimized. Duke pro-poses that the offset configuration seen in Figure 5 is such that the card only needs to be shifted up or down once for each ma-trix to be probed.

However, this is close to an ideal situa-tion. There may be instances that require the addition of dedicated needles to sup-ply “critical” signals (e.g., power supply, clocks). Also, it may be the case that a TSV or network could be contacted more than

once, but unnecessarily on the second pass. In this case, additional control sig-nals “can be included in the controller to close the gates for all TSVs tested in the first test period during the second test pe-riod, and vice versa.”

Duke presented HSPICE simulations, viewable in the final paper, to demonstrate the potential in its proposal not just for core resistance and capacitance measure-ments but also for stuck-at and leakage test. The simulations also indicated that the method is both reliable and accurate in the presence of process variations and multiple defective TSVs.

The hope now is that by pointing a way in which existing technology can be used for pre-bond test despite apparent physi-cal limitations, the methodology will help to control escalating test cost and attract more interest from commercial vendors.

“Pre-bond probing of TSVs in 3D stacked ICs”, Brandon Noia and Krish-nendu Chakrabarty, Duke University, Proc. ITC 2011, Anaheim, Paper 17.1.

MEMS-based probingAnother strategy targeting the challenges of pre-bond test was outlined at ITC 2011 by a team from Cascade Microtech and Belgian research institute IMEC.

Its paper put forward a lithograph-fabricated MEMS probe card, manufac-turable on current technology and which can work on 40um pitch arrays with the likelihood of scaling to still smaller di-mensions. The card also claims low prob-ing force and a lower cost per pin than conventional probes.

Initial mechanical and electrical re-sults “demonstrate the feasibility of probing large arrays at 1g-force per tip with very low pad damage, so as not to impair downstream bonding or other processing steps.”

Eliminating pre-bond probesThe microbumps that sit on the non-bot-tom dice in a 3D IC stack are generally con-

CLK

CLK

D

Vdd

Q

Open

Gnd

Open

Figure 3Transistor-level design of a gated scan flopSource: Duke University/ITC 2011

V

S2S1

VoltMeter

V1

C charge

Figure 4A charge sharing circuitSource: Duke University/ITC 2011


sidered too small for conventional probes, so designers have to add dedicated probe pads for pre-bond test. As well as taking up chip real estate and time to implement, these extra pads can leave a design more prone to parasitics. Also, the limited space available for the dedicated features often makes them so small that communication off chip is slow and significantly extends test time and cost.

So, how to maybe get rid of those probe pads and conduct pre-bond test directly through the microbumps?

Traditional cantilever and vertical probe cards contain an array of individual beams or needles that provide an electrical path and a compliant element. The compliant element ensures that the contact forces between each tip and each associated pad on a device-under-test (DUT) are in an ap-propriate range.

“The useable elastic strain of metals is on the order of 0.1%, so these beams/needles need to be long compared to the amount of tip deflection,” notes the paper. “In contrast, the probing tech-nology explored here has two compli-ant elements: the tip compliance and plunger compliance.”

The concept here extends the Cascade Microtech Pyramid Probe, which is used for high volume production testing of gigahertz frequency components.

“The new technology builds on that by greatly enhancing the tip compli-ance, and enabling finer pitch probing,” the paper continues. “This improved tip compliance is achieved by embed-ding the tips in an elastomer, which can handle roughly two orders of magnitude greater elastic strain than metals. The full array of tips is mechanically coupled to a semi-rigid plunger, which is also designed to deflect relative to the probe card frame (thus providing the plunger compliance).

Figure 6 (p. 50) shows schematics for the proposed card architectures. The en-larged view of the Pyramid probe tips

shows the elastomeric springs schemati-cally in red. Above the probe tips, the electrical path passes into a membrane, shown (lower left) in yellow, and out to the circuit board.

The proposal makes for a very small tip, but one still with sufficient compliance to handle non-uniform areas on the DUT without requiring large, potentially dam-aging increases in the probing force.

“The plunger and plunger spring use a combination of elastomeric and metal springs to accommodate imperfect pla-

narization and warpage or other distor-tions over larger dimensions (i.e., greater than 5-10X the probing pitch). Together, these compliant elements assure good con-tact force uniformity across the probing area,” the paper adds.

Both plunger and tip compliance can also be tuned for each probe card.

In developing the system, Cascade and IMEC have run probe card tests of up to one million touchdowns. They have also

Probe Card

Probe Head Offset(a) Configuration 1

(b) Configuration 2

Probehead

RequiredSignalsNo TSVS

contactedbetweenheads.

PreviouslyunprobedTSVs arenowcontacted

Figure 5Two configurations of a probe card for TSV testingSource: Duke University/ITC 2011



built prototypes that meet the JEDEC Wide I/O standard.

The technology is not immediately ready for deployment. Further work is needed to characterize the effects of probe tip forces on thin silicon layers when bonded to tem-porary carriers; the allowable pad damage that is compatible with stack assembly processes; and some production require-ments such as probe life-testing.

Nevertheless there is promise here.“The assumption that contacting TSVs

is impossible, highly risky, or prohibi-tively expensive is not valid. Contacting at TSV pitches is practical with evolutions of existing probe technology, and enables test strategies which probe some or all of the TSV pads, whether on the face or back of the wafer,” the research says.

“Evaluation of TSV and microbump probing for wide I/O testing”, Ken Smith, Peter Hanaway, Mike Jolley, Reed Glea-son, & Eric Strid, Cascade Microtech, Tom Daenen, Luc Dupas, Bruno Knuts, Erik Jan Marinissen, Marc Van Dievel, IMEC, Proc ITC 2011, Paper 17.2.

VerticalProbe

Cantilever Probe

CircuitBoard

PlungerSpring

Plunger

PyramidProbe

Figure 6Schematic diagram of probe card architecturesSource: Cascade Microtechnology/IMEC/ITC 2011

Untitled-2 1 8/10/11 9:27:09 AM

Untitled-3 1 12/2/11 2:59:32 PM

Documents

Tech Design Forum Journal