Modeling Microprocessor Performance

The book contains a detailed discussion of the various models and the underlying assumptions based on actual design practices.

Author: Bibiche Geuskens

Publisher: Springer Science & Business Media

ISBN: 9781461555612

Category: Technology & Engineering

Page: 195

View: 133

Download →

Modeling Microprocessor Performance focuses on the development of a design and evaluation tool, named RIPE (Rensselaer Interconnect Performance Estimator). This tool analyzes the impact on wireability, clock frequency, power dissipation, and the reliability of single chip CMOS microprocessors as a function of interconnect, device, circuit, design and architectural parameters. It can accurately predict the overall performance of existing microprocessor systems. For the three major microprocessor architectures, DEC, PowerPC and Intel, the results have shown agreement within 10% on key parameters. The models cover a broad range of issues that relate to the implementation and performance of single chip CMOS microprocessors. The book contains a detailed discussion of the various models and the underlying assumptions based on actual design practices. As such, RIPE and its models provide an insightful tool into single chip microprocessor design and its performance aspects. At the same time, it provides design and process engineers with the capability to model, evaluate, compare and optimize single chip microprocessor systems using advanced technology and design techniques at an early design stage without costly and time consuming implementation. RIPE and its models demonstrate the factors which must be considered when estimating tradeoffs in device and interconnect technology and architecture design on microprocessor performance.
Posted in:

Modeling Microprocessor Performance

The book contains a detailed discussion of the various models and the underlying assumptions based on actual design practices.

Author: Bibiche Geuskens

Publisher:

ISBN: 1461555620

Category:

Page: 216

View: 769

Download →

Posted in:

Analytical Modeling of Modern Microprocessor Performance

As the number of transistors integrated on a chip continues to increase, a growing challenge is accurately modeling performance in the early stages of processor design.

Author:

Publisher:

ISBN: OCLC:680293029

Category:

Page:

View: 967

Download →

As the number of transistors integrated on a chip continues to increase, a growing challenge is accurately modeling performance in the early stages of processor design. Analytical modeling is an alternative to detailed simulation with the potential to shorten the development cycle and provide additional insight. This thesis proposes hybrid analytical models to predict the impact of pending cache hits, hardware prefetching, and realistic miss status holding register (MSHR) resources on superscalar performance. We propose techniques to model the non-negligible influences of pending hits and the fine-grained selection of instruction profile window blocks on the accuracy of hybrid analytical models. We also present techniques to estimate the performance impact of data prefetching by modeling the timeliness of prefetches and to account for a limited number of MSHRs by restricting the size of profile window blocks. As with earlier hybrid analytical models, our approach is roughly two orders of magnitude faster than detailed simulations. Overall, our techniques reduce the error of our baseline from 39.7% to 10.3% when the number of MSHRs is unlimited. When modeling a processor with data prefetching, a limited number of MSHRs, or both, our techniques result in an average error of 13.8%, 9.5% and 17.8%, respectively. Moreover, this thesis proposes analytical models for predicting the cache contention and throughput of heavily fine-grained multithreaded architectures such as Sun Microsystems' Niagara. We first propose a novel probabilistic model using statistics characterizing individual threads run in isolation as inputs to accurately predict the number of extra cache misses due to cache contention among a large number of threads. We then present a Markov chain model for analytically estimating the throughput of multicore, fine-grained multithreaded architectures. Combined, the two models accurately predict system throughput obtained from a detailed simulator with an average.
Posted in:

Single and Multi CPU Performance Modeling for Embedded Systems

This thesis has attacked the problem of modeling microprocessor performance in embedded systems at different levels of abstraction , and through annotating timing information from cycle - level models back to the original application ...

Author: Trevor Conrad Meyerowitz

Publisher:

ISBN: UCAL:C3544556

Category:

Page: 362

View: 562

Download →

Posted in:

Embedded Computer Systems Architectures Modeling and Simulation

An analytical performance model for out of order issue superscalar micro-processors is presented. This model quantifies the performance impacts of micro-architecture design options including memory hierarchy, branch prediction, ...

Author: Timo D. H?m?l?inen

Publisher: Springer Science & Business Media

ISBN: 9783540269694

Category: Computers

Page: 476

View: 455

Download →

This book constitutes the refereed proceedings of the 5th International Workshop on Systems, Architectures, Modeling, and Simulation, SAMOS 2005, held in Samos, Greece in July 2005. The 49 revised full papers presented were thoroughly reviewed and selected from 114 submissions. The papers are organized in topical sections on reconfigurable system design and implementations, processor architectures, design and simulation, architectures and implementations, system level design, and modeling and simulation.
Posted in:

Digital Systems and Applications

4.2.3.2 Analytical Modeling Analytical performance models, while not popular for microprocessors, are suitable for evaluation of large computer systems. In large systems, where details cannot be modeled accurately for cycle accurate ...

Author: Vojin G. Oklobdzija

Publisher: CRC Press

ISBN: 9781351838108

Category: Computers

Page: 992

View: 750

Download →

New design architectures in computer systems have surpassed industry expectations. Limits, which were once thought of as fundamental, have now been broken. Digital Systems and Applications details these innovations in systems design as well as cutting-edge applications that are emerging to take advantage of the fields increasingly sophisticated capabilities. This book features new chapters on parallelizing iterative heuristics, stream and wireless processors, and lightweight embedded systems. This fundamental text— Provides a clear focus on computer systems, architecture, and applications Takes a top-level view of system organization before moving on to architectural and organizational concepts such as superscalar and vector processor, VLIW architecture, as well as new trends in multithreading and multiprocessing. includes an entire section dedicated to embedded systems and their applications Discusses topics such as digital signal processing applications, circuit implementation aspects, parallel I/O algorithms, and operating systems Concludes with a look at new and future directions in computing Features articles that describe diverse aspects of computer usage and potentials for use Details implementation and performance-enhancing techniques such as branch prediction, register renaming, and virtual memory Includes a section on new directions in computing and their penetration into many new fields and aspects of our daily lives
Posted in:

Computer Engineering and Technology

Noonburg, D.B., Shen, J.P.: A framework for statistical modeling of superscalar processor performance. In: Third International Symposium on High-Performance Computer Architecture, pp. 298–309. IEEE, (1997) 2.

Author: Weixia Xu

Publisher: Springer

ISBN: 9789811031595

Category: Computers

Page: 232

View: 966

Download →

This book constitutes the refereed proceedings of the 20th CCF Conference on Computer Engineering and Technology, NCCET 2016, held in Xi'an, China, in August 2016. The 21 full papers presented were carefully reviewed and selected from 120 submissions. They are organized in topical sections on processor architecture; application specific processors; computer application and software optimization; technology on the horizon.
Posted in:

Advances in Computers

SimpleScalar: An infrastructure for computer system modeling. ... In Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 25–36. ... Calibration of microprocessor performance models.

Author: Marvin Zelkowitz

Publisher: Academic Press

ISBN: 9780080880303

Category: Computers

Page: 368

View: 697

Download →

This is volume 72 of Advances in Computers, a series that began back in 1960 and is the oldest continuing series chronicling the ever-changing landscape of information technology. Each year three volumes are produced, which present approximately 20 chapters that describe the latest technology in the use of computers today. In this volume 72, we present the current status in the development of a new generation of high-performance computers. The computer today has become ubiquitous with millions of machines being sold (and discarded) annually. Powerful machines are produced for only a few hundred U.S. dollars, and one of the problems faced by vendors of these machines is that, due to the continuing adherence to Moore’s law, where the speed of such machines doubles about every 18 months, we typically have more than enough computer power for our needs for word processing, surfing the web, or playing video games. However, the same cannot be said for applications that require large powerful machines. Applications such as weather and climate prediction, fluid flow for designing new airplanes or automobiles, or nuclear plasma flow require as much computer power as we can provide, and even that is not enough. Today’s machines operate at the teraflop level (trillions of floating point operations per second) and this book describes research into the petaflop region (1,015 FLOPS). The six chapters provide an overview of current activities that will provide for the introduction of these machines in the years 2011 through 2015.
Posted in:

Multi Microprocessor Systems for Real Time Applications

|AJMO81| used an asynchronous model to analyze the performance of a single bus multiprocessor system with a single common memory module. This analysis was extended to multiple bus and multiple common memory systems by Ajmone Marsan and ...

Author: Gianni Conte

Publisher: Springer Science & Business Media

ISBN: 9789400954083

Category: Technology & Engineering

Page: 299

View: 102

Download →

The continous development of computer technology supported by the VLSI revolution stimulated the research in the field ·of multiprocessors systems. The main motivation for the migration of design efforts from conventional architectures towards multiprocessor ones is the possibi I ity to obtain a significant processing power together with the improvement of price/performance, reliability and flexibility figures. Currently, such systems are moving from research laboratories to real field appl ications. Future technological advances and new generations of components are I ikely to further enhance this trend. This book is intended to provide basic concepts and design methodologies for engineers and researchers involved in the development of mul tiprocessor systems and/or of appl ications based on multiprocessor architectures. In addition the book can be a source of material for computer architecture courses at graduate level. A preliminary knowledge of computer architecture and logical design has been assumed in wri ting this book. Not all the problems related with the development of multiprocessor systems are addressed in th i s book. The covered range spans from the electrical and logical design problems, to architectural issues, to design methodologis for system software. Subj ects such as software development in a multiprocessor environment or loosely coupled multiprocessor systems are out of the scope of the book. Since the basic elements, processors and memories, are now available as standard integrated circuits, the key design problem is how to put them together in an efficient and reliable way.
Posted in:

Improving the Capacity of U S Climate Modeling for Decision makers and End users

Until the mid - 1980s , high performance computing was defined by custom designed vector processors , those designed by the legendary Seymour Cray . The ubiquitous PC changed that , creating a new high performance computing model ...

Author: United States. Congress. Senate. Committee on Commerce, Science, and Transportation

Publisher:

ISBN: MINN:31951D03528067K

Category: Climatic changes

Page: 73

View: 819

Download →

Posted in:

Interlayer Thermal Management of High Performance Microprocessor Chip Stacks

81 83 4 Experimental Results and Validation of Modeling Framework 4.1 Uniform Single Cavity Experiment: Unit-Cell Shape Efficiency . ... 1 Vertical Integration of High-Performance Processor-Memory Stacks: Motivation & Conception.

Author: Thomas Brunschwiler

Publisher: Cuvillier Verlag

ISBN: 9783736940345

Category: Technology & Engineering

Page: 172

View: 112

Download →

Vertical integration of integrated circuit dies offers tremendous opportunities from an architectural as well as from an economical standpoint. Memory proximity supports performance scaling, and might enable significant energy savings. Partitioning of the corresponding functionalities and technologies into individual tiers can improve yield and modularity substantially. The paradigm change of stacking active components has a direct impact on heat-removal concepts and is therefore the motivation of this thesis. A stack comprised of a single logic layer in combination with multiple memory dies was identified as the limit for traditional back-side heat removal. To minimize junction temperatures, a stacking sequence with the high heat-flux component in close proximity to the cold plate is proposed. Interlayer cooling is the only volumetric heat-removal solution that scales with the number of dies in the stack. Hence, the focus of this thesis has been to identify the potential of interlayer cooling and to provide a modeling framework. Fundamental heat-transfer building blocks, such as unit-cell geometries, fluid structure modulation, fluid focusing, as well as four-port fluid delivery supporting power-map-aware heat removal, are discussed. Moreover, the theoretical foundation was experimentally validated on resistively heated convective test cavities. Therefore, specific bonding and insulation schemes were developed. Finally, the interlayer cooling performance was demonstrated on a pyramid chip stack. A multi-scale modeling approach for the efficient design of non-uniform heat-removal cavities was proposed. Periodic arrangements of heat-removal unit-cells in the cavities are described by the porousmedia approximation. Their characteristics are represented by the directional and velocity-dependent modified permeability and convective thermal resistance. An extended tensor description was developed to map the pressure gradient to the DARCY velocity. These parameters were derived from detailed numerical heat and mass transport modeling for arbitrary angle-of-attack of the fluid, using a set of novel routines that support periodic hydrodynamic and thermal boundary conditions. For pin-fin arrays, a biased fluid flow towards directions with maximal permeability could be observed. Fieldcoupling between the two-dimensional porous and adjacent three-dimensional solid domains was performed to derive the temperature field in the chip stack, including heat spreading in the silicon die. The modeling results are conservative and deviate less than 20% from the measured junction temperatures, when considering the temperature dependency of the coolant viscosity. This is a very good value considering the immense complexity reduction, resulting in a low computational time of less than 20 min on a desktop computer, to derive the mass transport and junction temperatures within a chip stack. Sputtered AuSn 80/20 was investigated as eutectic thin-film bond to form leak-tight interfaces with mechanical, electrical, and thermal functionality, as part of the technology development, to enable the use of water as coolant. The resulting bond quality was characterized for various underbump metallizations, atmospheres, and reflow/force profiles. The implementation of a differential pumped chamber allowed the use of formic acid in the flip chip bonder to reduce the tin oxide on the solder surface. The transient liquid-solid nature of the thin-film solder process explains the sensitivity on the underbump metallization and the heat ramp. Finally, processing guidelines supporting the design of leak-tight bond interfaces were summarized. Acceptable intermetallic compound formation was achieved at heat ramps of 100 K/min and with chromium as wetting layer. A bondline thickness of 4μm and a Teflon support provided sufficient compliance to form successful bonds considering the wedge errors of the flip chip bonder. Waterproof, two-level metallizations to mimic processor-like, non-uniform power maps with background and hot-spot heaters were developed for the implementation of single- and multi-cavity test sections. Pin-hole-free dielectric layers (1μm PECVD Si3N4 / 100nm ALD Al2O3) were achieved by conformal thin-film deposition. Numerous heat transfer assessments yielded the following insights: The limited heat capacity and flow rate of the coolant were identified as the major contributor to the thermal gradient in convective interlayer heat removal, even when water using as coolant. This is due to the small hydraulic diameter defined by the interconnect density (pitches 200 μm) and the length of the cross-flow heat exchange cavity ( 10 mm). The circular pin-fin in-line unit-cell was identified as the optimal heat transfer geometry for heat capacity limited cross-flow heat transfer. It results in the highest porosity, beneficial for efficient mass transport, compared with microchannels and other pin shapes at a given minimal radius constraint. Improved convective heat transfer towards the outlet of the cavities caused by transient vortex shedding was observed at increased REYNOLDS numbers ( 100) in the pin-fin in-line case. Fluid cavities with four-port fluid delivery and heat removal geometry modulation need to be considered for chip stacks larger than 2 cm2 and a interconnect pitch of  50 μm. Their effectiveness was demonstrated with cavities that were either partially fully or half populated with pin-fin arrays. These arrangements result in a significant increase in local fluid flow compared with uniform heat transfer cavities. Microchannels have proved to dissipate heat efficiently to multiple fluid cavities in the chip stack because of the improved die-to-die coupling, caused by the 50% fin fill factor. This is advantageous for disparate tier stacking. The high-power die can benefit from heat dissipation into cavities adjacent to low-power tiers. Additional recommendations, critical for electro-thermal co-design, are also discussed: i) Heat spreading in the silicon helps to mitigate hot-spots below a critical spatial dimension of 1mm. ii) High heat flux macros should be placed towards the fluid inlet and die corners if the two- or four-port configuration is implemented, respectively. iii) A manifold width of 1mm should be considered to achieve a fluid maldistribution below 1% between the fluid cavities. iv) A 1.6 ms thermal time constant was derived for an interlayer cooled chip stack. Hence, predictive cooling-loop control schemes need to be implemented to account for the comparable high pump time constant. Finally, for the first time, the superiority of interlayer cooling as a volumetric heat-removal method could be experimentally demonstrated on the pyramid chip stack test vehicle with four fluid cavities and three power dissipating tiers. Aligned hot-spots were included with 250 W/cm2 heat flux each. A total power of 390 W, corresponding to a 3.9 kW/cm3 volumetric heat flow, could be dissipated on the 1 cm2 device at a 54.7 K junction temperature increase. In comparison, back-side cooling would result in a junction temperature increase of 223 K with respect to the fluid inlet temperature of the microchannel cold plate. Using the results of the present work, it is now possible to design and predict mass and heat transport in an interlayer cooled chip stack, with the support of the proposed best-practice design rules in combination with the validated multi-scale modeling framework. The scalable nature of interlayer cooling will enable “Extreme-3D-Integration” with computation in sugar cube form factor chip stacks, extending integration density and efficiency scaling beyond the “End-of-2D-Scaling”.
Posted in:

The Computer Engineering Handbook

Performance modeling can be done using simulation models or analytical models. 4.2.3.1 Simulation Simulation has become the de facto performance modeling method in the evaluation of microprocessor architectures for several reasons.

Author: Vojin G. Oklobdzija

Publisher: CRC Press

ISBN: 9781439833162

Category: Computers

Page: 1648

View: 440

Download →

After nearly six years as the field's leading reference, the second edition of this award-winning handbook reemerges with completely updated content and a brand new format. The Computer Engineering Handbook, Second Edition is now offered as a set of two carefully focused books that together encompass all aspects of the field. In addition to complete updates throughout the book to reflect the latest issues in low-power design, embedded processors, and new standards, this edition includes a new section on computer memory and storage as well as several new chapters on such topics as semiconductor memory circuits, stream and wireless processors, and nonvolatile memory technologies and applications.
Posted in:

Computer Architecture Performance Evaluation Methods

A mechanistic performance model for superscalar out-of-order processors. ACM Transactions on Computer Systems (TOCS), 27(2), May 2009. 38, 44, 45 [65] S. Eyerman, James E. Smith, and L. Eeckhout. Characterizing the branch misprediction ...

Author: Lieven Eeckhout

Publisher: Morgan & Claypool Publishers

ISBN: 9781608454686

Category: Computers

Page: 145

View: 318

Download →

Performance evaluation is at the foundation of computer architecture research and development. Contemporary microprocessors are so complex that architects cannot design systems based on intuition and simple models only. Adequate performance evaluation methods are absolutely crucial to steer the research and development process in the right direction. However, rigorous performance evaluation is non-trivial as there are multiple aspects to performance evaluation, such as picking workloads, selecting an appropriate modeling or simulation approach, running the model and interpreting the results using meaningful metrics. Each of these aspects is equally important and a performance evaluation method that lacks rigor in any of these crucial aspects may lead to inaccurate performance data and may drive research and development in a wrong direction. The goal of this book is to present an overview of the current state-of-the-art in computer architecture performance evaluation, with a special emphasis on methods for exploring processor architectures. The book focuses on fundamental concepts and ideas for obtaining accurate performance data. The book covers various topics in performance evaluation, ranging from performance metrics, to workload selection, to various modeling approaches including mechanistic and empirical modeling. And because simulation is by far the most prevalent modeling technique, more than half the book's content is devoted to simulation. The book provides an overview of the simulation techniques in the computer designer's toolbox, followed by various simulation acceleration techniques including sampled simulation, statistical simulation, parallel simulation and hardware-accelerated simulation. Table of Contents: Introduction / Performance Metrics / Workload Design / Analytical Performance Modeling / Simulation / Sampled Simulation / Statistical Simulation / Parallel Simulation and Hardware Acceleration / Concluding Remarks
Posted in:

Computer Systems Architectures Modeling and Simulation

Media processing has motivated strong changes in the focus and design of processors. The inclusion of μSIMD multimedia extensions such as MMX is a cost effective option to improve the performance of those regions of the program with ...

Author: Andy Pimentel

Publisher: Springer

ISBN: 9783540277767

Category: Computers

Page: 566

View: 453

Download →

This book constitutes the refereed proceedings of the 4th International Workshop on Systems, Architectures, Modeling, and Simulation, SAMOS 2004, held in Samos, Greece on July 2004. Besides the SAMOS 2004 proceedings, the book also presents 19 revised papers from the predecessor workshop SAMOS 2003. The 55 revised full papers presented were carefully reviewed and selected for inclusion in the book. The papers are organized in topical sections on reconfigurable computing, architectures and implementation, and systems modeling and simulation.
Posted in:

Power Aware Computer Systems

We describe a new power-performance modeling toolkit, developed to aid in the evaluation and definition of future power-efficient, PowerPCTM processors. The base performance models in use in this project are: (a) a fast but ...

Author: B. Falsafi

Publisher: Springer Science & Business Media

ISBN: 9783540423294

Category: Computers

Page: 151

View: 851

Download →

This book constitutes the thoroughly refereed post-proceedings of the First International Workshop on Power-Aware Computer Systems, PACS 2000, held in Cambridge, MA, USA, in November 2000. The 11 revised full papers presented were carefully reviewed, selected, and revised for inclusion in the book. This book addresses power/energy-awareness at all levels of computer systems. The papers are organized in sections on power-aware microarchitectural/circuit techniques, application/compiler optimization, exploiting IPC/memory slack, and power/performance models and tools.
Posted in:

High Performance Computing HiPC 2002

... a unique tool for power-aware design space exploration of superscalar processors. HLSpower is based upon HLS [OCF00], a tool which used a novel blend of statistical modeling and symbolic execution to accelerate performance modeling ...

Author: International Conference on High Performance Computing (9 : 2002 : Bangalore)

Publisher: Springer Science & Business Media

ISBN: 9783540003038

Category: Computers

Page: 732

View: 187

Download →

This book constitutes the refereed proceedings of the 9th International Conference on High Performance Computing, HiPC 2002, held in Bangalore, India in December 2002. The 57 revised full contributed papers and 9 invited papers presented together with various keynote abstracts were carefully reviewed and selected from 145 submissions. The papers are organized in topical sections on algorithms, architecture, systems software, networks, mobile computing and databases, applications, scientific computation, embedded systems, and biocomputing.
Posted in:

Microprocessor Based Parallel Architecture for Reliable Digital Signal Processing Systems

With reliable shared buses, a single bus is shared by all the processors via time-division multiplexing. However, this model is unsatisfactory in meeting the desired specifications for this system, in terms of both performance and fault ...

Author: Alan D. George

Publisher: CRC Press

ISBN: 9781351091510

Category: Computers

Page: 287

View: 912

Download →

This book presents a distributed multiprocessor architecture that is faster, more versatile, and more reliable than traditional single-processor architectures. It also describes a simulation technique that provides a highly accurate means for building a prototype system in software. The system prototype is studied and analyzed using such DSP applications as digital filtering and fast Fourier transforms. The code is included as well, which allows others to build software prototypes for their own research systems. The design presented in Microprocessor-Based Parallel Architecture for Reliable Digital Signal Processing Systems introduces the concept of a dual-mode architecture that allows users a dynamic choice between either a conventional or fault-tolerant system as application requirements dictate. This volume is a "must have" for all professionals in digital signal processing, parallel and distributed computer architecture, and fault-tolerant computing.
Posted in:

Numerical Techniques for Global Atmospheric Models

In this regime there was little incentive to improve application performance by increasing parallelism. In middle of the 2000s these circumstances began to change, as several fundamental factors began to limit microprocessor frequency.

Author: Peter H. Lauritzen

Publisher: Springer Science & Business Media

ISBN: 9783642116407

Category: Mathematics

Page: 564

View: 306

Download →

This book surveys recent developments in numerical techniques for global atmospheric models. It is based upon a collection of lectures prepared by leading experts in the field. The chapters reveal the multitude of steps that determine the global atmospheric model design. They encompass the choice of the equation set, computational grids on the sphere, horizontal and vertical discretizations, time integration methods, filtering and diffusion mechanisms, conservation properties, tracer transport, and considerations for designing models for massively parallel computers. A reader interested in applied numerical methods but also the many facets of atmospheric modeling should find this book of particular relevance.
Posted in:

Transactions on High Performance Embedded Architectures and Compilers V

Moreover, the degree of abstraction influences simulation times for simulation-based approaches and hardware-effort for ... Microprocessor performance counters are utilized for system-wide power estimations by Bircher et al. in [13].

Author: Cristina Silvano

Publisher: Springer

ISBN: 9783662588345

Category: Computers

Page: 141

View: 996

Download →

Transactions on HiPEAC aims at the timely dissemination of research contributions in computer architecture and compilation methods for high-performance embedded computer systems. Recognizing the convergence of embedded and general-purpose computer systems, this journal publishes original research on systems targeted at specific computing tasks as well as systems with broad application bases. The scope of the journal therefore covers all aspects of computer architecture, code generation and compiler optimization methods of interest to researchers and practitioners designing future embedded systems. This 5th issue contains extended versions of papers by the best paper award candidates of IC-SAMOS 2009 and the SAMOS 2009 Workshop, colocated events of the 9th International Symposium on Systems, Architectures, Modeling and Simulation, SAMOS 2009, held in Samos, Greece, in 2009. The 7 papers included in this volume were carefully reviewed and selected. The papers cover research on embedded processor hardware/software design and integration and present challenging research trends.
Posted in: