Techniques for optimizing applications : high performance computing

著者

    • Garg, Rajat P
    • Sharapov, Ilya

書誌事項

Techniques for optimizing applications : high performance computing

Rajat P. Garg, Ilya Sharapov

(Sun blueprints)

Sun Microsystems Press, c2002

大学図書館所蔵 件 / 2

この図書・雑誌をさがす

内容説明・目次

内容説明

This book is a practical guide to performance optimization of computationally intensive programs on Sun UltraSPARC platforms. It is primarily intended for developers of technical or high performance computing (HPC) applications for the Solaris(tm) operating environment. This audience includes both independent software vendor (ISV) developers and noncommercial developers. It can also be used by end-users of HPC applications to help them better understand how applications utilize system resources. The book presents information so that it follows logical stages of the process for application development and optimization. We pay special attention to issues related to parallel applications and to using appropriate performance measurement tools. Wherever applicable, sections are illustrated with code examples that show benefits of methods described. Unless otherwise noted, topics in this book are not limited to a particular programming language, parallelization method, software version, or hardware product. However, emphasis is on techniques relevant to applications written in Fortran 77, Fortran 90, and C, because these languages are most commonly used in HPC and technical applications. Most topics can be applied to C++ programs; however, we do not address performance optimization issues specific to object-oriented programming.

目次

Acknowledgments. Preface. Who Should Read This Book. How This Book Is Organized. Additional Resources. Code Examples. Typographical Conventions. I. GETTING STARTED. 1. Introduction. Performance Components. Hardware. Software. Optimization Process Overview. Serial Optimization. Parallel Optimization. 2. Overview of Sun UltraSPARC Solaris Platforms. UltraSPARC-Based Desktop and Server Product Line. UltraSPARC-Based Workstations. UltraSPARC-Based Servers. Sun Technical Compute Farm. Solaris Operating Environment. Sun WorkShop and Forte Developer Tools. HPC ClusterTools Software. Summary. 3. Application Development on Solaris. Development Basics. Standards Conformance. Binary Compatibility. Source Code Verification Tools. Checking C Programs. Checking Fortran Programs. Additional Source Code Analysis Tools. 64-bit Development and Porting. Fortran Porting. Language Interoperability. Fortran 95 and Fortran 77. C and Fortran. Linking Mixed Languages. Summary. II. OPTIMIZING SERIAL APPLICATIONS. 4. Measuring Program Performance. Measurement Methodology. Benchmarking Guidelines. Measurement Tools. Program Timing Tools. Timing Entire Program. Timing Program Portions. Fine-Grained Timing Measurement. Program Profiling Tools. Profiling With prof and gprof. Profiling With tcov. Profiling Tools in Forte Developer 6. Process and System Monitoring Tools. /proc Tools. Process Tracing Tools. System Monitoring Tools. Hardware Counter Measurements. Monitoring Tools. Hardware Counter Overflow Profiling. Code Instrumentation With libcpc Calls. Summary. 5. Basic Compiler Optimizations. Compilation Overview. Structure of Sun Compilers. Using Sun Compilers. -fast and -xtarget Options. Basic Guidelines. -xarch. Specifying Target Architecture. Generation of Conditional Move Instructions. Creating 64-bit Binaries. -xchip. -xO Optimization Level. -xinline, -xcrossfile. -xdepend. - xvector. -xsfpconst. -xprofile=collect, use. -xprefetch. Summary. 6. Advanced Compiler Optimizations. IEEE Floating-Point Arithmetic. Binary Storage Format. Trap Handling and -ftrap. Gradual Underflow and -fns. -fsimple. -dalign. -xsafe= mem. Pointer Alias Analysis Options. -xrestrict. -xalias_level. -stackvar. Compiler Directives and Pragmas. pragma pipeloop. pragma opt. pragma prefetch. pragma pack. pragma align. Pointer Alias Analysis Pragmas. Summary. 7. Linker and Libraries in Performance Optimization. Linking Overview. Static and Dynamic Linking. Structure of an ELF Binary. Solaris Linker Usage. Linking Static and Dynamic Libraries. Weak Symbol Binding. Linker Mapfiles. Linking Optimized Math Libraries. Creating Architecture-Specific Libraries. $PLATFORM and $ISALIST Linker Tokens. $ORIGIN Token. Runtime Linker in Profiling and Debugging. Interposing Libraries. Using LD_PROFILE and LD_DEBUG. Summary. 8. Source Code Optimization. Overview of Memory Hierarchy. Memory Levels. Memory Organization of UltraSPARC-Based Systems. Memory Hierarchy Optimizations. Cache Blocking. Reducing Cache Conflicts. Reducing TLB Misses. Page-Coloring Effects. Memory Bank Interleaving. Inlining Assembly Templates. Optimal Data Alignment. Restructuring for Better Data Alignment. Double-Word Load and Store Generation. Cache Line Alignment. Preventing Register Window Overflow. Aliasing Optimizations. Aliasing in Fortran Programs. Pointer Aliasing in C Programs. Summary. 9. Loop Optimization. Loop Unrolling and Tiling. Loop Interchange. Loop Fusion. Loop Fission. Loop Peeling. Loops With Conditionals. Strength Reduction in Loops. Division Replacement. Operations on Complex and Real Operands. Summary. III. OPTIMIZING PARALLEL APPLICATIONS. 10. Parallel Processing Models on Solaris. Parallelization Overview. Parallel Scalability Concepts. Parallel Architectural Models. Parallel Programming Models. Multithreading Models. Compiler Auto-Parallelization. OpenMP Compiler Directives. Explicit Multithreading Using P-threads. Multiprocessing Models. UNIX fork/exec Model. MPI Message-Passing Model. Hybrid Models. Summary. 11. Parallel Performance Measurement Tools. Measurement Methodology. Timing a Parallel Program and Its Portions. Parallel Performance Monitoring With Forte Developer 6 Tools. Trace Normal Form Utilities. Analyzing and Profiling MPI Programs With the Prism Environment. Parallel System Monitoring Tools. Binding a Program to a Set of Processors. Measuring Performance on a Per-CPU Basis. Monitoring Kernel Lock Statistics. Hardware Counter Tools for Parallel Performance Monitoring. cpustat and cputrack Tools. busstat Tool. Summary. 12. Optimization of Explicitly Threaded Programs. Programming Models for Multithreading. Master-Slave Model. Worker-Crew Model. Pipeline Model. Multithreading in the Solaris Operating Environment. Thread Models. Compiling Threaded Applications. True and False Data Sharing. Synchronization and Locking. Thread Stack Size. Thread Creation Issues. Pool of Threads. Pool of Threads With Spin Locks. Summary. 13. Optimization of Programs Using Compiler Parallelization. Parallelization Support in Sun Compilers. Parallelization Model. Runtime Settings. Automatic Parallelization. Explicit Parallelization. OpenMP Support in Fortran 95 Compiler. OpenMP Programming Styles. Section Parallel Style. Single Program Multiple Data (SPMD) Style. OpenMP Performance Considerations. Synchronization Issues. Data Scoping. Memory Bandwidth Requirement. OpenMP and P-threads. Parallel Sun Performance Library. Linking the Library. Runtime Issues. 64-bit Integer Arguments. Fortran SUNPERF Module. Summary. 14. Optimization of Message-Passing Programs. Programming Models and Performance Considerations. Workload Distribution. Pipeline Method. Loop Parallelization Methods. Communication Metrics. Sun MPI Implementation. Building and Running MPI Programs. Dynamic Process Management. MPI I/O. Sun MPI Environment Variables. Diagnostic Information. Dedicated and Timeshared System Execution. Optimized Collectives. Point-to-Point Communication. General Performance. Sun Scalable Scientific Subroutine Library. MPI, OpenMP, and Hybrid Approaches. MPI and OpenMP Approaches. Hybrid Approach. Summary. IV. APPENDICES. A. Commands That Identify System Configuration Parameters. Hardware Parameters. System Configuration. Parameters of Installed Software and Hardware. Summary of Commands. B. Architecture of UltraSPARC Microprocessor Family. UltraSPARC I and II Processors. UltraSPARC III Processor. UltraSPARC IIi Processor. UltraSPARC IIe Processor. C. Architecture of UltraSPARC Interconnect Family. Ultra Port Architecture Interconnect. Gigaplane Interconnect. Gigaplane XB Crossbar Interconnect. Fireplane Interconnect. D. Hardware Counter Performance Metrics. CPU Counters. System ASIC Counters. E. Interval Arithmetic Support in Forte Developer 6 Fortran 95 Compiler. Interval Arithmetic Basics. Solution of Nonlinear Problems. F. Differences in I/O Performance. Reading a File with read/lseek. Reading a File with fread/fseek. Mapping a File to Memory. References. Index.

「Nielsen BookData」 より

関連文献: 1件中  1-1を表示

詳細情報

  • NII書誌ID(NCID)
    BA58860438
  • ISBN
    • 0130934763
  • 出版国コード
    us
  • タイトル言語コード
    eng
  • 本文言語コード
    eng
  • 出版地
    Palo Alto, Calif.
  • ページ数/冊数
    xliii, 616 p
  • 大きさ
    24 cm
  • 分類
  • 件名
  • 親書誌ID
ページトップへ