Solaris application programming

Author(s)

    • Gove, Darryl

Bibliographic Information

Solaris application programming

Darryl Gove

Sun Microsystems Press , Prentice Hall, c2008

Available at  / 1 libraries

Search this Book/Journal

Note

Includes index

Description and Table of Contents

Description

Solaris (TM) Application Programming is a comprehensive guide to optimizing the performance of applications running in your Solaris environment. From the fundamentals of system performance to using analysis and optimization tools to their fullest, this wide-ranging resource shows developers and software architects how to get the most from Solaris systems and applications. Whether you're new to performance analysis and optimization or an experienced developer searching for the most efficient ways to solve performance issues, this practical guide gives you the background information, tips, and techniques for developing, optimizing, and debugging applications on Solaris. The text begins with a detailed overview of the components that affect system performance. This is followed by explanations of the many developer tools included with Solaris OS and the Sun Studio compiler, and then it takes you beyond the basics with practical, real-world examples. In addition, you will learn how to use the rich set of developer tools to identify performance problems, accurately interpret output from the tools, and choose the smartest, most efficient approach to correcting specific problems and achieving maximum system performance. Coverage includes A discussion of the chip multithreading (CMT) processors from Sun and how they change the way that developers need to think about performance A detailed introduction to the performance analysis and optimization tools included with the Solaris OS and Sun Studio compiler Practical examples for using the developer tools to their fullest, including informational tools, compilers, floating point optimizations, libraries and linking, performance profilers, and debuggers Guidelines for interpreting tool analysis output Optimization, including hardware performance counter metrics and source code optimizations Techniques for improving application performance using multiple processes, or multiple threads An overview of hardware and software components that affect system performance, including coverage of SPARC and x64 processors

Table of Contents

Preface xix Part I: Overview of the Processor 1 Chapter 1: The Generic Processor 3 1.1 Chapter Objectives 3 1.2 The Components of a Processor 3 1.3 Clock Speed 4 1.4 Out-of-Order Processors 5 1.5 Chip Multithreading 6 1.6 Execution Pipes 7 1.7 Caches 11 1.8 Interacting with the System 14 1.9 Virtual Memory 16 1.10 Indexing and Tagging of Memory 18 1.11 Instruction Set Architecture 18 Chapter 2: The SPARC Family 21 2.1 Chapter Objectives 21 2.2 The UltraSPARC Family 21 2.3 The SPARC Instruction Set 23 2.4 32-bit and 64-bit Code 30 2.5 The UltraSPARC III Family of Processors 30 2.6 UltraSPARC T1 37 2.7 UltraSPARC T2 37 2.8 SPARC64 VI 38 Chapter 3: The x64 Family of Processors 39 3.1 Chapter Objectives 39 3.2 The x64 Family of Processors 39 3.3 The x86 Processor: CISC and RISC 40 3.4 Byte Ordering 41 3.5 Instruction Template 42 3.6 Registers 43 3.7 Instruction Set Extensions and Floating Point 46 3.8 Memory Ordering 46 Part II: Developer Tools 47 Chapter 4: Informational Tools 49 4.1 Chapter Objectives 49 4.2 Tools That Report System Configuration 49 4.3 Tools That Report Current System Status 55 4.4 Process- and Processor-Specific Tools 72 4.5 Information about Applications 84 Chapter 5: Using the Compiler 93 5.1 Chapter Objectives 93 5.2 Three Sets of Compiler Options 93 5.3 Using -xtarget=generic on x86 95 5.4 Optimization 96 5.5 Generating Debug Information 102 5.6 Selecting the Target Machine Type for an Application 103 5.7 Code Layout Optimizations 107 5.8 General Compiler Optimizations 116 5.9 Pointer Aliasing in C and C++ 123 5.10 Other C- and C++-Specific Compiler Optimizations 133 5.11 Fortran-Specific Compiler Optimizations 135 5.12 Compiler Pragmas 136 5.13 Using Pragmas in C for Finer Aliasing Control 142 5.14 Compatibility with GCC 147 Chapter 6: Floating-Point Optimization 149 6.1 Chapter Objectives 149 6.2 Floating-Point Optimization Flags 149 6.3 Floating-Point Multiply Accumulate Instructions 173 6.4 Integer Math 174 6.5 Floating-Point Parameter Passing with SPARC V8 Code 178 Chapter 7: Libraries and Linking 181 7.1 Introduction 181 7.2 Linking 181 7.3 Libraries of Interest 193 7.4 Library Calls 199 Chapter 8: Performance Profiling Tools 207 8.1 Introduction 207 8.2 The Sun Studio Performance Analyzer 207 8.3 Collecting Profiles 208 8.4 Compiling for the Performance Analyzer 210 8.5 Viewing Profiles Using the GUI 210 8.6 Caller-Callee Information 212 8.7 Using the Command-Line Tool for Performance Analysis 214 8.8 Interpreting Profiles 215 8.9 Intepreting Profiles from UltraSPARC III/IV Processors 217 8.10 Profiling Using Performance Counters 218 8.11 Interpreting Call Stacks 219 8.12 Generating Mapfiles 222 8.13 Generating Reports on Performance Using spot 223 8.14 Profiling Memory Access Patterns 226 8.15 er_kernel 233 8.16 Tail-Call Optimization and Debug 235 8.17 Gathering Profile Information Using gprof 237 8.18 Using tcov to Get Code Coverage Information 239 8.19 Using dtrace to Gather Profile and Coverage Information 241 8.20 Compiler Commentary 244 Chapter 9: Correctness and Debug 247 9.1 Introduction 247 9.2 Compile-Time Checking 248 9.3 Runtime Checking 256 9.4 Debugging Using dbx 262 9.5 Locating Optimization Bugs Using ATS 271 9.6 Debugging Using mdb 274 Part III: Optimization 277 Chapter 10: Performance Counter Metrics 279 10.1 Chapter Objectives 279 10.2 Reading the Performance Counters 279 10.3 UltraSPARC III and UltraSPARC IV Performance Counters 281 10.4 Performance Counters on the UltraSPARC IV and UltraSPARC IV+ 302 10.5 Performance Counters on the UltraSPARC T1 304 10.6 UltraSPARC T2 Performance Counters 308 10.7 SPARC64 VI Performance Counters 309 10.8 Opteron Performance Counters 310 Chapter 11: Source Code Optimizations 319 11.1 Overview 319 11.2 Traditional Optimizations 319 11.3 Data Locality, Bandwidth, and Latency 326 11.4 Data Structures 339 11.5 Thrashing 349 11.6 Reads after Writes 352 11.7 Store Queue 354 11.8 If Statements 357 11.9 File-Handling in 32-bit Applications 364 Part IV: Threading and Throughput 369 Chapter 12: Multicore, Multiprocess, Multithread 371 12.1 Introduction 371 12.2 Processes, Threads, Processors, Cores, and CMT 371 12.3 Virtualization 374 12.4 Horizontal and Vertical Scaling 375 12.5 Parallelization 376 12.6 Scaling Using Multiple Processes 378 12.7 Multithreaded Applications 385 12.8 Parallelizing Applications Using OpenMP 402 12.9 Using OpenMP Directives to Parallelize Loops 403 12.10 Using the OpenMP API 406 12.11 Parallel Sections 407 12.12 Automatic Parallelization of Applications 408 12.13 Profiling Multithreaded Applications 410 12.14 Detecting Data Races in Multithreaded Applications 412 12.15 Debugging Multithreaded Code 413 12.16 Parallelizing a Serial Application 417 Part V: Concluding Remarks 435 Chapter 13: Performance Analysis 437 13.1 Introduction 437 13.2 Algorithms and Complexity 437 13.3 Tuning Serial Code 442 13.4 Exploring Parallelism 444 13.5 Optimizing for CMT Processors 446 Index 447

by "Nielsen BookData"

Details

Page Top