Art

12 pages
4 views

Automatic logging of operating system effects to guide application-level architecture simulation

Please download to get full document.

View again

of 12
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Description
Automatic logging of operating system effects to guide application-level architecture simulation
Transcript
  Automatic Logging of Operating System Effects toGuide Application-Level Architecture Simulation Satish Narayanasamy † , Cristiano Pereira † , Harish Patil ‡ , Robert Cohn ‡ , and Brad Calder † † Computer Science and Engineering, University of California, San Diego ‡ Intel Corporation Abstract Modern architecture research relies heavily on application-level detailed pipeline simulation. A time consuming part of building a simulator is correctly emulating the operating sys-tem effects, which is required even if the goal is to simulate  just the application code, in order to achieve functional cor-rectness of the application’s execution. Existing application-level simulators require manually hand coding the emulation of each and every possible system effect (e.g., system call,interrupt, DMA transfer) that can impact the application’s execution. Developing such an emulator for a given oper-ating system is a tedious exercise, and it can also be costly to maintain it to support newer versions of that operating system. Furthermore, porting the emulator to a completely different operating system might involve building it all to-gether from scratch.In this paper, we describe a tool that can automatically log operating system effects to guide architecture simulation of application code. The benefits of our approach are: (a)we do not have to build or maintain any infrastructure for emulating the operating system effects, (b) we can support simulation of more complex applications on our application-level simulator, including those applications that use asyn-chronous interrupts, DMA transfers, etc., and (c) using the system effects logs collected by our tool, we can determinis-tically re-execute the application to guide architecture simu-lation that has reproducible results. Categories and Subject Descriptors:  I.6.7 [Simulationand Modeling]: Simulation Support Systems General Terms:  Experimentation, Measurement and Per-formance Keywords:  Architecture Simulation, Emulating SystemCalls, and Checkpoints 1. INTRODUCTION Modern computer architecture research relies heavily oncycle-accurate simulation to help evaluate new architecturalfeatures. Our focus is on building and maintaining application-level simulators. These are simulators that perform cyclelevel simulation of the application code and system libraries, Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee. SIGMetrics/Performance’06,  June 26–30, 2006, Saint Malo, France.Copyright 2006 ACM 1-59593-320-4/06/0006 ... $ 5.00. but do not simulate what happens while handling an oper-ating system call or interrupt. A time consuming part of building such a simulator is correctly emulating the systemeffects executed as part of the workload under study. Forexample, the traditional solution [4, 19, 17] to emulate sys-tem calls for these simulators is by gathering the requiredinput values from simulated registers and memory state andusing them to invoke the call natively. In addition, mostof these simulators do not support system effects such asDMA transfers or asynchronous interrupts because of theiremulation complexity.Emulating operating system effects, even just the systemcalls, can be a tedious exercise. For system calls, the pro-grammer has to be aware of the input and output semanticsof every call that needs to be emulated. Apart from havingto handle the complexity of an emulator, porting the simula-tor to run on a different operating system is labor intensive.Even maintaining a simulator with system emulation can bequite expensive, since the emulator can break when the sim-ulator is run on newer versions of the same operating system.Problems arise when there are changes to the operating sys-tem interface used by the application being simulated, sincethis can also require changes to the emulation system. In ad-dition to all these problems, a good number of system effectsare non-deterministic in nature, and as a result emulatingthem using native system calls during simulation can causesmall variations across different simulations of the same pro-gram with the same input. Hence, simulation results maynot be completely reproducible.In this paper, we present a tool that can automaticallycapture the side effects of all the operating system interac-tions to support application-level simulation. We call ourtool  pinSEL  (Pin-System Effect Logger), which is built us-ing the Pin [9] instrumentation tool. We capture systemeffects by executing an instrumented version of the binarynatively on the operating system for which the workload bi-nary was compiled for. The instrumented code creates the System Effect Log   (SEL) when executed. For each systemcall executed, the log contains the changes to the registerstate effected by the system call. The log also contains thevalues of memory locations accessed by load instructions ex-ecuted after the system call, if those memory locations weremodified by the system call. Our algorithm to identify theregisters and memory locations modified by a system call isindependent of the semantics of the system call and henceit is easy to implement and is portable across operating sys-tems. The SEL also contains memory values modified byother system interactions such as asynchronous interruptsor DMA, if those modified memory values are accessed bythe program being simulated. Thus the SEL enables deter-ministic simulation of a program-input pair across system  calls, interrupts and DMA transfers. Deterministic simula-tion is important to accurately compare different alterna-tives during design space exploration. In addition, pinSELcan also support simulation of multi-threaded applicationson uniprocessor systems, which we discuss in Section 5.6.Using pinSEL, an application-level simulator can avoidthe emulation of system effects and the associated complex-ity. As a result, we can easily simulate real applications fromstandard operating systems. For example, SimpleScalar [4],which has been widely used for over a decade, emulates just enough number of systems calls to support the simu-lation of SPEC and similar applications, and cannot sup-port simulation of many real world programs. Using our ap-proach, we can now simulate real world Linux applicationson our x86 version of SimpleScalar [4] without having toemulate any system calls or complex interactions with asyn-chronous interrupts or DMA transfers. At Intel, engineerswere successful in using pinSEL to quickly and easily sup-port application-level architecture simulation of MAC OSand Windows applications. Without pinSEL it would nothave been practical to port pinLIT to these operating sys-tems for architecture simulator. PinLIT (Pin-Long Instruc-tion Trace) is a tool used at Intel to gather checkpoints tosupport architecture simulation.Using our approach, we can simulate only the executionof application code and the user level libraries. This isuseful for studying applications like desktop and scientificprograms, which spend a significant amount of executiontime in the user level code. Even interactive applicationslike  acroread  and  powerpoint , spend 80% and 76% of ex-ecution time respectively in application code and user levellibraries [1], which will be captured by our approach. How-ever, our approach has limitations in that it cannot be usedto study applications that are heavily dependent on systeminteraction (e.g., I/O bound applications like TPC-C [6],web servers like DSS that spend significant amount of exe-cution time in the kernel code).This paper makes the following contributions: •  We examine how current popular simulators handlesystem effects through emulation. The discussion in-cludes Intel’s pinLIT and our x86 version of SimpleScalar [4]. •  We present a technique to automatically log systemeffects for user-level architecture simulation. The ben-efits of our automated logging are (a) we do not haveto build or support any infrastructure for emulatingsystem effects, (b) our approach allows us to log andsimulate more complex programs, including those thatuse asynchronous interrupts and DMA transfers, and(c) the system effect logs provide deterministic simu-lation across all kinds of system effects. •  We describe our tool called pinSEL (System EffectLogger) that is built using the Pin [9] instrumenta-tion infrastructure. A user level architectural simula-tor built with pinSEL support can simulate programsthat have complex interactions with the operation sys-tem and is also portable across different operating sys-tems.The rest of this paper is organized as follows. Section 2discusses prior work. Sections 3 explains the application-level simulation technique used in a couple of widely usedsimulators. The complexity of manually emulating systemcalls in those simulators is discussed in Section 4. Section5 presents our solution to automatically log system effects.Section 6 analyzes the time and space logging overhead. Sec-tion 7 summarizes. 2. PRIOR WORK This section discusses existing solutions to handle systemeffects. 2.1 Handling system effects for ApplicationLevel Simulation Many popularly used cycle accurate simulators [4, 19, 17]simulate just the user code and this is sufficient for study-ing many micro-architecture level optimizations and designchoices using workloads like SPEC. However, even thoughtheir goal is to simulate only the user code, they still have toemulate the system calls to obtain correct execution of theprogram. The conventional solution to emulate system callsis to decode the system call and obtain the arguments. Thenusing those arguments the simulator invokes an equivalentsystem call that can be executed natively on the host ma-chine on which the simulator is executing. The result valuesobtained from this native execution are then used to mod-ify appropriate simulated registers and memory locations.The output of the system call can be stored in a trace (e.g.,EIO trace in SimpleScalar) so that future simulations canuse those traces instead of emulating the system call again.Using system call traces like EIO traces ensures determinis-tic simulation, and we describe this approach in more detailin Section 4.The above approach is not desirable for a number of rea-sons. First, the programmer writing the emulator needs toexplicitly handle each system call to find the registers andmemory locations that contain the input/output operands.This code is then only valid for a given operating system.To use the simulator on multiple operating systems wouldrequire the emulation of the simulated system calls for eachof these systems. Even maintaining the simulator to run onthe same operating system requires changes over time to sup-port newer versions of the operating system. Similarly, if theuser desires to run a workload compiled for different versionsof an operating system, the emulation may need to changeif the operating system interface has changed. To top itall, complex system interactions due to asynchronous inter-rupts and DMA transfers cannot be handled easily with thisform of emulation, which is required to correctly execute realworld desktop applications like  acroread  and  powerpoint .In this paper we describe a simple binary instrumentationsolution to capture the effects of all types of system interac-tions without having to explicitly emulate each system callor interrupt. Since our solution is independent of the oper-ating system, it is very easy to provide simulator support toexecute binaries compiled for various operating systems, aswell as to allow the simulator to be compiled and executedin any operating system. 2.2 Full system simulation There exist full system functional simulators like Sim-ics [10], SimOS [15] and SoftSDV [20] that can emulate thefull system including the operating system and all interac-tion with the external devices. Therefore, one option forbuilding performance simulators would be to execute the bi-nary inside a functional full system simulator and use thatas a front end to feed traces of instructions executed to thecycle accurate performance simulator [7, 11, 5].However, building and maintaining full system simulators  is very expensive. It requires multiple person-years of effortto develop them. Also, they need to be modified constantlyto support newer systems. In addition, the execution envi-ronment required for running real applications on full systemsimulators can be hard to reproduce, because of dependen-cies on specific kernel or device drivers versions, run-timelicense checking, elaborate installation procedures and highstorage requirements. Therefore having a full system simu-lator in the front-end incurs higher runtime overhead duringsimulation. Moreover, if the goal is to analyze the perfor-mance of just the user code then it is an unwarranted com-plexity to have a full system simulator as a front end.It is highly desirable to have a way of handling all formsof system effects to correctly execute the application duringsimulation, but still preserve the simplicity of application-level simulators. Our solution in the paper is targeted to-ward achieving this goal. 2.3 Checkpoint Mechanisms Detailed cycle accurate simulation of the full program ex-ecution is very time consuming. Sampling techniques likeSimPoint [16] and SMARTS [21] are used to find repre-sentative samples of program execution. Simulating onlythese samples have been shown to provide accurate simula-tion results. The  Sample Starting Image   (SSI) is the stateneeded to accurately emulate and simulate the sample’s exe-cution to achieve the correct output for that sample. Variouscheckpoint mechanisms have been proposed to capture theSSI [14, 2, 18] with minimal checkpoint size. In this sectionwe describe those checkpoint mechanisms as they are relatedto the technique we use to collect our logs to capture systemeffects.Szwed  et.al.  [18] proposed SimSnap, which instrumentsthe application’s binary, with the SSI corresponding to asample and necessary code to restore it. Thus, during sim-ulation, the simulated application’s binary can itself restorethe SSI for the sample to be simulated. To create such abinary, they first obtain the SSI of the application’s state atthe beginning of a sample by natively executing the instru-mented binary of the application.Ringenberg  et al.  [14] proposed an  Intrinsic Checkpoint-ing   mechanism which also embeds SSI into the binaries andlets the application restore itself during simulation. Theirfocus is to create one binary, that restores the SSI for allof the simulation points needed to simulate the execution of that binary, for a specific input. In doing this, they makean observation that to create the SSI for a simulation pointthey can take the ending memory image of the last simu-lation point, and just update it with all of the stores thatoccurred between the end of the last simulation point andthe start of the current simulation point. In addition, theyoptimize the restoration process by choosing to restore onlythose locations that are read at least once inside the simula-tion interval. The intrinsic checkpointing approach achievesthe purpose of checkpointing the SSI at the beginning of asimulation interval by having a list of memory stores thatneed to be executed to get the memory image up to date forthe start of the new simulation interval. This saves a signif-icant amount of space over storing the full memory imagestate for each simulation point. Note that the simulator us-ing this binary with intrinsic checkpoints still needs to havesupport for emulating the system call and other system in-teractions. This is because the only thing that the intrinsiccheckpoint scheme ensures is that the simulation point hasthe correct SSI. Thus, intrinsic checkpointing does not ad-dress the problems of handling system effects, which is thefocus of our paper.Van Biesbrouck et.al. [2] also proposed an algorithm toreduce the size of SSI. Their technique assumes the EIOtrace generation mechanism used in SimpleScalar to handlesystem calls. In the EIO traces generated by the defaultSimpleScalar, the SSI is the full memory image of the appli-cation at the beginning of the simulation interval along witha trace of result values of all the system calls executed (EIOtrace) within the simulation interval. Instead of having thefull memory image for SSI, they log initial memory valuesonly for the locations that are accessed within the simu-lation interval. They also consider representing the sameinformation in a different format in the form of Load ValueSequence (LVS) which is essentially a trace of all the loadinstructions. Their approach focuses on reducing the size of the SSI, and not upon providing system call logging. Theystill rely upon the EIO traces and system call emulation inSimpleScalar for that. Our focus is to not have to provideany system emulation for SimpleScalar, while at the sametime enabling the simulation of real (non SPEC) programson SimpleScalar. 3. BASELINEAPPLICATION-LEVELSIM-ULATION APPROACHES In this section, we describe two system call logging infras-tructures – pinLIT, which is used at Intel, and SimpleScalar,which is commonly used in academia. 3.1 pinLIT An approach used at Intel for simulation is to first useSimPoint [16] to determine representative samples in a pro-gram’s execution. Then a tool called pinLIT is used to createa checkpoint for each sample. A sample’s checkpoint con-tains everything needed by their simulator to simulate thesample. In this section, we summarize this baseline tech-nique used to create a sample’s checkpoint. 3.1.1 SimPoint  The first step is to choose for a program-input pair wherethe execution interval for detailed simulation. SimPoint isused to choose the samples to be simulated. Note that othermethods can be used to choose the simulation samples; theselection algorithm is not the focus of this study.The SimPoint [16] sampling approach picks a small num-ber of samples, that accurately creates a representation of the complete execution of the program. It breaks a pro-gram’s execution into intervals, and for each interval createsa code signature. It then performs clustering on the codesignatures, grouping intervals with similar code signaturesinto phases. The notion is that intervals of execution withsimilar code signatures have similar architectural behavior,and this has been shown to be the case in [16, 8, 13, 22].Therefore, only one interval from each phase needs to besimulated in order to recreate a complete picture of the pro-gram’s execution. SimPoint then chooses a representativefrom each phase and performs detailed simulation on thatinterval. Taken together, these samples can represent thecomplete execution of a program. The set of chosen samplesare called  simulation points  , and each simulation point is aninterval on the order of millions of instructions. 3.1.2 Creating Checkpoint Image Once the simulation points are chosen, the next step is tocreate checkpoints for each simulation point using pinLIT(Pin-Long Instruction Trace) tool that is built using the  Pin [9] dynamic binary instrumentation tool. The check-point and system call tracing mechanism used in pinLITprovides the logs used to guide simulation as described inthe Intel’s UserLIT [17] simulation infrastructure.A checkpoint image for a simulation interval contains allthe necessary code and data information that is required forsimulating the interval that it represents. This includes atrace of all the input and output values for the system callsexecuted within the simulation interval.A checkpoint image for a simulation point is created asfollows. The instrumented binary is executed natively andonce the execution reaches the simulation point, the proces-sor’s architectural register state is copied to the checkpoint.In addition, pinLIT copies all the pages that contain appli-cation code and shared libraries to the checkpoint.For the code and data pages, pinLIT tries to avoid check-pointing the entire data image of the process that existsat the beginning of the simulation point. Instead, pinLITcopies the pages lazily to the checkpoint when they are firstused during the simulation interval. This approach avoidslogging those data and code pages that are never accessedinside the simulation point and thus reduces the size of thecheckpoint. The address locations inside the checkpoint im-age where the code and data pages are copied to are storedin a table at a particular location in the checkpoint image.This table, which we call as CheckpointPageTable, is re-quired during simulation to restore the code and data pages.In addition to copying pages accessed by the program tothe checkpoint, pinLIT also logs enough information aboutthe execution of system calls so that they can be handledduring simulation. pinLIT has code specific to each systemcall that determines the inputs and outputs for every oneof them. Before executing a system call, the analysis codein pinLIT logs information about the input values to thesystem call along with their address location (for memoryoperands) or the register name. After the return from thesystem call, the return value and any memory location andvalues modified by the system call are logged. When thesystem calls are encountered during simulation, the controlis transferred to a special system call handler that verifiesthe arguments and writes the output in the proper memoryand register locations. If the input arguments are different,then simulation is halted, since the simulation environmentrequires and only supports deterministic simulation. 3.1.3 Simulation Using pinLIT’s Checkpoint Image We now describe how the simulator uses the checkpoints.The simulator first loads the checkpoint image into its ad-dress space and starts the program’s execution from address0, which contains a specially inserted (by pinLIT) minimaloperating system code or mini-OS. The mini-OS initializesthe real page table using CheckpointPageTable to map thevirtual addresses of the application to the physical addresseswhere the code and data pages from the checkpoint imageare loaded. The mini-OS also registers a system call handlerwhich is invoked whenever a system call is encountered dur-ing the program’s execution inside the simulator. Finally,the architectural register’s contents are read from the check-point image and written to the registers. Note that this setsthe PC to the first instruction executed at the beginning of the simulation interval.When a system call is encountered the system call handlerverifies if the system call input values match the checkpointimage values and writes the outputs to the simulated regis-ters and memory. The system call itself is ignored. 3.2 SimpleScalar SimpleScalar supports a system call checkpoint environ-ment called EIO (External I/O) logging, which is a traceof the output values of system calls. Playing back the sys-tem calls effects from the log ensures deterministic behavior,even if the system call has non-reproducible behavior (e.g. gettimeofday ).An EIO file contains a checkpoint of the initial programstate that includes memory and architectural state that rep-resents the state of the system at the beginning of the sim-ulation interval. The rest of the EIO file contains informa-tion about every system call, including all input and outputvalues and the name of the registers and memory addresslocations where those values should reside.When the simulator encounters a system call, it restoresthe necessary register and memory values by reading themfrom the EIO trace. This method enables deterministic pro-gram execution across all the simulation runs. 4. COMPLEXITYANDEXAMPLEOFLOG-GING SYSTEM EFFECTS User level simulators need to emulate system calls for cor-rect execution of applications. In this section, we discussin more detail the solutions for emulating system calls andprovide some concrete examples to illustrate the complexityinvolved in emulating them. 4.1 Emulating System Calls We describe in more detail how system calls are emulatedin SimpleScalar [4]. SimpleScalar’s instruction decoder caninterpret Alpha, ARM and PISA instruction set architec-tures. Recently, support for x86 ISA have been provided.For clarity, here we assume Alpha OSF binary emulated ona Linux x86 architecture. 4.1.1 Approach When a system call is invoked by the simulated applica-tion, a special system call handler in the simulator is calledto emulate it. The system call handler’s operation can besummarized in three parts.First, the system call handler has to decode the systemcall invoked by the application and obtain the necessary in-put arguments from the simulated register and memory lo-cations. Decoding a system call is dependent on the systemcall numbers, which are specific to an operating system. ForLinux, these numbers are specified in the header file  unistd.h  .This decoding part of emulation should support the operat-ing system for which the application has been compiled for.Second, the system call arguments are used to invoke anequivalent system call, that can be executed natively on thehost machine. This part of the emulation should support theoperating system on which we want to execute the simulator,because the arguments to the system call are specific to thesystem. For example, SimpleScalar can support executionof Alpha binaries compiled for DEC Alpha Unix systems(determined by the decoding part of the emulator) on x86Linux systems (on which the emulator natively executes thesystem calls).Third, result values obtained from the native execution of the system call are used to modify appropriate registers andmemory locations in the simulator. 4.1.2 Examples Let us consider how  open  system call is emulated. Forthe  open  system call, register ECX contains the flag input  and EBX contains the address to the location containing thefilename. The flag input format can change between oper-ating systems and we personally have experienced problemswhile trying to run SimpleScalar on some newer versionsof RedHat Linux, which required changes to the emulationsystem.For  open , the filename is copied into a temporary buffer.The temporary buffer and an integer containing the flagvalue are used as arguments to invoke the  open  system callnatively. The file handle returned from the native systemcall is then copied into the EAX register. Note, the emula-tion of the  open  system call would be affected if either thebinary is compiled for a different operating system or if thehost on which the simulator is executed change.Let us consider another example. The  read  system callis used to read a specified number of bytes from a file andcopy the values read to a buffer. To emulate this system call,SimpleScalar invokes the  read  system call natively using thecontents of register EBX and EDX as arguments, whereEBX contains the file handle and EDX contains the sizeof the buffer. The  read  system call also requires a pointerto the location where the read contents need to be stored.To accomplish this, SimpleScalar allocates a buffer of a sizespecified by EDX register and passes the pointer to the  read system call. Once the system call returns, the contents of the buffer are copied to the location whose address can befound in the ECX register. Finally, the EAX register is writ-ten with the error code returned from the native executionof the  read  system call. Note that, the  read  call can modifythe memory location pointed to by ECX and the number of locations modified is dependent on the size specified in theregister EDX. Thus, it is necessary to capture the systemeffects on the memory locations.For this example, EAX, ECX and EDX determine thememory locations modified by the system call. Other systemcalls have different interfaces (e.g. pointers to structures,etc), and each case must be handled individually. Thesememory inputs and outputs are system call specific and thisis why creating these emulation systems is tedious, errorprone, and hard to maintain. 4.1.3 Handling AsynchronousInterrupts and DMA Emulating more complex interactions with the systemthrough asynchronous interrupts and DMA are even tougherto handle in an execution driven simulator. It would requiremodeling the full system including the external peripheraldevices, like in Simics [10]. Hence, applications affected byinterrupts and DMA are not supported in the user-level ar-chitectural simulators [4, 19, 17], but our logging approachcaptures the memory effects seen during application levelexecution. 4.2 ProvidingAutomated SystemEffects Log-ging The above implementation for logging system effects is notdesirable for a number of reasons. Note that handling sys-tem calls involves identifying the input and output values of each system call. This requires decoding and writing codeto handle each system call. This method is not portableto simulate applications compiled for a different operatingsystem or even for a different version of the same operat-ing system. In addition, pinLIT and SimpleScalar do notsupport applications that use asynchronous interrupts andDMA transfers. We solve these issues with our automatedsystem effect logging to capture all forms of system effectswhich we describe next. 5. AUTOMATICLOGGINGOFSYSTEMEF-FECTS In the previous sections, we described how popular cycleaccurate simulators [4, 19, 17] need to emulate system callsto achieve correct program execution. For example, Sim-pleScalar emulates 81 unique system calls to support simu-lation of SPEC and similar programs. In comparison, thepinLIT simulation tool used at Intel emulates 258 systemcalls to support a much more wider range of applicationscompiled for the most popular Linux kernels. Emulatingthese system effects is tedious to implement, hard to main-tain, and error prone.In this section, we discuss an instrumentation tool thatcan automatically capture system effects in a log, whichcan then be used to guide architecture simulation. Thetool that we describe here can also support simulation of multi-threaded programs on a time-shared uniprocessor sys-tem, which is discussed in detail in Section 5.6. It can alsobe extended to support deterministic simulation of multi-threaded programs on multi-processor systems, but we leavethat for future work. 5.1 Overview Our goal is to automatically capture all the system effectsto a program’s execution in a  System Effect Log   (SEL) whichcan be used to replay the program’s execution and simulateit without having to emulate any system effects. The SELreplaces the system effect logging approach used for pinLITand the SimpleScalar EIO checkpoint trace described in Sec-tion 3. Our logging approach is much easier to implementand maintain, and it provides support for asynchronous in-terrupts and DMA transfers, which are supported neither inthe pinLIT nor the SimpleScalar EIO tracing mechanism.We built our system effect logger called  pinSEL  using thePin [9] dynamic instrumentation tool. We briefly describethe key concept that allows us to automatically capture sys-tem effects. Our algorithm is inspired by the checkpointscheme used in BugNet [12]. A straight-forward way to cap-ture the system effects to a program execution is to log thevalue of every single load instruction executed by the pro-gram, and to log the register states and the PC value af-ter handling a system call or an interrupt. However, thismethod is clearly too expensive in terms of runtime and logsize overhead. Instead, we need to log a load value, onlyif (a) the load is the first memory operation to access thememory location or (b) the memory location accessed bythe load has been modified due to a system effect. We de-termine the second condition by keeping track of a  user-level  copy of the memory space that is read and written by theapplication during execution. The redundant copy is calledthe user-level copy, because it is maintained in the pinSEL’saddress space, and is updated by pinSEL for load and storeoperations executed by the application. The user-level copyis  not   updated when the system modifies the correspond-ing application’s memory state while it is handling systemcalls, interrupts or DMA transfers. Hence, if an applica-tion’s memory location is modified due to a system effect,and later if a load accesses the same location, pinSEL detectsa mismatch between its user-level copy and the correspond-ing value in the application’s address space. When pinSELdetects such a mismatch for a load, it can determine that theprogram’s memory value has been changed by some systemevent external to the program being profiled, and hence itknows that the load value needs to be logged. We use a sim-ilar mechanism to capture the system effects to the register
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x