DOC PREVIEW
Context Switch Overheads for Linux on ARM Platforms

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Context Switch Overheads for Linux on ARM PlatformsFrancis M. [email protected] C. [email protected] H. [email protected] of Computer ScienceUniversity of Illinois at Urbana-Champaign201 N Goodwin AveUrbana, IL 61801-2302ABSTRACTContext switching imposes a performance penalty on threads in amultitasking environment. The source of this penalty is both directoverhead due to running the context switch code and indirect over-head due to perturbation of caches. We calculate indirect overheadby measuring the running time of tasks that use context switchingand subtracting the direct overhead. We also measure the indirectoverhead impact on the running time of tasks due to processor in-terrupt servicing. Experiment results are presented for the Linuxkernel running on an ARM processor based mobile device plat-form.Categories and Subject DescriptorsD.4.8 [Operating Systems]: Performance—MeasurementsGeneral TermsExperimentation,Measurement,PerformanceKeywordsoperating system, context switch overhead1. INTRODUCTIONContext switching is the fundamental mechanism that is used toshare a processor across multiple threads of execution. Each threadis associated with a processor state such as the program counter,general purpose registers, status registers and so on. A contextswitch is the act of saving the processor state of a thread and load-ing the saved state of another thread. If the threads are associatedwith different virtual address spaces, a context switch also involvesswitching the address translation maps used by the processor. InLinux, this happens when the threads belong to different user pro-cesses. Switching address spaces requires that relevant entries inthe processor’s address translation cache (TLB) are invalidated. Ifthe instruction or data caches are tagged using virtual addresses,they would have to be emptied as well.Context switching imposes a small performance penalty on threadsin a multitasking environment. In addition to the direct overheadPermission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage, and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ExpCS, 13-14 June 2007, San Diego, CA.Copyright 2007 ACM 978-1-59593-751-3/07/06...$5.00associated with the actual context switching code, there are severalother factors that contribute to this penalty. The perturbation ofprocessor caches like the instruction, data, address translation andbranch-target buffers results in an additional indirect overhead. Yetanother possible source of indirect overhead is operating systemmemory paging. A context switch can result in an in-use memorypage being moved to disk if there is no free memory, thus hurtingperformance. In this paper, we do not consider overheads due topaging and assume that sufficient main memory is present to avoidthrashing.We have described a context switch as a mechanism used to switchbetween two threads of execution. We do not consider a systemcall a context switch. This is like a simple function call and onlyinvolves switching the processor from unprivileged user mode toa privileged kernel mode. Memory maps are not switched. Thetransition back to userspace from the kernel during the return ofthe system call is similar to a function call return.A processor interrupt causes the state of the currently executingtask to be saved while an interrupt service routine is executed.When the interrupt service routine completes, the saved state is re-stored. While memory maps are not switched during interrupt ser-vicing, it does perturb cache state and might also contribute someindirect overhead.In this paper, we measure the indirect overhead of context switchesinside the Linux kernel using pairs of tasks that perform coopera-tive multitasking. In a separate set of experiments, we also measurethe indirect overhead introduced due to processor interrupt servic-ing.We do not explore userspace implementations of threads and userspacecontext switching. The latest versions of the Linux kernel supportthe Native Posix Threading Library (NPTL) which implements userthreads as kernel threads and context switching happens inside thekernel.This study targets mobile device architectures and the hardwareplatform we use in our experiments is the OMAP1610 H2 Soft-ware Development Platform [8] cellular phone reference designfrom Texas Instruments. The OMAP1610 is powered by an ARMprocessor core.The rest of this paper is organized as follows. Section 2 presents aquick introduction to the hardware platform that we use in our ex-periments. We discuss the experiment setup and results for contextswitch overhead measurements in section 3. The experiment setup1Task 1 BeginTask 2 BeginTask 1 EndTask 2 EndRtotalCS TimeTask 1 BeginTask 1 EndTask 2 BeginTask 2 EndR’totalContext Switches = 1 Context Switches = 3TimeFigure 1: Context Switch Overhead Experiment Measurementsand results for interrupt servicing overhead measurements are pre-sented in section 4. After exploring some related work in section 5,we conclude in section 6.2. EXPERIMENTATION PLATFORMARM is a 32-bit RISC architecture. ARM processors are widelyused in mobile devices because of their low power consumption.In this section, we briefly describe some features of the ARM ar-chitecture that are relevant to this research. Our implementationsand experiments have been carried out on a processor core whichbelongs to the ARMv5 architecture generation. The ARM926EJ-S processor core that we use is part of the OMAP1610 chip fromTexas Instruments.Context switches require the saving of 16 general purpose regis-ters (including the program counter) and one status register. Amemory management unit (MMU) translates virtual addresses fromthe processor into physical addresses. A split (Harvard) memorycache is available in the processor, providing a 16 kilobyte, four-way set-associative instruction cache and a 8 kilobyte, four-wayset-associative data cache. These caches are virtually tagged andtherefore, have to be emptied when switching contexts. There aretwo TLBs - one for data and one for instructions. Each TLB holds64 entries. TLB entries can be locked down in software, but we donot use any lockdowns for the experiments in this paper.The


Context Switch Overheads for Linux on ARM Platforms

Download Context Switch Overheads for Linux on ARM Platforms
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Context Switch Overheads for Linux on ARM Platforms and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Context Switch Overheads for Linux on ARM Platforms 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?