UMass Amherst CMPSCI 710 - Incorporating Domain- Specific Information

Unformatted text preview:

Incorporating Domain-Specific Information into the Compilation ProcessMotivationFind the error – part 1Find the error – part 2Find the error – part 3ProblemSolutionThe Broadway CompilerBenefitsOutlineSecurity vulnerabilitiesRemote access vulnerabilityChallenge 1: PointersChallenge 2: ScopeChallenge 3: PrecisionInsufficient precisionCost versus precisionClient-Driven AlgorithmAlgorithm componentsSources of imprecisionIn action...MethodologyProgramsError detection problemsResultsOverall resultsWhy does it work?Slide 28Central contributionSpecific contributionsRelated workFuture workSlide 33Annotations (I)Annotations (II)Annotations (III)Annotations (IV)Slide 38TimeValidationType TheoryGeneratorsIs it correct?Annotation correctnessError Checking vs OptimizationComplexityOptimizationPLAPACK OptimizationsResultsSlide 50End backup slides1Incorporating Domain-Specific Information into the Compilation ProcessSamuel Z. GuyerSupervisor: Calvin LinApril 14, 20032MotivationTwo different views of software:Compiler’s viewAbstractions: numbers, pointers, loopsOperators: +, -, *, ->, []Programmer’s viewAbstractions: files, matrices, locks, graphicsOperators: read, factor, lock, draw This discrepancy is a problem...3Find the error – part 1Example:Error: case outside of switch statementPart of the language definitionError reported at compile timeCompiler indicates the location and nature of errorswitch (var_83) {case 0: func_24(); break;case 1: func_29(); break;}case 2: func_78();!4Find the error – part 2Example:Improper call to libfunc_38Syntax is correct – no compiler messageFails at run-timeProblem: what does libfunc_38 do? This is how compilers view reusablesstruct __sue_23 * var_72;char var_81[100];var_72 = libfunc_84(__str_14, __str_65);libfunc_44(var_72);libfunc_38(var_81, 100, 1, var_72);!5Find the error – part 3Example:Improper call to fread() after fclose()The names reveal the mistake No traditional compiler reports this errorRun-time system: how does the code fail?Code review: rarely this easy to spotFILE * my_file;char buffer[100];my_file = fopen(“my_data”, “r”);fclose(my_file);fread(buffer, 100, 1, my_file);!6ProblemCompilers are unaware of library semanticsLibrary calls have no special meaningThe compiler cannot provide any assistanceBurden is on the programmer:Use library routines correctlyUse library routines efficiently and effectively These are difficult manual tasksTedious and error-proneCan require considerable expertise7SolutionA library-level compilerCompiler support for software librariesTreat library routines more like built-in operatorsCompile at the library interface levelCheck programs for library-level errorsImprove performance with library-level optimizations Key: Libraries represent domainsCapture domain-specific semantics and expertiseEncode in a form that the compiler can use8The Broadway CompilerBroadway – source-to-source C compiler Domain-independent compiler mechanismsAnnotations – lightweight specification language Domain-specific analyses and transformations Many libraries, one compilerApplicationSource codeLibraryAnnotationsHeader filesSource codeBroadwayAnalyzerOptimizerError reportsLibrary-specific messagesApplication+LibraryIntegrated source code9BenefitsImproves capabilities of the compilerAdds many new error checks and optimizationsQualitatively differentWorks with existing systemsDomain-specific compilation without recodingFor us: more thorough and convincing validationImprove productivityLess time spent on manual tasksAll users benefit from one set of annotations10OutlineMotivationThe Broadway CompilerRecent work on scalable program analysisProblem: Error checking demands powerful analysisSolution: Client-driven analysis algorithmExample: Detecting security vulnerabilitiesContributionsRelated workConclusions and future work11Security vulnerabilitiesHow does remote hacking work?Most are not direct attacks (e.g., cracking passwords)Idea: trick a program into unintended behaviorAutomated vulnerability detection:How do we define “intended”?Difficult to formalize and check application logic Libraries control all critical system servicesCommunication, file access, process controlAnalyze routines to approximate vulnerability12Remote access vulnerabilityExample:Vulnerability: executes any remote commandWhat if this program runs as root?Clearly domain-specific: sockets, processes, etc.Requirement:Why is detecting this vulnerability hard?int sock;char buffer[100];sock = socket(AF_INET, SOCK_STREAM, 0);read(sock, buffer, 100);execl(buffer);Data from an Internet socket should not specify a program to execute!13Challenge 1: PointersExample:Still contains a vulnerabilityOnly one bufferVariables buffer and ref are aliases We need an accurate model of memoryint sock;char buffer[100];char * ref = buffer;sock = socket(AF_INET, SOCK_STREAM, 0);read(sock, buffer, 100);execl(ref);!14Challenge 2: ScopeCall graph:Objects flow throughout programNo scoping constraintsObjects referenced through pointers We need whole-program analysismainreadsocketsock = (AF_INET, SOCK_STREAM, 0); (sock, buffer, 100); (ref);execl!15Challenge 3: PrecisionStatic analysis is always an approximationPrecision: level of detail or sensitivityMultiple calls to a procedureContext-sensitive: analyze each call separatelyContext-insensitive: merge information from all callsMultiple assignments to a variableFlow-sensitive: record each value separatelyFlow-insensitive: merge values from all assignments Lower precision reduces the cost of analysis Exponential polynomial ~linear16Insufficient precisionExample:Context-insensitivityInformation merged at callAnalyzer reports 2 possible errorsOnly 1 real error Imprecision leads to false positives!mainsocketexeclexeclreadstdin??17Cost versus precisionProblem: A tradeoffPrecise analysis prohibitively expensiveCheap analysis too many false positivesIdea: Mixed precision analysisFocus effort on the parts of the program that matterDon’t waste time over-analyzing the rest Key: Let error detection problem drive precision Client-driven program


View Full Document

UMass Amherst CMPSCI 710 - Incorporating Domain- Specific Information

Download Incorporating Domain- Specific Information
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Incorporating Domain- Specific Information and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Incorporating Domain- Specific Information 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?