DOC PREVIEW
BGreP AND BDiff

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

BGrep and BDiff: UNIX Tools for High-Level Languages (Extended Abstract)1Dartmouth Computer Science Technical Report TR2011-705Version of September 17, 2011Gabriel A. Weaver and Sean W. Smith2Dartmouth CollegeAbstractThe rise in high-level languages for system administrators requires us to rethink traditional UNIX toolsdesigned for these older data formats. We propose new block-oriented tools, bgrep and bdiff, operating onsyntactic blocks of code rather than the line, the traditional information container of UNIX. Transcendingthe line number allows us to introduce longitudinal diff, a mode of bdiff that lets us track changes acrossarbitrary blocks of code. We present a detailed implementation roadmap and evaluation framework forthe full version of this paper. In addition we demonstrate how the design of our tools already addressesseveral real-world problems faced by network administrators to maintain security policy.Keywords: UNIX, configuration management, security policy1 IntroductionHigh-level languages for system administrators are increasingly on the rise over flat-file or line-based for-mats. Traditional UNIX tools, however, are geared towards these old file paradigms. In light of the varietyof modern programming languages available to system and network administrators, and given the successand utility of grep and diff, we adapt and extend grep and diff for this new programming ecosystem.Our new tools operate on syntactic blocks of code and text as the default information container, in contrast totraditional tools that treat the line as the default. Syntactic blocks often correspond to meaningful higher-level constructs ranging from a network interface in Cisco IOS, to a VirtualHost in Apache, to a sectionof text within a normative reference document such as an IETF RFC.Our block-based grep gives adminstrators a general mechanism to extract blocks of text or code thatmatch simple context-free patterns. Similarly, our block-based diff empowers administrators to comparetwo versions of the same set of blocks over time. Moreover, we provide a natural extension to diff, longitu-dinal diff, that enables an administrator to track changes made to a specific set of blocks (such as networkinterfaces) as a time-series dataset.High-Level Languages for System Administration are on the Rise System administrators are increas-ingly turning to high-level programming languages to configure and manage their systems. Consider sys-tem configuration in general. Last summer’s USENIX Configuration Management Summit featured four toolsavailable to system administrators to programmatically configure their systems: CFEngine3, Bcfg2, Chef,and Puppet [1]. CFEngine3 explicitly encodes promises among different system resources. Bcfg2 viewsconfiguration management “as an API” for programming a system configuration. Chef views infrastruc-ture as code that can benefit from software engineering practices. Finally, the Puppet language models thedesired state of datacenters.We also notice this trend in network configuration management. At USENIX HotICE this year, one Ciscoengineer remarked that Cisco IOS is old and that model-driven architectures are the new direction to config-ure networks. Consider Netconf, Yang, and Cisco NxOS. In addition, some consider network configurationa form of distributed programming.Traditional UNIX Tools were Designed for Simpler File Formats Two traditional UNIX workhorses aregrep and diff. However, these tools were designed for simpler file formats than modern system modelinglanguages. During conversations with Doug McIlroy (inventor of diff and pipes, and arguably the first userof the first UNIX), we identified several design decisions behind grep and their limitations [10].1This work was supported in part by the NSF (under grant CNS-0448499), and the TCIPG project sponsored by the DOE (undergrant DE-OE0000097). The views and conclusions contained in this document are those of the authors and should not be interpretedas necessarily representing the official policies, either expressed or implied, of any of the sponsors.2{gweave01,sws}@cs.dartmouth.edu1Grep First consider grep. Users can casually and quickly write regular expressions to extract structurefrom a file. Originally, grep was written because Doug McIlroy needed to extract terms from a dictionarybut the dictionary was too large to load in his text editor. Therefore, the regular expression parser waswritten in order to accommodate the limitations of the machine at the time.Today, however, many of the languages and language constructs encountered by administrators are notregular. There are a wide variety of languages found within configuration management packages such asChef. The unit of distribution within Chef, the cookbook, includes (among other things) configuration tem-plates written in a configuration language specific to that utility, and recipes, Ruby scripts that specify howto manage node resources.Given the variety of languages, we cannot assume they are all regular. Although Ruby supports blocksof code that are nested arbitrarily deep, one cannot write a regular expression that matches blocks of codeat an arbitrary depth [13].We want to be able to extract syntactic blocks because they often encode meaningful units of information.A practitioner may want to extract a particular method or set of methods from a Ruby recipie to see howPowershell is installed depending upon the version of Windows on hosting node. Alternatively, a practi-tioner may want to extract the set of virtual hosts to quickly see how she has partitioned their organizationaldomain into meaningful subdomains on a single server instance.Diff Diff lets system administrators compare files by line. Today diff is a backbone in many version-control systems. More recently, diff is used by the Really Awesome New Cisco Config Differ (RANCID) , toreport changes to the configuration of a network device.Many of the languages encountered by system administrators today, however, are no longer organizedby line number but in more complex syntactic structures. For example, Cisco IOS uses blocks of code todenote interfaces. Today, line numbers are as much a consequence of data storage as they are of languagesyntax. We argue that using the line as the primary information unit in these high-level languages is liketrying to compare two editions of a textbook by page number rather than by logical section.Practitioners agree with the need to reconsider


BGreP AND BDiff

Download BGreP AND BDiff
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view BGreP AND BDiff and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view BGreP AND BDiff 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?