DOC PREVIEW
MIT 6 055J - The UNIX philosophy

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

26 2626 26182010-05-13 00:43:32 / rev b667c9e4c1f1+1.5 Example 4: The UNIX philosophyThe preceding examples illustrate how divide and conquer enables accu-rate estimates. An example remote from estimation – the design principlesof the UNIX operating system – illustrates the generality of this tool.UNIX and its close cousins such as GNU/Linux operate devices as smallas cellular telephones and as large as supercomputers cooled by liquidnitrogen. They constitute the world’s most portable operating system.Its success derives not from marketing – the most successful variant,GNU/Linux, is free software and owned by no corporation – but ratherfrom outstanding design principles.These principles are the subject of The UNIX Philosophy [9], a valuablebook for anyone interested in how to design large systems. The authorisolates nine tenets of the UNIX philosophy, of which four – those withcomments in the following list – incorporate or enable divide-and-conquerreasoning:1. Small is beautiful. In estimation problems, divide and conquer worksby replacing quantities about which one knows little with quantitiesabout which one knows more (Section 8.2). Similarly, hard computa-tional problems – for example, building a searchable database of allemails or web pages – can often be solved by breaking them into small,well-understood tasks. Small programs, being easy to understand anduse, therefore make good leaf nodes in a divide-and-conquer tree (Sec-tion 1.3).2. Make each program do one thing well. A program doing one task –only spell-checking rather than all of word processing – is easier tounderstand, to debug, and to use. One-task programs therefore makegood leaf nodes in a divide-and-conquer trees.3. Build a prototype as soon as possible.4. Choose portability over efficiency.5. Store data in flat text files.6. Use software leverage to your advantage.7. Use shell scripts to increase leverage and portability.8. Avoid captive user interfaces. Such interfaces are typical in programsfor solving complex tasks, for example managing email or writing27 2727 27192010-05-13 00:43:32 / rev b667c9e4c1f1+documents. These monolithic solutions, besides being large and hardto debug, hold the user captive in their pre-designed set of operations.In contrast, UNIX programmers typically solve complex tasks by divid-ing them into smaller tasks and conquering those tasks with simpleprograms. The user can adapt and remix these simple programs tosolve problems unanticipated by the programmer.9. Make every program a filter. A filter, in programming parlance, takesinput data, processes it, and produces new data. A filter combineseasily with another filter, with the output from one filter becomingthe input for the next filter. Filters therefore make good leaves in adivide-and-conquer tree.As examples of these principles, here are two UNIX programs, each asmall filter doing one task well:• head: prints the first lines of the input. For example, head invokedas head -15 prints the first 15 lines.• tail: prints the last lines of the input. For example, tail invoked astail -15 prints the last 15 lines.How can you use these building blocks to print the 23rd line of a file?This problem subdivides into two parts: (1) print the first 23 lines, then(2) print the last line of those first 23 lines. The first subproblem is solvedwith the filter head -23. The second subproblem is solved with the filtertail -1.The remaining problem is how to hand the second filter the output ofthe first filter – in other words how to combine the leaves of the tree. Inestimation problems, we usually multiply the leaf values, so the combi-nator is usually the multiplication operator. In UNIX, the combinator isthe pipe. Just as a plumber’s pipe connects the output of one object, suchas a sink, to the input of another object (often a larger pipe system), aUNIX pipe connects the output of one program to the input of anotherprogram.The pipe syntax is the vertical bar. Therefore, the following pipeline printsthe 23rdline from its input:head -23 | tail -128 2828 28202010-05-13 00:43:32 / rev b667c9e4c1f1+But where does the system get the input? There are several ways to tellit where to look:1. Use the pipeline unchanged. Then head reads its input from thekeyboard. A UNIX convention – not a requirement, but a habit followedby most programs – is that, unless an input file is specified, programsread from the so-called standard input stream, usually the keyboard.The pipelinehead -23 | tail -1therefore reads lines typed at the keyboard, prints the 23rdline, andexits (even if the user is still typing).2. Tell head to read its input from a file – for example from an Englishdictionary. On my GNU/Linux computer, the English dictionary is thefile /usr/share/dict/words. It contains one word per line, so thefollowing pipeline prints the 23rdword from the dictionary:head -23 /usr/share/dict/words | tail -13. Let head read from its standard input, but connect the standard inputto a file:head -23 < /usr/share/dict/words | tail -1The < operator tells the UNIX command interpreter to connect thefile /usr/share/dict/words to the input of head. The system trickshead into thinking its reading from the keyboard, but the input comesfrom the file – without requiring any change in the program!4. Use the cat program to achieve the same effect as the precedingmethod. The cat program copies its input file(s) to the output. Thisextended pipeline therefore has the same effect as the preceding method:cat /usr/share/dict/words | head -23 | tail -1This longer pipeline is slightly less efficient than using the redirectionoperator. The pipeline requires an extra program (cat) copying itsinput to its output, whereas the redirection operator lets the lowerlevel of the UNIX system achieve the same effect (replumbing the input)without the gratuitous copy.29 2929 29212010-05-13 00:43:32 / rev b667c9e4c1f1+As practice, let’s use the UNIX approach to divide and conquer a searchproblem:Imagine a dictionary of English alphabetized from right to left instead of theusual left to right. In other words, the dictionary begins with words that end in‘a’. In that dictionary, what word immediately follows trivia?This whimsical problem is drawn from a scavenger hunt [24]created by thecomputer scientist Donald Knuth, whose many accomplishments includethe TEX typesetting system used to produce this book.The UNIX approach divides the problem into two parts:1. Make a


View Full Document

MIT 6 055J - The UNIX philosophy

Download The UNIX philosophy
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view The UNIX philosophy and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view The UNIX philosophy 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?