Unformatted text preview:

56Pervasive secure information pro-cessing over the public wired and wirelessInternet could benefit from rapid and conve-nient cryptographic transformations. But theperformance of software-implemented cryp-tographic functions is hampered by certainoperations that have not been optimized in aprocessor’s instruction set architecture becausethey occurred infrequently in earlier pro-gramming workloads. One such operation isthe permutation of bits within a block to beencrypted, which is particularly difficult inword-oriented processors. Permutations playa prominent role in the secure processing ofinformation and in multimedia processing.Secure information processingSecure information processing includesauthentication of users and host machines,confidentiality of messages sent over publicnetworks, and assurances that messages, pro-grams, and data have not changed in transit.It also includes access control and provisionsto ensure privacy, anonymity, and the avail-ability of essential services.Systems can provide some security functionsby using security protocols that employ differ-ent cryptographic algorithms, such as thosebased on public keys, symmetric keys, and hash-ing. To facilitate pervasive and secure informa-tion processing, such security functions mustbe fast and painless to implement. One way toachieve this goal is to perform the cryptographicfunctions rapidly in software, rather than requir-ing users to buy special-purpose hardware cryp-tography boards. Here, we assume that aprogrammable processor is available in an infor-mation appliance or server machine to performregular (unsecure) transactions over the net-work. The question becomes “What general-purpose operations should this programmableprocessor incorporate so that it can executecryptographic functions without significant per-formance degradation?”Our work focuses on one set of operations,permutations, which are useful in symmetric-key cryptographic algorithms for encryptinglarge amounts of data for confidentiality. Sym-Ruby B. LeeZhijie ShiXiao YangPrinceton UniversityPERFORMING PERMUTATIONS IN SOFTWARE CAN FACILITATE MOREWIDESPREAD USE OF SECURE INFORMATION PROCESSING AND FASTERMULTIMEDIA PROCESSING. BUT CURRENT INSTRUCTION SET ARCHITECTURES,EVEN WHEN AUGMENTED WITH SUBWORD-PARALLEL MULTIMEDIAINSTRUCTIONS, DO NOT PROVIDE EFFICIENT, BIT-LEVEL SOFTWAREPERMUTATIONS. FOUR NEW INSTRUCTIONS EACH OFFER A SOLUTION.0272-1732/01/$10.00  2001 IEEEEFFICIENTPERMUTATIONINSTRUCTIONS FORFASTSOFTWARECRYPTOGRAPHYmetric-key algorithms break a message intoblocks and use confusion and diffusion oper-ations synergistically to transform each blockof plaintext (original message) into ciphertext(encrypted message) bits.1Confusion obscuresthe relationship between the plaintext andciphertext by, for example, substituting arbi-trary bits for plaintext bits. Diffusion spreadsthe redundancy of the plaintext over theciphertext—for example, through permuta-tion of the bits of the plaintext block.DES (Data Encryption Standard) and tripleDES2use bit-level permutations extensivelyto encrypt a block of 64 plaintext bits intociphertext. But bit-level permutations are slowwhen implemented with the current instruc-tions available in microprocessors and otherprogrammable processors. Because permuta-tions are particularly difficult for existingprocessors, new cryptographic algorithms,such as the Advanced Encryption Standard,3tend to avoid permutations to ensure fast soft-ware implementations. In contrast, we show how programmableprocessors can achieve efficient bit permuta-tions, thus enabling future cryptographic algo-rithms to use the superior diffusioncapabilities of permutations.Quick multimedia processingPermutations are also an important newrequirement for fast processing of digital mul-timedia information with subword-parallelinstructions,4,5more commonly known as mul-timedia instructions. Every major micro-processor instruction set architecture (ISA) nowhas these subword-parallel instructions for fastmultimedia information processing.4-11Withsubwords packed into 64- or 128-bit words,it is often necessary to rearrange the subwordswithin the word. However, many of thesemultimedia ISA extensions do not providesuch subword permutation instructions,except for a few initial attempts in MAX-2,5IA-64,8and AltiVec.9New permutation instructionsTo serve the growing need for efficient soft-ware permutation, we developed four per-mutation instructions and an underlyingmethodology for efficiently performing arbi-trary n-bit permutations in programmableprocessors. Although our work targets themore difficult problem of permuting n 1-bitelements, it also addresses the issue of per-muting fewer multibit subwords packed intoan n-bit word, a feature needed to acceleratemultimedia processing in software.By providing the ability to do fast permu-tations in software, we open the field for newcryptography and multimedia algorithmsusing these powerful yet simple permutationprimitives. This capability results in muchfaster cryptography and multimedia process-ing, while retaining the flexibility of softwareimplementations, for secure multimediainformation appliances and servers.Past workBecause current microprocessors are wordoriented, performing bit-level permutations ispainful. Every bit must be extracted from thesource register, moved to its new location inthe destination register, and combined withthe bits that have already been moved. Thisprocess can take four instructions per bit (maskgeneration, AND, SHIFT, and OR). In cer-tain microprocessors such as PA-RISC,12more-powerful bit-manipulation instructions—suchas EXTRACT and DEPOSIT—can essen-tially move a bit with two instructions. As aresult, any arbitrary permutation of n bitsrequires 4n or 2n instructions. Processors canperform certain predefined permutations withregular patterns in fewer instructions. But, ingeneral, an arbitrary 64-bit permutation couldtake up to 128 or 256 instructions on currentmicroprocessors.Because this is unacceptably slow, pro-grammers have used table lookup methods toimplement fixed permutations. Achieving afixed permutation of n input bits is possiblewith one or m table lookups. Doing sorequires a table with 2nentries or m tables with2n/mentries; each entry contains n bits.For example, implementing a 64-bit per-mutation can be done with eight tables, eachwith 256, 64-bit entries. Each table representsthe permutation of eight consecutive bits ofthe source. Each entry would


View Full Document
Download EFFICIENT PERMUTATION
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view EFFICIENT PERMUTATION and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view EFFICIENT PERMUTATION 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?