Random Data Generator Language RDGL COMS W4115 Programming Languages and Translators Professor Stephen A Edwards Summer 2007 CVN Computer Science is as much about computers as astronomy is about telescopes Edsger Dijkstra Navid Azimi na2258 nazimi microsoft com Contents Introduction 5 Purpose 5 Goals 5 Portability 5 Execution 5 Language Tutorial 6 Before You Start 6 Getting Started 6 Single Token 6 Predefined Length 6 Varying Length 6 Numbers 7 Additional Features 7 Language Manual 8 Overview 8 Data Types 8 Number 8 Character 8 Alphanumeric 8 Symbol 8 WildCard 9 Enumerations 9 Modifiers 9 Length 9 Range 9 Loops 10 Tabs Whitespace and Newlines 10 Project Plan 10 Process 10 Programming Style 10 Tools 10 Architectural Design 11 Components 11 rdgl shell 11 rdgl compiler 11 rdgl antlr 11 rdgl test 12 rdgl grammar 12 Test Plan 12 Regressions 12 Automation 12 Code Coverage 12 Lessons Learned 13 Future Work Items 14 No Illegal Tokens 14 Configurable Distribution Randomness 14 Performance 14 Comments 14 More Options Granular Control 15 Formatting 15 Source 15 rdgl grammar 15 RDGL g 15 Walker g 17 rdgl compiler 20 Engine java 20 RandomData java 23 DataType java 25 Range java 25 rdgl test 27 AllTests java 27 AlphaNumericTests java 28 EngineTests java 31 EnumTests java 32 InvalidTests java 34 LetterTests java 36 LoopTests java 39 NumberTests java 40 RangeTests java 44 StringTests java 45 SymbolTests java 48 rdgl antlr 51 Introduction Random Data Generator Language RDGL is a language which facilitates the generation of random data in a very flexible and powerful way It is most analogous to regular expressions except instead of matching on input strings RDGL generates output strings given an input expression Purpose There are a number of different areas in which well formed or explicitly malformed randomly generated data is useful In software testing for example randomly generated data can help with security tests fuzzing stress tests input validation and even help increase overall test coverage by encouraging different data sets during each automated run Goals The goal of this language is to create a simple and intuitive syntax which can facilitate complex expressions to generate data sets which are tailored yet random Portability RDGL is translated to Java code and as a result is capable of running on any operating system or environment supported by Sun s JVM Execution To simplify the execution and use of RDGL a command shell has been developed which allows RDGL expressions to be executed directly from the command prompt The RDGL Shell is provided as a standalone application However it is still possible to write source rdgl plain text files and have them compiled into executable Java code as well Language Tutorial The following is a brief language tutorial for a typical novice programmer It assumes that the user has some experience with programming concepts i e compilation and computer languages i e ANTLR Java Before You Start Before you can get started please ensure you have all the project prerequisites installed and configured correctly This includes JDK 5 or later for compilation JRE 5 or later for execution and ANTLR2 to generate the parser and walker Also please ensure that you have the latest copy of the RDGL project It is currently hosted on Google Code here Note the grammar files provided have been written using ANTLR2 and are incompatible with ANTLR3 Getting Started The simplest way to get started with RDGL is to use the interactive command shell Each data type is signified by a single character token The five most basic RDGL commands to learn include number character alphanumeric symbol wildcard Single Token To start the command shell should greet you with an RDGL prompt As an example type and hit return The output should be similar to RDGL 8 RDGL If you repeat this action it is likely that a different number between 0 and 9 inclusive will be outputted The same logic applies for the other data types as well where outputs a single character outputs a single alphanumeric character et al Predefined Length To generate an alphanumeric string with a predefined length type 4 where 4 indicates the length of the string to generate Again this syntax is consistent across all data types where 40 would generate a random string of length 40 containing only symbols no letters or numbers Varying Length To generate a random word letters only of length 1 3 or 5 one 1 3 5 For slightly more convoluted examples see 1 7 10 15 generates a number with length between 1 and 7 or 10 and 15 inclusive 0 4 6 8 generates a number with length 0 4 5 6 or 8 There is no grammar imposed limit on the number of parameters that can be passed Numbers It is often times necessary to generate a number between a minimum and maximum value This is achieved in RDGL by providing the range of numbers in curly braces The following examples should be self explanatory 1 12 2007 2039 generates a random number between 1 to 12 generates a random number between 2007 and 2039 To pick from a set of unordered numbers please see enumerations Additional Features There are a number of additional features not covered in the language tutorial It is recommended that serious users read the language manual for additional information Language Manual Overview RDGL is comprised of approximately five data types which are used to generate a data type The data types include number character alphanumeric symbol and wildcard As the name implies alphanumeric is a combination of characters and numbers while wildcard is a combination of alphanumeric and symbols These data types can then be constrained or modified accordingly length range etc In addition to these data types it is also possible to generate data based on a given set of values These are called enumerations There are two types of enumerations normal and nullable The difference between is more closely discussed in the enumerations section Data Types As discussed in the overview there are seven data types two of which are enumerations characters a to z A to Z numbers 0 to 9 alphanumeric characters numbers symbols characters numbers wildcard alphanumeric and symbols enumeration nullable enumeration Each of the data types has been described in more detail in their own section Number This is any number between 0 and 9 inclusive You can specify the range using for example the following expression 25 32 34 43 will pick a random number that lies between 25 and 32 and 34 and 43 You can only specify the
View Full Document
Unlocking...