Characters and StringsCharacterschar literalsAdditional character literalsCharacter encodingsUnicodeUnicode character literalsGlyphs and fontsStringsString literalsString concatenationNewlinesSystem.out.print and printlnPrinting your objectsConstructing a StringString methodsVocabularyThe EndJan 14, 2019Characters and Strings2CharactersIn Java, a char is a primitive type that can hold one single characterA character can be:A letter or digitA punctuation markA space, tab, newline, or other whitespaceA control characterControl characters are holdovers from the days of teletypes—they are things like backspace, bell, end of transmission, etc.3char literalsA char literal is written between single quotes (also known as apostrophes): 'a' 'A' '5' '?' ' 'Some characters cannot be typed directly and must be written as an “escape sequence”:Tab is '\t'Newline is '\n'Some characters must be escaped to prevent ambiguity:Single quote is '\'' (quote-backslash-quote-quote)Backslash is '\\'4Additional character literals \n newline \t tab \b backspace \r return \f form feed \\ backslash \' single quote \" double quote5Character encodingsA character is represented as a pattern of bitsThe number of characters that can be represented depends on the number of bits usedFor a long time, ASCII (American Standard Code for Information Interchange) has been usedASCII is a seven-bit code (allows 128 characters)ASCII is barely enough for EnglishOmits many useful characters: ¢ ½ ç “ ”6UnicodeUnicode is a new standard for character encoding that is designed to replace ASCII“Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language.” Java uses a subset of Unicode to represent charactersThe Java subset uses two bytes for every characterJava 1.5 expands this by allowing some three-byte charactersExcept for having these extra characters available, it seldom makes any difference to how you program7Unicode character literalsThe rest of the ASCII characters can be written as octal numbers from \0 to \377Any Unicode character (in the Java subset) can be written as a hexadecimal number between \u0000 and \uFFFFSince there are over 64000 possible Unicode characters, the list occupies an entire bookThis makes it hard to look up charactersUnicode “letters” in any alphabet can be used in identifiers8Glyphs and fontsA glyph is the printed representation of a characterFor example, the letter ‘A’ can be represented by any of the glyphs A A A A A font is a collection of glyphsUnicode describes characters, not glyphs9StringsA String is a kind of object, and obeys all the rules for objectsIn addition, there is extra syntax for string literals and string concatenationA string is made up of zero or more charactersThe string containing zero characters is called the empty string10String literalsA string literal consists of zero or more characters enclosed in double quotes "" "Hello" "This is a String literal."To put a double quote character inside a string, it must be backslashed: "\"Wait,\" he said, \"Don't go!\""Inside a string, a single quote character does not need to be backslashed (but it can be)11String concatenationStrings can be concatenated (put together) withthe + operator "Hello, " + name + "!"Anything “added” to a String is converted to a string and concatenatedConcatenation is done left to right: "abc" + 3 + 5 gives "abc35" 3 + 5 + "abc" gives "8abc" 3 + (5 + "abc") gives "35abc"12NewlinesThe character '\n' represents a “newline” (actually, it’s an LF, the linefeed character)When “printing” to the screen, you can go to a new line by printing a newline characterYou can also go to a new line by using System.out.println with no argument or with one argumentWhen writing to the internet, you should use "\r\n" instead of println because println is platform-specificOn UNIX, println uses LF for a newlineOn Macintosh, println uses CR instead of LF for a newlineOn Windows, println uses CR-LF for a newlineWhen you use the character constants, you are in control of what is actually output13System.out.print and printlnSystem.out.println can be called with no arguments (parameters), or with one argumentSystem.out.print is called with one argumentThe argument may be any of the 8 primitive typesThe argument may be any objectJava can print any object, but it doesn’t always do a good jobJava does a good job printing StringsJava typically does a poor job printing types you define14Printing your objectsIn any class, you can define the following instance method: public String toString() { ... }This method can return any string you chooseIf you have an instance x, you can get its string representation by calling x.toString()If you define your toString() method exactly as above, it will be used whenever your object is converted to a StringThis happens during concatenation:"My object is " + myObjecttoString() is also used by System.out.print and System.out.println15Constructing a StringYou can construct a string by writing it as a literal: "This is special syntax to construct a String."Since a string is an object, you could construct it with new: new String("This also constructs a String.")But using new for constructing a string is foolish, because you have to write the string as a literal to pass it in to the constructorYou’re doing the same work twice!16String methodsThis is only a sampling of string methodsAll are called as: myString.method(params)length() -- the number of characters in the StringcharAt(index) -- the character at (integer) position index, where index is between 0 and length-1equals(an otherString) -- equality test (because == doesn’t do quite what you expectHint: Use "expected".equals(actual) rather than actual.equals("expected") to avoid NullPointerExceptionsDon’t learn all 48 String methods unless you use them a lot—instead, learn to use the API!17Vocabularyescape sequence -- a code sequence for a character, beginning with a backslashASCII -- an 7-bit standard for encoding charactersUnicode -- a 16-bit standard for encoding charactersglyph -- the printed representation of a characterfont -- a
View Full Document