CS 33600 Exam 1 Review The exam is on Thursday, March 26. These problems review the material we covered about processes, command-lines, pipes and filters, byte streams, data formats, character encodings, Java I/O, and networking. http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_1_processes/ http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_2_simple_ipc/ http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_3_streams/ http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_6_data_formats/ http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_7_character_encodings/ http://cs.pnw.edu/~rlkraft/cs33600/for-class/streams_and_processes.zip http://cs.pnw.edu/~rlkraft/cs33600/for-class/network_clients.zip Problem 1.) This question is about I/O redirection and pipes on the cmd.exe command-line. Explain what each of the following possible command-lines mean. In each problem, you need to associate an appropriate meaning to the symbols a, b and c. Each symbol can represent either a program, a file, or a command-line argument, for example "a is the name of a program, b and c are the names of files", or "a and b are the names of programs and c is the name of a file", or "a is the name of a program, b and c are arguments to the program". > a > b < c > a < b > c > a | b > c > a < b | c > a b c > a b > c > a b | c > a & b < c > a < b c > a < b & c > a & b | c > a &(b | c) >(a & b)| c >(a & b)> c > a & b c > a & b & c See http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_3_streams/#command-line-syntax Problem 2.) Draw a picture illustrating the processes and streams in this command-line. > a < b | c 2> d | e > f 2> d See http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_3_streams/#pipes Problem 3.) What problem is there with each of the following two command-lines? Hint: Try to draw a picture of all the associated processes and streams. > a | b < c > a > b | c Problem 4.) Give an explanation of how the ideas of "standard in", "standard out", "I/O redirection", "pipes", "processes", and "files" are being used in the following command-line. (If you want, you can draw a picture that clearly illustrates each of the mentioned ideas.) > b < a | c > d Problem 5.) Draw a picture that illustrates all the processes, pipes, files, and streams in the following situation. A process p1.exe gets its standard input from a file a.txt and its standard output and standard error are connected to a pipe that feeds into the standard input of a process p2.exe. The Standard output of p2.exe is connected to a pipe that feeds into a process p3.exe. The standard error of p2.exe is redirected to a file err2.txt. The standard output of p3.exe is redirected to a file b.txt and the standard error of p3.exe is redirected to a file err3.txt. Also, write a command-line that would implement this configuration of processes, pipes, files, and streams. Problem 6.) Write a single Windows cmd.exe command line that successively applies filter programs Filter1.exe and Filter2.exe to the data in file File1.dat and writes the results to the file File2.dat and redirects the error outputs from Filter1 and Filter2 (respectively) to the files Errors1.log and Errors2.log (respectively). Problem 7.) Explain in detail the output from this program. (You can copy and paste it directly into the Java Visualizer.) import java.util.Arrays; import java.io.UnsupportedEncodingException; public class Problem5a { public static void main(String[] args) throws UnsupportedEncodingException { String s = "abcdefghij"; System.out.println(Arrays.toString( s.getBytes("US-ASCII") )); System.out.println(Arrays.toString( s.getBytes("UTF-16LE") )); System.out.println(Arrays.toString( s.getBytes("UTF-16BE") )); System.out.println(Arrays.toString( s.getBytes("UTF-16") )); System.out.println(Arrays.toString( s.getBytes("UTF-32") )); System.out.println(Arrays.toString( s.getBytes("UTF-32LE") )); } } Problem 8.) Explain in detail the output from this program. (You can copy and paste it directly into the Java Visualizer.) Here is a reference for code page 1252. https://en.wikipedia.org/wiki/Windows-1252#Code_page_layout import java.util.Arrays; import java.io.UnsupportedEncodingException; public class Problem_5b { public static void main(String[] args) throws UnsupportedEncodingException { String s = "abcd±fghiÆ€"; // ±, Æ, and € are in Cp1252 but not ASCII System.out.println(s); System.out.println(Arrays.toString( s.getBytes("US-ASCII") )); System.out.println(Arrays.toString( s.getBytes("Cp1252") )); System.out.println(Arrays.toString( s.getBytes("UTF-8") )); System.out.println(Arrays.toString( s.getBytes("UTF-16LE") )); System.out.println(Arrays.toString( s.getBytes("UTF-16BE") )); System.out.println(Arrays.toString( s.getBytes("UTF-16") )); System.out.println(Arrays.toString( s.getBytes("UTF-32") )); System.out.println(Arrays.toString( s.getBytes("UTF-32LE") )); } } Problem 9.) Explain in detail what each of lines 9 through 14 from the following program do. Explain why the program prints out the number that it does. (You can copy and paste this program directly into the Java Visualizer.) /* 1*/ import java.nio.ByteBuffer; /* 2*/ import java.nio.ByteOrder; /* 3*/ import java.util.Arrays; /* 4*/ public class Problem6 /* 5*/ { /* 6*/ public static void main(String[] args) /* 7*/ { /* 8*/ final int n = 1; /* 9*/ final byte[] b = ByteBuffer.allocate(Integer.BYTES) /*10*/ .order(ByteOrder.LITTLE_ENDIAN) /*11*/ .putInt(n) /*12*/ .array(); /*13*/ final int n2 = ByteBuffer.wrap( b ) /*14*/ .getInt(); /*15*/ System.out.println( n2 ); /*16*/ } /*17*/ } See http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_6_data_formats/#integers Problem 10.) Explain why the following program will throw an exception (and crash) if the program's input is exactly abc with the end-of-file condition coming right after the c. (You can copy and paste this program directly into the Java Visualizer. Use the "stdin" button for the input.) import java.io.DataInputStream; public class Problem3 { public static void main(String[] args) throws Exception { final DataInputStream in = new DataInputStream(System.in); final int n = in.readInt(); System.out.println(n); } } See http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_9_java_streams/#basic-io-streams Problem 11.) What are code pages and why do we need them? If you open a file (with an editor, or a console window, or a browser, etc.) while using the wrong code page, what happens? Problem 12.) What is a code point? What is a code unit? Give an example of each one. Why is Java's "char" data type not a character, but is really a "code unit". Problem 13.) This problem is about Unicode's UTF-8 encoding. Given each of the following byte sequences, determine if the sequence is, 1) legal and complete, 2) legal but incomplete, 3) illegal. Briefly explain your answers. Use either of these two pictures of the UTF-8 encoding to help with your explanation. https://higheredbcs.wiley.com/legacy/college/horstmann/1119402972/app/appendices.pdf#page=7 or https://en.wikipedia.org/wiki/UTF-8#Description See http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_7_character_encodings/#utf-8 (a) 01000000 10000000 (b) 11000000 01000000 (c) 11000000 10000000 (d) 11110000 10000000 10000000 (e) 11100000 10000000 10000000 (f) 11100000 10000000 01000000 (g) 10000000 01000000 01000000 (h) 10000000 10000000 10000000 Problem 14.) This problem is about Unicode's UTF-8 encoding. How many Unicode characters does the following sequence of 16 bytes encode? Explain your answer. (The x's represent bits that do not change the answer.) 01010101 0000xxxx 0111xxxx 1110xxxx 1010111x 100011xx 1101xxxx 1000000x 11110xxx 1011xxxx 10111xxx 101111xx 0xxxxxxx 110xxxxx 101100xx 011xxxxx Problem 15.) Unicode has a 21-bit address space for all of its characters. UTF-8 is an encoding of Unicode's 21-bit address space. Every valid UTF-8 encoding represents a 21-bit address in Unicode's address space. For each valid UTF-8 character encoding below, what is the decoded 21-bit Unicode address? What is the name of the Unicode character? What is the character's code point (in proper Unicode notation)? Use this picture of the UTF-8 encoding to help with the decoding. https://higheredbcs.wiley.com/legacy/college/horstmann/1119402972/app/appendices.pdf#page=7 The prefix "0b" means that a number is written in binary (not decimal or hexadecimal). a) 0b01100011 0b __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ b) 0b11100110 0b10001100 0b10111000 0b __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ c) 0b11110100 0b10101010 0b10010101 0b10100000 0b __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ Problem 16.) Below are 16 binary bits. These 16 bits can be interpreted as two UTF-8 code units or as one UTF-16BE code unit. What character (or characters) do these 16 bits represent in UTF-8? What character do they represent in UTF-16? 1 1 0 0 1 0 0 1 1 0 1 0 0 0 0 1 See http://cs.pnw.edu/~rlkraft/cs33600/for-class/readmes/Readme_7_character_encodings/#utf-16 Problem 17.) The UTF-16 BOM (byte-order-mark) is the following two bytes. 0xFE 0xFF In this Stack Overflow answer, https://stackoverflow.com/questions/2223882/whats-the-difference-between-utf-8-and-utf-8-with-bom/2223976#2223976 it is mentioned that the "UTF-8 BOM" is the following three bytes. 0xEF 0xBB 0xBF Is this really a three byte UTF-8 character? If so, what actual Unicode character is it? Why is this called the "UTF-8 BOM"? Why does UTF-8 NOT need a BOM, but UTF-16 does? Problem 18.) Use the code page tables for the 437 and 1252 code pages to carefully explain the results from these two lines of Java code. new String( "½¼".getBytes("Cp1252"), "Cp437" ) new String( "½¼".getBytes("Cp437"), "Cp1252" ) https://en.wikipedia.org/wiki/Code_page_437#Character_set https://en.wikipedia.org/wiki/Windows-1252#Code_page_layout Problem 19.) What does an IP address identify? What does a port number identify? Problem 20.) Identify all the parts of the following URL. http://www.bigcompany.com:42/consumer/products/lookup.php?sku=0&color=greenish#item=3