This is a review session schedule for 11am to 12:30 in W302, on Wednesday 5/2 (reading day). I'll answer questions, and also give solutions to the written hw5 problems. See images in this directory, for solutions as presented on the blackboard. Some remarks: I omit 3(b), since I have decided not to ask you to design a KMP NFA on the exam, not even as "extra credit". The point being, it was not taught well (our only proper description was the code). You still might be expected to know that a KMP NFA uses only O(M) space (for a pattern of length M), compared to a KMP DFA which needs space O(MR). They both need only O(N) time to search a text of length N. For 4(c), you only had to figure out the number of bits, but I went ahead and worked out the string as well. Of course the details of the string depend on trie you chose, but the length is always 59 bits. Here are a few solutions rendered in ASCII (view in fixed-width font): 1(a) no pa ai is pe al th of co ti th fo fo th go al th is go ai no pe ti of to al pa co co pe to fo th th go th ai no th of to ti th to to pa is to 1(b). Original: now is the time for all good people to come to the aid of their party After first pass (sort on char 0), we get these groups. Note groups of size one are "done", they need no more work. all aid come for good is now of people party the time to to the their Note we may name groups by the prefix we know they have in common. Our unfinished groups above are "a", "p", and "t". After one more recursive level of sorting (on the second char), groups "a" and "p" are finished (they split into groups of size one). Group "t" splits into "th", "ti", and "to", of which only "ti" is finished: aid all party people the the their time to to In the next step (the third char, at index 2), we realize the "to" group is finished, because they both terminate (we imagine a terminator char value of "-1", one past the end of the string). The "th" group does not split at all, but it becomes the "the" group (all the strings had an "e"): to [saw terminator, group is done] to the the their In one more step, we find the two strings that both terminate as "the", and the third string that differs with an "i". the [saw terminator, group is done] the their Remark: I presented the MSD operations here "level by level", but really MSD operates by a typical "depth first" recursion. For example, it would completely finish sorting the "a" and "p" groups, before taking its first look at the "t" group. 1(c). Here we indicate the chars that were not inspected by MSD in uppercase: aiD alL cOME fOR gOOD iS nOW oF paRTY peOPLE the the theiR tiME to to 3(a). Here is the DFA transition table in tabular form, this is really just another way to present the state diagram. A B 0 0 1 1 2 1 2 0 3 3 2 4 4 5 1 5 0 6 6 7 4 7 8 3 Recall 0 is the start state, and 8 is the goal state. If we ever reach 8, we know we have just found the pattern. 4(a,b). Note there were many other ways we could have built our Huffman code trie, but in part (b) we should always end up with a bit string of 34 bits. 4(c). Our code trie can be encoded in 59 bits. Note this depends only on the number of leaves in the trie, not its shape. Here is the string for our particular trie (see also in 4c_extra.jpg): 0 0 1 'A' 1 'B' 0 1 'O' 0 0 1 '!' 1 'Y' 1 'D' Each 'X' denotes an ASCII char in 8 bits. For example 'A' is 65, or 01000001. I will not expect you to know ASCII codes for the exam. 5. On the blackboard, I did not make it very clear when each code is discovered by the compressor, versus the decompressor (or expander). Here it is, presented a bit more clearly. 5(a). LZW compressor step transmits adds to table 1. 41 = A 81 = AB 2. 42 = B 82 = BA 3. 81 = AB 83 = ABA 4. 83 = ABA 84 = ABAA 5. 84 = ABAA 85 = ABAAB 6. 42 = B nothing 7. 80 = end nothing 5(b). LZW decompressor step receives adds to table 1. 41 = A nothing 2. 42 = B 81 = AB 3. 81 = AB 82 = BA 4. 83 = ? 83 = ABA (by tricky case, AB+A) 5. 84 = ? 84 = ABAA (by tricky case, ABA+A) 6. 42 = B 85 = ABAAB 7. 80 = end nothing In steps 4 and 5, the decompressor figures out the received string by the "tricky case": the code must stand for the previously received string, plus its first letter.