5CCS2PEP Scala Coursework
1. You are asked to implement a recursive function that calculates the number of steps needed until a series ends with 1. In case of starting with 6, it takes 8 steps and in case of starting with 9, it takes 19 (see above). We assume it takes 0 steps, if we start with 1. In order to try out this function with large numbers, you should use Long as argument type, instead of Int. You can assume this function will be called with numbers between 1 and 1 Million.
2. Write a second function that takes an upper bound as an argument and calculates the steps for all numbers in the range from 1 up to this bound(the bound including). It returns the maximum number of steps and the corresponding number that needs that many steps. More precisely it returns a pair where the first component is the number of steps and the second is the corresponding number.
3. Write a function that calculates hard numbers in the Collatz series-these are the last odd numbers just before a power of two is reached. For this, implement an is-power-of-two function which tests whether a number is a power of two. The easiest way to implement this is by using the bit-operator & of Scala. For a power of two, say n with n > 0, it holds that n & (n - 1) is equal to zero. I let you think why this is the case. The function is-hard calculates whether 3n + 1 is a power of two. Finally the last-odd function calculates the last odd number before a power of 2 in the Collatz series. For example for 113 we obtain 85, because of the series: 113,340,170,85,256,128,64,32,16,8,4,2,1
4. Implement a function that 'cleans'a string by finding all(proper) words in the string. For this use the regular expression \w+ for recognising words and the library function findAllIn. The function should return a document (a list of strings).
def clean(s: String) : List[String] = ???
5. In order to compute the overlap between two documents, we associate each document with a Map. This Map represents the strings in a document and how many times these strings occur in the document. A simple(though slightly inefficient) method for counting the number of string occurrences in a document is as follows: remove all duplicates from the document; for each of these (unique) strings, count how many times they occur in the original document. Return a Map associating strings with occurrences. For example
occurrences(List("a","b","b","c","d")) producesMap(a ->1, b-> 2, c -> 1, d -> 1) and
occurrences(List("d","b", "d", "b","d")) produces Map(d -> 3, b -> 2)
6. You can think of the Maps calculated under (2) as memory-efficient representations of sparse "vectors". In this subtask you need to implement the product of two such vectors, sometimes also called dot product of two vectors.
For this dot product, implement a function that takes two documents(List[string]) as arguments. The function first calculates the (unique strings in both. For each string, it multiplies the corresponding occurrences in each document. If a string does not occur in one of the documents, then the product for this string is zero. At the end you need to add up all products.
7. Implement first a function that calculates the overlap between two documents, say d1 and d2, according to the formula:
overlap(d1, d2) = d1 * d2 / max(d1^2, d2^2)
You can expect this function to return a Double between 0 and 1. Second, implement a function that calculates the similarity of two strings, by first extracting the substrings using the clean function from (1) and then calculating the overlap of the resulting documents.
咨询 Alpha 小助手,获取更多课业帮助