Lab03: Intermediate Python ProgramsΒΆ

Brief Honor Code. Do the homework on your own. You may discuss ideas with your classmates, but DO NOT copy the solutions from someone else or the Internet. If stuck, discuss with TA.

1. (50 points)

Write separate toolz pipelines to generate the following variables

  • words: a list of all the words in the files fortune?.txt in the data directory
  • reverse_index: a reverse index of words (key=position, value=word)
  • index: an index of words (key=word, value=position)
  • cat: a list containing the categorical encoding of words

Finally, use numpy to convert cat into a one-hot matrix with shape (#words, #unique words)

In [59]:




2. (50 points)

Write a simulation of diffusion-limited aggregation. In this simulation, we have \(n\) random walkers. Each walker starts from row 0 and a random column number, and in each step, the walker increases the row number by 1 and randomly increments or decrements its column number by 1. If the column number of the walker exceeds the maximum or becomes negative, the walker emerges on the other side (toroidal boundary conditions). At any time, if any of the walkers 8 neighbors is non-zero, the walker stops in that position, and the number of steps taken is recorded in that (row, column).

Write a function dla(nwalkers, width, height, seed) that returns a matrix with shape (width, height) after running nwalkers random walks as described above. The argument ssed is used to initialize a random number seed. Internally, the function should create a (width, height+1) matrix, and initialize the last row to have 1 with all other entries 0.

Feel free to use loops. This function is not easily vectorized.

Plot the returned matrix for the arguments nwalkers=10000, width=300, height=150, seed=123. It should look like this:

dla

dla

In [211]: