Unit 3 - Files, Algorithms (Searching and Sorting), 2D Arrays and Recursion

Files

File I/O can easily be understood if we break it down into writing to and reading from. The following examples assume that there is a file called foo.txt in the same directory as your program.

Notice the case sensitivity of the file name. The contents of foo.txt are 4 lines. Please recreate this in a text editor. Make sure that you press enter after each line you type.

hello

dog

bye

cat

Chapter 7 of the Python docs outlines file I/O in Python. I'm using it as the main source for this lesson.

HEED THIS:

Basic Principle #1

CLOSE THE FILE AFTER YOU'RE DONE WITH IT - AS SOON AS POSSIBLE. CORRUPTION EXISTS.

Basic Principle #2

KEEP IN MIND THAT YOU ARE USING TEXT FILES IN THESE EXAMPLES. BINARY FILES EXIST, AND MUST BE HANDLED DIFFERENTLY. CORRUPTION EXISTS.

Reading FROM a file

Reading from a file is very simple in Python.

f = open('foo.txt', 'r')

open() returns a file object. The first argument is a string containing the filename.

The second argument describes the mode:

'r' when the file will only be read

'w' for only writing (an existing file with the same name will be erased)

'a' opens the file for appending; any data written to the file is automatically added to the end.

'r+' opens the file for both reading and writing.

The mode argument is optional; 'r' will be assumed if it’s omitted.

HEED THIS:

Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding. If encoding is not specified, the default is platform dependent (see open()). 'b' appended to the mode opens the file in binary mode: now the data is read and written in the form of bytes objects. This mode should be used for all files that don’t contain text.

In text mode, the default when reading is to convert platform-specific line endings (\n on Unix, \r\n on Windows) to just \n. When writing in text mode, the default is to convert occurrences of \n back to platform-specific line endings. This behind-the-scenes modification to file data is fine for text files, but will corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files.

Methods of File Objects

f.read()    #reads the content of a file.  Returns it as a string.

            #When the end of the file is reached, an empty string '' is returned

f.readline()    #reads a single line. Returns an empty string '' at end of the file

For loops and File Objects

f = open('foo.txt', 'r')

for line in f:

    print(line, end='')

f.close()

Fast and simple ... and no newline characters to deal with ('\n').

Close the file - ALWAYS.

f.close() #free the system's resources.  MUST BE DONE AFTER READ AND WRITE. IMMEDIATELY.

Ideas To Explore

Test these two functions. What do they do? Compare them.

list(f)    #this might be very cool...

f.readlines()    #could this be even better?

The Safest Way to Read

with open('foo.txt', 'r') as f:

    for line in f:

        print(line, end='')    # or try this: print(line) OR print(line.rstrip('\n'))

Using with will automatically call close(). I strongly suggest with as your only choice.

Notice the print(). Try the commented line. Which version is easier for you to work with? There are several ways to strip the newline character ('\n'). Choose one that works for you.

Reading data from a file into a list

with open('foo.txt', 'r') as f:

    my_list = f.read().splitlines()

print(my_list)

Writing to a list is effective way to store what you've read from the file.

Writing TO a file

When you open a file in 'w', Python will erase the contents before it begins writing. Append, 'a', allows writing to happen after the last line of the file.

write() will only accept strings. You may have to cast your data first.

f.write(string) writes the contents of string to the file. It will return the amount of characters written.

f = open('workfile', 'w')    #But what TYPE of file is workfile?

f.write('This is a test\n')

f.close()

The file that was written will be located in the same directory as your source. Please ADD AN EXTENSION (.txt) to the file you are creating. This will help your OS with recognition, it doesn't effect the actual type.

f = open('workfile.txt', 'w')    #Much better.

f.write('This is a test\n')

f.close()

The Safest Way to Write

with open('baa.txt', 'w') as f:

    f.write('Hello')

with open('baa.txt', 'r') as f:

    print(f.read())

Looks like the writing worked.

Appending to a file

with open('Groceries.txt', 'w') as f:

    f.write('Bacon')

with open('Groceries.txt', 'a') as f:

    f.write('Eggs')

with open('Groceries.txt', 'r') as f:

    print(f.read())

That worked, but it's not much of a list. My items are all on the same line. I need to add a newline character '\n'...

Adding the newline character

with open('Groceries.txt', 'w') as f:   #NOTE THE MODE

    f.write('Bacon' + '\n') #concatenate a newline character

with open('Groceries.txt', 'a') as f:   #NOTE THE MODE

    f.write('Eggs') #write your string

    f.write('\n')   #write a newline character

with open('Groceries.txt', 'a') as f:   #NOTE THE MODE

    f.write('Coffee' + ' ') #concatenate a space

with open('Groceries.txt', 'a') as f:   #NOTE THE MODE

    f.write('and something else...')

with open('Groceries.txt', 'r') as f:

    print(f.read())

Adding the newline character isn't hard, and clearly there are two ways you can do it. Note that this example is broken down into MULTIPLE WRITE CYCLES. This is not efficient. It was done so that you could see each line, one at a time.

If I strip the inefficient multiple write cycles out, it would look like this:

with open('Groceries.txt', 'w') as f:   #NOTE THE MODE

    f.write('Bacon' + '\n') #concatenate a newline character

    f.write('Eggs') #write your string

    f.write('\n')   #write a newline character

    f.write('Coffee' + ' ') #concatenate a space

    f.write('and something else...')

with open('Groceries.txt', 'r') as f:

    print(f.read())

Could this be even better? Probably. What if I had a list holding my items? Wait, I should save idea that for an assignment...

A3.1

Dealing with Files

Dealing with files is a small part of either the CCC or ECOO contests.

These were originally written by Mr. Reid in Java, and I archived them in my ICS Files page. I re-wrote them in Python.

All these examples deal with numbers, and assume that the data file and the code are in the same directory. They all share similar code, your individual mileage may vary.

For CCC Junior input() ideas, please keep reading past these examples:

FileInputExample1.py deals with a vertical row of numbers.

# FileInputExample1.py

# NEEDS: External file called dataE1.txt to be in the same

#       directory as FileInputExample1.py. That file looks like:

# 1

# 2

# 3

# 4

#dump file to screen

with open('dataE1.txt', 'r') as f:

    print(f.read())

#read each line of file, split each line by the end of line character, add to a list

with open('dataE1.txt', 'r') as f:

    my_list = f.read().splitlines()

print(my_list)  #note this is a list of strings

#convert list of strings to ints, start loop at 0, go to its length

for counter in range(0, len(my_list)):

    my_list[counter]=int(my_list[counter])  #cast for the conversion

print(my_list)  #note this is a list of ints

FileInputExample2.py deals with a horizontal row of numbers delimited with a space character.

# FileInputExample2.py

# NEEDS: External file called dataE2.txt to be in the same

#       directory as FileInputExample2.py. That file looks like:

# 1 2 3 4

#dump file to screen

with open('dataE2.txt', 'r') as f:

    print(f.read())

#read each line of file, split by whitespace, add to a list

with open('dataE2.txt', 'r') as f:

    my_list = f.read().split()

print(my_list)  #note this is a list of strings

#convert list of strings to ints, start loop at 0, go to its length

for counter in range(0, len(my_list)):

    my_list[counter]=int(my_list[counter])  #cast for the conversion

print(my_list)  #note this is a list of ints

FileInputExample3.py deals with a horizontal row of numbers that has no delimiter.

# FileInputExample3.py

# NEEDS: External file called dataE3.txt to be in the same

#       directory as FileInputExample3.py. That file looks like:

# 1234

#dump file to screen

with open('dataE3.txt', 'r') as f:

    print(f.read())

#read a line of file, add to a list

with open('dataE3.txt', 'r') as f:

    my_list = f.readline().split()

print(my_list)  #note this is a list with 1 string

my_string = my_list[0] #move the string out of the array and into a string

print(my_string) #see the string

#start loop at 0, end at length of the string

for counter in range(0, len(my_string)):

    digits = int(my_string[counter:counter+1]) #slice from loop counter position to loop counter position + 1, then cast that slice to an int

    print(digits) #individual digits

FileInputExample4.py deals with multiple lines of space delimited data.

# FileInputExample4.py

# NEEDS: External file called dataE4.txt to be in the same

#       directory as FileInputExample4.py. That file looks like:

# 1 2 3 4

# 5 2 91 21

# 34 12 6 3

#dump file to screen

with open('dataE4.txt', 'r') as f:

    print(f.read())

#read each line of file, split by whitespace, add to a list

with open('dataE4.txt', 'r') as f:

    my_list = f.read().split()

print(my_list)  #note this is a list of strings

#convert list of strings to ints, start loop at 0, go to its length

for counter in range(0, len(my_list)):

    my_list[counter]=int(my_list[counter])  #cast for the conversion

print(my_list)  #note this is a list of ints

FileInputExample4.py will also handle a file with multiple lines of space delimited data, but the first line of the file tells you how big the file is. They tend to look like this:

# 3

# 1 2 3 4

# 5 6 7 8

# 99 88 77 66

This last example comes from the University of Waterloo. It's part of their CCC prep page.

#   Canadian Computing Competition

#    Example program to demonstrate input and output and time limit.

#    Programming Language:  Python 3.x

#    Specification:

#       Write a program that reads several positive integers, one per

#       line.  For each integer n, output the number of orderings

#       possible for a set of n distinct values.  n will not exceed 11.

#       The last line of input is indicated by 0.

#    Sample Input:

#       1

#       9

#       0

#    Output for Sample Input:

#       1

#       362880

#    Implementation:

#       The answer is n! (n factorial) which is easily computed in n steps.

#       But this program does it the hard way.  It uses a recursive function

#       to enumerate and count all the possible orderings.

#    How to run the program:

#       The program reads from "standard input" and writes to "standard output."

#       Specifically, you can  this program with input by typing at the

#       DOS command prompt (yes, you will need a Command Prompt window)

#       in the correct directory (where you created the test.py2 file):

#                 python test2.py < input.txt

#       where input.txt contains the test data.

#    Run-time:

#       Please time the execution time of this program on your computer,

#       using the sample input.  This time is the maximum that should be

#       allowed for any CCC program.  Note that this solution will work

#       only up to problems of size 9 in the required time limit

def countfact(s, n, total):

  if (n == 0):

    # uncomment to print out each ordering

    # for j in range(total):

    #   print(s[j]," "),

    # print()

    return 1;

  r = 0

  for i in range(n):

    t = s[i]

    s[i] = s[n-1]

    s[n-1] = t

    r += countfact(s,n-1,total)

    s[n-1] = s[i]

    s[i] = t

  return r

######################################

## start of main function

set = [0]*11;

# initialize the set of values

for i in range(11):

  set[i] = i+1;

n = 1

while (n > 0):    #  0 would do nothing, and -1 is nothing to read

  n = int(input())

  if (n>0):

    print(countfact(set,n,n))

######################################################################

## For comparison, here is a slightly more efficient solution

##

##import itertools  # allows importing of permutations

##

##n = 1

##while (n > 0):    #  0 would do nothing, and -1 is nothing to read

##  n = int(raw_input())

##  if (n>0):

##    all_permutations = list(itertools.permutations(range(n)))

##    # uncomment to see all the permutations

##    #

##    #for perm in all_permutations:

##    #  print(perm)

##

##    print(len(all_permutations))

##

CCC Junior input() vs File Input

When you look up Junior questions and solutions (J1, J2, etc) , the INPUT data files are named j1.1.in, j1.2.in, etc.

j1 is the name of the question

1 is which scenario your question is being tested on

in indicates this is the file with the INPUT

Junior questions don't always need file input. The CCC Grader uses the input() in that case.

Here are 3 solutions written by Tasho M. I mangled his code to prove a point about the CCC Grader:

CCC 2018 Junior (Post-Contest): Telemarketer or not? Solution 1

#my_list = [0,0,0,0]

my_list = []

x = input()

my_list.append(x)

x = input()

my_list.append(x)

x = input()

my_list.append(x)

x = input()

my_list.append(x)

#my_list[0] = input()

#my_list[1] = input()

#my_list[2] = input()

#my_list[3] = input()

if my_list[0] == '8' or my_list[0] == '9': #first

    first = True

else:

    first = False

if my_list[3] == '8' or my_list[3] == '9': #last

    last = True

else:

    last = False

if my_list[1] == my_list[2]: #middle

    mid = True

else:

    mid = False

if first and mid and last:

    print('ignore')

else:

    print('answer')

CCC 2018 Junior (Post-Contest): Telemarketer or not? Solution 2

my_list = [0,0,0,0]

my_list[0] = input()

my_list[1] = input()

my_list[2] = input()

my_list[3] = input()

if my_list[0] == '8' or my_list[0] == '9': #first

    first = True

else:

    first = False

if my_list[3] == '8' or my_list[3] == '9': #last

    last = True

else:

    last = False

if my_list[1] == my_list[2]: #middle

    mid = True

else:

    mid = False

if first and mid and last:

    print('ignore')

else:

    print('answer')

CCC 2018 Junior (Post-Contest): Telemarketer or not? Solution 3

#with open('j1.1.in', 'r') as f:

#    my_list = f.read().splitlines()

#for counter in range(0, len(my_list)):

#    my_list[counter]=int(my_list[counter])

my_list = [0,0,0,0]

for counter in range(4):

    my_list[counter]=input()

if my_list[0] == '8' or my_list[0] == '9': #first

    first = True

else:

    first = False

if my_list[3] == '8' or my_list[3] == '9': #last

    last = True

else:

    last = False

if my_list[1] == my_list[2]: #middle

    mid = True

else:

    mid = False

if first and mid and last:

    print('ignore')

else:

    print('answer')

In the third (last) example, you can see where I commented out the file I/O. It's because the CCC Grader was hanging on the file read.

So, even though the CCC gives you test files like these:

#j1.1.in

#j1.2.in

The CCC Grader doesn't want file input, it just relies on the input().

Good Luck!

Problem Set 4

Designing Algorithms using Searching and Sorting

When we speak about data and usable information, the terms search, sort and algorithm come up frequently. An algorithm is a series of steps used to solve a problem. Those steps can be logical, or computational. A search is the act of looking for data in a data set. Sorting refers to arranging a data set so that it can be searched efficiently. Python has very robust methods for searching and sorting. sort(), sorted() and bisect / bisect.bisect_left() are all there for a programmer to use. This unit is about how those algorithms are written.

Let's think about this for a bit ...

WHAT'S THE FASTEST WAY TO ALPHABETIZE YOUR BOOKSHELF?

Linear Search

Linear searches are systematic. They do not require the data to be sorted before they run. A linear (sequential) search is a method for finding a target value within a list. It sequentially checks each element of the list for the target value until a match is found or until all the elements have been searched. Imagine being in a classroom with 30 students, and you were looking for Bob. A linear approach would involve you asking every student in the room what their first name was. You could start at the end of the room, or the front of the room, but you would still be systematically asking everyone.

I've used a return statement to stop my linear search at the first occurrence.

dataset = [12,11,76,99,16,76]

def linearSearchA(data, lookfor):

    position = 0

    for element in data:

        if element == lookfor:

            return position

        position += 1

whereisit = linearSearchA(dataset, 76)

print('Location: ' + str(whereisit))

https://pythonschool.net/data-structures-algorithms/linear-search/

Use the code in the video as your template.

Binary Search

Binary search is a search algorithm that finds the position of a target value within a sorted array. It compares the target value to the middle element of the array; if they are unequal, the half in which the target cannot lie is eliminated and the search continues on the remaining half until it is successful.

Suppose you were looking for Bob again. If the teacher had put the class in alphabetical order of first names, your search would be faster. Binary searches efficiently look for data in lists that are ordered. Just like modular programming, binary searches divide and conquer the data.

Assume you are looking for 39 in the following data set.

8 13 17 18 20 23 26 27 28 29 39 41 40 45 46 50

A binary search would begin in the middle of the data. The middle item, 27, is not what we are looking for. But because we know the data is ordered, and that 27 is less than the value we are looking for, we can eliminate all items in the lower half of the list. Our binary search can focus on the remaining data.

Let's repeat that strategy with the remaining data:

28 29 39 41 40 45 46 50

Becomes:

28 29 39

Why? Because we started at the half way point of the data set, and compared that value to the value we are looking for, 39. Because 41 is greater than 39, we eliminate all the values in ordered data set above and including 41.

Let's repeat that strategy with the remaining data:

28 29 39

Becomes:

Why? Because we started at the halfway point of the data set, and compared that value to the value we are looking for, 39. Because 29 is less than 39, we eliminate all the values in the ordered data set below and including 29.

Let's repeat that strategy with the remaining data:

Becomes:

Why? Because it's a good idea to check. This leaves us with 39, the value we were looking for.

In four passes, we have found our data. This would have taken a linear search as many passes as there are elements.

A binary search requires knowing the starting point and ending point of the array. Using these two numbers, the program will be able to average out the middle of the array. Consider this, if the item in the middle is what we are looking for, we are done. If the item is smaller, we can drop the value in the middle and all the values below it. This can be expressed as starting point becomes middle + 1. Similarly, if the value located at the middle is larger than the item, we drop all the positions above it by setting the end point to middle -1.

Assume you are looking for 20:

8 13 17 18 20 23 26

0 1 2 3 4 5 6

In the first pass, the start point is index 0, and the end point is index 6.

Middle would be (0 + 6) / 2 = 3.

Because 18 is less than 20, we drop the left half of the array by setting end to middle +1.

20 23 26

4 5 6

The middle of the remaining data is (4 + 6) / 2 = 5

Because 23 is greater than 20, we drop the right half of the array by setting start to middle - 1.

Now the search ends with start, middle and end all the same value.

https://pythonschool.net/data-structures-algorithms/binary-search/

Use the code in the video as your template.

Sorting

Binary searches work efficiently when the data has been presorted. Let's consider a few ways of accomplishing this task.

Insertion Sort

The following scenario deals with playing cards. Assuming you have been dealt a hand of five cards, and the suits do not matter, you could be looking at this sequence of numbers in you hand:

8 5 7 10 4

One way to sort your hand is by starting on the left hand side, and rearranging the cards if necessary. The moved card will be placed to the left unsorted cards. In this example the original hand becomes,

5 8 7 10 4

The next card to be moved would be the 7. GIving us,

5 7 8 10 4

At this point we can see that the first 3 cards are in order. Looking at the 4th card, the 10 we see that it is greater than the other values to its left, so we pass it and look at the last card, 4.

4 5 7 8 10

Use this code from Problem Solving with Algorithms and Data Structures using Python.

Specifically, Chapter 5.9 The Insertion Sort

def insertionSort(alist):

    for index in range(1,len(alist)):

        currentvalue = alist[index]

        position = index

        while position>0 and alist[position-1]>currentvalue:

            alist[position]=alist[position-1]

            position = position-1

        alist[position]=currentvalue

alist = [54,26,93,17,77,31,44,55,20]

insertionSort(alist)

print(alist)

~~https://pythonschool.net/data-structures-algorithms/insertion-sort/~~

~~Use the code in the video as your template.~~

Selection Sort

In the insertion sort, the data is constantly being moved. This can pose performance issues. The selection sort reduces the amount of data movement. Like the insertion sort, data is tested and moved based on its value. Unlike an insertion sort, once data is move, it is never moved again. We look for the largest data, place it at the top of the list and exchange it with the item that was originally there. The number of passes needed to sort the items is one less than the number of items.

Given the following data:

9 3 11 5 7 6

We find the largest element 11, and swap it with 6 (the end),

9 3 6 5 7 11

On the subsequent passes, all items except the last are tested.

9 3 6 5 7 11

7 3 6 5 9 11

5 3 6 7 9 11

5 3 6 7 9 11

3 5 6 7 9 11

Bubble Sort

In a bubble sort, the data being sorted is compared to the adjacent values and then exchanged if they are not in order. The number of passes required to sort the data is one less than all the items.

Given the following data:

Pat Isabell Sam Joe Greg

Starting on the left, we compare name 1 with name 2, and swap them because P is larger than I.

Isabell Pat Sam Joe Greg

Next we compare name 2 and name 3, because they are in order, they are left:

Isabell Pat Sam Joe Greg

Next we compare name 3 and name 4, and swap them because S is larger than J.

Isabell Pat Joe Sam Greg

Next we compare name 4 and name 5, and swap them because S is larger than G.

Isabell Pat Joe Greg Sam

Notice that the largest value in our data, S (Sam) has bubbled to the top of our list. The algorithm must pass over the data three more times until it finishes sorting.

Clearly the bubble sort is slower than either the selection or the insertion sort.

https://pythonschool.net/data-structures-algorithms/bubble-sort/

Use the code in the video as your template.

Algorithm Conditions

When you are thinking about designing your algorithm, consider the following idea: writing your conditions down.

What are the conditions?

Algorithm's, no matter how simple or complicated, all have precondition (i.e., starting state) and a postcondition (i.e., ending state).

Prof. F. Pitt (U of T) describes preconditions and postconditions very simply in these lecture notes:

---------------------

Algorithm Correctness

---------------------

Preconditions/postconditions:

 -  "Precondition": statement specifying what conditions must hold _before_

    an algorithm is executed (i.e., describes valid inputs).

 -  "Postcondition": statement specifying what conditions hold _after_ an

    algorithm executes (i.e., describes expected output).

 -  In general, we want weakest reasonable precondition (i.e., put as few

    constraints as possible, only specify what is strictly necessary) and

    strongest reasonable postcondition (i.e., specify as much as possible).

A3.2, A3.4C2.1

Problem Set 5

Two Dimensional Arrays, aka Multidimensional Lists, aka 2D Arrays

Preamble

You know how I always say, "array", when we're talking about 'lists'?

I'm going to do that again.

Array

An array is series of elements that can be accessed with the name of the array and an index. In ICS3U, arrays were used to store data. Please take a look at the String or the Lists, aka Data Structures, aka Arrays notes for a refresher.

foo = [1,2,3]

2D Array

A two dimensional array (in Python, a multidimensional list) is called a '2D array' for short. It is an array, holding an array, or in Python, a list holding a list.

foo = [[ , , ],[ , , ]]    #Please don't try this...just look at the empty lists

Navigating a 2D array

You're going to have to think in rows and columns when you want to navigate a multidimensional list. Each element will be accessed by using a coordinate.

bookcase = [['Graphic Novels', 'Small Board Games', 'Large Board Games'],['Novels','Picture Frames', 'Textbooks']]

print(bookcase) #proof that the 2D array bookcase exists.  Data not really usable

print(bookcase[0])  #print the first row. Data not really usable

print(bookcase[1])  #print the second row. Data not really usable

print(bookcase[0][1])   #print Small Board Games. Notice the data OUTPUT.

print(bookcase[1][0])   #print Novels. Notice the data OUTPUT.

How did we get to Novels, or Small Board Games? We used a rows and columns based coordinate system. Here's the pseudocode:

arrayname[arrayrow][arraycol]

Building a 2D array from several 1D arrays

shelf1 = ['Graphic Novels', 'Small Board Games', 'Large Board Games']

shelf2 = ['Novels','Picture Frames', 'Textbooks']

bookcase = [shelf1, shelf2] #create a new 2D array from 2 1D ones

print(bookcase)

Building a 2D array, element by element

#not exactly elegant, but it illustrates the point:

bookcase = [['','',''],['','','']]

bookcase[0][0] = 'Graphic Novels'

bookcase[0][1] = 'Small Board Games'

bookcase[0][2] = 'Large Board Games'

bookcase[1][0] = 'Novels'

bookcase[1][1] = 'Picture Frames'

bookcase[1][2] = 'Textbooks'

print(bookcase)

An alternative way to declare a 2D array, for readability.

bookcase = [

    ['Graphic Novels', 'Small Board Games', 'Large Board Games'],    #each row, separated

    ['Novels','Picture Frames', 'Textbooks']

print(bookcase)

Empty Lists and List Comprehensions

#This first example shows how the 1D array can be filled with a for loop

#Eventually, I'll show you how to use List Comprehensions to do this

empty1Darray = []   #build a 1D array with no size

print(empty1Darray) #what's in the array? Not much - no elements.

for x in range(5):  #use a for loop to append a blank element to the array

    empty1Darray.append('')

print(empty1Darray) #looks like our array has entries

#Using List Comprehensions

anotherempty1Darray = [None for x in range(5)]  #'null' in Python, is None

#what happened? The for loop ran 5 times, and added the value None to the array

print(anotherempty1Darray)  #looks like our array has entries

yetanotherempty1Darray = ['' for x in range(5)]  #use '' character

#what happened? The for loop ran 5 times, and added the value '' to the array

print(yetanotherempty1Darray)  #looks like our array has entries

#When List Comprehensions really shine

squares = [x**2 for x in range(10)] #There's even more in the Python Docs...

print(squares)

#When List Comprehensions are your friend

empty2Darray = [[0 for i in range(5)] for i in range(5)]    #List Comprehensions magic

print(empty2Darray)

empty2Darray[4][4] = 2  #let's modify one position

print(empty2Darray) #see which element is modified?

#When multiplication isn't your friend

anotherempty2Darray = 5*[5*[0]] #looks good, right?....

print(anotherempty2Darray) #looks good....all empty, right size.....

anotherempty2Darray[4][4] = 2  #let's modify one position

print(anotherempty2Darray) #see which element is modified? Bad, right?

#What happened?  We made 5 copies of the same list, so when you change one...

#you change them all.

Printing 2D lists

This came from Simplest Solutions. Other interesting 2D solutions are provided.

for row in sampleList:

   for element in row:

      print(element, end=" ")

   print()

A3.5

Problem Set 6

Recursion

To understand recursion, you must first understand recursion.

Weird, huh?

If you look closely, you're going to begin to understand recursion.

Maybe you should google, 'recursion'.

A definition of recursion is, the repeated application of a recursive procedure or definition.

In computer science, recursion is a method where the solution to a problem depends on solutions to smaller instances of the same problem (as opposed to iteration (loops)).

A. Downey writes in Think Python:

"It is legal for one function to call another; it is also legal for a function to call itself. It may not be obvious why that is a good thing, but it turns out to be one of the most magical things a program can do."

This problem solving technique can otherwise be understood in this manner:

Recursion, in mathematics and computer science, is a method of defining functions in which the function being defined is applied within its own definition (http://en.wikipedia.org/wiki/Recursion).

I like to think of it as a function calling itself. That sets the programmer up for a problem called infinite recursion - recursion that will not end - like the mirror picture. In order for the recursive calls to stop, there needs to be a base case, but more about that later.

Or,

"In order to understand recursion, one must first understand recursion."

Let's see how this works in computer science.

The following information comes from http://www.cs.umd.edu/class/spring2002/cmsc214/Tutorial/recursion.html

Solving a "big" problem recursively means to solve one or more smaller versions of the problem, and using those solutions of the smaller problems to solve the "big" problem.
Most people go through three phases when learning recursion. First, they hate it, because they can't understand it. Then, they love it because they understand this mysterious process. Then, they hate it because they think it's inefficient.

The Four Steps to Understanding Recursion

Step 1: Write and define the prototype of the function.

Since functions are the basic unit of recursion, it's important to know what the function does. The prototype you use will dictate how the recursion behaves.

Let's look at an example. Here's a function which will sum the first n elements of an array.

#Sums the first n elements of the array, arr

def sum(arr, n):

Step 2: Write out a sample function call

Once you've determined what the function does, then we imagine a function call.

result = sum(arr, n)

So, the call is sum(arr, n). This will sum the first n elements of arr. Pick the most generic function call. For example, you don't want to have a call like:

result = sum( arr, 10 )

That's picking a constant. You want to use variables when possible, because that's the most general way to call the function.

Step 3: Think of the smallest version of the problem

The smallest version is called the base case. Most people mistakenly pick a base case that's too large. In this case, you will pick a specific value for n.

So, what's the smallest version of the problem? Here are three choices:

sum(arr, 2)    #Choice 1

sum(arr, 1)    #Choice 2

sum(arr, 0)    #Choice 3

Some people pick Choice 1, reasoning that if you are to sum elements of an array, then you must have at least two elements to sum. However, that's really not necessary. In math, there is something called a "summation". It's perfectly valid to have a summation of only one element. You just return that one element.

Some people pick Choice 2, because it doesn't make sense to sum an array of size 0, whereas an array of size 1 seems to make sense.

However, it turns out Choice 3 is the smallest choice possible. You can sum zero elements of an array. What value should it return? It should return 0. As it turns out, 0 is the additive identity. Anything added to 0 is that number. If we wanted to multiply all elements of an array, we would have picked the multiplicative identity, which is 1.

Admittedly, picking such a small value requires a more extensive knowledge of math, but that shouldn't scare you from picking the smallest value possible.

There are several mistakes people make with a base case.

The first one we've already mentioned: picking too large a base case.

Second, not realizing there may be more than one base case.

Finally, thinking that the base case only gets called when the input size is the smallest.

In fact, the recursion ALWAYS makes it to some base case.

Thus, the base case is where the recursion eventually stops. Don't think of it as merely called when the input is, say, 0. It gets called for all cases (eventually).

Step 4: Think of smaller versions of the function call

Here's the function call

sum(arr, n)    #sums first n elements of arr

It tries to solves a problem of size "n". We want to think of a smaller problem which we will assume can be solved correctly. The next smallest problem is to sum "n - 1" elements.

sum(arr, n - 1)    #sums first n - 1 elements of arr

Assume this problem has been solved for you. How would you solve the original, larger problem? If the first n - 1 elements have already been summed then only the n_th element is left to be summed. The n_th element is actually at index n - 1 (because arrays start at index 0).

So, the solution to solving sum( arr, n ) is to add sum( arr, n - 1 ) to arr[ n - 1 ].

Putting it All Together

Writing a recursive function requires pasting the base case and the recursive case together. Here's the usual format

if base case :

    #return some simple expression

else:

    # THIS IS THE RECURSIVE CASE

    #some work before

    #recursive call

    #some work after

The simplest version of a recursive function is an if-else statement where the "if" part is the base case, and the "else" part is the recursive case.

In the recursive case, there is a recursive call. Most recursive functions do something after the call. After all, you often need the solution of the "smaller" recursive call to create the solution for the "big" problem.

However, on occasion, especially when printing output, you may need to do some work prior to the recursive function call (e.g., printing something).

A more general structure for recursion looks like:

if base case 1:

    #return some simple expression

elif base case 2:

    #return some simple expression

elif base case 3:

    #return some simple expression

elif recursive case 1:

    #some work before

    #recursive call

    #some work after

elif recursive case 2:

    #some work before

    #recursive call

    #some work after

else:

    #recursive case 3

    #some work before

    #recursive call

    #some work after

In this more general version, there may be more than one base case, and there may be more than one recursive case.

Back to our problem:

To solve the problem from the previous section, we use the simpler of the two versions.

arr = [1,2,3,4]

n = 3

def sum(arr, size):

    if size == 0:    #base case

        return 0

    else:

        # recursive call

        smallResult = sum( arr, size - 1 )

        #use solution of recursive call to solve this problem

        return smallResult + arr[ size - 1 ]

answer = sum(arr, n)

print(answer)

Multiple return statements got you down?

Some people don't like multiple return statements. That can be easily handled:

arr = [1,2,3,4]

n = 4

def sum(arr, size):

    result = None   #None allows the creation of empty types

    if size == 0:    #base case

        result = 0

    else:

        # recursive call

        smallResult = sum( arr, size - 1 )

        #use solution of recursive call to solve this problem

        result =  smallResult + arr[ size - 1 ]

    return result

answer = sum(arr, n)

print(answer)

Hone the recursion:

You may even think there's no reason to declare smallResult and prefer to write:

arr = [1,2,3,4]

n = 4

def sum(arr, size):

    if size == 0:    #base case

        return 0

    else:   # recursive call

        return sum( arr, size - 1 ) + arr[ size - 1 ]

answer = sum(arr, n)

print(answer)

Certainly, once you gain more experience with recursive functions, this is the preferable version.

However, declaring a local variable to store the result of the recursive call helps beginners to think about the "small" solution and then thinking about how to use that "small" solution to solve the bigger problem.

Thinking Backwards

Classic recursion involves thinking "backwards". Instead of building a solution from nothing, you pretend you are at the solution, and want to take a step back and ask how to solve the problem if you were a step back.

Things Get Big - Fast (Exponential Growth)

Because the stack holds each instance of the recursive call in an 'open' position until the return statement occurs (thereby closing the individual open instances), the memory footprint of the recursion can cause performance issues.

What on Earth is Recursion

If that wasn't enough:

Please take some advice from a past student (Michael R.). He wrote this student generated Recursion Notes tutorial. Read it!

Supercool

Labyrinth Algorithms explores 3 wonderful searches:

Breadth-first searching (BFS)

Depth-first search (DFS)

A* (A Star)

Laurent Luce's very neat article Solving mazes using Python: Simple recursivity and A* search

A3.6C2.4

The Importance of Recursion

The folks at Rice University posted The Importance of Recursion. Clever and simple.

Introducing Recursion

You'll notice quickly that these videos deal with recursion and Java, and that's ok. I want you to focus on the idea of the stack as it is explained in the third video.

Tracing Recursive Methods

You'll notice quickly that these videos deal with recursion and Java, and that's ok. I want you to focus on the idea of the stack as it is explained in the third video.

How Recursion Works

You'll notice quickly that these videos deal with recursion and Java, and that's ok. I want you to focus on the idea of the stack as it is explained in the third video.

Maze Solving with Recursion

Maze searching with recursion is cool. Here's a great example that I slightly edited. Thanks to Laurent Luce.

# Original http://www.laurentluce.com/posts/solving-mazes-using-python-simple-recursivity-and-a-search/

maze = [

    [0,1,1,1,1,1],  #1 is wall

    [0,0,1,1,1,1],  #0 is open

    [1,0,1,1,1,1],  #2 is exit

    [1,0,1,0,0,0],  #row 0, col 0 is start

    [0,0,0,0,1,2],  #3 is visited

    [1,1,1,1,1,1]

def showLocation(row,col):      #show content of current position

    if maze[row][col] == 1:     #1 is wall

        print('WALL at ' + str(row) + ',' + str(col))

    elif maze[row][col] == 0:   #0 is open

        print('OPEN at ' + str(row) + ',' + str(col))

    elif maze[row][col] == 2:   #2 is exit

        print('EXIT at ' + str(row) + ',' + str(col))

def printMaze(maze):

    for row in maze:

        for element in row:

            print(element, end=" ")

        print()

def move(row,col):

    if maze[row][col] == 2:     #2 is exit

        showLocation(row,col)

        print('FINISH')

        return True

    elif maze[row][col] == 1:   #1 is wall

        showLocation(row,col)

        return False

    elif maze[row][col] == 3:   #3 is visited

        showLocation(row,col)

        return False

    showLocation(row,col)

    maze[row][col] = 3      #3 used to mark visited

    if ((row < len(maze)-1 and move(row+1, col))

        or (col > 0 and move(row, col-1))

        or (row > 0 and move(row-1, col))

        or (col < len(maze)-1 and move(row, col+1))):

        return True

    return False

printMaze(maze)

move(0, 0)

printMaze(maze)

It's a simple to understand path-finding algorithm that uses recursion. Not too scary to understand.

Problem Set 7

Wondering how you got here? George Dyson: The birth of the computer
Visualizing Data with Processing - Hello World! Processing on Vimeo
Visualizing Data with D3.js

Google Sites

Report abuse