Lesson 4 Lab 1: Wiki Page Data Extraction

We exported in text file the the Wikipedia page listing postal for Toronto. You will find the source file in the course material under the name List_of_M_postal_codes_of_Canada.txt.

The page provide an area name for each postal code. We want to extract those information into two columns:

  • postalcode
  • area

Once completed rename the project postalcode. The area column should contains unique value. In the lesson 5 we will see how we can import the area name in the Toronto Building data set.

You will find in the course material an example output file.

Tip: To see the page syntax in Wikipedia, click edit page.

Last modified: Saturday, 19 September 2015, 11:00 AM