Answer to Twitter Stream Analysis - Part 3: Extracting URLs

Select all the tweets that contains a URL

  1. Create a text search with http.

  2. However the text search doesn’t let you select all string that doesn’t have http. To do this we will use a new function, star/flag rows available from the All drop down menu (first column).

  3. Star all the rows that contains http and remove all your facet. Now create a facet by stars and select false to select all the non-starred rows.

  4. Still under the All menu, Edit Rows select Remove all matching rows.


Extract the URL in a new column

  1. First split the column entities_str using the string "expanded_url": as delimiter

  2. Split the newly created column based on a comma, with split into 2 columns at most.

  3. Create a facet on the new field and sort by count.
Last modified: Friday, 18 September 2015, 6:00 PM