Answer to Twitter Stream Analysis - Part 3: Extracting URLs
Select all the tweets that contains a URL
Create a text search with http.
However the text search doesn’t let you select all string that doesn’t have http. To do this we will use a new function, star/flag rows available from the All drop down menu (first column).
Star all the rows that contains http and remove all your facet. Now create a facet by stars and select false to select all the non-starred rows.
- Still under the All menu, Edit Rows select Remove all matching rows.
Extract the URL in a new column
First split the column entities_str using the string "expanded_url": as delimiter
Split the newly created column based on a comma, with split into 2 columns at most.
- Create a facet on the new field and sort by count.