Practice Problem Set 5A: Strings
Run the following code to load the data. View the codebook and other information on the on the data here.
Separate the project title into the main title and sub-title.
Create a subset of the data with projects that have the word STEM in their title.
Count the number of grants by
directorate. Do you see anything odd?Clean up the
directoratevariable and create the table again. Arrange the table in descending order.Count the length of the
org_zipvariable. Look at the distribution of the length.Split the
org_zipvariable into into two variables. First containing the 5 number zip code and the second containing the add-on 4 number code.Challenge problem: Create a graph identifying the most commonly occurring word across the abstracts from grants that are on Ted Cruz’s list.