End Activity Session (Day 3)
1. Setup
- Create a repo on GitHub named
eds212-day3-activities
- Clone to create a version-controlled R Project
- Create some subfolder infrastructure (docs, data)
2. Conditional statements & for loops
Create a new Quarto document in your docs folder, saved as conditionals_loops.qmd
. Complete all tasks for Part 2 in this .qmd.
Complete each of the following in a separate code chunk.
Conditional statements
Task 1
Create an object called pm2_5
with a value of 48 (representing Particulate Matter 2.5, an indicator for air quality, in \(\frac{\mu g}{m^3}\) (see more about PM2.5 here).
Write an if - else if - else
statement that returns “Low to moderate risk” if pm2_5
(for Particulate Matter 2.5) is less than 100, “Unhealthy for sensitive groups” if PM 2.5 is 100 <= pm2_5 < 150, and “Health risk present” if PM 2.5 is >= 150.
Test by changing the value of your pm2_5 object and re-running your statement to check.
Task 2
Store the string “blue whale” as an object called species
. Write an if statement that returns “You found a whale!” if the string “whale” is detected in species, otherwise return nothing. Test by changing the species string & re-running to see output.
Task 3
Store the base price of a burrito as base_burrito
with a value of 6.50. Store main_ingredient
with a starting string of “veggie.” Write a statement that will return the price of a burrito based on what a user specifies as “main_ingredient” (either “veggie”, “chicken” or “steak”) given the following:
- A veggie burrito is the cost of a base burrito
- A chicken burrito costs 3.00 more than a base burrito
- A steak burrito costs 3.25 more than a base burrito
For loops
Complete each of the following in a separate code chunk.
Task 4
Create a new vector called fish
that contains the values 8, 10, 12, 23
representing counts of different fish types in a fish tank (goldfish, tetras, guppies, and mollies, respectively). Write a for loop that iterates through fish
, and returns what proportion of total fish in the tank are that species. Assume that these counts represent all fish in the tank.
Task 5
There is an existing vector in R called month.name
that contains all month names (just ry running month.name
in the Console to check it out). Write a for loop that iterates over all months in month.name
and prints “January is month 1,” “February is month 2”, etc.
Hint: you can index values in the month.name
vector just like you would any other vector (e.g., try running month.name[5]
).
3. Real data
You will complete Part 3 in a separate .qmd.
Explore this data package from EDI, which contains a “Data file describing the biogeochemistry of samples collected at various sites near Toolik Lake, North Slope of Alaska”. Familiarize yourself with the metadata (particularly, View full metadata > expand ‘Data entities’ to learn more about the variables in the dataset).
Kling, G. 2016. Biogeochemistry data set for soil waters, streams, and lakes near Toolik on the North Slope of Alaska, 2011. ver 5. Environmental Data Initiative. https://doi.org/10.6073/pasta/362c8eeac5cad9a45288cf1b0d617ba7
Download the CSV containing the Toolik biogeochemistry data
Take a look at it - how are missing values stored? Keep that in mind.
Drop the CSV into your data folder of your project
Create a new Quarto document, save in docs as
toolik_chem.qmd
Attach the
tidyverse
,here
, andjanitor
packages in your setup code chunkRead in the data as
toolik_biochem
. Remember, you’ll want to specify here howNA
values are stored. Pipe directly intojanitor::clean_names()
following your import code to get all column names into lower snake case.Create a subset of the data that contains only observations from the “Toolik Inlet” site, and that only contains the variables (columns) for pH, dissolved organic carbon (DOC), and total dissolved nitrogen (TDN) (hint: see
dplyr::select()
). Store this subset asinlet_biochem
. Make sure to look at the subset you’ve created.Find the mean value of each column in
inlet_biochem
3 different ways:
- Write a for loop from scratch to calculate the mean for each
- Use one other method (e.g.
apply
,across
, orpurrr::map_df
) to find the mean for each column.