[1] "George Washington: February 22, 1732"
[2] "Thomas Jefferson: April 13, 1743"
[3] "Abraham Lincoln: February 12, 1809"
[4] "Theodore Roosevelt: October 27, 1858"
Exam 2 review
Below are questions to help you study for Exam 2. These are some examples of the kinds of questions I might ask.
- This is not a practice exam. There will be fewer questions on the actual exam.
- The questions cover most, but not all, possible material for the exam.
- The distribution of questions here is not necessarily reflective of the distribution of questions on the actual exam.
Strings and regular expressions
Example 1
Consider the following strings:
For each question below, fill in the R code to produce the desired output.
Solution:
str_extract(strings, ".+(?=:)")[1] "George Washington" "Thomas Jefferson" "Abraham Lincoln"
[4] "Theodore Roosevelt"
Solution:
str_extract(strings, "(?<=: ).+")[1] "February 22, 1732" "April 13, 1743" "February 12, 1809"
[4] "October 27, 1858"
Solution:
str_extract(strings, "(?<=\\s).+(?=:)")[1] "Washington" "Jefferson" "Lincoln" "Roosevelt"
Solution:
str_extract(strings, "(?<=, ).+")[1] "1732" "1743" "1809" "1858"
Example 2
Consider the following strings:
strings [1] "apple" "banana" "canteloupe" "durian"
[5] "eggplant" "french fries" "goat cheese" "pizza"
[9] "99 red balloons" "101 dalmatians" "route 66"
For each question below, fill in the R code to produce the desired output.
Solution:
str_subset(strings, "\\d+")[1] "99 red balloons" "101 dalmatians" "route 66"
Solution:
str_subset(strings, "^\\d+")[1] "99 red balloons" "101 dalmatians"
Solution:
str_subset(strings, "a")[1] "apple" "banana" "canteloupe" "durian"
[5] "eggplant" "goat cheese" "pizza" "99 red balloons"
[9] "101 dalmatians"
Solution:
str_subset(strings, "\\s")[1] "french fries" "goat cheese" "99 red balloons" "101 dalmatians"
[5] "route 66"
Solution:
str_subset(strings, "(.)\\1{1}")[1] "apple" "eggplant" "goat cheese" "pizza"
[5] "99 red balloons" "route 66"
Example 3
Suppose you receive a vector containing the following email addresses:
emails <- c("evansc@wfu.edu",
"dalzelnm@wfu.edu",
"heplersa@wfu.edu")- Fill in the following code to produce the desired output:
Solution:
str_extract(emails, "^.+(?=@)")[1] "evansc" "dalzelnm" "heplersa"
Example 4
You have a folder with the following filenames:
filenames <- c(".bash_profile",
"workspace.doc",
"img0912.jpg",
"updated_img0912.png",
"documentation.html",
"favicon.gif",
"access.lock")- Fill in the following code to produce the desired output:
Solution:
str_subset(filenames, "(gif|png|jpg)")[1] "img0912.jpg" "updated_img0912.png" "favicon.gif"
Example 5
Consider the following string:
ex_str <- "The kurtosis is defined by the equation $\\mu_4/\\sigma^4$,
where $\\mu_4$ is the fourth central moment
and $\\sigma$ is the standard deviation."- Fill in the following code to produce the desired output:
Solution:
str_extract_all(ex_str, "\\$[^\\$]+\\$")[[1]]
[1] "$\\mu_4/\\sigma^4$" "$\\mu_4$" "$\\sigma$"
Example 6
Here is another small dataset, ex_df, which has a single column (faculty):
ex_df faculty
1 Rob Erhardt, Professor: MAN 343
2 Daniel Beavers, Associate Professor: MAN 337
3 Kenneth Berenhaut, Professor: MAN 379
4 Leonardo Cella, Assistant Professor: MAN 344
5 Lucy D'Agostino McGowan, Assistant Professor: MAN 342
6 Nicole Dalzell, Associate Teaching Professor: MAN 338
- Write code to produce the following output using
ex_df:
Solution:
Option 1:
ex_df |>
mutate(name = str_extract(faculty, ".+(?=,)"),
position = str_extract(faculty, "(?<=, ).+(?=:)"),
office = str_extract(faculty, "(?<=: ).+")) |>
select(-faculty) name position office
1 Rob Erhardt Professor MAN 343
2 Daniel Beavers Associate Professor MAN 337
3 Kenneth Berenhaut Professor MAN 379
4 Leonardo Cella Assistant Professor MAN 344
5 Lucy D'Agostino McGowan Assistant Professor MAN 342
6 Nicole Dalzell Associate Teaching Professor MAN 338
Option 2:
ex_df |>
separate_wider_regex(faculty,
patterns = c(
name = ".+",
", ",
position = ".+",
": ",
office = ".+"
))# A tibble: 6 × 3
name position office
<chr> <chr> <chr>
1 Rob Erhardt Professor MAN 343
2 Daniel Beavers Associate Professor MAN 337
3 Kenneth Berenhaut Professor MAN 379
4 Leonardo Cella Assistant Professor MAN 344
5 Lucy D'Agostino McGowan Assistant Professor MAN 342
6 Nicole Dalzell Associate Teaching Professor MAN 338
Web scraping
Example 1
Consider the following portion of a web page:
STA 279
Welcome to STA 279! This is a 16-week course offered at Wake Forest University, which meets 3 hours each week. The syllabus can be found here
Below are some useful dates for the semester:
| Exam 1 | March 1 |
| Exam 2 | April 12 |
Now here is the source HTML for that simple web page:
<h1> STA 279 </h1>
<p>
Welcome to STA 279!
This is a 16-week course offered at Wake Forest University,
which meets 3 hours each week.
The syllabus can be found
<a href='https://sta279-f25.github.io/about/'>here</a>
</p>
<p>
Below are some useful dates for the semester:
<table>
<tr>
<td>Exam 1</td>
<td>March 1</td>
</tr>
<tr>
<td>Exam 2</td>
<td>April 12</td>
</tr>
</table>
</p>
- Suppose that the url for this page is “https://fake_web_page.html”. Write code that will import the table of exam dates into R, as a data frame.
Solution:
read_html("https://fake_web_page.html") |>
html_element("table") |>
html_table()- Suppose that the url for this page is “https://fake_web_page.html”. What will the following code return?
read_html("https://fake_web_page.html") |>
html_element("h1") |>
html_text2() |>
str_extract("\\d+")Solution: 279
Example 2
Consider the following short portion of a web page, with URL https://fake_page.html.
<div class="mesa">
<span>
<p> Hello! </p>
</span>
<p class="small"> Welcome to STA 279! </p>
<span>
<p class="small"> Room: Carswell 102 </p>
</span>
<span>
<h3 class="small"> Grading policy </h3>
</span>
<span>
<p class="small"> Exam dates </p>
</span>
</div>
- Write code in R that will extract the following string from this web page:
[1] "Carswell 102"
Solution:
read_html("https://fake_page.html") |>
html_element("span > p.small") |>
html_text2() |>
str_extract("(?<=: ).+")- Using your code from the previous question, write a short function with the following specifications:
- Input:
urlthe URL for a web page with the same format as above - Output: the building and room number for the class, scraped from the web page
- Input:
Solution:
get_room <- function(url){
read_html(url) |>
html_element("span > p.small") |>
html_text2() |>
str_extract("(?<=: ).+")
}- Suppose that we have the web pages for 20 different classes at Wake Forest, all of which have the same format as shown above, but which take places in different rooms. The URLS are stored in a vector:
class_urls [1] "https://fake_page_1.html" "https://fake_page_2.html"
[3] "https://fake_page_3.html" "https://fake_page_4.html"
[5] "https://fake_page_5.html" "https://fake_page_6.html"
[7] "https://fake_page_7.html" "https://fake_page_8.html"
[9] "https://fake_page_9.html" "https://fake_page_10.html"
[11] "https://fake_page_11.html" "https://fake_page_12.html"
[13] "https://fake_page_13.html" "https://fake_page_14.html"
[15] "https://fake_page_15.html" "https://fake_page_16.html"
[17] "https://fake_page_17.html" "https://fake_page_18.html"
[19] "https://fake_page_19.html" "https://fake_page_20.html"
Using your answers to the previous questions, iterate over this vector of URLs to produce a vector containing the building and room number for each class.
Solution:
map_chr(class_urls, get_room)I’m using map_chr here since I want to return a vector rather than a list. If I asked this question on the exam, I would also accept a solution with map in place of map_chr.
Polite web scraping
Why is it important to be polite when scraping from a website?
What are the features of polite web scraping that we discussed in class?
SQL
Here is a rough summary of what you should know from the SQL unit:
- Pieces of a SQL query you should know:
SELECTFROMJOIN(an inner join) andLEFT JOIN(a left join)WHEREGROUP BYHAVINGORDER BYLIMIT
- You should also know the order in which queries must be written
- Other functions to know for queries:
DISTINCT- Summary functions like
SUM,AVG,MIN,MAX,COUNT LIKE, and the wildcards%and_CASE
- The function of primary and foreign keys, and the constraints they place on tables
- How to read and interpret an ERD
- You should be able to read and interpet the results of
DESCRIBE,ALTER TABLE, andINSERT INTOstatements. You will not have to write these statements - Subqueries
Practice with SQL queries
Here is a small table in a SQL database. The name of the table is small_penguins.
species island bill_length_mm bill_depth_mm
1 Gentoo Biscoe 43.3 14.0
2 Gentoo Biscoe 59.6 17.0
3 Adelie Dream 39.7 17.9
4 Adelie Dream 39.2 21.1
5 Chinstrap Dream 50.8 19.0
6 Gentoo Biscoe 49.9 16.1
7 Chinstrap Dream 50.7 19.7
8 Gentoo Biscoe 47.3 15.3
9 Gentoo Biscoe 49.3 15.7
10 Adelie Dream 37.5 18.9
In each of the following examples, write a SQL query to produce the desired output from small_penguins.
- Output:
species island n
1 Adelie Dream 3
2 Chinstrap Dream 2
3 Gentoo Biscoe 5
SELECT species, island, COUNT(*) as n
FROM small_penguins
GROUP BY species, island;| species | island | n |
|---|---|---|
| Adelie | Dream | 3 |
| Chinstrap | Dream | 2 |
| Gentoo | Biscoe | 5 |
- Output:
island species mean_length
1 Biscoe Gentoo 49.88
2 Dream Adelie 38.80
3 Dream Chinstrap 50.75
SELECT island, species, AVG(bill_length_mm) AS mean_length
FROM small_penguins
GROUP BY island, species;| island | species | mean_length |
|---|---|---|
| Biscoe | Gentoo | 49.88 |
| Dream | Adelie | 38.80 |
| Dream | Chinstrap | 50.75 |
- Output:
species island bill_length_mm bill_depth_mm bill_ratio
1 Gentoo Biscoe 43.3 14.0 3.092857
2 Gentoo Biscoe 59.6 17.0 3.505882
3 Adelie Dream 39.7 17.9 2.217877
4 Adelie Dream 39.2 21.1 1.857820
5 Chinstrap Dream 50.8 19.0 2.673684
6 Gentoo Biscoe 49.9 16.1 3.099379
7 Chinstrap Dream 50.7 19.7 2.573604
8 Gentoo Biscoe 47.3 15.3 3.091503
9 Gentoo Biscoe 49.3 15.7 3.140127
10 Adelie Dream 37.5 18.9 1.984127
SELECT species, island, bill_length_mm, bill_depth_mm,
bill_length_mm/bill_depth_mm AS bill_ratio
FROM small_penguins;| species | island | bill_length_mm | bill_depth_mm | bill_ratio |
|---|---|---|---|---|
| Gentoo | Biscoe | 43.3 | 14.0 | 3.092857 |
| Gentoo | Biscoe | 59.6 | 17.0 | 3.505882 |
| Adelie | Dream | 39.7 | 17.9 | 2.217877 |
| Adelie | Dream | 39.2 | 21.1 | 1.857820 |
| Chinstrap | Dream | 50.8 | 19.0 | 2.673684 |
| Gentoo | Biscoe | 49.9 | 16.1 | 3.099379 |
| Chinstrap | Dream | 50.7 | 19.7 | 2.573604 |
| Gentoo | Biscoe | 47.3 | 15.3 | 3.091503 |
| Gentoo | Biscoe | 49.3 | 15.7 | 3.140127 |
| Adelie | Dream | 37.5 | 18.9 | 1.984127 |
- Output:
species island bill_length_mm bill_depth_mm
1 Adelie Dream 39.7 17.9
2 Adelie Dream 39.2 21.1
3 Adelie Dream 37.5 18.9
SELECT species, island, bill_length_mm, bill_depth_mm
FROM small_penguins
WHERE species = 'Adelie' AND island = 'Dream';| species | island | bill_length_mm | bill_depth_mm |
|---|---|---|---|
| Adelie | Dream | 39.7 | 17.9 |
| Adelie | Dream | 39.2 | 21.1 |
| Adelie | Dream | 37.5 | 18.9 |
Joins
In each of the following questions, write a SQL query to produce the desired output from the two input tables
df1 id x
1 1 7
2 2 9
3 3 13
df2 id y
1 1 10
2 2 12
3 4 14
Output:
id x y
1 1 7 10
2 2 9 12
3 3 13 NA
SELECT df1.id, x, y
FROM df1
LEFT JOIN df2 ON df1.id = df2.id;| id | x | y |
|---|---|---|
| 1 | 7 | 10 |
| 2 | 9 | 12 |
| 3 | 13 | NA |
df1 id x
1 1 7
2 2 9
3 3 13
df2 id y
1 1 10
2 2 12
3 4 14
Output:
id x y
1 1 7 10
2 2 9 12
SELECT df1.id, x, y
FROM df1
JOIN df2 ON df1.id = df2.id;| id | x | y |
|---|---|---|
| 1 | 7 | 10 |
| 2 | 9 | 12 |
More practice with SQL queries
The nycflights13 package in R contains several datasets describing flights from NY airports in 2013, including the airports and weather tables.
The airports table has one row for each different airport in the database. There are a total of 1458 rows and 8 columns in the airports table; here are a few random rows from the table, showing the relevant columns for this question:
faa name
1 AOS Amook Bay Seaplane Base
2 2H0 Shelby County Airport
3 CEF Westover Arb Metropolitan
4 AVW Marana Regional
5 DLF Laughlin Afb
The weather table has one row for each date-time, for each of the three NY airports (EWR, JFK, LGA), and records a variety of weather variables at each time for each location. Here are a few random rows from the table, showing the relevant columns for this question:
origin month temp humid
1 EWR 12 24.98 62.55
2 LGA 2 33.98 88.42
3 LGA 5 57.92 71.90
4 EWR 6 64.04 93.28
5 LGA 4 42.98 88.85
The weather table uses a 3-character FAA code to identify each airport. It does not include the full name for the airports.
We would like to produce a table with the mean temperature, and humidity for each month, for each of the 3 NYC airports in the database. The output table should also include the full name of each airport. Here is what the output table looks like:
origin name month mean_temp mean_humid
1 EWR Newark Liberty Intl 1 35.56216 62.12451
2 EWR Newark Liberty Intl 2 34.26332 63.33558
3 EWR Newark Liberty Intl 3 40.11865 57.91382
4 EWR Newark Liberty Intl 4 52.97750 54.23513
5 EWR Newark Liberty Intl 5 63.32024 64.07371
6 EWR Newark Liberty Intl 6 73.26725 67.00229
7 EWR Newark Liberty Intl 7 80.70300 66.67641
8 EWR Newark Liberty Intl 8 74.53748 65.75175
9 EWR Newark Liberty Intl 9 67.30478 64.03556
10 EWR Newark Liberty Intl 10 59.77821 65.31928
11 EWR Newark Liberty Intl 11 44.57734 56.52011
12 EWR Newark Liberty Intl 12 37.95008 69.72141
13 JFK John F Kennedy Intl 1 35.38555 61.66162
14 JFK John F Kennedy Intl 2 34.19246 62.55848
15 JFK John F Kennedy Intl 3 39.54472 58.02222
16 JFK John F Kennedy Intl 4 50.14270 60.05764
17 JFK John F Kennedy Intl 5 59.31476 70.98926
18 JFK John F Kennedy Intl 6 69.95825 73.76147
19 JFK John F Kennedy Intl 7 78.73492 71.74469
20 JFK John F Kennedy Intl 8 73.81878 67.84420
21 JFK John F Kennedy Intl 9 66.89775 65.25182
22 JFK John F Kennedy Intl 10 59.80195 65.06362
23 JFK John F Kennedy Intl 11 45.13419 56.79961
24 JFK John F Kennedy Intl 12 38.60487 68.31341
25 LGA La Guardia 1 35.95927 59.15609
26 LGA La Guardia 2 34.35612 60.68603
27 LGA La Guardia 3 39.97652 56.55947
28 LGA La Guardia 4 52.11450 53.15060
29 LGA La Guardia 5 62.75000 61.49618
30 LGA La Guardia 6 73.32650 61.22686
31 LGA La Guardia 7 80.76425 62.26404
32 LGA La Guardia 8 75.04825 60.34574
33 LGA La Guardia 9 67.91125 59.67743
34 LGA La Guardia 10 60.63244 60.13875
35 LGA La Guardia 11 45.26093 53.49296
36 LGA La Guardia 12 38.76976 63.62769
- Write a SQL query to produce the desired results.
Solution:
SELECT weather.origin, name, month,
AVG(temp) AS mean_temp, AVG(humid) AS mean_humid
FROM weather
JOIN airports ON weather.origin = airports.faa
GROUP BY origin, name, month
ORDER BY origin, month;| origin | name | month | mean_temp | mean_humid |
|---|---|---|---|---|
| EWR | Newark Liberty Intl | 1 | 35.56216 | 62.12451 |
| EWR | Newark Liberty Intl | 2 | 34.26332 | 63.33558 |
| EWR | Newark Liberty Intl | 3 | 40.11865 | 57.91382 |
| EWR | Newark Liberty Intl | 4 | 52.97750 | 54.23512 |
| EWR | Newark Liberty Intl | 5 | 63.32024 | 64.07371 |
| EWR | Newark Liberty Intl | 6 | 73.26725 | 67.00229 |
| EWR | Newark Liberty Intl | 7 | 80.70300 | 66.67641 |
| EWR | Newark Liberty Intl | 8 | 74.53748 | 65.75175 |
| EWR | Newark Liberty Intl | 9 | 67.30478 | 64.03556 |
| EWR | Newark Liberty Intl | 10 | 59.77821 | 65.31928 |
- How would you modify your above query to keep only the rows in the results for which the mean temperature is at least 60?
Solution:
SELECT weather.origin, name, month,
AVG(temp) AS mean_temp, AVG(humid) AS mean_humid
FROM weather
JOIN airports ON weather.origin = airports.faa
GROUP BY origin, name, month
HAVING mean_temp >= 60
ORDER BY origin, month;| origin | name | month | mean_temp | mean_humid |
|---|---|---|---|---|
| EWR | Newark Liberty Intl | 5 | 63.32024 | 64.07371 |
| EWR | Newark Liberty Intl | 6 | 73.26725 | 67.00229 |
| EWR | Newark Liberty Intl | 7 | 80.70300 | 66.67641 |
| EWR | Newark Liberty Intl | 8 | 74.53748 | 65.75175 |
| EWR | Newark Liberty Intl | 9 | 67.30478 | 64.03556 |
| JFK | John F Kennedy Intl | 6 | 69.95825 | 73.76147 |
| JFK | John F Kennedy Intl | 7 | 78.73492 | 71.74469 |
| JFK | John F Kennedy Intl | 8 | 73.81878 | 67.84420 |
| JFK | John F Kennedy Intl | 9 | 66.89775 | 65.25182 |
| LGA | La Guardia | 5 | 62.75000 | 61.49618 |
- How would you modify your above query to keep only the rows for which the mean temperature is the highest average temperature?
Solution: This is quite tricky, and requires us to combine some queries.
Assuming that we want to get the highest average monthly temperature for each airport, a partial answer is the following, though note that this only gives us the origin and max mean temperature columns:
SELECT origin, MAX(mean_temp)
FROM
(
SELECT weather.origin, name, month,
AVG(temp) AS mean_temp, AVG(humid) AS mean_humid
FROM weather
JOIN airports ON weather.origin = airports.faa
GROUP BY origin, name, month
ORDER BY origin, month
)
GROUP BY origin;| origin | max(mean_temp) |
|---|---|
| EWR | 80.70300 |
| LGA | 80.76425 |
| JFK | 78.73492 |
This partial solution uses a subquery, and this is about the level of difficulty I would put on the exam.
A full solution is beyond the scope of the exam, and I think requires us to use either multiple subqueries, or to create an intermediate view/table. Here is one solution which uses two subqueries:
SELECT weather.origin, name, month,
AVG(temp) AS mean_temp, AVG(humid) AS mean_humid
FROM weather
JOIN airports ON weather.origin = airports.faa
GROUP BY origin, name, month
HAVING (origin, mean_temp) IN
(SELECT origin, MAX(mean_temp)
FROM
(
SELECT weather.origin, name, month,
AVG(temp) AS mean_temp, AVG(humid) AS mean_humid
FROM weather
JOIN airports ON weather.origin = airports.faa
GROUP BY origin, name, month
ORDER BY origin, month
)
GROUP BY origin
);| origin | name | month | mean_temp | mean_humid |
|---|---|---|---|---|
| EWR | Newark Liberty Intl | 7 | 80.70300 | 66.67641 |
| JFK | John F Kennedy Intl | 7 | 78.73492 | 71.74469 |
| LGA | La Guardia | 7 | 80.76425 | 62.26404 |
However, the problem with this solution is that it requires us to create the results from question 27 twice, which is quite inefficient. Instead, we could store the results from question 27, then use these results. Here is what those queries might look like (again, I don’t expect you to be able to do this part for the exam):
CREATE VIEW monthly_weather AS
(SELECT weather.origin, name, month,
AVG(temp) AS mean_temp, AVG(humid) AS mean_humid
FROM weather
JOIN airports ON weather.origin = airports.faa
GROUP BY origin, name, month);SELECT *
FROM monthly_weather
WHERE (origin, mean_temp) IN
(SELECT origin, MAX(mean_temp)
FROM monthly_weather
GROUP BY origin);| origin | name | month | mean_temp | mean_humid |
|---|---|---|---|---|
| JFK | John F Kennedy Intl | 7 | 78.73492 | 71.74469 |
| LGA | La Guardia | 7 | 80.76425 | 62.26404 |
| EWR | Newark Liberty Intl | 7 | 80.70300 | 66.67641 |
Conceptual questions
- What are the differences between
SELECTin a SQL query, and theselectfunction fromdplyrin R?
Solution: The select function from the dplyr package has one function: selecting columns from the data frame. We don’t use the select function to create new columns or summary statistics – those roles are performed by, respectively, the mutate and summarize functions. However, in SQL SELECT does all of this. Choosing existing columns, creating a new column, or creating aggregates of existing columns all happen with SELECT.
- Suppose you are managing a database for a hospital. List four different tables which might be included in the hospital database. For each table:
- Describe what information would be included (what would a row represent? what would be some of the columns?)
- Is there a natural primary key for the table? What column or set of columns would form that primary key?
- Are there any columns in the table that would be foreign keys for other tables in the database?
Solution: There is not just one right answer here. Here is one possible answer.
Tables:
Patients: information on each patient that has been treated at the hospital- Primary key:
MRN(medical record number), a unique identifier assigned to each patient in the healthcare organization
- Other columns
- Date of birth
- Sex
- Insurance
- Address
- Phone number
- Primary key:
Staff: information on doctors, nurses, etc. who have worked at the hospital- Primary key:
IDemployee ID
- Other columns
- Role
- Employment status
- Age
- Area(s) of specialty
- Department
- Primary key:
Pharmacy: information on drugs prescribed to patients in the hospital- Foreign keys:
PatientIDrefers toMRNin thePatientstable. Patient to whom drug was prescribedStaffIDrefers toIDin theStafftable. Staff member who prescribed the drug
- Other columns
- Date prescribed
- Drug
- Dosage
- Foreign keys:
Visits: information on each visit- Foreign keys:
PatientIDrefers toMRNin thePatientstable
- Other columns
- Check-in time
- Check-out time
- Reason for visit
- Foreign keys:
Practice with function scoping
- What will be the output of the following code?
x <- 10
test_fun <- function(x = 11){
return(x)
}
test_fun()[1] 11
x[1] 10
- What will be the output of the following code?
x <- 10
test_fun <- function(y = 11){
return(x + 1)
}
test_fun()[1] 11
x[1] 10
- What will be the output of the following code?
x <- 10
test_fun <- function(y = 11){
x <- x + 1
return(x + 1)
}
test_fun()[1] 12
x[1] 10
- What will be the output of the following code?
x <- 10
test_fun <- function(x = 11){
x <- x + 1
return(x + 1)
}
test_fun()[1] 13
x[1] 10
- What will be the output of the following code?
x <- 10
test_fun <- function(x = 11){
x <- x + 1
return(x + 1)
}
x <- test_fun(x)
x[1] 12
Practice with lists
- Create a list
xin R such that:
x[[1]]returns the functionmeanx[[2]]returns the functionsdx[[3]][[1]]returns the vectorc(0, 1, 2)x[[3]][[2]]returns a function which calculates the cube root of a vector
Solution:
x <- list(mean, sd, list(c(0, 1, 2),
function(y) y^(1/3)))
x[[1]](1:10)[1] 5.5
x[[2]](1:10)[1] 3.02765
x[[3]][[1]][1] 0 1 2
x[[3]][[2]](c(8, 27, 64))[1] 2 3 4