How about analyzing your midweek results?
c(1, 5)
between the square brackets. For example, the code below(下面) selects the first and fifth element of
poker_vector
:
poker_vector[c(1, 5)]
poker_midweek
.
# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector
# Define a new variable based on a selection
poker_midweek <- poker_vector[c(2,3,4)]
console:
> # Poker and roulette winnings from Monday to Friday:
> poker_vector <- c(140, -50, 20, -120, 240)
> roulette_vector <- c(-24, -50, 100, -350, 10)
> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
> names(poker_vector) <- days_vector
> names(roulette_vector) <- days_vector
>
> # Define a new variable based on a selection
> poker_midweek <- poker_vector[c(2,3,4)]
Vector selection: the good times (3)
Selecting multiple elements of poker_vector
with c(2, 3, 4)
is not very convenient(方便). Many statisticians are lazy people by nature(天性), so they created an easier way to do this: c(2, 3, 4)
can be abbreviated (简写)to2:4
, which generates(引起) a vector with all natural numbers from 2 up to 4.
poker_vector[2:4]
.
2:4
is placed between the square brackets to select element 2 up to 4.(这种写法是递增)
roulette_selection_vector
the roulette(轮盘赌) results from Tuesday up to Friday; make use of
:
if it makes things easier for you.
# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector
# Define a new variable based on a selection
roulette_selection_vector <- roulette_vector[2:5]
console:
# Poker and roulette winnings from Monday to Friday:
> poker_vector <- c(140, -50, 20, -120, 240)
> roulette_vector <- c(-24, -50, 100, -350, 10)
> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
> names(poker_vector) <- days_vector
> names(roulette_vector) <- days_vector
>
> # Define a new variable based on a selection
> roulette_selection_vector <- roulette_vector[2:5]
Vector selection: the good times (4)
Another way to tackle(处理) the previous exercise is by using the names of the vector elements (Monday, Tuesday, ...) instead of their numeric positions. For example,
poker_vector["Monday"]
will select the first element of poker_vector
since "Monday"
is the name of that first element.
Just like you did in the previous exercise with numerics, you can also use the element names to select multiple elements, for example:
poker_vector[c("Monday","Tuesday")]
- Select the first three(前3个) elements in
poker_vector
by using their names:"Monday"
,"Tuesday"
and"Wednesday"
. Assign the result of the selection topoker_start
. - Calculate(计算) the average of the values in
poker_start
with themean()
function. Simply print out the result so you can inspect(检查) it.
# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector
# Select poker results for Monday, Tuesday and Wednesday
poker_start <- poker_vector[c("Monday","Tuesday","Wednesday")]
# Calculate the average of the elements in poker_start 直接计算平均数使用自带函数mean()
mean(poker_start)
#mark#重点理解
console:
> # Poker and roulette winnings from Monday to Friday:
> poker_vector <- c(140, -50, 20, -120, 240)
> roulette_vector <- c(-24, -50, 100, -350, 10)
> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
> names(poker_vector) <- days_vector
> names(roulette_vector) <- days_vector
>
> # Select poker results for Monday, Tuesday and Wednesday
> poker_start <- poker_vector[c("Monday","Tuesday","Wednesday")]
>
> # Calculate the average of the elements in poker_start
> mean(poker_start)
[1] 36.66667
Selection by comparison - Step 1
By making use of comparison(比较) operators(操作符), we can approach(靠近) the previous question in a more proactive(先进) way.
The (logical) comparison operators known to R are:
-
<
for less than 不到; 少于 -
>
for greater than 大于 -
<=
for less than or equal to 小于等于 -
>=
for greater than or equal to 大于等于 -
==
for equal to each other 等于 -
!=
not equal to each other 不等于
As seen in the previous chapter, stating 6
returns
> 5TRUE
. The nice thing about R is that you can use these comparison operators also on vectors. For example:
> c(4, 5, 6) > 5
[1] FALSE FALSE TRUE
TRUE
or FALSE
.
- Check which elements in
poker_vector
are positive(正数) (i.e. > 0) and assign this toselection_vector
. - Print out
selection_vector
so you can inspect(验证) it. The printout tells you whether you won (TRUE
) or lost (FALSE
) any money for each day.
# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector
# Which days did you make money on poker?
selection_vector <- poker_vector[c(1,2,3,4,5)] > 0
#mark#重点理解
【刚开始时,我写的是
selection_vector <- poker_vector[c(1,2,3,4,5) > 0 ]
然后就出错啦,我没有选中向量元素就比较啦,只是选中啦向量中的下标(理解成下标吧)
】
# Print out selection_vector
selection_vector
console:
> # Poker and roulette winnings from Monday to Friday:
> poker_vector <- c(140, -50, 20, -120, 240)
> roulette_vector <- c(-24, -50, 100, -350, 10)
> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
> names(poker_vector) <- days_vector
> names(roulette_vector) <- days_vector
>
> # Which days did you make money on poker?
> selection_vector <- poker_vector[c(1,2,3,4,5)]>0
>
> # Print out selection_vector
> selection_vector
Monday Tuesday Wednesday Thursday Friday
TRUE FALSE TRUE FALSE TRUE
Selection by comparison - Step 2
Working with comparisons will make your data analytical life easier. Instead of selecting a subset(子集) of days to investigate(研究) yourself (like before), you can simply ask R to return only those days where you realized a positive return for poker.
In the previous exercises you used selection_vector <- poker_vector > 0
to find the days on which you had a positive poker return. Now, you would like to know not only the days on which you won, but also how much you won on those days.
You can select the desired(渴望的) elements, by putting selection_vector
between the square brackets that follow poker_vector
:
poker_vector[selection_vector]
TRUE
in
selection_vector
.
selection_vector
in square brackets to assign the amounts(总额) that you won on the profitable(获利的) days to the variable
poker_winning_days
.
# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector
# Which days did you make money on poker?
selection_vector <- poker_vector > 0
#选中获利的那些天,即poker_vector 表示所有元素,大于0的
# Select from poker_vector these days
#mark#重点理解
poker_winning_days <- poker_vector[selection_vector]
#将获利的那些天的获利额赋值给poker_winning_days
poker_winning_days
#打印
console:
> # Poker and roulette winnings from Monday to Friday:
> poker_vector <- c(140, -50, 20, -120, 240)
> roulette_vector <- c(-24, -50, 100, -350, 10)
> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
> names(poker_vector) <- days_vector
> names(roulette_vector) <- days_vector
>
> # Which days did you make money on poker?
> selection_vector <- poker_vector > 0
>
> # Select from poker_vector these days
> poker_winning_days <- poker_vector[selection_vector]
> poker_winning_days
Monday Wednesday Friday
140 20 240
Advanced selection
- Create the variable
selection_vector
, this time to see if you made profit with roulette for different days. - Assign the amounts that you made on the days that you ended positively for roulette to the variable
roulette_winning_days
. This vector thus contains the positive winnings ofroulette_vector
.
# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector
# Which days did you make money on roulette?
selection_vector <- roulette_vector > 0
# Select from roulette_vector these days
roulette_winning_days <- roulette_vector[selection_vector]
console:
> # Poker and roulette winnings from Monday to Friday:
> poker_vector <- c(140, -50, 20, -120, 240)
> roulette_vector <- c(-24, -50, 100, -350, 10)
> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
> names(poker_vector) <- days_vector
> names(roulette_vector) <- days_vector
>
> # Which days did you make money on roulette?
> selection_vector <- roulette_vector > 0
>
> # Select from roulette_vector these days
> roulette_winning_days <- roulette_vector[selection_vector]
这一章又完成啦。