R_Studio(关联)对dvdtrans.csv数据进行关联规则分析

时间:2024-11-05 23:36:33

  dvdtrans.csv数据:该原始数据仅仅包含了两个字段(ID, Item) 用户ID,商品名称(共30条)

  R_Studio(关联)对dvdtrans.csv数据进行关联规则分析

  

#导入arules包
#install.packages("arules")
library (arules) setwd('D:\\data')
Gary=read.csv(file="dvdtrans.csv",header=T) # 将数据转换为arules关联规则方法apriori 可以处理的数据形式.交易数据
# transactions "事务"
Gary<- as(split(Gary$Item, Gary$ID),"transactions") # 查看一下数据
#attributes(Gary)
summary(Gary) # 使用apriori函数生成关联规则
rules <- apriori(Gary, parameter=list(support=0.3,confidence=0.5)) # 查看一下数据
inspect(rules)

Gary.R

实现过程

  导入arules包

  对数据进行预处理

#导入arules包
#install.packages("arules")
library (arules) setwd('D:\\data')
Gary=read.csv(file="dvdtrans.csv",header=T) # 将数据转换为arules关联规则方法apriori 可以处理的数据形式.交易数据
# transactions "事务"
Gary<- as(split(Gary$Item, Gary$ID),"transactions")
> # 查看一下数据
> #attributes(Gary)
> summary(Gary)
transactions as itemMatrix in sparse format with
10 rows (elements/itemsets/transactions) and            10行(元素/项集/事务)
10 columns (items) and a density of 0.3                10列(项)和0.3的密度 most frequent items:                           最常见的项目(频率):
Gladiator Patriot Sixth Sense Green Mile Harry Potter1 (Other)
7 6 6 2 2 7 element (itemset/transaction) length distribution:          元素(项集/事务)长度分布:
sizes
2 3 4 5
3 5 1 1 Min. 1st Qu. Median Mean 3rd Qu. Max.
2.00 2.25 3.00 3.00 3.00 5.00 includes extended item information - examples:
labels
1 Braveheart
2 Gladiator
3 Green Mile includes extended transaction information - examples:
transactionID
1 1
2 2
3 3

  生成关联规则

> # 使用apriori函数生成关联规则
> rules <- apriori(Gary, parameter=list(support=0.3,confidence=0.5))
Apriori Parameter specification:
confidence minval smax arem aval originalSupport maxtime support minlen maxlen target ext
0.5 0.1 1 none FALSE TRUE 5 0.3 1 10 rules FALSE Algorithmic control:
filter tree heap memopt load sort verbose
0.1 TRUE TRUE FALSE TRUE 2 TRUE Absolute minimum support count: 3 set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[10 item(s), 10 transaction(s)] done [0.00s].
sorting and recoding items ... [3 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 done [0.00s].
writing ... [12 rule(s)] done [0.00s].
creating S4 object ... done [0.00s].
>
> # 查看一下数据
> inspect(rules)
lhs rhs support confidence lift count
[1] {} => {Patriot} 0.6 0.6000000 1.000000 6
[2] {} => {Sixth Sense} 0.6 0.6000000 1.000000 6
[3] {} => {Gladiator} 0.7 0.7000000 1.000000 7
[4] {Patriot} => {Sixth Sense} 0.4 0.6666667 1.111111 4
[5] {Sixth Sense} => {Patriot} 0.4 0.6666667 1.111111 4
[6] {Patriot} => {Gladiator} 0.6 1.0000000 1.428571 6
[7] {Gladiator} => {Patriot} 0.6 0.8571429 1.428571 6
[8] {Sixth Sense} => {Gladiator} 0.5 0.8333333 1.190476 5
[9] {Gladiator} => {Sixth Sense} 0.5 0.7142857 1.190476 5
[10] {Patriot,Sixth Sense} => {Gladiator} 0.4 1.0000000 1.428571 4
[11] {Gladiator,Patriot} => {Sixth Sense} 0.4 0.6666667 1.111111 4
[12] {Gladiator,Sixth Sense} => {Patriot} 0.4 0.8000000 1.333333 4