資料說明

1.本單元主題僅在介紹購物籃關聯分析

2.資料集,共有786 records 15 fields

[設定所需的函式庫(libraries)以及載入資料]

setwd("/home/m600/Working Area/Rdata Practice/Customer Course/shopping list")

library(arules)
shopping=read.table("./Shopping.txt",header=T, sep=",")

[Part 1].Data-ETL

1-1.取得資料集的概況

head(shopping)
##   Ready.made Frozen.foods Alcohol Fresh.Vegetables Milk Bakery.goods
## 1          1            0       0                0    0            0
## 2          1            0       0                0    0            0
## 3          1            0       0                0    0            0
## 4          1            0       0                0    1            1
## 5          1            0       0                0    0            0
## 6          1            0       0                0    0            1
##   Fresh.meat Toiletries Snacks Tinned.Goods GENDER      Age   MARITAL
## 1          0          0      1            0 Female 18 to 30   Widowed
## 2          0          1      0            0 Female 18 to 30 Separated
## 3          0          1      1            0   Male 18 to 30    Single
## 4          0          0      0            0 Female 18 to 30   Widowed
## 5          0          0      0            0 Female 18 to 30 Separated
## 6          0          0      1            1   Male 18 to 30    Single
##   CHILDREN WORKING
## 1       No     Yes
## 2       No     Yes
## 3       No     Yes
## 4       No     Yes
## 5       No     Yes
## 6       No      No
shopping=shopping[,1:10]
shopping=na.exclude(shopping)
  • 全部總共786筆資料
  • 買Milk和Frozen Food的人是85筆
  • 買Bakery goods的人是337筆
  • 買Milk和Frozen Food而且買Bakery goods的人是71筆
  • 買Milk和Frozen Food但不買Bakery goods的人是14筆
  • 後項(R的rhs) – Bakery goods
  • 前項(R的lhs) – Milk和Frozen Food
  • 實例– 85,即符合前項的筆數

1-2.轉換為Matrix

shopping=as.matrix(shopping) 

[Part 2].Apriori analysis

rule=apriori(shopping,parameter=list(supp=0.2,conf=0.5,maxlen=5),appearance=list(rhs="Alcohol",default="lhs"))
## Apriori
## 
## Parameter specification:
##  confidence minval smax arem  aval originalSupport support minlen maxlen
##         0.5    0.1    1 none FALSE            TRUE     0.2      1      5
##  target   ext
##   rules FALSE
## 
## Algorithmic control:
##  filter tree heap memopt load sort verbose
##     0.1 TRUE TRUE  FALSE TRUE    2    TRUE
## 
## Absolute minimum support count: 157 
## 
## set item appearances ...[1 item(s)] done [0.00s].
## set transactions ...[10 item(s), 786 transaction(s)] done [0.00s].
## sorting and recoding items ... [6 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 done [0.00s].
## writing ... [2 rule(s)] done [0.00s].
## creating S4 object  ... done [0.00s].
inspect(head(sort(rule,by="support"),10))
##   lhs               rhs       support   confidence lift    
## 1 {Frozen.foods} => {Alcohol} 0.2302799 0.5727848  1.452287
## 2 {Bakery.goods} => {Alcohol} 0.2150127 0.5014837  1.271504
inspect(head(sort(rule,by="confidence"),10))
##   lhs               rhs       support   confidence lift    
## 1 {Frozen.foods} => {Alcohol} 0.2302799 0.5727848  1.452287
## 2 {Bakery.goods} => {Alcohol} 0.2150127 0.5014837  1.271504