의사결정 나무(Decision Tree)

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

류동균의 R 공부방입니다.

의사결정 나무(Decision Tree) 본문

Machine Learning

의사결정 나무(Decision Tree)

R쟁이 2019. 9. 20. 23:46

의사결정 나무란 여러가지 데이터가 있을때 마치 나무가 가지를 치듯이 노드를 통해 의사결정을

하여 분류를 하는 것이다.

그러면 이번엔 Decision Tree를 사용하여 iris데이터의 Species를 분류해보자.

## data set

#필요한패키지
library(rpart)
library(rpart.plot)
library(caret)
library(e1071)

# data 생성
df <- iris

# 데이터 탐색
plot(iris)
str(iris)
summary(iris)
sum(is.na(iris))

# train/test sampling
training_sampling <- sort(sample(1:nrow(df), nrow(df) * 0.7 ))
test_sampling <- setdiff(1:nrow(df),training_sampling)

# traning_set, test_set
training_set <- df[training_sampling,]
test_set <- df[test_sampling,]

traning_set과 test_set을 준비했으니 traning_set으로 모델을 생성해보자.

# rpart함수 사용
rpart_m <- rpart(Species ~ ., data = training_set)

rpart_m을 실행하면 다음과같은 결과가 나오는데 Petal.Lengh가 2.45보다 크면 setosa 그다음엔 어떠한 기준에 따라

종류가 분류가 되는 것을 확인할 수 있다.

이번엔 이것을 의사결정나무로 시각화해보자. 기본적인 plot으로 생성해보자.

# rpart모델 시각화(rpart 기본패키지)
plot(rpart_m, margin = .2)
text(rpart_m, cex = 1.5)

이번엔 rpart.plot 패키지의 prp()로 생성해보자.
# rpart.plot 패키지 사용
prp(rpart_m, type = 4, extra = 2, digits = 3)

# rpart.plot 패키지 사용
prp(rpart_m, type = 4, extra = 2, digits = 3)

좀더 깔끔하게 시각화가 되었다. 확실히 text보다는 시각화 한 것이 직관적으로 이해력이 높아질 수 있다.

모델확인

#fitted
rpart_f <- predict(rpart_m, type = "class")
table(rpart_f, training_set$Species)

이제 test_set에 모델을 적용시켜 결과와 정확도를 살펴보자.

rpart_p <- predict(rpart_m, newdata = test_set, type = "class")

sum(rpart_p == test_set$Species) / nrow(test_set)

table(rpart_p, test_set$Species)

결과를 0.933333의 확률로 봤더니 test_set에서의 setosa, virginica는 정확하게 분류가 되었고 versicolor는 viginica로

잘못 분류가 된 것이 3개가 존재했다.

'Machine Learning' 카테고리의 다른 글

K-최근접 이웃(K-Nearest Neighbor) (0)	2019.09.25
뉴럴 네트워크(Neural Network) (0)	2019.09.24
랜덤 포레스트(Random Forest) (0)	2019.09.22
로지스틱 회귀분석(Logistic Regression Analysis) (0)	2019.09.19
선형회귀 분석(Linear regression analysis) (0)	2019.09.16

'Machine Learning' Related Articles

류동균의 R 공부방입니다.

의사결정 나무(Decision Tree) 본문

의사결정 나무(Decision Tree)

'Machine Learning' 카테고리의 다른 글

티스토리툴바