Monday, June 5, 2017

Classification - K-NN ( K Nearest Neighbor )

Using K-NN ( K Nearest Neighbour Classification ) to detect Prostate Cancer on the basis of sample data provided by UCI Machine Learning repository.

Step 1 : Data collection

# Read data from hyperlink and remove 1st column
http.path = "https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data"
cancer <- read.csv(http.path,header = FALSE)

Step 2 : Prepare data and exploring data

# Remove 1st column (Patient ID) it does not contain any classification specific data
cancer <- cancer[-1]
# Check for missing value and distribution of data
summary(cancer)
##  V2            V3               V4              V5        
##  B:357   Min.   : 6.981   Min.   : 9.71   Min.   : 43.79  
##  M:212   1st Qu.:11.700   1st Qu.:16.17   1st Qu.: 75.17  
##          Median :13.370   Median :18.84   Median : 86.24  
##          Mean   :14.127   Mean   :19.29   Mean   : 91.97  
##          3rd Qu.:15.780   3rd Qu.:21.80   3rd Qu.:104.10  
##          Max.   :28.110   Max.   :39.28   Max.   :188.50  
##        V6               V7                V8                V9         
##  Min.   : 143.5   Min.   :0.05263   Min.   :0.01938   Min.   :0.00000  
##  1st Qu.: 420.3   1st Qu.:0.08637   1st Qu.:0.06492   1st Qu.:0.02956  
##  Median : 551.1   Median :0.09587   Median :0.09263   Median :0.06154  
##  Mean   : 654.9   Mean   :0.09636   Mean   :0.10434   Mean   :0.08880  
##  3rd Qu.: 782.7   3rd Qu.:0.10530   3rd Qu.:0.13040   3rd Qu.:0.13070  
##  Max.   :2501.0   Max.   :0.16340   Max.   :0.34540   Max.   :0.42680  
##       V10               V11              V12               V13        
##  Min.   :0.00000   Min.   :0.1060   Min.   :0.04996   Min.   :0.1115  
##  1st Qu.:0.02031   1st Qu.:0.1619   1st Qu.:0.05770   1st Qu.:0.2324  
##  Median :0.03350   Median :0.1792   Median :0.06154   Median :0.3242  
##  Mean   :0.04892   Mean   :0.1812   Mean   :0.06280   Mean   :0.4052  
##  3rd Qu.:0.07400   3rd Qu.:0.1957   3rd Qu.:0.06612   3rd Qu.:0.4789  
##  Max.   :0.20120   Max.   :0.3040   Max.   :0.09744   Max.   :2.8730  
##       V14              V15              V16               V17          
##  Min.   :0.3602   Min.   : 0.757   Min.   :  6.802   Min.   :0.001713  
##  1st Qu.:0.8339   1st Qu.: 1.606   1st Qu.: 17.850   1st Qu.:0.005169  
##  Median :1.1080   Median : 2.287   Median : 24.530   Median :0.006380  
##  Mean   :1.2169   Mean   : 2.866   Mean   : 40.337   Mean   :0.007041  
##  3rd Qu.:1.4740   3rd Qu.: 3.357   3rd Qu.: 45.190   3rd Qu.:0.008146  
##  Max.   :4.8850   Max.   :21.980   Max.   :542.200   Max.   :0.031130  
##       V18                V19               V20          
##  Min.   :0.002252   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.013080   1st Qu.:0.01509   1st Qu.:0.007638  
##  Median :0.020450   Median :0.02589   Median :0.010930  
##  Mean   :0.025478   Mean   :0.03189   Mean   :0.011796  
##  3rd Qu.:0.032450   3rd Qu.:0.04205   3rd Qu.:0.014710  
##  Max.   :0.135400   Max.   :0.39600   Max.   :0.052790  
##       V21                V22                 V23             V24       
##  Min.   :0.007882   Min.   :0.0008948   Min.   : 7.93   Min.   :12.02  
##  1st Qu.:0.015160   1st Qu.:0.0022480   1st Qu.:13.01   1st Qu.:21.08  
##  Median :0.018730   Median :0.0031870   Median :14.97   Median :25.41  
##  Mean   :0.020542   Mean   :0.0037949   Mean   :16.27   Mean   :25.68  
##  3rd Qu.:0.023480   3rd Qu.:0.0045580   3rd Qu.:18.79   3rd Qu.:29.72  
##  Max.   :0.078950   Max.   :0.0298400   Max.   :36.04   Max.   :49.54  
##       V25              V26              V27               V28         
##  Min.   : 50.41   Min.   : 185.2   Min.   :0.07117   Min.   :0.02729  
##  1st Qu.: 84.11   1st Qu.: 515.3   1st Qu.:0.11660   1st Qu.:0.14720  
##  Median : 97.66   Median : 686.5   Median :0.13130   Median :0.21190  
##  Mean   :107.26   Mean   : 880.6   Mean   :0.13237   Mean   :0.25427  
##  3rd Qu.:125.40   3rd Qu.:1084.0   3rd Qu.:0.14600   3rd Qu.:0.33910  
##  Max.   :251.20   Max.   :4254.0   Max.   :0.22260   Max.   :1.05800  
##       V29              V30               V31              V32         
##  Min.   :0.0000   Min.   :0.00000   Min.   :0.1565   Min.   :0.05504  
##  1st Qu.:0.1145   1st Qu.:0.06493   1st Qu.:0.2504   1st Qu.:0.07146  
##  Median :0.2267   Median :0.09993   Median :0.2822   Median :0.08004  
##  Mean   :0.2722   Mean   :0.11461   Mean   :0.2901   Mean   :0.08395  
##  3rd Qu.:0.3829   3rd Qu.:0.16140   3rd Qu.:0.3179   3rd Qu.:0.09208  
##  Max.   :1.2520   Max.   :0.29100   Max.   :0.6638   Max.   :0.20750

Step 3A : Normalize Numeric data

Since the data in different scale, best practice advices to normalize the data by transforming to comman scale.

Using Min-Max Normalization of Numeric Data

normalize <-  function(x){
    return (( x - min(x))/(max(x) - min(x)))
}
cancer_n <- as.data.frame(lapply(cancer[2:31],normalize))
cancer_n$diagnosis <- cancer$V2 
# summary(cancer_n)

Step 4A : Split data into training and validation set

Using 80% data as training set data and remaining 20% data as validation set of data

library(caTools)
set.seed(2017)
SplitRatio <- 0.8
split = sample.split(cancer_n$diagnosis, SplitRatio = 0.8)
train.sample_n <- subset(cancer_n, split == TRUE)
valid.sample_n <- subset(cancer_n, split == FALSE)

Step 5A : Training a KNN Model on training dataset

# Include classification package
library(class)
# Calclulate K value. sqrt of no of observation in training set
k.value <- round(sqrt(nrow(train.sample_n)))
# Predict the cancer data
cancer_predict_n <- knn(
  train = train.sample_n[,1:30], 
  test = valid.sample_n[,1:30], 
  cl = train.sample_n[,31], 
  k = k.value
  )

Step 6A : Evaluate Model Performance

library(gmodels)
CrossTable( x = valid.sample_n[,31] , 
            y = cancer_predict_n, 
            prop.chisq = FALSE,
            addmargins = FALSE
            )
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
## 

Actual case of B - 71 but predicted cases are 75. There are 113 cases out of which 4 cases are wrongcly classified. hence accuracy of the model is ((113-4)/113) = 96.4%

Evaluate model Performance by transforming z-score standarization.

Step 3B : Z-Score standardization

cancer_z <- as.data.frame(scale(cancer[2:31]))
cancer_z$diagnosis <- cancer$V2 
summary(cancer_z)
##        V3                V4                V5                V6         
##  Min.   :-2.0279   Min.   :-2.2273   Min.   :-1.9828   Min.   :-1.4532  
##  1st Qu.:-0.6888   1st Qu.:-0.7253   1st Qu.:-0.6913   1st Qu.:-0.6666  
##  Median :-0.2149   Median :-0.1045   Median :-0.2358   Median :-0.2949  
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.4690   3rd Qu.: 0.5837   3rd Qu.: 0.4992   3rd Qu.: 0.3632  
##  Max.   : 3.9678   Max.   : 4.6478   Max.   : 3.9726   Max.   : 5.2459  
##        V7                 V8                V9               V10         
##  Min.   :-3.10935   Min.   :-1.6087   Min.   :-1.1139   Min.   :-1.2607  
##  1st Qu.:-0.71034   1st Qu.:-0.7464   1st Qu.:-0.7431   1st Qu.:-0.7373  
##  Median :-0.03486   Median :-0.2217   Median :-0.3419   Median :-0.3974  
##  Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.63564   3rd Qu.: 0.4934   3rd Qu.: 0.5256   3rd Qu.: 0.6464  
##  Max.   : 4.76672   Max.   : 4.5644   Max.   : 4.2399   Max.   : 3.9245  
##       V11                V12               V13               V14         
##  Min.   :-2.74171   Min.   :-1.8183   Min.   :-1.0590   Min.   :-1.5529  
##  1st Qu.:-0.70262   1st Qu.:-0.7220   1st Qu.:-0.6230   1st Qu.:-0.6942  
##  Median :-0.07156   Median :-0.1781   Median :-0.2920   Median :-0.1973  
##  Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.53031   3rd Qu.: 0.4706   3rd Qu.: 0.2659   3rd Qu.: 0.4661  
##  Max.   : 4.48081   Max.   : 4.9066   Max.   : 8.8991   Max.   : 6.6494  
##       V15               V16               V17               V18         
##  Min.   :-1.0431   Min.   :-0.7372   Min.   :-1.7745   Min.   :-1.2970  
##  1st Qu.:-0.6232   1st Qu.:-0.4943   1st Qu.:-0.6235   1st Qu.:-0.6923  
##  Median :-0.2864   Median :-0.3475   Median :-0.2201   Median :-0.2808  
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.2428   3rd Qu.: 0.1067   3rd Qu.: 0.3680   3rd Qu.: 0.3893  
##  Max.   : 9.4537   Max.   :11.0321   Max.   : 8.0229   Max.   : 6.1381  
##       V19               V20               V21               V22         
##  Min.   :-1.0566   Min.   :-1.9118   Min.   :-1.5315   Min.   :-1.0960  
##  1st Qu.:-0.5567   1st Qu.:-0.6739   1st Qu.:-0.6511   1st Qu.:-0.5846  
##  Median :-0.1989   Median :-0.1404   Median :-0.2192   Median :-0.2297  
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.3365   3rd Qu.: 0.4722   3rd Qu.: 0.3554   3rd Qu.: 0.2884  
##  Max.   :12.0621   Max.   : 6.6438   Max.   : 7.0657   Max.   : 9.8429  
##       V23               V24                V25               V26         
##  Min.   :-1.7254   Min.   :-2.22204   Min.   :-1.6919   Min.   :-1.2213  
##  1st Qu.:-0.6743   1st Qu.:-0.74797   1st Qu.:-0.6890   1st Qu.:-0.6416  
##  Median :-0.2688   Median :-0.04348   Median :-0.2857   Median :-0.3409  
##  Mean   : 0.0000   Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.5216   3rd Qu.: 0.65776   3rd Qu.: 0.5398   3rd Qu.: 0.3573  
##  Max.   : 4.0906   Max.   : 3.88249   Max.   : 4.2836   Max.   : 5.9250  
##       V27               V28               V29               V30         
##  Min.   :-2.6803   Min.   :-1.4426   Min.   :-1.3047   Min.   :-1.7435  
##  1st Qu.:-0.6906   1st Qu.:-0.6805   1st Qu.:-0.7558   1st Qu.:-0.7557  
##  Median :-0.0468   Median :-0.2693   Median :-0.2180   Median :-0.2233  
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 0.5970   3rd Qu.: 0.5392   3rd Qu.: 0.5307   3rd Qu.: 0.7119  
##  Max.   : 3.9519   Max.   : 5.1084   Max.   : 4.6965   Max.   : 2.6835  
##       V31               V32          diagnosis
##  Min.   :-2.1591   Min.   :-1.6004   B:357    
##  1st Qu.:-0.6413   1st Qu.:-0.6913   M:212    
##  Median :-0.1273   Median :-0.2163            
##  Mean   : 0.0000   Mean   : 0.0000            
##  3rd Qu.: 0.4497   3rd Qu.: 0.4504            
##  Max.   : 6.0407   Max.   : 6.8408

Step 4B : Split data into training and validation set

Using 80% data as training set data and remaining 20% data as validation set of data

# library(caTools)
set.seed(2017)
SplitRatio <- 0.8
split = sample.split(cancer_z$diagnosis, SplitRatio = 0.8)
train.sample_z <- subset(cancer_z, split == TRUE)
valid.sample_z <- subset(cancer_z, split == FALSE)

Step 5B : Training a KNN Model on training dataset

k.value <- round(sqrt(nrow(train.sample_z)))
# Predict the cancer data
cancer_predict_z <- knn(
  train = train.sample_z[,1:30], 
  test = valid.sample_z[,1:30], 
  cl = train.sample_z[,31], 
  k = k.value
)

Step 6A : Evaluate Model Performance

CrossTable( 
  x = valid.sample_z[,31] , 
  y = cancer_predict_z, 
  prop.chisq = FALSE,
  addmargins = FALSE)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_z 
## valid.sample_z[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        70 |         1 |        71 | 
##                      |     0.986 |     0.014 |     0.628 | 
##                      |     0.921 |     0.027 |           | 
##                      |     0.619 |     0.009 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         6 |        36 |        42 | 
##                      |     0.143 |     0.857 |     0.372 | 
##                      |     0.079 |     0.973 |           | 
##                      |     0.053 |     0.319 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        76 |        37 |       113 | 
##                      |     0.673 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
## 
## 

Unfortunately classification is decline, it predited 7 observation incorrectly ( 113 - 7)/113 = 93.8 Model performance can be improve by chaging the K value

Test Alternate K values (1..30)

k_test_n <- function(k.val){
  print(paste("Model for K value", k.val))
  cancer_predict_n <- knn(
    train = train.sample_n[,1:30], 
    test = valid.sample_n[,1:30], 
    cl = train.sample_n[,31], 
    k = k.val
  )
  t <- CrossTable( x = valid.sample_n[,31] , y = cancer_predict_n, prop.chisq = FALSE)
  t$t
}

# K = 1 to 30 ( Normalized transformation )
for ( k.val in 1:30 ){
  k_test_n(k.val)
}
## [1] "Model for K value 1"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        68 |         3 |        71 | 
##                      |     0.958 |     0.042 |     0.628 | 
##                      |     0.958 |     0.071 |           | 
##                      |     0.602 |     0.027 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         3 |        39 |        42 | 
##                      |     0.071 |     0.929 |     0.372 | 
##                      |     0.042 |     0.929 |           | 
##                      |     0.027 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        71 |        42 |       113 | 
##                      |     0.628 |     0.372 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 2"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        69 |         2 |        71 | 
##                      |     0.972 |     0.028 |     0.628 | 
##                      |     0.945 |     0.050 |           | 
##                      |     0.611 |     0.018 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.055 |     0.950 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        73 |        40 |       113 | 
##                      |     0.646 |     0.354 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 3"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        67 |         4 |        71 | 
##                      |     0.944 |     0.056 |     0.628 | 
##                      |     0.957 |     0.093 |           | 
##                      |     0.593 |     0.035 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         3 |        39 |        42 | 
##                      |     0.071 |     0.929 |     0.372 | 
##                      |     0.043 |     0.907 |           | 
##                      |     0.027 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        70 |        43 |       113 | 
##                      |     0.619 |     0.381 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 4"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        66 |         5 |        71 | 
##                      |     0.930 |     0.070 |     0.628 | 
##                      |     0.957 |     0.114 |           | 
##                      |     0.584 |     0.044 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         3 |        39 |        42 | 
##                      |     0.071 |     0.929 |     0.372 | 
##                      |     0.043 |     0.886 |           | 
##                      |     0.027 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        69 |        44 |       113 | 
##                      |     0.611 |     0.389 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 5"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        68 |         3 |        71 | 
##                      |     0.958 |     0.042 |     0.628 | 
##                      |     0.958 |     0.071 |           | 
##                      |     0.602 |     0.027 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         3 |        39 |        42 | 
##                      |     0.071 |     0.929 |     0.372 | 
##                      |     0.042 |     0.929 |           | 
##                      |     0.027 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        71 |        42 |       113 | 
##                      |     0.628 |     0.372 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 6"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        68 |         3 |        71 | 
##                      |     0.958 |     0.042 |     0.628 | 
##                      |     0.958 |     0.071 |           | 
##                      |     0.602 |     0.027 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         3 |        39 |        42 | 
##                      |     0.071 |     0.929 |     0.372 | 
##                      |     0.042 |     0.929 |           | 
##                      |     0.027 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        71 |        42 |       113 | 
##                      |     0.628 |     0.372 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 7"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        69 |         2 |        71 | 
##                      |     0.972 |     0.028 |     0.628 | 
##                      |     0.958 |     0.049 |           | 
##                      |     0.611 |     0.018 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         3 |        39 |        42 | 
##                      |     0.071 |     0.929 |     0.372 | 
##                      |     0.042 |     0.951 |           | 
##                      |     0.027 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        72 |        41 |       113 | 
##                      |     0.637 |     0.363 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 8"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        69 |         2 |        71 | 
##                      |     0.972 |     0.028 |     0.628 | 
##                      |     0.958 |     0.049 |           | 
##                      |     0.611 |     0.018 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         3 |        39 |        42 | 
##                      |     0.071 |     0.929 |     0.372 | 
##                      |     0.042 |     0.951 |           | 
##                      |     0.027 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        72 |        41 |       113 | 
##                      |     0.637 |     0.363 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 9"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        68 |         3 |        71 | 
##                      |     0.958 |     0.042 |     0.628 | 
##                      |     0.958 |     0.071 |           | 
##                      |     0.602 |     0.027 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         3 |        39 |        42 | 
##                      |     0.071 |     0.929 |     0.372 | 
##                      |     0.042 |     0.929 |           | 
##                      |     0.027 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        71 |        42 |       113 | 
##                      |     0.628 |     0.372 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 10"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        70 |         1 |        71 | 
##                      |     0.986 |     0.014 |     0.628 | 
##                      |     0.946 |     0.026 |           | 
##                      |     0.619 |     0.009 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.054 |     0.974 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        74 |        39 |       113 | 
##                      |     0.655 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 11"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        68 |         3 |        71 | 
##                      |     0.958 |     0.042 |     0.628 | 
##                      |     0.958 |     0.071 |           | 
##                      |     0.602 |     0.027 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         3 |        39 |        42 | 
##                      |     0.071 |     0.929 |     0.372 | 
##                      |     0.042 |     0.929 |           | 
##                      |     0.027 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        71 |        42 |       113 | 
##                      |     0.628 |     0.372 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 12"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        69 |         2 |        71 | 
##                      |     0.972 |     0.028 |     0.628 | 
##                      |     0.945 |     0.050 |           | 
##                      |     0.611 |     0.018 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.055 |     0.950 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        73 |        40 |       113 | 
##                      |     0.646 |     0.354 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 13"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        69 |         2 |        71 | 
##                      |     0.972 |     0.028 |     0.628 | 
##                      |     0.945 |     0.050 |           | 
##                      |     0.611 |     0.018 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.055 |     0.950 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        73 |        40 |       113 | 
##                      |     0.646 |     0.354 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 14"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 15"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 16"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        70 |         1 |        71 | 
##                      |     0.986 |     0.014 |     0.628 | 
##                      |     0.946 |     0.026 |           | 
##                      |     0.619 |     0.009 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.054 |     0.974 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        74 |        39 |       113 | 
##                      |     0.655 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 17"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 18"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        70 |         1 |        71 | 
##                      |     0.986 |     0.014 |     0.628 | 
##                      |     0.946 |     0.026 |           | 
##                      |     0.619 |     0.009 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.054 |     0.974 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        74 |        39 |       113 | 
##                      |     0.655 |     0.345 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 19"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 20"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 21"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 22"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 23"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 24"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.934 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         5 |        37 |        42 | 
##                      |     0.119 |     0.881 |     0.372 | 
##                      |     0.066 |     1.000 |           | 
##                      |     0.044 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        76 |        37 |       113 | 
##                      |     0.673 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 25"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 26"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.947 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         4 |        38 |        42 | 
##                      |     0.095 |     0.905 |     0.372 | 
##                      |     0.053 |     1.000 |           | 
##                      |     0.035 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        75 |        38 |       113 | 
##                      |     0.664 |     0.336 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 27"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.934 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         5 |        37 |        42 | 
##                      |     0.119 |     0.881 |     0.372 | 
##                      |     0.066 |     1.000 |           | 
##                      |     0.044 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        76 |        37 |       113 | 
##                      |     0.673 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 28"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.934 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         5 |        37 |        42 | 
##                      |     0.119 |     0.881 |     0.372 | 
##                      |     0.066 |     1.000 |           | 
##                      |     0.044 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        76 |        37 |       113 | 
##                      |     0.673 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 29"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.934 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         5 |        37 |        42 | 
##                      |     0.119 |     0.881 |     0.372 | 
##                      |     0.066 |     1.000 |           | 
##                      |     0.044 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        76 |        37 |       113 | 
##                      |     0.673 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
## 
##  
## [1] "Model for K value 30"
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  113 
## 
##  
##                      | cancer_predict_n 
## valid.sample_n[, 31] |         B |         M | Row Total | 
## ---------------------|-----------|-----------|-----------|
##                    B |        71 |         0 |        71 | 
##                      |     1.000 |     0.000 |     0.628 | 
##                      |     0.934 |     0.000 |           | 
##                      |     0.628 |     0.000 |           | 
## ---------------------|-----------|-----------|-----------|
##                    M |         5 |        37 |        42 | 
##                      |     0.119 |     0.881 |     0.372 | 
##                      |     0.066 |     1.000 |           | 
##                      |     0.044 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
##         Column Total |        76 |        37 |       113 | 
##                      |     0.673 |     0.327 |           | 
## ---------------------|-----------|-----------|-----------|
## 
## 

K Value test Concusion : K value 14,17,19,20,21, 22,23,25 and 26 are giving almost similar good result.

No comments:

Post a Comment

Naive Bayes - Simple

Naive Bayes - Toy Problem: Classify whether a given person is a male or a female based...