The data contain measurements on cells in suspicious lumps in a women's breast. Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. All samples are classsified as either benign or malignant.

data(wdbc)

Format

wdbc is a data.frame with 31 columns. The first column indicates wether the sample is classified as benign (B) or malignant (M). The remaining columns contain measurements for 30 features.

Details

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter)
b) texture (standard deviation of gray-scale values)
c) perimeter
d) area
e) smoothness (local variation in radius lengths)
f) compactness (perimeter^2 / area - 1.0)
g) concavity (severity of concave portions of the contour)
h) concave points (number of concave portions of the contour)
i) symmetry
j) fractal dimension ("coastline approximation" - 1)

The references listed below contain detailed descriptions of how these features are computed.

The mean, standard error, and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features.

Note

This breast cancer database was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.

Source

https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)

Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science.

References

O. L. Mangasarian and W. H. Wolberg: "Cancer diagnosis via linear programming",
SIAM News, Volume 23, Number 5, September 1990, pp 1 & 18.

William H. Wolberg and O.L. Mangasarian: "Multisurface method of pattern separation for medical diagnosis applied to breast cytology",
Proceedings of the National Academy of Sciences, U.S.A., Volume 87, December 1990, pp 9193-9196.

K. P. Bennett & O. L. Mangasarian: "Robust linear programming discrimination of two linearly inseparable sets",
Optimization Methods and Software 1, 1992, 23-34 (Gordon & Breach Science Publishers).

Examples

data(wdbc)
str(wdbc)
#> 'data.frame':	569 obs. of  31 variables:
#>  $ diagnosis              : Factor w/ 2 levels "B","M": 2 2 2 2 2 2 2 2 2 2 ...
#>  $ mean radius            : num  18 20.6 19.7 11.4 20.3 ...
#>  $ mean texture           : num  10.4 17.8 21.2 20.4 14.3 ...
#>  $ mean perimeter         : num  122.8 132.9 130 77.6 135.1 ...
#>  $ mean area              : num  1001 1326 1203 386 1297 ...
#>  $ mean smoothness        : num  0.1184 0.0847 0.1096 0.1425 0.1003 ...
#>  $ mean compactness       : num  0.2776 0.0786 0.1599 0.2839 0.1328 ...
#>  $ mean concavity         : num  0.3001 0.0869 0.1974 0.2414 0.198 ...
#>  $ mean concave points    : num  0.1471 0.0702 0.1279 0.1052 0.1043 ...
#>  $ mean symmetry          : num  0.242 0.181 0.207 0.26 0.181 ...
#>  $ mean fractal dimension : num  0.0787 0.0567 0.06 0.0974 0.0588 ...
#>  $ worst radius           : num  1.095 0.543 0.746 0.496 0.757 ...
#>  $ worst texture          : num  0.905 0.734 0.787 1.156 0.781 ...
#>  $ worst perimeter        : num  8.59 3.4 4.58 3.44 5.44 ...
#>  $ worst area             : num  153.4 74.1 94 27.2 94.4 ...
#>  $ worst smoothness       : num  0.0064 0.00522 0.00615 0.00911 0.01149 ...
#>  $ worst compactness      : num  0.049 0.0131 0.0401 0.0746 0.0246 ...
#>  $ worst concavity        : num  0.0537 0.0186 0.0383 0.0566 0.0569 ...
#>  $ worst concave points   : num  0.0159 0.0134 0.0206 0.0187 0.0188 ...
#>  $ worst symmetry         : num  0.03 0.0139 0.0225 0.0596 0.0176 ...
#>  $ worst fractal dimension: num  0.00619 0.00353 0.00457 0.00921 0.00511 ...
#>  $ sd radius              : num  25.4 25 23.6 14.9 22.5 ...
#>  $ sd texture             : num  17.3 23.4 25.5 26.5 16.7 ...
#>  $ sd perimeter           : num  184.6 158.8 152.5 98.9 152.2 ...
#>  $ sd area                : num  2019 1956 1709 568 1575 ...
#>  $ sd smoothness          : num  0.162 0.124 0.144 0.21 0.137 ...
#>  $ sd compactness         : num  0.666 0.187 0.424 0.866 0.205 ...
#>  $ sd concavity           : num  0.712 0.242 0.45 0.687 0.4 ...
#>  $ sd concave points      : num  0.265 0.186 0.243 0.258 0.163 ...
#>  $ sd symmetry            : num  0.46 0.275 0.361 0.664 0.236 ...
#>  $ sd fractal dimension   : num  0.1189 0.089 0.0876 0.173 0.0768 ...