Skip to content

Commit 1a2d164

Browse files
committed
added tutorials for UNAM
0 parents  commit 1a2d164

35 files changed

+13689
-0
lines changed

C50/breast-cancer.data

Lines changed: 500 additions & 0 deletions
Large diffs are not rendered by default.

C50/breast-cancer.names

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
| Citation Request:
2+
| This breast cancer databases was obtained from the University of Wisconsin
3+
| Hospitals, Madison from Dr. William H. Wolberg. If you publish results
4+
| when using this database, then please include this information in your
5+
| acknowledgements. Also, please cite one or more of:
6+
|
7+
| 1. O. L. Mangasarian and W. H. Wolberg: "Cancer diagnosis via linear
8+
| programming", SIAM News, Volume 23, Number 5, September 1990, pp 1 & 18.
9+
|
10+
| 2. William H. Wolberg and O.L. Mangasarian: "Multisurface method of
11+
| pattern separation for medical diagnosis applied to breast cytology",
12+
| Proceedings of the National Academy of Sciences, U.S.A., Volume 87,
13+
| December 1990, pp 9193-9196.
14+
|
15+
| 3. O. L. Mangasarian, R. Setiono, and W.H. Wolberg: "Pattern recognition
16+
| via linear programming: Theory and application to medical diagnosis",
17+
| in: "Large-scale numerical optimization", Thomas F. Coleman and Yuying
18+
| Li, editors, SIAM Publications, Philadelphia 1990, pp 22-30.
19+
|
20+
| 4. K. P. Bennett & O. L. Mangasarian: "Robust linear programming
21+
| discrimination of two linearly inseparable sets", Optimization Methods
22+
| and Software 1, 1992, 23-34 (Gordon & Breach Science Publishers).
23+
|
24+
| 1. Title: Wisconsin Breast Cancer Database (January 8, 1991)
25+
|
26+
| 2. Sources:
27+
| -- Dr. WIlliam H. Wolberg (physician)
28+
| University of Wisconsin Hospitals
29+
| Madison, Wisconsin
30+
| USA
31+
| -- Donor: Olvi Mangasarian ([email protected])
32+
| Received by David W. Aha ([email protected])
33+
| -- Date: 15 July 1992
34+
|
35+
| 3. Past Usage:
36+
|
37+
| Attributes 2 through 10 have been used to represent instances.
38+
| Each instance has one of 2 possible classes: benign or malignant.
39+
|
40+
| 1. Wolberg,~W.~H., \& Mangasarian,~O.~L. (1990). Multisurface method of
41+
| pattern separation for medical diagnosis applied to breast cytology. In
42+
| {\it Proceedings of the National Academy of Sciences}, {\it 87},
43+
| 9193--9196.
44+
| -- Size of data set: only 369 instances (at that point in time)
45+
| -- Collected classification results: 1 trial only
46+
| -- Two pairs of parallel hyperplanes were found to be consistent with
47+
| 50% of the data
48+
| -- Accuracy on remaining 50% of dataset: 93.5%
49+
| -- Three pairs of parallel hyperplanes were found to be consistent with
50+
| 67% of data
51+
| -- Accuracy on remaining 33% of dataset: 95.9%
52+
|
53+
| 2. Zhang,~J. (1992). Selecting typical instances in instance-based
54+
| learning. In {\it Proceedings of the Ninth International Machine
55+
| Learning Conference} (pp. 470--479). Aberdeen, Scotland: Morgan
56+
| Kaufmann.
57+
| -- Size of data set: only 369 instances (at that point in time)
58+
| -- Applied 4 instance-based learning algorithms
59+
| -- Collected classification results averaged over 10 trials
60+
| -- Best accuracy result:
61+
| -- 1-nearest neighbor: 93.7%
62+
| -- trained on 200 instances, tested on the other 169
63+
| -- Also of interest:
64+
| -- Using only typical instances: 92.2% (storing only 23.1 instances)
65+
| -- trained on 200 instances, tested on the other 169
66+
|
67+
| 4. Relevant Information:
68+
|
69+
| Samples arrive periodically as Dr. Wolberg reports his clinical cases.
70+
| The database therefore reflects this chronological grouping of the data.
71+
| This grouping information appears immediately below, having been removed
72+
| from the data itself:
73+
|
74+
| Group 1: 367 instances (January 1989)
75+
| Group 2: 70 instances (October 1989)
76+
| Group 3: 31 instances (February 1990)
77+
| Group 4: 17 instances (April 1990)
78+
| Group 5: 48 instances (August 1990)
79+
| Group 6: 49 instances (Updated January 1991)
80+
| Group 7: 31 instances (June 1991)
81+
| Group 8: 86 instances (November 1991)
82+
| -----------------------------------------
83+
| Total: 699 points (as of the donated datbase on 15 July 1992)
84+
|
85+
| Note that the results summarized above in Past Usage refer to a dataset
86+
| of size 369, while Group 1 has only 367 instances. This is because it
87+
| originally contained 369 instances; 2 were removed. The following
88+
| statements summarizes changes to the original Group 1's set of data:
89+
|
90+
| ##### Group 1 : 367 points: 200B 167M (January 1989)
91+
| ##### Revised Jan 10, 1991: Replaced zero bare nuclei in 1080185 & 1187805
92+
| ##### Revised Nov 22,1991: Removed 765878,4,5,9,7,10,10,10,3,8,1 no record
93+
| ##### : Removed 484201,2,7,8,8,4,3,10,3,4,1 zero epithelial
94+
| ##### : Changed 0 to 1 in field 6 of sample 1219406
95+
| ##### : Changed 0 to 1 in field 8 of following sample:
96+
| ##### : 1182404,2,3,1,1,1,2,0,1,1,1
97+
|
98+
| 5. Number of Instances: 699 (as of 15 July 1992)
99+
|
100+
| 6. Number of Attributes: 10 plus the class attribute
101+
|
102+
| 7. Attribute Information: (class attribute has been moved to last column)
103+
|
104+
| # Attribute Domain
105+
| -- -----------------------------------------
106+
| 1. Sample code number id number
107+
| 2. Clump Thickness 1 - 10
108+
| 3. Uniformity of Cell Size 1 - 10
109+
| 4. Uniformity of Cell Shape 1 - 10
110+
| 5. Marginal Adhesion 1 - 10
111+
| 6. Single Epithelial Cell Size 1 - 10
112+
| 7. Bare Nuclei 1 - 10
113+
| 8. Bland Chromatin 1 - 10
114+
| 9. Normal Nucleoli 1 - 10
115+
| 10. Mitoses 1 - 10
116+
| 11. Class: (2 for benign, 4 for malignant)
117+
|
118+
| 8. Missing attribute values: 16
119+
|
120+
| There are 16 instances in Groups 1 to 6 that contain a single missing
121+
| (i.e., unavailable) attribute value, now denoted by "?".
122+
|
123+
| 9. Class distribution:
124+
|
125+
| Benign: 458 (65.5%)
126+
| Malignant: 241 (34.5%)
127+
128+
2, 4
129+
130+
Sample code number: ignore
131+
Clump Thickness: continuous
132+
Uniformity of Cell Size: continuous
133+
Uniformity of Cell Shape: continuous
134+
Marginal Adhesion: continuous
135+
Single Epithelial Cell Size: continuous
136+
Bare Nuclei: continuous
137+
Bland Chromatin: continuous
138+
Normal Nucleoli: continuous
139+
Mitoses: continuous

C50/breast-cancer.test

Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
1133991,4,1,1,1,1,1,2,1,1,2
2+
1285531,1,1,1,1,2,1,3,1,1,2
3+
167528,4,1,1,1,2,1,3,6,1,2
4+
1213273,2,1,1,1,2,1,1,1,1,2
5+
1212422,3,1,1,1,2,1,3,1,1,2
6+
1243256,10,4,3,2,3,10,5,3,2,4
7+
1299924,5,1,1,1,2,1,2,1,1,2
8+
274137,8,8,9,4,5,10,7,8,1,4
9+
1258549,9,10,10,10,10,10,10,10,1,4
10+
566346,3,1,1,1,2,1,2,3,1,2
11+
850831,2,7,10,10,7,10,4,9,4,4
12+
1344121,8,10,4,4,8,10,8,2,1,4
13+
1049815,4,1,1,1,2,1,3,1,1,2
14+
536708,1,1,1,1,2,1,1,1,1,2
15+
1182404,5,1,4,1,2,1,3,2,1,2
16+
1143978,5,2,1,1,2,1,3,1,1,2
17+
1223793,6,10,7,7,6,4,8,10,2,4
18+
1331405,4,1,1,1,2,1,3,2,1,2
19+
1193683,1,1,2,1,3,?,1,1,1,2
20+
1298484,10,3,4,5,3,10,4,1,1,4
21+
1345452,1,1,3,1,2,1,2,1,1,2
22+
434518,3,1,1,1,2,1,2,1,1,2
23+
1184184,1,1,1,1,2,5,1,1,1,2
24+
1208301,1,2,3,1,2,1,3,1,1,2
25+
1199219,1,1,1,2,1,1,1,1,1,2
26+
666942,1,1,1,1,2,1,3,1,1,2
27+
1217952,4,1,1,1,2,1,2,1,1,2
28+
1061990,4,1,1,1,2,1,2,1,1,2
29+
1074610,2,1,1,2,2,1,3,1,1,2
30+
1211594,3,1,1,1,1,1,2,1,1,2
31+
1002945,5,4,4,5,7,10,3,2,1,2
32+
752904,10,1,1,1,2,10,5,4,1,4
33+
1287282,3,1,1,1,2,1,1,1,1,2
34+
1238777,1,1,1,1,2,1,1,1,1,2
35+
1070935,3,1,1,1,1,1,2,1,1,2
36+
1168736,10,10,10,10,10,1,8,8,8,4
37+
1201936,5,10,10,3,8,1,5,10,3,4
38+
1183516,3,1,1,1,2,1,1,1,1,2
39+
1213375,8,4,4,5,4,7,7,8,2,2
40+
1035283,1,1,1,1,1,1,3,1,1,2
41+
764974,5,1,1,1,2,1,3,1,2,2
42+
803531,5,10,10,10,5,2,8,5,1,4
43+
1218982,4,1,1,1,2,1,1,1,1,2
44+
1321348,2,1,1,1,2,1,2,1,1,2
45+
1241232,3,1,4,1,2,?,3,1,1,2
46+
1334015,7,8,8,7,3,10,7,2,3,4
47+
474162,8,7,8,5,5,10,9,10,1,4
48+
792744,1,1,1,1,2,1,1,1,1,2
49+
836433,5,1,1,3,2,1,1,1,1,2
50+
1227481,10,5,7,4,4,10,8,9,1,4
51+
1339781,4,1,1,1,2,1,3,1,1,2
52+
1222464,6,10,10,10,4,10,7,10,1,4
53+
1017023,6,3,3,5,3,10,3,5,3,2
54+
1120559,8,3,8,3,4,9,8,9,8,4
55+
486283,3,1,1,1,2,1,3,1,1,2
56+
1200772,1,1,1,1,2,1,2,1,1,2
57+
1257648,4,3,3,1,2,1,3,3,1,2
58+
452264,1,1,1,1,2,1,2,1,1,2
59+
1334659,5,2,4,1,1,1,1,1,1,2
60+
1304595,3,1,1,1,1,1,2,1,1,2
61+
1169049,7,3,4,4,3,3,3,2,7,4
62+
1321264,5,2,2,2,1,1,2,1,1,2
63+
476903,10,5,7,3,3,7,3,3,8,4
64+
1065726,5,2,3,4,2,7,3,6,1,4
65+
1222047,10,10,10,10,3,10,10,6,1,4
66+
1111249,10,6,6,3,4,5,3,6,1,4
67+
1296593,5,2,1,1,2,1,1,1,1,2
68+
1155546,2,1,1,2,3,1,2,1,1,2
69+
1225382,6,2,3,1,2,1,1,1,1,2
70+
666090,1,1,1,1,2,1,3,1,1,2
71+
1350319,5,7,4,1,6,1,7,10,3,4
72+
1260659,3,1,4,1,2,1,1,1,1,2
73+
1224329,1,1,1,2,2,1,3,1,1,2
74+
1185609,3,4,5,2,6,8,4,1,1,4
75+
1165297,2,1,1,2,2,1,1,1,1,2
76+
704097,1,1,1,1,1,1,2,1,1,2
77+
1173681,3,2,1,1,2,2,3,1,1,2
78+
1190386,4,6,6,5,7,6,7,7,3,4
79+
1311875,5,1,2,1,2,1,1,1,1,2
80+
1239420,1,1,1,1,2,1,1,1,1,2
81+
1002504,3,2,2,2,2,1,3,2,1,2
82+
466906,1,1,1,1,2,1,1,1,1,2
83+
1371920,5,1,1,1,2,1,3,2,1,2
84+
704097,1,1,1,1,1,1,2,1,1,2
85+
521441,5,1,1,2,2,1,2,1,1,2
86+
1368882,2,1,1,1,2,1,1,1,1,2
87+
1313982,4,3,1,1,2,1,4,8,1,2
88+
145447,8,4,4,1,2,9,3,3,1,4
89+
1198128,10,8,10,10,6,1,3,1,10,4
90+
1218860,1,1,1,1,1,1,3,1,1,2
91+
1151734,10,8,7,4,3,10,7,9,1,4
92+
1276091,6,1,1,3,2,1,1,1,1,2
93+
1160476,2,1,1,1,2,1,3,1,1,2
94+
183913,1,2,2,1,2,1,1,1,1,2
95+
1182404,4,2,1,1,2,1,1,1,1,2
96+
760001,8,10,3,2,6,4,3,10,1,4
97+
1318169,9,10,10,10,10,5,10,10,10,4
98+
128059,1,1,1,1,2,5,5,1,1,2
99+
1354840,5,3,2,1,3,1,1,1,1,2
100+
1334667,1,1,1,1,2,1,1,1,1,2
101+
798429,1,1,1,1,2,1,3,1,1,2
102+
1323477,1,2,1,3,2,1,2,1,1,2
103+
822829,8,10,10,10,6,10,10,10,10,4
104+
1200952,5,8,7,7,10,10,5,7,1,4
105+
1166630,7,5,6,10,5,10,7,9,4,4
106+
1276091,5,1,1,3,4,1,3,2,1,2
107+
1108449,5,3,3,4,2,4,3,4,1,4
108+
654546,1,1,1,1,2,1,1,1,8,2
109+
1238186,4,1,1,1,2,1,2,1,1,2
110+
1206695,1,5,8,6,5,8,7,10,1,4
111+
1343374,10,10,8,10,6,5,10,3,1,4
112+
1217717,5,1,1,6,3,1,1,1,1,2
113+
1173509,4,5,5,10,4,10,7,5,8,4
114+
1107684,6,10,5,5,4,10,6,10,1,4
115+
1103608,10,10,10,4,8,1,8,10,1,4
116+
770066,5,2,2,2,2,1,2,2,1,2
117+
1286943,8,10,10,10,7,5,4,8,7,4
118+
1018561,2,1,2,1,2,1,3,1,1,2
119+
1116132,6,3,4,1,5,2,3,9,1,4
120+
1324572,5,1,1,1,2,1,2,2,1,2
121+
1223282,1,1,1,1,2,1,2,1,1,2
122+
493452,1,1,3,1,2,1,1,1,1,2
123+
1190394,4,1,1,1,2,3,1,1,1,2
124+
1181356,5,1,1,1,2,2,3,3,1,2
125+
535331,3,1,1,1,3,1,2,1,1,2
126+
1212251,1,1,1,1,2,1,3,1,1,2
127+
1261751,5,1,1,1,2,1,2,2,1,2
128+
1371026,5,10,10,10,4,10,5,6,3,4
129+
1067444,2,1,1,1,2,1,2,1,1,2
130+
1192325,5,5,5,6,3,10,3,1,1,4
131+
1066373,3,2,1,1,1,1,2,1,1,2
132+
1054593,10,5,5,3,6,7,7,10,1,4
133+
1207986,5,8,4,10,5,8,9,10,1,4
134+
1168736,5,6,6,2,4,10,3,6,1,4
135+
466906,1,1,1,1,2,1,1,1,1,2
136+
1183983,9,5,5,4,4,5,4,3,3,4
137+
1315807,5,10,10,10,10,2,10,10,10,4
138+
1000025,5,1,1,1,2,1,3,1,1,2
139+
1324681,4,1,1,1,2,1,2,1,1,2
140+
1171710,1,1,1,1,2,1,2,3,1,2
141+
1026122,2,1,1,1,2,1,1,1,1,2
142+
1277792,4,1,1,1,2,1,1,1,1,2
143+
763235,3,1,1,1,2,1,2,1,2,2
144+
718641,1,1,1,1,5,1,3,1,1,2
145+
1145420,6,1,1,1,2,1,2,1,1,2
146+
832567,4,2,3,5,3,8,7,6,1,4
147+
826923,1,1,1,1,2,1,1,1,1,2
148+
640744,10,10,10,7,9,10,7,10,10,4
149+
1002025,1,1,1,3,1,3,1,1,1,2
150+
1330361,5,1,1,1,2,1,2,1,1,2
151+
1218860,1,1,1,1,1,1,3,1,1,2
152+
1259008,8,8,9,6,6,3,10,10,1,4
153+
1171710,6,5,4,4,3,9,7,8,3,4
154+
640712,1,1,1,1,2,1,2,1,1,2
155+
1190485,1,1,1,1,2,1,1,1,1,2
156+
1298416,10,6,6,2,4,10,9,7,1,4
157+
1204558,4,1,1,1,2,1,2,1,1,2
158+
1225799,10,6,4,3,10,10,9,10,1,4
159+
486662,2,1,1,2,2,1,3,1,1,2
160+
95719,6,10,10,10,8,10,7,10,7,4
161+
1102573,5,6,5,6,10,1,3,1,1,4
162+
1200892,8,6,5,4,3,10,6,1,1,4
163+
1231387,6,8,7,5,6,8,8,9,2,4
164+
1193210,2,1,1,1,2,1,3,1,1,2
165+
673637,3,1,1,1,2,5,5,1,1,2
166+
1296263,4,1,1,1,2,1,1,1,1,2
167+
1295529,2,5,7,6,4,10,7,6,1,4
168+
734111,1,1,1,3,2,3,1,1,1,2
169+
824249,1,1,1,1,2,1,3,1,1,2
170+
1227244,1,1,1,1,2,1,2,1,1,2
171+
1167439,2,3,4,4,2,5,2,5,1,4
172+
1112209,8,10,10,1,3,6,3,9,1,4
173+
859164,5,3,3,1,3,3,3,3,3,4
174+
1287971,3,1,1,1,2,1,2,1,1,2
175+
1105524,1,1,1,1,2,1,2,1,1,2
176+
1301945,5,1,1,1,1,1,1,1,1,2
177+
1049837,1,1,1,1,2,1,1,1,1,2
178+
1113038,8,2,4,1,5,1,5,4,4,4
179+
1348851,3,1,1,1,2,1,3,1,1,2
180+
1149548,1,1,1,1,2,1,1,1,1,2
181+
809912,10,3,3,1,2,10,7,6,1,4
182+
1113906,9,5,5,2,2,2,5,1,1,4
183+
1182404,3,1,1,1,2,1,2,1,1,2
184+
616240,5,3,4,3,4,5,4,7,1,2
185+
1339781,1,1,1,1,2,1,2,1,1,2
186+
1198641,10,10,6,3,3,10,4,3,2,4
187+
1118039,5,3,4,1,8,10,4,9,1,4
188+
1343068,8,4,4,1,6,10,2,5,2,4
189+
1234554,1,1,1,1,2,1,2,1,1,2
190+
1177027,3,1,1,1,2,1,3,1,1,2
191+
1311108,1,1,1,3,2,1,1,1,1,2
192+
1272166,5,1,1,1,2,1,1,1,1,2
193+
1220330,1,1,1,1,2,1,3,1,1,2
194+
527337,4,1,1,1,2,1,1,1,1,2
195+
888820,5,10,10,3,7,3,8,10,2,4
196+
690557,5,1,1,1,2,1,2,1,1,2
197+
1241035,7,8,3,7,4,5,7,8,2,4
198+
603148,4,1,1,1,2,1,1,1,1,2
199+
534555,1,1,1,1,2,1,1,1,1,2

C50/c5.0

256 KB
Binary file not shown.

C50/golf.data

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
sunny, 85, 85, false, Don't Play
2+
sunny, 80, 90, true, Don't Play
3+
overcast, 83, 78, false, Play
4+
rain, 70, 96, false, Play
5+
rain, 68, 80, false, Play
6+
rain, 65, 70, true, Don't Play
7+
overcast, 64, 65, true, Play
8+
sunny, 72, 95, false, Don't Play
9+
sunny, 69, 70, false, Play
10+
rain, 75, 80, false, Play
11+
sunny, 75, 70, true, Play
12+
overcast, 72, 90, true, Play
13+
overcast, 81, 75, false, Play
14+
rain, 71, 80, true, Don't Play

C50/golf.names

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Play, Don't Play.
2+
3+
outlook: sunny, overcast, rain.
4+
temperature: continuous.
5+
humidity: continuous.
6+
windy: true, false.

0 commit comments

Comments
 (0)