From ALI@CORNELLA.cit.cornell.edu Tue Mar 17 19:44:22 1992 Return-Path: Received: from CORNELLA.cit.cornell.edu (801fd80) by maths.lth.se (1.2/MATHS BL-2.4s); Tue, 17 Mar 92 19:43:36 -0100 (MET) Message-Id: <9203171843.AA07821@maths.lth.se> Received: from CORNELLA by CORNELLA.cit.cornell.edu (IBM VM SMTP V2R2) with BSMTP id 7430; Tue, 17 Mar 92 13:42:15 EST Received: from CORNELLA (ALI) by CORNELLA (Mailer R2.08A) with BSMTP id 0134; Tue, 17 Mar 92 13:42:09 EST Date: Tue, 17 Mar 92 13:41:51 EST From: "Ali S. Hadi" Subject: Data sets To: andersh@maths.lth.se Status: R *********************** Phosphorus Data Source: Snedecor, G. W. and Cochran, W. G. (1967),"Statistical Methods", (6 Edition), Iowa State University, Ames, Iowa, p. 384. Taken From: Chatterjee and Hadi (1988), p. 82. Dimension: 18 observations on 3 variables Description: An investigation of the source from which corn plants obtain their phosphorus was carried out. Concentrations of phosphorus in parts per millions in each of 18 soils were measured. Column Description 1 Concentrations of inorganic phosphorus in the soil 2 Concentrations of organic phosphorus in the soil 3 Phosphorus content of corn grown in the soil at 20 degrees C 0.4 53 64 0.4 23 60 3.1 19 71 0.6 34 61 4.7 24 54 1.7 65 77 9.4 44 81 10.1 31 93 11.6 29 93 12.6 58 51 10.9 37 76 23.1 46 96 23.1 50 77 21.6 44 93 23.1 56 95 1.9 36 54 26.8 58 168 29.9 51 99 *********************** Scottish Hill Race Data Source: Scottish Hill Runners Association Taken From: Atkinson's discussion of Chatterjee and Hadi (1986), Statistical Science, p. 400. Dimension: 35 observations on 3 variables Description: The observations are: 1. Greenmantle New Year Dash 19. Black Hill 2. Carnethy 5 Hill Race 20. Creag Beag 3. Craig Dunain 21. Kildoon 4. Ben Rha 22. Meall Ant-Suiche 5. Ben Lomond 23. Half Ben Nevis 6. Goatfell 24. Cow Hill 7. Bens of Jura 25. North Berwick Law 8. Cairnpapple 26. Creag Dubh 9. Scolty 27. Burnswark 10. Traprain Law 28. Largo 11. Lairig Ghru 29. Criffel 12. Dollar 30. Achmony 13. Lomonds of Fife 31. Ben Nevis 14. Cairn Table 32. Knockfarrel 15. Eildon Two 33. Two Breweries Fell 16. Cairngorm 34. Cockleroi 17. Seven Hills of Edinburgh 35. Moffat Chase 18. Knock Hill Column Definition 1 Distance (miles) 2 Climb (ft) 3 Time (seconds) *********************** Scottish Hill Race Data 2.5 650 965 6 2500 2901 6 900 2019 7.5 800 2736 8 3070 3736 8 2866 4393 16 7500 12277 6 800 2182 5 800 1785 6 650 2385 28 2100 11560 5 2000 2583 9.5 2200 3900 6 500 2648 4.5 1500 1616 10 3000 4335 14 2200 5905 3 350 4719 4.5 1000 1045 5.5 600 1954 3 300 957 3.5 1500 1674 6 2200 2859 2 900 1076 3 600 1121 4 2000 1573 6 800 2066 5 950 1714 6.5 1750 3030 5 500 1257 10 4400 5135 6 600 1943 18 5200 10215 4.5 850 1686 20 5000 9590 *********************** Salary Survey Data Source: Chatterjee, S. and Hadi, A. S. (1988),p. 88 Taken From: Source Dimension: 31 observations on 6 variables Description: The data set is a result of a study relating the monthly salary of a random sample of employees in a given company to several factors thought to determine salary differentials. Column Description 1 Job evaluation points 2 Sex (1 = male, 0 = female) 3 Number of years with the company 4 Number of years on present job 5 Performance rating (1 = unsatisfactory, 5 = outstanding) 6 Monthly salary 350 1 2 2 5 1000 350 1 5 5 5 1400 350 0 4 4 4 1200 350 1 20 20 1 1800 425 0 10 2 3 2800 425 1 15 10 3 4000 425 0 1 1 4 2500 425 1 5 5 4 3000 600 1 10 5 2 3500 600 0 8 8 3 2800 600 0 4 3 4 2900 600 1 20 10 2 3800 600 1 7 7 5 4200 700 1 8 8 1 4600 700 0 25 15 5 5000 700 1 19 16 4 4600 700 0 20 14 5 4700 400 0 6 4 3 1800 400 1 20 8 3 3400 400 0 5 3 5 2000 500 1 22 12 3 3200 500 1 25 10 3 3200 500 0 8 3 4 2800 500 0 2 1 5 2400 800 1 10 10 3 5200 475 1 10 4 3 2400 475 0 3 3 4 2400 475 1 8 8 2 3000 475 1 6 6 4 2800 475 0 12 4 3 2500 475 0 4 2 5 2100 *********************** Health Club Data Source: Chatterjee, S. and Hadi, A. S. (1988), Sensitivity Analysis in Linear Regression, New York: John Wiley and Sons, Table 4.9, p. 129. Taken From: Source Dimension: 30 observations on 5 variables Description: The data set is taken from health records of 30 employees who were regular members of a given company's health club. The measured variables are: Column Description 1 Weight in pounds 2 Resting pulse rate per one minute 3 Arm and leg strength (lbs an employee was able to lift) 4 Time (in seconds) in a 1/4-mile trial run 5 Time (in seconds) in a one-mile run 217 67 260 91 481 141 52 190 66 292 152 58 203 68 338 153 56 183 70 357 180 66 170 77 396 193 71 178 82 429 162 65 160 74 345 180 80 170 84 469 205 77 188 83 425 168 74 170 79 358 232 65 220 72 393 146 68 158 68 346 173 51 243 56 279 155 64 198 59 311 212 66 220 77 401 138 70 180 62 267 147 54 150 75 404 197 76 228 88 442 165 59 188 70 368 125 58 160 66 295 161 52 190 69 391 132 62 163 59 264 257 64 313 96 487 236 72 225 84 481 149 57 173 68 374 161 57 173 65 309 198 59 220 62 367 245 70 218 69 469 141 63 193 60 252 177 53 183 75 338 *********************** Brain and Body Weight Data Source: Jerison, H. J. (1973), "Evolution of the Brain and Intelligence," New York: Academic Press. Taken From: Rousseeuw, P. J. and Leroy, A. M. (1987), "Robust Regression and Outlier Detection," New York: John Wiley & Sons, p. 57. Dimension: 28 observations on 2 variables Description: The sample was taken from a larger data set. It is to be determined whether a larger brain is required to govern a heavier body. The observations represent the following animals: 1 Mountain beaver 2 Cow 3 Gray wolf 4 Goat 5 Guinea pig 6 Diplodocus 7 Asian elephant 8 Donkey 9 Horse 10 Potar monkey 11 Cat 12 Giraffe 13 Gorilla 14 Human 15 African elephant 16 Triceratops 17 Rhesus monkey 18 Kangaroo 19 Hamster 20 Mouse 21 Rabbit 22 Sheep 23 Jaguar 24 Chimpanzee 25 Brachiosaurus 26 Rat 27 Mole 28 Pig Column Description 1 Body weight in kilograms 2 Brain weight in grams 1.35 8.1 465 423 36.33 119.5 27.66 115 1.04 5.5 11700 50 2547 4603 187.1 419 521 655 10 115 3.3 25.6 529 680 207 406 62 1320 6654 5712 9400 70 6.8 179 35 56 0.12 1 0.023 0.4 2.5 12.1 55.5 175 100 157 52.16 440 87000 154.5 0.28 1.9 0.122 3 192 180 *********************** Cement Data Source: Daniel, C. and Wood, F. S. (1980),"Fitting Equations to Data: Computer Analysis of Multifactor Data", Second Edition, New York: John Wiley and Sons Taken From: Chatterjee and Hadi (1988), p. 259. Dimension: 14 observations on 6 variables Description: The data set is one of a sequence of several data sets taken at different times in an experimental study relating the heat evolved during hardening of 14 samples of cement to the composition of the cement. The explanatory variables are weights (measured as percentages of the weight of each sample) of five Clinker compounds. The dependant variable (heat) is the last column in the data set. 6 7 26 60 2.5 85.5 15 1 29 52 2.3 76 8 11 56 20 5 110.4 8 11 31 47 2.4 90.6 6 7 52 33 2.4 103.5 9 11 55 22 2.4 109.8 17 3 71 6 2.1 108 22 1 31 44 2.2 71.6 18 2 54 22 2.3 97 4 21 47 26 2.5 122.7 23 1 40 34 2.2 83.1 9 11 66 12 2.6 115.4 8 10 68 12 2.4 116.3 18 1 17 61 2.1 62.6 *********************** Colon Cancer Data Source: Carmeron and Pauling(1978) Taken From: Rawlings (1988), Applied Regression Analysis, Wadsworth, p. 83. Dimension: 22 observations on 4 variables. Description: A study of the effects of supplemental ascorbate, vitamin C on the treatment of colon cancer Column Description 1 Sex(female=1, male=-1) 2 Age 3 Days(Number of days survival after date of untreatability) 4 Control(Average number of days survival of ten control patients for each case) 1 76 135 18 1 58 50 30 1 49 189 65 1 69 1267 17 1 70 155 57 1 68 534 16 1 50 502 25 1 74 126 21 1 66 90 17 1 76 365 42 1 56 911 40 1 65 743 14 1 74 366 28 1 58 156 31 1 60 99 28 1 77 20 33 1 38 274 80 *********************** Growth Data Source: Rawlings (1988), page 325 Taken From: Rawlings (1988), Applied Regression Analysis, Wadsworth, p. 325 Dimension: 24 observations on 2 variables Description: The growth data taken on four different independent experimental units at each different ages Column Description 1 Age in weeks 2 Dry weight in grams 1 8 1 10 1 12 1 15 2 35 2 38 2 42 2 48 3 57 3 63 3 68 3 74 5 68 5 76 5 86 5 90 7 76 7 95 7 103 7 105 9 85 9 98 9 105 9 110 *********************** Consumption Function Source: Belsley (1991), Conditioning Diagnostics: Collinearity and Weak data in Regression, New York: Wiley, Exhibit 5.10. Taken From: Source Dimension: 28 observations on 4 variables Description: Annual aggregate consumption function for the U.S. for the years 1947-1974. Column Abrev. Description 1 t year 2 C(t-1) total consumption , 1958 dollars 3 DPI(t), disposable income, 1958 dollars 4 5 r(t) interset rate, (Moody's Aaa) 6 C(t) total consumption , 1958 dollars Model: E{C(t)} = b0 + b1 C(t-1) + b2 DPI(t) + b3 r(t) + b4 48 206.275 2.81667 229.7 11.625 210.775 49 210.775 2.66 230.925 1.225 216.5 50 216.5 2.6225 249.65 18.725 230.5 51 230.5 2.86 255.675 6.025 232.825 52 232.825 2.95583 263.25 7.575 239.425 53 239.425 3.19917 275.475 12.225 250.775 54 250.775 2.90083 278.4 2.925 255.725 55 255.725 3.0525 296.625 18.225 274.2 56 274.2 3.36417 309.35 12.725 281.4 57 281.4 3.885 316.075 6.725 288.15 58 288.15 3.7875 318.8 2.725 290.05 59 290.05 4.38167 333.05 14.25 307.3 60 307.3 4.41 340.325 7.275 316.075 61 316.075 4.35 350.475 10.15 322.5 62 322.5 4.325 367.25 16.775 338.425 63 338.425 4.25917 381.225 13.975 353.3 64 353.3 4.40417 408.1 26.875 373.725 65 373.725 4.49333 434.825 26.725 397.7 66 397.7 5.13 458.875 24.05 418.1 67 418.1 5.50667 477.55 18.675 430.1 68 430.1 6.175 499.05 21.5 452.725 69 452.725 7.02917 513.5 14.45 469.125 70 469.125 8.04 534.75 21.25 477.55 71 477.55 7.38667 555.425 20.675 496.425 72 496.425 7.21333 580.45 25.025 527.35 73 527.35 7.44083 619.5 39.05 552.075 74 552.075 8.56583 602.875 16.625 539.45 *********************** Cost-of-Living Data Source: Beck, J.H. and Quan, N.T. (1983) Taken From: Quan, N.T. (1988), JBES, 6, 4, 501-504. Dimension: 38 observations on 8 variables Description: A study of the effects of right-to-work laws and geographic differences on the cost of living. Observations are for the following cities: 1 Atlanta 14 Dayton 27 New York 2 Austin 15 Denver 28 Orlando 3 Bakersfield 16 Detroit 29 Philadelphia 4 Baltimore 17 Green Bay 30 Pittsburgh 5 Baton Rouge 18 Hartford 31 Portland 6 Boston 19 Houston 32 St. Louis 7 Buffalo 20 Indianapolis 33 San Diego 8 Champaign-Urbana 21 Kansas City 34 San Francisco 9 Cedar Rapids 22 Lancaster, PA 35 Seattle 10 Chicago 23 Los Angeles 36 Washington 11 Cincinnati 24 Milwaukee 37 Wichita 12 Cleveland 25 Minneapolis, St. Paul 38 Raleigh-Durham 13 Dalas 26 Nashville Column Definition 1 Population density in the ith SMSA, in 1975, in terms of persons per square mile. From U. S. Bureau of the Census, 1978, County and City Data Book, 1977. 2 State Unionization rate in 12978. From the U. S. Bureau of Labor Statistics, Handbook of Labor Statistics, December 1980, p. 414. 3 Total population in the ith SMSA, in 1975. From the US Bureau of the Census, 1978, County and City Data Book, 1977. 4 The average cost of living for a four-person family living on an intermediate budget in the ith SMSA in 1978. From the US Bureau of the Census, Statistical Abstract of the United States, 1979. 5 The 1972 per capita level of property taxes in the ith SMSA. From the US Bureau of the Census, 1978, County and City Data Book, 1977. Except New York and Washington, DC, data were from the U.S. Bureau of Census, 1974. 6 The 1974 per capita income in the ith SMSA. From the US Bureau of the Census, 1978, County and City Data Book, 1977. 7 Temperature is measured as average annual degree days, 1931-1960. From the US Weather Bureau, 1962, Decennial Census of United States Climate - Monthly Normals of Temperature, Precipitation, and Heating Degree Days. (Due to lack of published data for Champaign, IL, Cedar Rapids, IA, and Lancaster, PA, degree days for Springfield, IL, Waterloo, IA, and Harrisburg, PA, respectively, were used for the missing data.) 8 A dummy variable with a value of 1, if there is right-to-work legislation in the state where the ith SMSA is located, 0 otherwise From the US Bureau of Labor Statistics, 1980, Handbook of Labor Statistics, December 1980, p. 414. 414 13.6 1790128 169 5128 2961 1 16897 239 11 396891 143 4303 1711 1 16221 43 23.7 349874 339 4166 2122 0 17168 951 21 2147850 173 5001 4654 0 18699 255 16 411725 99 3965 1620 1 16806 1257 24.4 3914071 363 4928 5634 0 22117 834 39.2 1326848 253 4471 7213 0 19517 162 31.5 162304 117 4813 5535 0 19076 229 18.2 164145 294 4839 7224 1 18224 1886 31.5 7015251 291 5408 6113 0 18794 643 29.5 1381196 170 4637 4806 0 18354 1295 29.5 1966725 239 5138 6432 0 18987 302 11 2527224 174 4923 2363 1 16714 489 29.5 835708 183 4787 5606 0 17430 304 15.2 1413318 227 5386 5982 0 18565 1130 34.6 4424382 255 5246 6275 0 19145 323 27.8 169467 249 4289 8214 0 18490 696 21.9 1062565 326 5134 6235 0 19392 337 11 2286247 194 5084 1278 1 17114 371 29.3 1138753 251 4837 5699 0 18193 386 30 1290110 201 5052 4868 0 18262 362 34.2 342797 124 4377 5205 0 17982 1717 23.7 6986898 340 5281 1349 0 17722 968 27.8 1409363 328 5176 7635 0 20025 433 24.4 2010841 265 5206 8392 0 19389 183 17.7 748493 120 4454 3578 1 16627 6908 39.2 9561089 323 5260 4862 0 21587 230 11.7 582664 117 4613 782 1 16334 1353 34.2 4807001 182 4877 5144 0 19416 762 34.2 2322224 169 4677 5987 0 18008 201 23.1 228417 267 4123 7511 0 19186 480 30 2366542 184 4721 4809 0 17897 372 23.7 1584583 256 4837 1458 0 17707 1266 23.7 3140306 381 5940 3015 0 19427 333 33.1 1406746 195 5416 4424 0 18671 1073 21 3021801 205 6404 4224 0 20105 157 12.8 384920 206 4796 4620 1 17783 302 6.5 468512 126 4614 3393 1 18074 *********************** Demographic Data Source: Gunst, R. F., and Mason, R. L. (1980), Regression Analysis and Its Application: A Data-Oriented Approach, New York: Marcel Dekker, p. 358. Taken From: Source Dimension: 49 observations on 7 variables Description: As described in Gunst and Mason, the data set is a subset of larger data set. The variables are demographic characteristics of the following 49 countries: 1 Australia 2 Austria 3 Barbados 4 Belgium 5 British Guiana 6 Bulgaria 7 Canada 8 Chile 9 Costa Rica 10 Cyprus 11 Czechoslovakia 12 Denmark 13 El Salvador 14 Finland 15 France 16 Guatemala 17 Hong Kong 18 Hungary 19 Iceland 20 India 21 Ireland 22 Italy 23 Jamaica 24 Japan 25 Luxembourg 26 Malaya 27 Malta 28 Mauritius 29 Mexico 30 Netherlands 31 New Zealand 32 Nicaragua 33 Norway 34 Panama 35 Poland 36 Portugal 37 Puerto Rico 38 Romania 39 Singapore 40 Spain 41 Sweden 42 Switzerland 43 Taiwan 44 Trinidad 45 United Kingdom 46 United States 47 USSR 48 West Germany 49 Yugoslavia Column Description 1 infant death per 1,000 live births 2 # of inhabitants per physician 3 population per square kilometer 4 population per 1,000 hectars of agricultural land 5 percentage literate of population aged 15 years and over 6 # of students enrolled in higher education per 100,000 population 7 gross national product per capita (1957 U.S. dollars) 19.5 860 1 21 98.5 856 1316 37.5 695 84 1720 98.5 546 670 60.4 3000 548 7121 91.1 24 200 35.4 819 301 5257 96.7 536 1196 67.1 3900 3 192 74 27 235 45.1 740 72 1380 85 456 365 27.3 900 2 257 97.5 645 1947 127.9 1700 11 1164 80.1 257 379 78.9 2600 24 948 79.4 326 357 29.9 1400 62 1042 60.5 78 467 31 620 108 1821 97.5 398 680 23.7 830 107 1434 98.5 570 1057 76.3 5400 127 1497 39.4 89 219 21 1600 13 1512 98.5 529 794 27.4 1014 83 1288 96.4 667 943 91.9 6400 36 1365 29.4 135 189 41.5 3300 3082 98143 57.5 176 272 47.6 650 108 1370 97.5 258 490 22.4 840 2 79 98.5 445 572 225 5200 138 2279 19.3 220 73 30.5 1000 40 598 98.5 362 550 48.7 746 164 2323 87.5 362 516 58.7 4300 143 3410 77 42 316 37.7 930 254 7563 98 750 306 31.5 910 123 2286 96.5 36 1388 68.9 6400 54 2980 38.4 475 356 38.3 980 1041 8050 57.6 142 377 69.5 4500 352 4711 51.8 14 225 77.7 1700 18 296 50 258 262 16.5 900 346 4855 98.5 923 836 22.8 700 9 170 98.5 839 1310 71.7 2800 10 824 38.4 110 160 20.2 946 11 3420 98.5 258 1130 54.8 3200 15 838 65.7 371 329 74.7 1100 96 1411 95 351 475 77.5 1394 100 1087 55.9 272 224 52.4 2200 271 4030 81 1192 563 75.7 788 78 1248 89 226 360 32.3 2400 2904 108214 50 437 400 43.5 1000 61 1347 87 258 293 16.6 1089 17 1705 98.5 401 1380 21.1 765 133 2320 98.5 398 1428 30.5 1500 305 10446 54 329 161 45.4 2300 168 4383 73.8 61 423 24.1 935 217 2677 98.5 460 1189 26.4 780 20 399 98 1983 2577 35 578 10 339 95 539 600 33.8 798 217 3631 98.5 528 927 100 1637 73 1215 77 524 265