The writer is very fast, professional and responded to the review request fast also. Thank you.
Laboratory I:
To download additional .arff data sets go to:
or search the Internet for .arff files required
· What’s the difference between a “training set” and a “test set”?
· Why might a pruned decision tree that doesn’t fit the data so well be better than an un-pruned one?
· What’s the first thing that 1R does when making a rule based on a numeric attribute?
· How does 1R avoid overfitting when making a rule based on an enumerated and/or numeric attribute?
· What is the difference between Attribute, Instance and Training set?
OneR
– weka.classifiers.OneR
Decision table
– weka.classifiers.DecisionTable -R
C4.5
– weka.classifiers.j48.J48
· Do the decisions made by the classifiers make sense to you? Why?
· What can you say about the accuracy of these classifiers? When classifying iris that has not been used for training?
· How did each one of the methods perform?
Decision Tree
– weka.classifiers.j48.J48
Decision table
– weka.classifiers.DecisionTable -R
Linear regression
– weka.classifiers.LinearRegression
M5′
– weka.classifiers.M5′
· The dataset describes the time needed by a machine to produce and count 20 bolts. (More details can be found in the file containing the dataset.)
· Analyze the data. What adjustments have the greatest effect on the time to count 20 bolts?
· According to each classifier, how would you adjust the machine to get the shortest time to count 20 bolts?
Laboratory II:
To download additional .arff data sets go to:
weka data folder for
zoo.arff, wine.arff, bodyfat.arff, sleep.arff, pollution.arff
OneR
– weka.classifiers.OneR
Decision table
– weka.classifiers.DecisionTable -R
C4.5
– weka.classifiers.j48.J48
K-means
– weka.clusterers.SimpleKMeans
Try using reduced error pruning for the C4.5. Did it change the produced model? Why?
For K-means, for the first run, set k=10. Adjust as needed. What was the final number of k? Why?
Linear regression
– weka.classifiers.LinearRegression
M5′
– weka.classifiers.M5′
Regression Tree
– weka.classifiers.M5′
K-means clustering
– weka.clusterers.SimpleKMeans
A) How many leaves did the Model tree produce? Regression Tree? What happens if you change the pruning factor?
How many clusters did you choose for the K-means method? Was that a good choice? Did you try a different value for k?
B) Now perform the same analysis on the bodyfat.arff data set.
Change the acuity and cutoff parameters in order to produce a model similar to the one obtained in the book. Use the classes to cluster evaluation – what does that tell you?
Laboratory III:
To download additional .arff data sets go to:
zoo.arff, wine.arff, soybean.arff, zoo2_x.arff,
sunburn.arff, disease.arff
8. Use the following learning schemes to compare the training set and 10-fold stratified cross-validation scores of the disease data (in disease.arff):
Decision table
– weka.classifiers.DecisionTable -R
C4.5
– weka.classifiers.j48.J48
Id3
– weka.clusterers.Id3
A) What does the training set evaluation score tell you?
B) What does the cross-validation score evaluate?
C) Which one of these models would you say is the best? Why?
9. Use the following learning schemes to analyze the wine data (in wine.arff).
C4.5
– weka.classifiers.j48.J48
Decision List
– weka. classifiers.PART
A) What is the most important descriptor (attribute) in wine.arff?
B) How well were these two schemas able to learn the patterns in the dataset? How would you quantify your answer?
C) Compare the training set and 10-fold cross-validations scores of the two schemas.
D) Would you trust these two models? Did they really learn what is important for proper classification of wine?
E) Which one would you trust more, even if just very slightly?
10. Perform the same analysis of sunburn.arff as in 2. Instead of 10-fold cross-validations use 5-fold.
A)-E) Same as in 2.
F) Why could not we use 10-fold evaluation in this example?
11. Choose one of the following three files: soybean.arff, zoo.arff or zoo2_x.arff and use any two schemas of your choice to build and compare the models.
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more