For each exercise, use the tabular, calculation and graphical functions of Microsoft Excel to the full. For the given set of data (not shown):
a) Arrange the observations into an appropriate frequency table.
b) Derive the frequencies of each class
c) Visualise the frequencies using a tally chart and a stem and leaf plot or table.
d) Produce a histogram for the data.
e) Calculate, using the appropriate formula for frequency distributions, the mean, mode, variance and standard deviation.
f) Derive the range, median and interquartile range.
g) Produce a box plot, showing any outliers.

EXERCISE 2
For the given data set (not shown) for a two-class problem, and using Microsoft Excel, set out the table. Using appropriate formulae:
a) Calculate the information gain for each attribute in the set.
b) Calculate the Gini index for each attribute in the set.
c) From the results of these calculations, state which attribute would be chosen by ID3 for the root of the decision tree.
d) Using appropriate calculations in Excel, supplemented if necessary with explanations in a Word document, complete the construction of the decision tree for the table of data using the ID3 algorithm.

