The writer is very fast, professional and responded to the review request fast also. Thank you.
BUA 6315: Business Analytics for Decision Making
Module 5 Assignment Handout:
Supervised Data Mining
Overview:
In this assignment, you will learn how to apply supervised data mining techniques in business cases.
Prompt:
For this assignment, you will analyze the three case studies below and address the questions associated
with each.
For all the cases, you will first partition data sets into 50% training, 30% validation, and 20% test and use
12345 as the default random seed. If the predictor variable values are in the character format, then treat the
predictor variable as a categorical variable. Otherwise, treat the predictor variable as a numerical variable.
Case 1:
For this case, first download the data: Social Media data (available in Blackboard).
Next review the following case study:
A social media marketing company is conducting consumer research to see how the income level and age
might correspond to whether or not consumers respond positively to a social media campaign. Aliyah
Turner, a new college intern, is assigned to collect data from the past marketing campaigns. She compiled
data on 284 consumers who participated in the marketing campaigns in the past, including income (in
$1,000s), age, and whether or not each individual responded to the campaign (1 if yes, 0 otherwise).
Then complete the actions below and record your answers in a Microsoft Word document.
Note: For step-by-step instructions on how to use Excel and Analytic Solver to estimate and predict with
KNN method and how to interpret results, refer to the following videos from Lesson 1: Supervised Data
Mining – KNN Algorithm:
● K – Nearest Neighbors (KNN) – Introduction (7:51)
● KNN Model with Analytic Solver (12:57)
1. Perform KNN analysis to estimate a classification model for the social media campaign using the
data. What is the optimal value of k?
2. Report the overall accuracy, specificity, sensitivity, and precision rates for the test data set (for
Analytic Solver) or validation dataset (for R) using the cutoff value of 0.50. Explain each of them with
one sentence.
3. What is the area under the ROC curve(or the AUC value)?
4. Comment on the performance of the KNN classification model. Is the KNN method an effective way
to predict whether or not a consumer responds positively? Note: Interpret the results of ROC curve,
lift chart, and decile-wise lift chart.
5. What is the predicted outcome for the first new consumer record?
1
BUA 6315: Business Analytics for Decision Making
Case 2:
For this case, first download the data: Mobile Banking data (available in Blackboard).
Next review the following case study:
Sunnyville Bank wants to identify customers who may be interested in its new mobile banking app. The
worksheet called Mobile Banking Data contains 500 customer records collected from a previous marketing
campaign for the bank’s mobile banking app. Each observation in the data set contains the customer’s age
(Age), gender (Male/Female), education level (Edu, ranging from 1 to 3), income (Income in $1,000s),
whether the customer has a certificate of deposit account (CD), and whether the customer downloaded the
mobile banking app (App equals 1 if downloaded, 0 otherwise). Create a classification tree model for
predicting whether a customer will download the mobile banking app.
Then complete the actions below and record your answers in a Microsoft Word document.
Note: For step-by-step instructions on how to use Excel and Analytic Solver to estimate and predict with a
classification tree, and how to interpret results, refer to the following video from Lesson 2: Supervised Data
Mining – Decision Trees: Using Analytic Solver to Build a Classification Tree (8:17).
1. How many leaf nodes are in the best-pruned tree? What are the predictor variables and split value
for the root node of the best-pruned tree?
2. What are the accuracy rate, specificity, sensitivity, and precision of the best-pruned tree on the test
data? Explain each of them with one sentence.
3. Generate the ROC curve. What is the area under the ROC curve?
4. Score the 20 customers in the data set you have downloaded using the best-pruned tree. How many
of the 20 new customers will likely download the mobile banking app based on your classification
model? What is the probability of the first new customer downloading the app?
Case 3:
For this case, first download the data: Electricity data (available in Blackboard).
Next review the following case study:
Kyle Robson, an energy researcher for the U.S. Energy Information Administration, is trying to build a model
for predicting annual electricity retail sales for states. Kyle has compiled a data set for the 50 states and the
District of Columbia that contains average electricity retail price (Price in cents/kWh), per capita electricity
generation (Generation), median household income (Income), and per capita electricity retail sales (Price in
MWh). Create a regression tree model for predicting per capita electricity retail sales (Sales).
Then complete the actions below and record your answers in a Microsoft Word document.
Note: For step-by-step instructions on how to use Excel and Analytic Solver to estimate and predict with a
regression tree and how to interpret results, refer to the following video from Lesson 2: Supervised Data
Mining – Decision Trees: Using Analytic Solver to Build a Prediction Tree (5:53).
2
BUA 6315: Business Analytics for Decision Making
1. How many leaf nodes are in the best-pruned tree and minimum error tree?
2. What are the predictor variables and split value for the first split of the best-pruned tree? What are
the rules that can be derived from the root node?
3. What are the RMSE and MAD of the best-pruned tree on the test data?
4. What is the predicted per capita electricity retail sales for a state with the following values: Price =
11, Generation = 25, and income = 65,000?
Submission Guidelines:
Your completed assignment must be submitted as a Microsoft Word document, 1-2 pages in length, double
spacing, 12-point Times New Roman font, and 1-inch margins. The submission must be accompanied by
three Microsoft Excel spreadsheets showing your work. Only the Word document will be assessed for
grading purposes, however the case spreadsheets are required and must be submitted to show your work.
Note: No tables or charts need to be included in your Word document for this assignment.
Note About Grading:
This assignment will be assessed based on the accuracy of your responses to each question in the
worksheet.
3
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more