Marketing代写 Using information from the estimation sample only, estimate a logistic regression model of the purchase decision (buytabw)
Please limit your report in 10 pages.
1.Estimation (20 points)
Randomly split the data into an estimation sample and a validation sample:
set.seed(2019) L = nrow(catalog_DF) ind = sample(1:L, L/2) catalog_DF$validation_sample = 0 catalog_DF$validation_sample[ind] = 1
(1) Marketing代写
Using information from the estimation sample only, estimate a logistic regression model of the purchase decision (buytabw), using all customer attributes in the data file (except customer_no and validation_sample indicator) as independent variables. Display and briefly discuss the marginal effects of the customer attributes on the catalog purchase choice using the maBina function in the erer package.
(2) Marketing代写
Try one of your most favorite machine learning methods. The purchase decision (buytabw) is a binary outcome. Using “regression” should restrict the outcome to [0, 1]. You can simply change all predicted values below 0 to zero, and all predicted value above 1 to 1.
For all analysis below, you should compare results between the logistic regression and your chosen method.
2.Predicted purchase probability in the validation sample (10 points) Marketing代写
Predict the purchase probability for all customers in the validation sample using the predict function. Verify that the predicted purchase probability variable was created and that it has reasonable values.
From now on, you should only work with observations in the validation sample. Make sure that you do not accidentally include observations from the estimation sample in the analysis!
3.Box plot of predicted purchase probabilities (10 points)
Show box plots of the predicted purchase probabilities using the boxplot function, separately for customers who made a purchase after receiving the catalog and those who did not respond:
Do the box plots indicate that the model has some power to predict who is likely to purchase in the validation sample?
par(cex = 0.80) boxplot(Pr ~ buytabw, data = catalog_DF, col = "hotpink1", xlab = "Customer did not buy (0) or bought (1)", ylab = "Predicted purchase probability")
4.Scoring and segmentation (10 points) Marketing代写
Score the customers and segment the customers into ten deciles, where score = 1 corresponds to the customers with the lowest predicted purchase probabilities and score = 10 corresponds to the customers with the highest predicted purchase probabilities. Employ the createBins function for this task.
# createBins ------------------------------------------------------------------ # Inputs: x, a vector of numbers (or a column of a data frame) # N, the number of bins (groups) to create createBins <- function(x, N) { cut_points = quantile(x, probs = seq(1/N, 1 - 1/N, by = 1/N), type = 2) cut_points = unique(cut_points) bins = cut(x, c(-Inf, cut_points, +Inf), label = 1:(length(cut_points) + 1)) return(as.numeric(bins)) }
Now create a summary data set, score_DF, that contains some key summary statistics separately for each segment (score). Include these summary statistics:
- Number of observations in segment
- Number of buyers in segment
- Mean predicted purchase probability
- Mean observed purchase rate (based on buytabw)
5.Lift and gains (10 points)
Create a table indicating the lift, cumulative lift, and cumulative gains from the predictive model. Plot the lift, cumulative lift, and cumulative gains chart.
Interpret and discuss the lifts and gains: Is the predictive model useful for targeting purposes?
6.Profitability analysis (10 points) Marketing代写
From now on work again with the customer-level data in catalog_DF. Use the following data:
- Based on past data, the average dollar margin per customer is $ 26.90
- The cost of printing and mailing one tabloid is $ 1.40
Using the predicted purchase probability, calculate expected profits. Provide a histogram of the expected profits variable. Discuss the graph. Calculate the fraction of customers who are expected to be profitable, i.e. have positive expected profits.
Now rank customers according to their expected profitability. Then calculate realized profits, based on the observed purchase decision of each customer. Calculate the cumulative sum of realized profits for a targeting strategy where customers are targeted in descending order of expected profits. Plot the cumulative realized profits on the y-axis versus the percent of customers mailed on the x-axis. Discuss your findings.
7.Recommended targeting strategy (20 points)
What mailing strategy do you recommend? Compare the actual profitability from your proposed strategy to
- The expected profitability based on your model,
- A mass mailing strategy where each customer receives a catalog.
What is the percent improvement in profits from your recommended strategy relative to a mass mailing strategy?
更多其他:金融代写 assignment代写 C++代写 CS代写 游戏代写 澳洲代写 程序代写 程式代写 算法代写 经济代写 统计代写 统计作业代写 编程代写