Assignment 3. Predicting an e-commerce Promotional Sales
Tayko is an e-commerce website that sells games and educational software. Recently, it had its “Double 11” promotion. This promotion yielded 2000 purchases. Based on these data, Tayko wants to devise a model for predicting the spending amount that a purchasing customer will yield in a similar promotion like “Double 11”. The file Tayko.csv contains information on 2000 purchases. The following table describes the variables to be used in the problem.
Table 1. Variable Description | |
Variable | Description |
TransactionID | Transaction ID |
Freq | Number of transactions in the preceding year |
LastUpdate | Number of days since last update to customer record |
WebOrder | Whether customer purchased by web order at least once: 1=Yes, 0=No |
Gender | 1=Male, 0=Female |
Address_Res | Whether it is a residential address: 1=Yes, 0=No |
Address_HK | Whether it is a Hong Kong address |
Spending (outcome) | Amount spent by customer in Hong Kong dollars |
Based on the background above, answer the following questions:
1. 数据分析课业代做
How many transactions have spending greater than 0? How many transactions have spending that equals to 0? (10 points)
Step1 predict who are more likely to Respond (Logistic Regression)
Step2 Predict who are more likely to spend more money (Liner Regression)(use the response Data)
So there are 1001 transactions have spending greater than 0, and there are 999 transactions have spending that equals to 0.
2.
For those customers who have spending greater than 0, what is their average spending and what is the standard deviation? (10 points)
3. 数据分析课业代做
Explore the relationship between spending and each of the two continuous predictors by creating two scatterplots (Spending vs. Freq, and Spending vs. LastUpdate). Does there seem to be a linear relationship? (10 points) (Hints: answer the question with scatterplots)
4.
Check the normality of spending and the two continuous predictors. Do they seem to be normally distributed? (15 points) (Hints: answer the question with three histogram figures and skewness values)
5.To fit a predictive model for Spending: 数据分析课业代做
1)Recode Spendingas a new variable BinarySpending: when Spending=0, BinarySpending=0, when Spending>0, BinarySpending=1,
2)Use BinarySpendingas the outcome variable, run a binary logistic regression and use all the six predictors (Freq, LastUpdate, WebOrder, Gender, Address_Res, Address_HK), write down the estimated predictive equation. (10 points)
3)Based on this model, among those six predictors, which ones have positive impacts on the customer’s response to “Double 11” promotion? Which ones have negative impacts on the customers’ response to “Double 11” promotion? Which ones do not have impact? (15 points) (Hints: write the predictor names and the rationales)
4)What is the model accuracy to predict “no-response” customers? What is the model accuracy to predict “response” customers? (10 points)
5)Based on this model, what type of customer is most likely to respond to the “Double 11” promotion? (20 points)
Bonus questions (20 points) 数据分析课业代做
Select the transactions with spending greater than 0 and then use ONLY these transactions to create multiple linear regression model:
- Check the assumptions, is there any assumptions violated? (10 points) (Hints: even if there are violations, only report the violations)
- Fit the data with a multiple linear regression model with all the six predictors (Freq, LastUpdate, WebOrder, Gender, Address_Res, Address_HK). Write down predictive equation. (5 points)
- Based on the model, what type of customer is more likely to spend a large amount of money? (5 points)
更多代写:cs利兹网课代上 网考怎么作弊 英国理科计算机代写 历史学essay写作 留学Dissertation代写 代数和微积分代写