Data-table代写 Part 1: Summary：How would you summarize the data? For each table, write 2-4 sentences with relevant information.
Part 1: Summary
How would you summarize the data? For each table, write 2-4 sentences with relevant information. Briefly describe what is measured in the data and provide a summary of the information. You can show a table or graphic, but keep things brief.
This part of the report will be directed to your internal team at the consulting company. It is intended to document the sources of information that were used in the project. It will also describe the data in less technical terms to team members who are not data scientists. If another member of the team joins the project later, they will rely on your descriptions to gain familiarity with the data. To that end, we recommend providing some instructions that will help other consultants use the information more effectively.
See the following for further instructions:
Only a single table representing all of the customers is provided.
Only a single table representing all of the products is provided.
A new table of page views will be delivered each month. You should build the initial report based on the file from January of 2020.
A new table showing products purchased will be delivered each month. You should build the initial report based on the file from January of 2020.
Part 2: Specific Questions Data-table代写
In addition to your summary, our prior work has identified specific questions of interest. Please provide these answers in output that is easy to read (e.g. tables).
This part of the report will be directed to product managers throughout the client’s company. The idea is to give them the useful information they need to act on the specific questions they posed. Plan your communication accordingly.
Notes: Using data.table, most of these calculations can be solved in no more than 5 lines of code. Many of the questions may require information from multiple tables. Use the merge function to combine tables as needed. HTML-friendly tables can be constructed using the datatable function in the DT package.
These questions were carefully crafted based upon the client’s needs. It is important to answer them based on what is stated. To that end, please read each question closely and answer it accordingly.
1.During the first week of the month, what were the 10 most viewed products? Show the results in a table with the product’s identifier, category, and count of the number of views.
2.During the whole month, what were the 10 most viewed products for each category? Show the results in separate tables by category in the bullets below. Including only the product’s identifier and the count of the number of views.
3.What was the total revenue for each category of product during the month? Show the results in a single table sorted in decreasing order.
4.Among customers with at least one transaction, show the average, median, and standard deviation of the customers’ monthly spending on the site.
5.What is the percentage distribution of spending by gender? Show the amount of revenue and the percentage.
Using linear regression, what is the effect of an extra ten thousand dollars of income on monthly spending for a customer while adjusting for age, gender, and region?
7.Among customers who viewed at least 1 product, how many had at least one purchase during the month? Show the total number and as a percentage of the users with a view.
8.Now let’s look at the viewing habits in different age groups, including 18-34, 35-49, 50-64, and 65+. Within each group, what were the mean, median, and standard deviation for the number of unique products viewed per customer?
Note: You can use R’s cut2 function in the Hmisc library to create the age groups.
9.What is the correlation between a user’s total page views and total spending? For customers without a transaction, include their spending as zero.
10.Which customer purchased the largest number of coats? In the event of a tie, include all of the users who reached this value. Show their identifiers and total volume.
Part 3: Generalization Data-table代写
This part of the report will be directed internally to your team’s engagement manager. The idea is to present these approaches to your team. The work will then be conveyed to the client’s technical team and middle managers who are working closely with you on the project. Plan your communication accordingly.
1.Did you see any problems with the data set? If so, whom would you report them to, and what would you do to address them? What would be different about the next version of the data?
2.Now generate a version of the same report using the data on views and transactions from the month of February 2020.
In building this report, do not create a new RMarkdown file. Instead, build a small .R file that allows the user to specify some parameters (e.g. the names of the files). Then use the render function in the rmarkdown library to run the report. Supply these new parameters as a list object in the params input. Then you can make use of these parameters within the RMarkdown file. For instance, if your file name is “views – January 2020.csv” and it is stored as params$views.file, then you can read the data with fread(input = params$views.file) Data-table代写
Use the dir.create function to build new subfolders to store each month’s report. Specify a name for the output file when calling the render function. Use this method to generate the separate reports for January and February.
Briefly describe your process for implemeting this automated approach. What work would a non-technical user need to perform to run this script without your involvement?
3.What are the advantages of creating an automated approach to routine reporting?
Part 4: Opportunities Data-table代写
This part of the report will be directed externally to your client’s senior leadership. Your work will help to determine the future direction of the project and the company’s contract with this client. Plan your communication accordingly.
1.How would you build on the reporting capabilities that you have created? What would you design next?
2.What are some opportunities to learn valuable information and inform strategic decisions? List a number of questions that you might explore.
3.How would you approach other decisionmakers within the client’s organization to assess their priorities and help them better utilize the available information?
4.Video Submission: Make a 2-minute pitch to the client with a proposal for the next phase of work. Include in your request a budget, a time frame, and staffing levels. Explain why this proposal would be valuable for the client and worth the investment in your consulting services. Please submit this answer as a short video recording. You may use any video recording program you feel comfortable with. The only requirements are that you are visible and audible in the video. You may also submit a file of slides if that is part of your pitch.
This project involves extensive coding on large data sets. As a training exercise, we are asking you to develop your skills in R and to use the data.table package for processing applications.
Part of your work is to understand what is measured in the tables. The fields in the tables should make intuitive sense. However, the organization has not provided a data dictionary. It will be up to you to gain this understanding of what is measured.
The goal of this project is to present a report that can be delivered to managers of the client’s organization. To that end, the final output should not show your code or calculations. Make the explanations clear and concise.