Applied Analytics from Data to Decisions
- GreatPlants grows and sells plants and trees online
- You are a Data Analyst at the company, collaborating mainly with the Marketing team and the Design team.
- In particular, you often analyze data on “leads”. A lead is a visitor of our website that can come from various sources: direct, search, ads, social media.
- If a non-signed up lead (regardless of orders) comes back to the website for a new session they will start again from the landingpage. We will assign the same lead_id to a returning lead (but a different session_id). This is possible thanks to our algorithm that identifies leads (using information like IP address)
- If a signed-up lead (regardless of orders) comes back to the website for a new session they will start directly from the shopping page (we recognize from their IP address that they have visited the website in the past and they already signed up).
The AB test
The Marketing team has asked your help to AB test the 2 landing pages
- Some leads will see the old landing page (Control)
- Some leads will see the new landing page (Treatment)
- The AB test will run for 2 weeks, between 2023-03-20 and 2023-04-02 (both included)
Your project 从数据到决策的应用分析代写
You should develop a single pdf document that contains information on the entire experiment. In particular, here are some key dates:
- Pretending it is March 19th: you have access to some historical data in the 3 tables mentioned on slide 8. In particular you should use data from the 2-week period before the experiment. You should develop an experiment plan that clarifies all the technical aspects of the AB test.
- Pretending it is March 20: you can assume the rollout of theexperiment goes smoothly
- Pretending it is April 3rd: The experiment has run between 2023-03-20 and 2023-04-02 (both included). The data has been recorded. You can analyze the data that was collected during the experiment (regardless of the sample size you came up with during planning) and provide a recommendation. The grouping is contained in the table experiment_groups, under the experiment_name “landing_2023”
Some remarks and tips 从数据到决策的应用分析代写
- During the experiment new leads (that have never been on the website before) will start their first session and at that time they will be randomized in the Control and Treatment groups.
- During the experiment, the table fact_landing will log sessions generated from both landing pages.
- fact_order contains more orders than fact_landing (which contains only order_id’s generated by sessions that started from a landing page)
- For planning purposes: the marketing team thinks that the new landing page might have an impact on a primary metric of at least 22% of the baseline. For example:
○ If your primary metric is a percentage with a baseline of 35%, then the assumed impact would be 35% x 22% = 0.077 = 7.7 percentage points.
○ If your primary metric is an average with a baseline of 1.5, then the assumed impact would be 1.5 x 22% = 0.33
The 4 tables that you have access to 从数据到决策的应用分析代写
fabriziopublic.greatplants.dim_lead (one row per lead that ever started a session on a landing page)
- lead_id: the lead identifier
- lead_source: can be one of direct / search / ads / social
- first_session_at: timestamp of the first session of a lead
- signup_at: timestamp of the signup completion. If NULL, the lead never signed up.
- lead_info: encrypted information provided by the lead during signup (name, address, email, etc). If NULL, the lead never signed up.
fabriziopublic.greatplants.fact_landing (one row per session that started on a landing page)
- lead_id: the lead identifier
- session_id: the session identifier
- session_at: timestamp of the session
- signup_at: if a signup was completed during the session, the timestamp of the signup is recorded here.
- order_id: if an order was placed during the session, the order_id is recorded here.
fabriziopublic.greatplants.fact_order (one row per order, which could be generated from a session that started on a landing page or from a different flow) 从数据到决策的应用分析代写
- lead_id: the lead or client identifier (even after a lead signs up, we still identify them with their lead_id)
- order_id: the order identifier
- order_at: timestamp of the order
- tot_price: total price of the order, before taxes
- taxes: sales tax applied to the order
- tot_cost: total cost paid by the customer, including taxes
fabriziopublic.greatplants.experiment_groups (one row per unit of any experiment)
- experiment_name: the unique identifier of an experiment
- unit_id: the identifier of the unit of randomization that was used in the experiment
- grouped_at: timestamp of when a unit was assigned to a group in an experiment
- group: the name of the group that a unit was assigned to
Your deliverable 从数据到决策的应用分析代写
- You will produce a single pdf filethat you will upload to Courseworks (under Final Exam)
- Your pdf document will contain all the details of the experiment
- There is a limit of 8 pages (standard US letter page, 11pt text or larger). You can add extra information (for example queries) in an Appendix, if you want. The Appendix won’t be graded (but might be used to understand metrics definitions).
- The deadline is May 7th, 11:59pm
- This is an individual take home exam. You cannot work with anybody else on this exam. The exam is intended to be completed in approximately ~4 hours.
- If you have technical questions about the Final Exam, please post them as a “Reply” in the corresponding Announcement on Courseworks.
Solutions and grading 从数据到决策的应用分析代写
- Solutions will be posted on May 8th
○ this exam is worth 35 points
○ the grade will depend on
■ Accuracy of technical results
■ Presentation of results in the document. The audience of the document is intended to be a mix of: Marketing managers and other Data Analysts. It should contain enough technical information for analysts, and high level results for Marketing managers.
○ Grades will be posted by May 11
○ If you have questions about your grade, email me within 24 hours from the time grades are posted.
○ 24-hours after grades are posted, Course grades will be determined and finalized.