Midterm
Part 1 (10 points) 应用分析期中代考
Given the below subset of Uber’s schema (the 3 tables below), write executable SQL queries to answer the questions. Please answer in a single query script for each question and you can use at most 1 temporary table (either with a WITH statement or a nested query).
A. (5 pts) For each of the cities ‘Qarth’ and ‘Meereen’, calculate the average difference between Actual and Predicted ETA (estimated time of arrival) for all completed trips within the last 90 days.
B. (5 pts) A signup is defined as an event labeled ‘sign_up_success’ within the events table. Restrict the attention to signups in the first 7 days of 2023. For each city (‘Qarth’ and ‘Meereen’) and for each day in which there was a signup, determine the percentage of signups that resulted in a completed trip within 168 hours of the signup.
Data is not provided for this exercise, but you can generate fake data to test your queries.
Submission instructions: You will copy/paste your queries into Courseworks.
Table name: trips 应用分析期中代考
Column name: | Datatype: |
id | integer |
client_id | Integer (Foreign keyed to events.rider_id) |
driver_id | integer |
city_id | Integer (Foreign keyed to cities.city_id) |
client_rating | integer |
driver_rating | integer |
request_at | Timestamp |
predicted_eta | Integer (in minutes) |
actual_eta | Integer (in minutes) |
status | String (can be ‘completed’, canceled’) |
Table name: cities
Column name: | Datatype: |
city_id | integer |
city_name | string |
Table name: events
Column name: | Datatype: |
device_id | integer |
rider_id | integer |
city_id | integer |
event_name | String (can be: ‘sign_up_success’, ‘attempted_sign_up’, ‘sign_up_failure’) |
_ts | Timestamp of the event |
Part 2 (20 points) 应用分析期中代考
Uber’s goal is to make sure that the drivers that start the signup process end up taking their first trip on the platform.
A snapshot of data from the driver_signup table is available in the driver_signup.csv file.
An Uber product manager needs help answering a few (intentionally ambiguous) questions:
- Are there any issues with the data? Feel free to make assumptions to perform your analysis.
- What fraction of the drivers that sign up also take a first trip? How long does it take them to take a first trip?
- What makes a driver more likely to start driving?
- How would you define the most important performance metric that ourcross-functional team should track using this table? What are one or two additional performance metrics that we should track?
- Do you have any ideas for our app to improve the performance metrics that you picked?
Submission instructions: 应用分析期中代考
Please report the results of your analysis in a Google Slides deck of at most 12 pages (title page included). Optional: you can add additional pages or code in an appendix section of your deck (they won’t be graded). When done working on your slides deck, you will click on Share (top right) and change the General Access setting to LionMail (or Public or columbia.edu). You will then copy the url link to your deck and paste it in the Courseworks submission field.Your analysis will be evaluated based on its content and the criteria for building an effective slides deck, discussed throughout the course.
Table name: driver_signup
Column name: | Datatype: |
id | integer |
city_id | Integer (where did the driver sign up) |
signup_os | String (can be: “android”, “ios”, “website”, …) |
signup_channel | string (Can be: “offline”, “paid”, “organic”, “referral”) |
signup_timestamp | Timestamp (timestamp of account creation) |
bgc_date | Date (of background consent) |
vehicle_added_date | Date when driver uploaded vehicle info |
first_trip_date | Date of first trip as a driver |
vehicle_make | String (make of vehicle uploaded) |
vehicle_model | String (model of vehicle uploaded) |
vehicle_year | Year that the car was made |
更多代写:cs澳洲Midterm代考 雅思线上代考 英国留学生哲学代考 北美assignment写作 澳大利亚哲学代考 机器学习课业代做