Below is a structured table displaying various requirements and their descriptions.
| Requirement | Description |
|---|---|
| Book a cab | User should be able to book a cab from any pickup location to any drop-off location. When they do so, our system should try to match them with one of the closest driver. |
| Track the ride | User should be able to track their journey from source to destination on a map |
Note: We won't be covering payment in this design as this design focuses on Cab sharing.
| Requirement | Description |
|---|---|
| Accept/decline the booking request | The cab driver who accepts the rider first will be given the ride |
| View ride history | Cab driver should be able to view his ride history mentioning payments for each ride. |
| Requirement | End User | Description | |
|---|---|---|---|
| Availability | Customer & Cab Driver | The system should be highly available - 99.999999% uptime | |
| Latency | Customer & Cab Driver | User should receive booking acknowledgement within 15 seconds. | |
| Scalability | Customer & Cab Driver | The system should support global users and traffic that will be from multiple geographic regions | |
| Customer | The system should support 36 million Daily Active Users (DAU) | ||
| The system should support 180 million Monthly Active Users (MAU) | |||
| Cab Driver | The system should support 3 million Daily Active Users (DAU) | ||
| The system should support 93 million Monthly Active Users (MAU) | |||
| Extensibility | Customer & Cab Driver | The design of our system should be such that it is easier to extend it in the future. Example: If we need to add features like auto-pilot cab bookings, or luxury cab bookings. |
|
Note: For now, we are not concentrating on the non-functional requirements such as User experience, Security and Storage reliability.
For Capacity Estimation, we will consider both customers and cab drivers. Here customers are the users who use cab sharing services for their rides.
- Daily Active Users (DAU) :
36 million - Monthly Active Users (MAU) :
180 million
Note: DAU and MAU estimations for customers are considered from Uber cab sharing Wiki. If you want, then you can update these estimates as per your convenience.
- Daily Active Users (DAU) :
3 million - Monthly Active Users (MAU) :
93 million
Note: DAU and MAU estimations for cab drivers are considered as a rough estimate based on the Google search results. If you want, then you can update these estimates as per your convenience.
Calculation of write requests and read requests to the system
One of the possible ways of write requests to the system:
- Customer booking request information
Most of the cases, write requests are one time activities other than booking requests.
Assumptions:
50 out of 100customers book rides daily.
Calculation:
- Total DAU:
36 million - Write requests per day:
- Booking :
(50/100) times 36,000,000 = 18,000,000 ~ 18 million
- Booking :
One of the possible ways of read requests to the system:
- Users track their ride.
Assumptions:
55 out of 100users track their ride.
Calculation:
- Total DAU:
36 million - Read requests per day:
- Users track their ride:
(55/100) times 36,000,000 = 19,800,000 ~ 19.8 million
- Users track their ride:
One of the possible ways of write requests to the system:
- Cab driver's booking acceptance/rejection information
Most of the cases write requests are one time activities other than booking and payment requests.
Assumptions:
95 out of 100cab drivers accept rides daily.
Calculation:
- Total DAU:
36 million - Write requests per day:
- Ride acceptance :
(95/100) times 36,000,000 = 34,200,000 ~ 34.2 million
- Ride acceptance :
Some of the possible ways of read requests to the system:
- Cab drivers read their booking request.
- Cab drivers refer their ride history along with fare details.
Assumptions:
95 out of 100cab drivers read their booking request.90 out of 100cab drivers check their ride history along with fare details.
Calculation:
- Total DAU:
36 million - Read requests per day:
- Cab drivers read their booking request:
(95/100) times 36,000,000 = 34,200,000 ~ 34.2 million - Users refer their ride history along with fare details daily:
(90/100) times 36,000,000 = 32400000 ~ 32.4 million
- Cab drivers read their booking request:
| End User | Operation | Calculation | Result |
|---|---|---|---|
| Customer | Write | (50/100) x 36 million | 18 million |
| Read | (55/100) x 36 million | 19.8 million | |
| Cab Driver | Write | (95/100) x 36 million | 34.2 million |
| Read | (95/100) x 36 million | 34.2 million | |
| (90/100) x 36 million | 32.4 million | ||
| Total read request(s) per day | 66.6 million | ||
Assumptions
- Average size of a cab sharing record:
100 KB - Daily write operations to the system:
18 million
Storage Calculations
- Daily storage requirement:
100 KB x 18 million requests per day = 1.8 TB per day - Storage requirement for 10-Years:
1.8 TB per day x 365 days x 10 years = 6.57 PB
Assumptions
- Average size of a cab sharing record:
100 KB - Daily write operations to the system:
34.2 million
Storage Calculations
- Daily storage requirement:
100 KB x 34.2 million requests per day = 3.42 TB per day - Storage requirement for 10-Years:
3.42 TB per day x 365 days x 10 years = 12.483 PB
Summary
| End User | Metric | Calculation | Result |
|---|---|---|---|
| Customer | Daily Storage | 100 KB x 18 M requests/day | 1.8 TB |
| 10-Year Storage | 1.8 TB/day x 365 days x 10 years | 6.57 PB | |
| Cab Driver | Daily Storage | 100 KB x 34.2 M requests/day | 3.42 TB |
| 10-Year Storage | 3.42 TB/day x 365 days x 10 years | 12.483 PB |
Note: Average size of a Cab Sharing user record - 100 KB is considered as a rough estimate. If you want, then you can update it as per your convenience.
By memory, we refer to the cache memory size required for faster data access.
Accessing data directly from the database takes time. To speed up data retrieval, cache memory is used.
- Daily Storage Requirement:
1.8 TB - Cache Requirement(1% of Daily Storage):
(1/100) x 1.8 TB = 18 GB
- Daily Storage Requirement:
3.42 TB - Cache Requirement(1% of Daily Storage):
(1/100) x 3.42 TB = 34.2 GB
Note: You may wonder, why we considered 1% of daily storage as cache requirement! This is because we need to store geo-spatial data only relevant to the user i.e. area closer to their location.
The memory size should scale as the system grows to accommodate increasing storage and data access demands.
| End-user | Metric | Result |
|---|---|---|
| Customer | Daily Storage | 1.8 TB |
| Cache Requirement(1% of Daily Storage) | 18 GB | |
| Cab Driver | Daily Storage | 3.42 TB |
| Cache Requirement(1% of Daily Storage) | 34.2 GB |
Network/Bandwidth estimation helps us determine the amount of data flowing in and out of the system per second.
- Data stored per day:
1.8 TB - Calculation:
1.8 TB / (24 x 60 x 60) = 20.833 MB/sec - Result: Incoming Data Flow =
20.833 MB/sec
- Total read requests per day:
19.8 million - Average customer record size:
100 KB - Daily Outgoing Data:
19.8 million x 100 KB = 1.98 TB/day - Calculation:
1.98 TB / (24 x 60 x 60) = 22.92 MB/sec - Result:
22.92 MB/sec
- Data stored per day:
3.42 TB - Calculation:
3.42 TB / (24 x 60 x 60) = 39.58 MB/sec - Result: Incoming Data Flow =
39.58 MB/sec
- Total read requests per day:
66.6 million - Average customer record size:
100 KB - Daily Outgoing Data:
66.6 million x 100 KB = 6.66 TB/day - Calculation:
6.66 TB / (24 x 60 x 60) = 77.08 MB/sec - Result:
77.08 MB/sec
| End User | Type | Calculation | Result |
|---|---|---|---|
| Customer | Ingress (Data flow in) | 1.8 TB / (24 x 60 x 60) | 20.833 MB/sec |
| Egress (Data flow out) | 1.98 TB / (24 x 60 x 60) | 22.92 MB/sec | |
| Cab Driver | Ingress (Data flow in) | 3.42 TB / (24 x 60 x 60) | 39.58 MB/sec |
| Egress (Data flow out) | 6.66 TB / (24 x 60 x 60) | 77.08 MB/sec |
We follow a standard & efficient way to communicate between the cab sharing systems. Computers talk to each other through API call. So let's first try with REST API for this communication.
Let's say our user(Mark) wants to book a ride using a cab sharing company. Mark can send request to the cab sharing server, the cab sharing server can relay that request to cab driver(John). John can accept the ride, the cab sharing server can get the response from John, and then cab sharing server can pass the booking confirmation response to Mark.
Note: Here, John can also reject the Mark's booking request. If so, Mark will have to repeat the process by re-initiating booking request.
- User(Mark) sends a booking request with his choice of pick-up and drop-off location(s) to the Cab Sharing Server.
- The Cab Sharing Server maps booking request with a Cab Driver(John) who is near to the Mark's pick-up location.
- John acknowledges the booking request with his acceptance or rejection and notifies to the Cab Sharing Server
- The Cab Sharing Server relays that acknowledgement to Mark.
Note: The process of mapping cab driver involves micro-service gRPC communication and we will talk about that in High-level design.
We can handle this with a simple REST API call (HTTP POST request).
Here are the technical details.
This tells to the server what action to perform. Since we want to book a cab for a user on the server, we use the POST action.
This tells the server where to perform that action. Since we are booking a ride for a user, we will use the /v1/bookings endpoint of the server.
Note: 'v1' means version 1. It is good practice to version your APIs. You can customize the endpoint based on your convenience.
We have told the server to book a ride for a user, but we haven't provided the details of the booking itself. This information is sent in the request body:
{
"userId":"Identification number of the user",
"source":"pick-up location of the ride",
"destination":"drop-off location of the ride",
"cabType":"Type of the cab for a ride. E.g: Economy/Premium e.t.c",
"BookingQuoteId":"Temporary booking Id for the ride"
}This is more challenging. The Cab Sharing Server needs to send the message to John.
If second step isn't through, then we can forget about the third part even though from client to server communication is possible through HTTP request.
So is the fourth part!
Problem
In HTTP, the server cannot initiate requests to the user. Requests can only be sent from the client to the server - it's a one-way street (user -> cab sharing server).
Solution: WebSockets
WebSockets enable bidirectional communication.
Note: For more details, you can refer to our WebSocket section of Communication Protocols.
The below image represents the creation of bi-directional communication using a HTTP request.
Once the connection is upgraded, it remains open, allowing both the cab sharing server and user/client to send messages directly to each other in real-time without needing the traditional HTTP request/response model.
- Mark sends messages to the cab sharing server through his connection.
- John receives messages from the cab sharing server through his connection.
- John sends messages to the cab sharing server through his connection.
- Mark receives messages from the cab sharing server through his connection.
In the above API, we saw how Mark booked a ride. Now, let's see how Mark will track his ride.
We will continue with the WebSocket connection as Track The Ride requires bi-directional communication and also a stream of messages from server to client.
Ride tracking involves the steps below.
- Mark will wait for John's cab to arrive at the pick-up location.
- John will start the ride after picking Mark from the pick-up location.
- Mark and John will start tracking their ride along the way.
Imagine Mark is waiting at his pick-up point for John. He has no clue when John will arrive at his location. So, Mark opened the cab sharing application to get John's Estimated Time of Arrival(ETA).
Server will read the Mark's pick-up point and John's current location coordinates to calculate the ETA. After ETA calculation, server will share John's ETA to Mark's pick-up location with Mark.
Imagine John is at Mark's pick-up point. Mark got into John's cab now. They wanted to start the ride and John clicked start ride button in the cab sharing application.
Server will read the Mark/John's current and drop-off location coordinates to calculate the ETA to the drop-off location along with distance in miles. After calculation, server will share ETA to the drop-off location and distance between pick-up and drop-off locations to both Mark and John via cab sharing application.
Let's assume Mark is curious to track his cab ride along with John. So, they both opened cab sharing application to track their ride.
Server will consider both Mark/John's current location and drop-off location coordinates continuously to calculate dynamic ETA, distance(in miles). It will asynchronously sending these updates to both Mark and John to track their ride along the way.
Note:
- ETA calculation involves several steps which will get covered in the Book A Cab high level design.
- The map, ETA and distance(in miles) in the above images are considered as an example. You can re-consider these factors as per your convenience.
- There can be more cases during this tracking process. Some of them are-
- Mark may change his pick-up location during Wait For The Cab period. In this case-
- Server can notify both Mark and John about the dynamic change of John's ETA to Mark's location along with distance to reach.
- Server can give heads-up to John about Mark's new pick up location.
- Server can suggest alternate ways to reach drop-off location by considering various factors, such as ETA, distance, traffic e.t.c.
- During the trip, Mark may change his drop-off location. In this case-
- Server can re-calculate ETA to the drop-off location along with relevant changes, such as price, alternate routes e.t.c.
- Server can notify these changes to both Mark and John.
After dropping Mark at drop-off point, John thought of checking his ride history along with the payment information. So, John opened his Cab Sharing application and clicked Activity option in his profile page.
Unlike previous API, we can use REST API (HTTP GET request) for viewing ride history. But, we will stick to the WebSocket connection for API communication to avoid extra HTTP headers and also to maintain consistency.
Now, John is able to view his ride history including payment information. He can also view more details of a particular ride if needed.
Let's see what was happened in the background when-
- Mark booked a cab,
- Mark and John tracked their ride,
- John viewed his ride history along with payment information.
In the API design, we saw how WebSocket connection can be used to communicate with cab sharing servers. If we go a little deeper, then we can see how services communicate with each other.
For service communication, we can use gRPC.
google Remote Procedure call(gRPC) enables efficient communication between services using proto buffers and gRPC server.
- Protocol buffers(an interface definition language) used to represent the service interface and the payload message structure.
- gRPC server is used to handle client calls.
Note: We can relate gRPC to a general example here
Mark performed a couple of steps to book his cab. Let's dive into them one by one on a high-level.
How Mark was able to view the map which allowed him to choose pick-up and drop-off points graphically? Below steps walk us through.
-
Mark can tap the Cab Sharing App icon on his mobile device(client) to open Cab Sharing Application.
-
The Client can send view request to the API gateway via WebSocket connection.
- An API gateway acts as a single entry point for all incoming requests. Note: For more details, you can refer to our API gateway template
-
The API gateway can relay the request to the load balancer.
- A load balancer acts like a traffic manager, directing incoming user requests to different servers. Note: For more details, you can refer to our ultimate system design template
-
The Load balancer can relay the request to the Data Fetch service.
- The Data Fetch service can take request and pass it to appropriate service within cluster for a response.
-
The Data Fetch service can relay the request to a Load balancer within the service cluster. The Load balancer can direct the request to a available Map service.
- The Map service is responsible to manage map data.
- There can be multiple Map services to address multiple requests simultaneously.
-
The Map Service can take help from Map Database service to get the map details.
- The Map Database service is responsible for getting relevant map details and also for maintaining user's map information until the ride completes.
-
The Map Database service can use S2 index to get user's region and also to find relevant database partition.
- S2 is a library used to work with geographical data. For more details, refer to this link.
-
The Map Database service can use this partition to download the map data from key value storage.
- The Key value storage is an example of NoSQL database. For more details, refer to our Database and Storage Basics
-
The Map Database service can relay the response to the Map Service.
-
The Map Service can take help from the GPS signal service to avoid the risk of missing road data.
-
The Map Service can provide final map data output to the Data Fetch service.
-
The Data Fetch can save user's map information to the user record database.
- The user record database is responsible for maintaining user's information.
- Internally, you can implement Cache Aside Strategy for read operations and Write Aside Strategy for write operations as user's data storage has more importance in cab sharing system.
Note: For more details on caching, refer to our Caching Basics.
- As there is a chance of stale data, we can use ZooKeeper service to maintain synchronization between database and its replica(s).
- Internally, you can implement Cache Aside Strategy for read operations and Write Aside Strategy for write operations as user's data storage has more importance in cab sharing system.
Note: For more details on caching, refer to our Caching Basics.
- The user record database is responsible for maintaining user's information.
Note: Maintenance of multiple Map services can be dependent on no of user requests.
-
The API gateway can send the response back to the Client. Now, Mark can view the booking page with a Map view.
-
The Client can save map information to the CDN.
- The Content Delivery Network(CDN) stores copies of your website’s data that doesn’t change too often. Note: For more details on CDN, you can refer to the CDN section of Database Storages.
-
We can use in-memory cache to avoid additional network data usage on duplicate download of map data by drivers.
- In-memory cache can help us to update data only if the server map data changes.
To provide map data with low latency, we can make use CDN effectively. The below image gives us an idea how CDN decreases latency in loading Mark's/John's Map details.
- Mark thought of opening his cab sharing application and Client can request CDN for Mark's Map information.
- The CDN worried because it doesn't have Mark's Map information, so it requested Cab Sharing servers to provide relevant information.
- The Cab Sharing servers can get the map data from key-value storage
- The Cab Sharing servers can provide the response back to the CDN.
Note: The Client can validate the CDN cache correctness based on various factors such as Mark's or John's current location e.t.c.,
Google’s S2 library: Unlike many geometry libraries, google's S2 is primarily designed to work with spherical geometry Amazon DynamoDB: Supports key-value and document data structures. It is primarily used for scalability and performance.
How Mark was able to view ETA to the drop-off point? Let's find out.
-
The Client can get static Map data from CDN for Mark and Mark can click ok button after entering pick-up and drop-off points.
-
The Client can send view ETA request to the API gateway.
Note: When Mark clicks ok button, other options such as cab type e.t.c can also be considered, but we are not covering them in this design.
-
The API gateway can relay the request to load balancer.
-
The Load balancer can direct the same request to Data Fetch service.
-
The Data Fetch service can relay the request to the Load Balancer.
- Here, Data Fetch service can re-validate the input data before relaying the request further and this validation is optional because first input validation can be covered at API Gateway.
-
The Load balancer can direct the request to a available Ride Estimator service as shown in the image above.
- The Ride Estimator service is responsible to compute ETA by considering various factors which will be discussed in the steps below.
- There can be multiple Ride Estimator services to address multiple requests simultaneously.
-
The Ride Estimator service can send the map data request to a available Map service through a Load Balancer.
-
The Map service can respond with the map data based on Mark's pick-up and drop-off points.
-
The Ride Estimator service can use deep learning algorithms to predict traffic control elements such as stop signals and traffic lights.
-
The Ride Estimator service can get average speeds (based on the locations) from hash table through Estimator Database handler.
-
The Ride Estimator service can consider the map data as a graph to compute accurate ETA.
- In Map graph, a road intersection is considered as a node and a road segment is considered as an edge.
- Let's say, road intersections are more in Mark's ride path. In this case, we can partition the Map graph to calculate ETA efficiently. Note: You can refer to this link for more details on Graph.
-
The Ride Estimator service can make use of GPS signal service to get active & recent GPS observed points between Mark's pick-up and drop-off points. Note: For more information on GPS signals, you can refer to this link
-
Now, to get the accurate ETA, we can do map matching between estimated and GPS observed points as shown in the image above.
- In Map matching, if the estimated and GPS observed points don't match, then step 11 will be repeated with the ride path paired with GPS observed points and can skip step 12 and step 13.
- The final calculated ETA data can be relayed back to the Data Fetch service.
-
The computed ETA along with user's details (such as user's current location, pick-up and drop-off points) can be registered in database through Estimator Database handler for re-usability purposes. Note:
- Based on the business need you can limit or extend this storage.
- For instance, if you want to limit, then you can save only ETA associated to pick-up and drop-off points to the Ride Estimator Database. Also, you can consider this storage if Mark changes his drop-off location.
Note:
- We can store average speeds in a hash table for fast look-up.
- A hash table generalizes the simpler notion of an array. You can refer to our hashing section of Extras file for more information.
-
The Ride Estimator service can provide computed ETA output to the Data Fetch service.
-
The Data Fetch service can save ETA associated to pick-up and drop-off points to the User Record Database. Also, you can consider this storage if Mark changes his drop-off location. Note: This storage cannot be required always unless you have a business need.
-
The API gateway can relay the response message to the Client.
-
Mark can view the ETA on his booking page.
Note:
- The average speed and ETA in the reference image are considered as an example. You can update them as per your convenience.
- While calculating ETA, Haversine distance can be considered. Think of it like a formula to compute the shortest distance between two points on a sphere. More details are here
- As ETA can keep on changing between two locations based on various factors and also storage can be huge in case of ETA. So, we are not considering ETA storage for entire ride path. Instant communication can be preferred.
How Mark was able to find a driver for his booking? Let's look into it.
-
The Client can get static Map data from CDN for Mark and Mark can click Book button after knowing ETA to his drop-off point.
-
The Client can send booking request to the API gateway.
- Find A Driver request is a part of booking request.
-
The API gateway can relay the request to the load balancer.
-
The Load balancer can direct the same request to the Data Fetch service.
-
The Data Fetch service can relay the same request to the available Driver Finder service through a Load balancer.
- The Driver Finder service is responsible to find active drivers near Mark's location.
-
The Driver Finder service can take help from the available Map service through a Load balance to get map data of available drivers locations.
- The Driver Finder service can use Key-Value(Redis cluster) storage to store driver locations.
- Redis key-value store is an example of NoSQL database. For more details, you can refer to Database Concept.
- The Key Value Storage can have many instances, meaning driver locations are distributed among all instances.
- We may encounter two issues with this storage:
-
Hot Shard Problems
- Hot Shard Problems can occur due to more drivers in big cities.
- Solution:
- We can use S2 library to divide a map into regions that can vary in size.
- So that only few cars will fit inside a single shard. Note: For more details on sharding, you can refer to our Database Sharding under System Design Basics
-
In-active Drivers
- The drivers who stopped driving for the rest of the day.
- Solution:
- We can create buckets to store active drivers and can remove old buckets with in-active drivers.
- To avoid frequent memory access for allocation and de-allocation, we can use a sorted set with latest timestamp to find nearby drivers.
-
- The Driver Finder service can relay a list of active drivers near to Mark's location to the available Request service through a Load balancer.
- The Request service is responsible for sending requests to users and record their responses.
-
The Request service can send Mark's booking request to first driver(Kevin) in the active drivers list.
-
The Client can relay Kevin's rejection message to the Request service.
-
As Kevin rejected the booking request, the Request service can send Mark's booking request to John.
-
The Client can relay John's approval message to the Request service.
-
The Client can save driver details to the CDN.
-
The Request service can stop sending requests to drivers and can send confirmed Driver details to the Driver Finder service.
-
The Driver Finder service can update the ride status of John in database.
- These details are useful in filtering out active drivers.
-
The Driver Finder service can give response back to the Data Fetch service.
-
The Data Fetch service can store John's details associated to Mark's ride in user record database.
- These details are useful to maintain ride history.
-
The API gateway can send the response back to the Client with booking acceptance message.
-
The Client can retrieve John details from the CDN. Note: The Client can get the driver details from API gateway, but for faster load time, it can make use of CDN.
-
Now, Mark is able to view driver details via Client.
Google’s S2 library: As described in the description above, we use this library here to divide a map into grids. Redis cluster: Redis is fastest, and most feature-rich cache, data structure server, and document and vector query engine.
How Mark and John were able to track their ride? Let's take a look.
-
User can start their ride through Client device to initiate tracking.
-
The Client can get static data from CDN to populate on Client device.
-
The Client can make use of web-socket connection to forward the request to the API Gateway.
-
The API gateway can relay the request to the load balancer.
-
The Load balancer can direct the same request to the Data Fetch service.
-
The Data Fetch service can send the Map data request to the available Map service through a Load Balancer.
-
The Data Fetch service can send the ETA request to the available Ride Estimator service through a Load Balancer.
-
The Map service can take help from Map Database handler to get the map details.
- The Map Database handler can use S2 index to get user's region and also to find relevant database partition.
- The Map Database handler can use this partition to download the map data from key value storage.
- The Map Database handler can relay the response to the Map service.
-
The Map service can take help from the GPS signal service to avoid the risk of missing road data.
-
The Map service can provide a copy of final map data output to the Ride Estimator service for computing ETA.
- The Map service can send this information as per the request from the Ride Estimator service.
-
The Estimator service can use deep learning algorithms to predict traffic control elements, such as stop signals and traffic lights. Note:
- The traffic control elements will be considered only once between pick-up and drop-off points.
- If Mark changes either pick-up point (at the start) or drop-off point (at the end), then you can repeat this step to get updated data.
-
The Estimator service can get location based average speeds from the hash table through Estimator Database handler.
-
The Estimator service can extract GPS signal details for missing road information.
- This information is useful, if the estimated path encounters any run time changes like John takes different route to reach drop-off location or Mark updates the ride with stopping points in between.
-
The Estimator service will make use of Map data to compute the ETA as per user's current location.
- The partition of Map graph is still useful to compute ETA by considering road intersections and road segments efficiently.
- The Map service can relay the response back to the Data Fetch service.
- The Ride Estimator service can provide the computed ETA to the Data Fetch service.
- The Data Fetch service can save user's current, drop-off location coordinates and respective ETA to the user record database.
- This step can be an optional one, because it will be a storage overhead to maintain location coordinates through out the ride path.
- You can consider the location coordinates at the start of the ride and at the end of the ride along with their ETA. Also, you can consider this storage if Mark changes his drop-off location.
-
The API Gateway can give a response to the Client.
- As we are tracking the ride, responses can be asynchronous i.e: cab sharing servers can keep-on sending responses to client as Mark's or John's current location progresses.
-
The Client can store the information to the CDN.
-
The static information from CDN can be used to render the client's User Interface(UI) until, it receives the response from backend system.
-
As Mark progressed with his ride, our backend-system responded back with updated ETA along with his latest location details to Client through API Gateway.
-
The Client can store the information to the CDN.
-
The Client retrieves static information that was saved in CDN.
- And this process of retrieving data from backend system for data population continuous until ride completes
-
The API Gateway can relay the ride completion response to the client.
-
The Client can save the ride completion status to the CDN.
- This can help Mark to view his ride status later on.
How John checked his Ride History? Let's see.
-
John can click Profile option from Client's home page to see an option for his ride history called Activity.
-
As John's profile page is static, meaning options won't change often, the Client can take help from CDN to load profile page options.
- If John is accessing his profile page for the first time, then Client can request the Cab Sharing Servers for profile page data.
- Also, you might be wondering, what will happen if there is a change in profile page options! This case can be handled in two ways-
- The Cab sharing system can notify client via Notification asynchronous communication service and can re-load the page with his approval.
- The Cab sharing system can collect all new features for client and release them as an when application upgrade happens (preferred for hand held devices like mobile applications).
-
After loading profile page, John can click Activity option to send ride history request.
-
The Client can relay the Ride history request to the API Gateway.
-
The API gateway can relay the request to the load balancer.
-
The Load balancer can direct the same request to the Data Fetch service.
-
The Data Fetch service can request User Record NoSQL database for John's previous ride history.
-
The User Record Database can send John's ride history to the Data Fetch service.
Note: John's ride history can have payment information associated to each and every ride of John.
-
The API Gateway can relay ride history response to the Client.
-
John can see his previous ride history along with payment information through Client.
-
The Client can save John's ride history to the CDN until John completes his next ride.
- The reason for maintaining John's ride history until his next ride is, the Cab Sharing system can refresh the John's ride history based on his next ride status.
Final View Ride History HLD:
The table below provides a high-level comparison of when to use SQL vs NoSQL, marked with ✔ (preferred) and ✖ (not preferred).
| Criteria | SQL | NoSQL |
|---|---|---|
| Fast Data Access | ✖ | ✔ |
| Handles Large Scale | ✖ | ✔ |
| Fixed Structure Data | ✔ | ✖ |
| Flexible Structure Data | ✖ | ✔ |
| Complex Queries | ✔ | ✖ |
| Simple Queries | ✖ | ✔ |
| Frequent/Evolving Data Changes | ✖ | ✔ |
| Database | Deciding Factors | Decision |
|---|---|---|
| Map DB |
|
NoSQL |
| Ride Estimator DB |
|
NoSQL |
| Driver Finder DB |
|
NoSQL |
| User Record DB |
|
NoSQL |
| Attribute | Details |
|---|---|
| Database Type: | NoSQL |
| Common Queries: | Find the map by region ID |
| Indexing: | Region ID |
Note: Because of this common query, we were able to get the map data for Mark and John as per their Region ID. So, we create an index on the Region ID field. This sets a shortcut to quickly find the information by Region ID.
| Attribute | Details |
|---|---|
| Database Type: | NoSQL |
| Common Queries: | Find the ETA by current and drop-off points. |
| Indexing: | Current location ID, Drop-off location ID |
Note: Because of this common query, we were able to get the historical ETA for Mark and John as per their Current Location ID and Drop-off Location ID. So, we create an index on the Current Location ID and Drop-off Location ID fields. This sets a shortcut to quickly find the information by Current Location ID and Drop-off Location ID.
| Attribute | Details |
|---|---|
| Database Type: | NoSQL |
| Common Queries: | Find the Driver(s) by region ID. |
| Indexing: | Region ID |
Note: Because of this common query, we were able to get the Drivers list in Mark's region as per his Region ID. So, we create an index on the Region ID field. This sets a shortcut to quickly find the information by Region ID.
| Attribute | Details |
|---|---|
| Database Type: | NoSQL |
| Common Queries: | Find the user record by user ID. |
| Indexing: | User ID |
Note: Because of this common query, we were able to get the John's ride history as per his User ID. So, we create an index on the User ID field. This sets a shortcut to quickly find the information by User ID.
We've seen how the Map service provided services to Mark and John by gathering data from different sources. Now, let's dive deeper into those sources.
Introduction:
-
OpenStreetMap: OpenStreetMap (OSM) is a free, open map database updated and maintained by a community of volunteers via open collaboration. For more information, you can click on the hyperlink.
-
S2 Library: Unlike many geometry libraries, google's S2 is primarily designed to work with spherical geometry, i.e: shapes drawn on a sphere rather than on a planar 2D map.
- Some of the S2 features:
- S2 divides the map into grids called cells and gives each cell a unique ID.
- Flexible support for spatial indexing, including the ability to estimate inconsistent regions as a collection of discrete S2 cells. This feature makes it easy to build distributed spacial indexes.
- Fast in-memory spacial indexing of collections of points, polylines, and polygons.
- Some of the S2 features:
-
Amazon DynamoDB is a managed NoSQL database service provided by Amazon Web Services (AWS). It supports key-value and document data structures. It is primarily used for scalability and performance.
Usage:
- We can use OSM for internal map data. It gives a free and editable map of the world.
- And can use Google’s S2 library on top of OSM to efficiently index and query map data.
- OSM can store road metadata and road segment sequences in each cell. It helps to understand turn restrictions on the road.
- OSM let the client dynamically build a road network graph for low memory usage. This means the client downloads the S2 cells based on the user's location. Besides the client buffers nearby cells to have enough map data for navigation.
- We can store map data in serialized format on DynamoDB. And query the S2 geospatial index to find the relevant DynamoDB partition. The map data then gets downloaded from DynamoDB. The map data gets deserialized and filtered before returning it to the client.
This is how Mark and John were able to see their map information.
We've seen how the ETA service provided services to Mark and John by following several steps. Now, let's look into into those steps.
Conceptual Introduction:
-
Graph: There are two types of graphs: directed and undirected.
- Directed Graph: A directed graph G is a pair (V, E), where V is a finite set and E is a binary relation on V. The set V is called vertex set of G, and its elements are called vertices. The set E is called the edge set of G, and its elements are called edges.
- Undirected Graph: In an undirected graph G = (V, E), the edge set E consists of unordered pairs of vertices, rather than ordered pairs.
-
Weighted Graph: Graphs for which each edge has an associated weight, typically given by a weight function w: E -> R. For example, let G = (V, E) be a weighted graph with weight function w. We simply store the weight w(u, v) of the edge (u, v) ∈ E with vertex v in u's adjacency list.
- ∈ denotes set membership, and is read "is in", "belongs to", or "is a member of".
-
Routing Algorithm(Dijkstra): Dijkstra's algorithm is an algorithm for finding the shortest paths between nodes in a weighted graph, which may represent, for example, a road network.
-
Routing Algo:
- We can relate map with a graph: every road intersection is modeled as a node, while every road segment is modeled as a directed edge.
- ETA computation becomes finding the shortest path in a directed weighted graph.
- Dijkstra's algorithm is known for finding the shortest path in a graph. But the time complexity of Dijkstra's algorithm is O(n logn). And n is the number of road intersections or nodes in the graph.
- Busiest areas like San Francisco Bay can have half a million road intersections, so Dijkstra's algo is not enough at our scale. Solution is to partition the map graph and then precomputed the best path within each partition. Thus interacting with boundaries of graph partitions is enough to find the best path.
- Every single node in the map graph of a sphere must be traversed to find the best path between 2 points. So time complexity would be the area of the circle: pi * r^2
- While partitioning and precomputing make it more efficient.
- It becomes possible to find the best path by interacting with only the nodes on the Map graph boundary.
- So time complexity would be the perimeter of the circle: 2 * pi * r
- Put another way, the time complexity to find the best path in the San Francisco Bay Area gets reduced from 500 Thousand to 700.
-
Traffic Information:
- The traffic on the road segments must be considered to find the fastest path between 2 points.
- While traffic is a function of the time of the day, weather, and number of vehicles on the road.
- We can use traffic information to populate the edge weights of the graph. Because it can make the ETA more accurate.
- Besides we can combine aggregated historical speed information that was stored in the hash table with real-time speed information. Because extra traversal data makes traffic information more accurate.
- Map Matching:
- GPS signals can get noisy especially when the vehicle enters a enclosed areas like tunnel(s).
- Also the multi-path effect could worsen the GPS signal. The multi-path effect occurs when buildings reflect the GPS signal. A poor GPS signal decreases the ETA accuracy.
- So they do map matching to find the best ETA. Map matching works by mapping raw GPS signals to actual road segments.
| GPS Signals | Road Segments |
|---|---|
| Latitude | Latitude |
| Longitude | Longitude |
| Direction | Direction |
| Speed | |
| Road Name | |
| Segment ID |
- We can use the Kalman filter for map matching. It takes GPS signals and matches them to road segments.
- Imagine the Kalman filter as a person who makes a correct guess about something's location. The new and old information is taken into consideration for guessing.
- Besides we can use the Viterbi algorithm to find the most probable road segments. It's a dynamic programming approach.
- Imagine the Viterbi algorithm as a person who figures out the correct story even if some words were spelled wrong. We can do that by looking at the nearby words and fixing the mistakes so that the story makes more sense.
- Mark may avoid his future trips if the actual trip time is higher than ETA. Also, more than 30 million trips can be completed daily as per our consideration.
- So at our scale, a bad ETA could cost cab sharing company billions of USD in loss. The current approach can allow us to scale to half a million requests per second.
We saw how the Driver Finder service provided services to Mark, to find John by following a process. Now, let's zoom into that sequence.
Conceptual Introduction:
- Redis: Redis is preferred real-time data-driven applications like Cab sharing system. It is fastest, and most feature-rich cache, data structure server, and document and vector query engine.
The process of finding a driver:
-
We can store driver locations in a Redis cluster for scalability and low latency. A Redis cluster contains many Redis instances as shown in the HLD image. This means driver locations are spread across many Redis instances. Thus preventing global write lock and contention issues when many rides get ordered at the same time.
-
But sharding Redis based on region causes a hot shard problem because of more drivers in big cities.
- So we can use Google’s S2 library and divide the map into grids.
- And S2 is hierarchical. That means the cell size varies from square centimeters to square kilometers.
- We can choose Geohash (level 5) by default to find nearby drivers. It represents a square kilometer, so only a few cars will fit inside a single shard. And the hot shard problem wouldn't occur.
- So we can use Google’s S2 library and divide the map into grids.
-
Redis cluster may contain drivers who stopped driving for the rest of the day. But we want only active drivers. This means drivers who are still driving during the day.
- A simple approach is to create in-memory time buckets periodically. Then store the list of active drivers in it.
- And remove old buckets every 20-30 seconds(approx). So only active drivers will stay in the latest bucket.
- Yet it results in constant allocation and freeing up of memory.
- So we can use a Redis sorted set in each Geohash to find nearby drivers. And store the last timestamp reported by the drivers in a sorted order. While inactive driver data is expired using the ZREMRANGEBYSCORE command. That means only data of those drivers who haven’t reported in the last 30 seconds will expire. Simply put, we can overwrite memory instead of reallocating it. Imagine the sorted set as key-value pairs sorted by score.
- Besides we can store the driver location in a hash data structure. It's also queried to ensure that a driver doesn’t show up in 2 different Geohashes while driving through.
Note:
- The Map and other reference icons are considered just as an example. We can always change them as per our convenience.

















































