Learn to Earn: Enabling Coordination within a Ride Hailing Fleet
Abstract: The problem of optimizing social welfare objectives on multi sided ride hailing platforms such as Uber, Lyft, etc., is challenging, due to misalignment of objectives between drivers, passengers, and the platform itself. An ideal solution aims to minimize the response time for each hyper local passenger ride request, while simultaneously maintaining high demand satisfaction and supply utilization across the entire city. Economists tend to rely on dynamic pricing mechanisms that stifle price sensitive excess demand and resolve the supply demand imbalances emerging in specific neighborhoods. In contrast, computer scientists primarily view it as a demand prediction problem with the goal of preemptively repositioning supply to such neighborhoods using black box coordinated multi agent deep reinforcement learning based approaches. Here, we introduce explainability in the existing supply repositioning approaches by establishing the need for coordination between the drivers at specific locations and times. Explicit need based coordination allows our framework to use a simpler non deep reinforcement learning based approach, thereby enabling it to explain its recommendations ex post. Moreover, it provides envy free recommendations i.e., drivers at the same location and time do not envy one another’s future earnings. Our experimental evaluation demonstrates the effectiveness, the robustness, and the generalizability of our framework. Finally, in contrast to previous works, we make available a reinforcement learning environment for end to end reproducibility of our work and to encourage future comparative studies.