Deep reinforcement learning for stochastic last-mile delivery with crowd shipping
Résumé
We study a setting in which a company not only has a fleet of capacitated vehicles and drivers available to make deliveries but may also use the services of occasional drivers (ODs) who are willing to make deliveries using their vehicles in return for a small compensation. Under such a business model, a.k.a crowd shipping, the company seeks to make all the deliveries at the minimum total cost, i.e., the cost associated with their vehicles and drivers plus the compensation paid to the ODs. We consider a stochastic and dynamic last-mile delivery environment in which customer delivery orders, as well as ODs willing to make deliveries, arrive randomly throughout the day and present themselves for deliveries made within fixed time windows. We present a novel deep reinforcement learning (DRL) approach to the problem that can deal with large real-life problem instances, where we formulate the action selection problem as a mixed-integer optimization program. The DRL approach is compared against other approaches to optimization under uncertainty, namely, sample-average approximation (SAA) and distributionally robust optimization (DRO). The results show the effectiveness of the DRL approach by examining out-of-sample performance and that it is suitable to process large samples of uncertain data.
Origine : Fichiers produits par l'(les) auteur(s)