Title: Personalized autonomous driving maneuver based on reinforcement learning


It is evident that road users will be changed from human-driven cars to automated-driven cars in the future by progressive research trends. However, the adoption of autonomous vehicles in our daily life is controversial in many different aspects since people have not ready to accept autonomous vehicles yet. Significant concerns are safety and reliability. According to Reuters[1], 67 percent of respondents considered that the safety standard of autonomous vehicle driving should be higher than conventional vehicle driving. Plus, on 2020 March, almost half of the Americans responded that they would never take autonomously driven cars, and 60% of respondents said they would have more trust in the auto-driven car if they understand how it works[2]. People tend to think that ten more years is required to believe automated driving system[2].
In order to answer the questions above, car manufacture such as Tesla, Hyundai, and so on have been putting so much effort to bring the autonomous vehicle era to us sooner [3, 4]. However, autonomous technology is still insufficient for being used in reality yet. The manufactures offer only driving assistance systems so far, such as the Advanced Driver Assistance System(ADAS), and recommend people to use the systems only on the highway. The primary reason for driving automatically in the urban area is complicated because of the lack of infrastructure for communication and plenty of components to consider for safe driving compared to highway[5-8].
When it comes to urban areas, the typical types of intersections can be categorized into a signalized and unsignalized intersections. In the case of the signalized intersection, traffic signal plays a role in maximizing the traffic flow efficiency under the premise of ensuring crossing priority. However, unsignalized intersection is a totally different story. Although fewer cars might cross the intersection than signalized one, the priority of crossing the intersection has to rely on each driver’s decision making since most countries outside of North America, and South Africa does not adjust stop sign rule[9]. Therefore, traffic at unsignalized intersections not only considers crossing the intersection safely but also needs to be react similarly to how humans drive. Many researchers have begun to focus on the study on unsignalized intersection control to give people trustworthy autonomous driving experience. Therefore, people would trust and ride autonomous vehicles in the future. In this study, the techniques that consider the human experience and make a car drive autonomously will be investigated.
When it comes to consider the human experience, human-in-the-loop(HITL) and Virtual Reality(VR) technology are implemented for this study. The HITL approach gets the spotlight due to the inability of a computer system to accurately accomplish tasks, which require human participation [10-12]. VR technology is prevalently used to offer a realistic experience to participants [13]. By integrating two techniques, it has the potential to carry out practical driving behavior datasets involving human experience.
In the case of getting an autonomously driven maneuver supervised(ref) and reinforcement learning(ref) have already been conducted. Among them, reinforcement learning(RL) has been paying attention due to its feature that the agent is able to do self-study cooperating with the environment. The remarkable benefit of reinforcement learning is that it could give a better solution that the human being even never thought about or show technique that very limited people could perform. In addition, in a computer science field, they have started considering putting human experience or human knowledge in the middle of the reinforcement learning procedure to find the optimized result quickly[14]. Likewise, the agent considering individual’s preference by human-in-the-loop technique has been used in a different area. For example, the fashion industry developed a personalized outfit recommendation system depending on each customer’s taste, StyleSnap service from Amazon. Therefore, it would be worthy of researching optimized autonomous driving maneuvor based on individual’s driving habit. The research gaps that can be filled are as follows: 

  • Optimizing autonomous driving maneuver at Unsignalized intersection research is insufficient compared to research at the signalized intersection.
  • Most of them do not consider human behavior and only focus on the efficiency of movement, but this study will consider people3’s characteristics.
  • Training based on the personalized driving data is expected to shorten the time to reach the optimized value for reinforcement learning.
  • It is possible to compare how realistic the trained model is by comparing the data of human driving in a virtual environment.