Scene compliant spatio-temporal multi-modal multi-agent long-term trajectory forecasting

Ridel, Daniela Alves

Scene compliant spatio-temporal multi-modal multi-agent long-term trajectory forecasting

Detalhes bibliográficos
Ano de defesa:	2021
Autor(a) principal:	Ridel, Daniela Alves
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Tese
Tipo de acesso:	Acesso aberto
Idioma:	eng
Instituição de defesa:	Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Aprendizado de máquina Convolutional neural networks Machine learning Multimodal trajectory forecasting Predição multimodal de trajetórias Redes neurais convolucionais
Link de acesso:	https://www.teses.usp.br/teses/disponiveis/55/55134/tde-08112021-112852/
Resumo:	Predicting long-term human motion is challenging due to the non-linearity, multi-modality, and inherent uncertainty in future trajectories. Such type of prediction is important to ensure safety in the context of self-driving vehicles, especially when driving inside cities where vulnerable road agents, as cyclists and pedestrians, might be more commonly seen. By predicting the trajectories of surrounding agents, the self-driving car can plan safer routes and avoid possible collisions. Prior studies have used different types of input information depending on the type of agent (cars, pedestrians, or cyclists), the length of the predicted trajectory (long or short-term), and the number of predicted trajectories (unimodal or multimodal). Related work either rely on highdefinition maps or processes scene and past trajectories as disconnected features, therefore the spatial inference of context in future trajectories is lost. We propose a new approach to trajectory forecasting that aligns the input information in space and time in an agent-centered manner. By aligning the input information we can take advantage of convolutional neural networks to compute the most plausible paths. Our model automatically learns and enforces scene context and therefore can predict multiple plausible paths according to the input information. The proposed approach achieved competitive results compared to the state-of-the-art in the Stanford Drone Dataset (SDD) for long-term trajectory forecasting, using five predicted trajectories. For critical applications, like self-driving cars, it is important to predict several possible future trajectories of each target agent, as it covers a broader range of possible futures, increasing self-driving car safety. Accordingly, the prediction of trajectories is a crucial task to be developed and included in the self-driving cars pipeline.

Scene compliant spatio-temporal multi-modal multi-agent long-term trajectory forecasting

Registros relacionados