Add further information to the representation vectors

The Trajectron++ paper suggests to maybe  "_add further additional information (e.g., raw LIDAR data, camera images, pedestrian skeleton or gaze direction estimates) in this framework by encoding it as a vector and adding it to this backbone of representation vectors, **e**x_". 
Has anyone tried this?  If so, where did you add it in the code and did the results improve?
Thanks and Best Regards