b'12345UPDATING HD MAPSTOWARDEXPLOITINGWHAT-IF MOTIONJUST GO WITH WITH VIDEOS FROMSTREAMINGVISIBILITY FOR 3DPREDICTION FORTHE FLOW: SELF-VEHICLES PERCEPTION OBJECT DETECTION AUTONOMOUSSUPERVISED SCENE The Abstract The AbstractThe Abstract DRIVING FLOW ESTIMATIONA method to keep high- A metric that makes itDeep 3D networks can useThe Abstract The Abstractdefinition city maps up to datepossible for the first time todepth sensors to estimate theA new machine-learningA new machine-learning using crowdsourced videoscompare and jointly optimizefree space between a sensorapproach for context- method that helps engineers from cars on roads.perception systems for bothand a measured point inaware, multimodal behaviortrain tracking systems on accuracy and speed.order to significantly improveforecasting.unlabeled autonomous The Authors accuracy in 3D object CMU masters students Yi- The Authors detection. driving datasets, reducing Chun Kuo, Shining Yu, and Ti- CMU PhD students MengtianThe Authors the immense amount of time Teng Lin, and CMU PrincipalLi and Yuxiong Wang andThe Authors William Qi, Jagjeet Singh,required to annotate data and Image Courtesy of Carnegie Mellon UniversityProject Scientists ChristophArgo Principal ScientistCMU PhD student Peiyun Hu,and Andrew Hartnett, Argotrain a system. Mertz and John Dolan Ramanan Assistant Professor DavidPrediction Engineers; Held, Argo Software LeadSiddhesh Khandelwal,The AuthorsThe ProblemThe ProblemJason Ziglar, and Ramanan Computer Science mastersCMU PhD student Brian High-definition maps areUntil now, no metric hasstudent from University ofOkorn, CMU Assistant essential for self-drivingtracked the delay betweenThe ProblemBritish Columbia; and DevaProfessor David Held, and systems. A robust HD mapthe time that a sensor seesMost popular representationsRamanan CMU Research Intern Himangi can complement real-timesomething and the timeof lidar sensor data areMittalsensor data, providing ait takes for a computer tomeant for processing truly 3DThe Problem vehicle with informationprocess that information,data, ignoring the fact thatForecasting the future statesThe Problem Students at work insideabout objects and shapesmaking it difficult to comparemuch of whats consideredof other actors (cars, cyclists,Scene flow is a technology the Robotics Instituteof roadways that are outsidethe relative quality of different3D sensored data, suchpedestrians, animals, etc.)that helps autonomous at Carnegie Mellonsensor range. The challengeperception systems.as a lidar sweep, is in facton busy roads is one ofsystems track the motion of University. with HD maps, however, is2.5D. The result: the lossthe greatest challenges inmultiple independent actors keeping them up to date asThe Breakthroughof hidden information aboutdeveloping safe autonomous(cars, cyclists, pedestrians) road conditions change.The new metric, knownfree space (space that isntvehicles. Its more difficultin a scene. But training scene as streaming perception,occupied by any objects) inbecause actors areflow systems has historically The Breakthroughmakes it possible to comparerepresentations of lidar data.multimodal (actors move inrequired massive annotated The research group isperception systems for bothdifferent ways, and they candatasets.utilizing more than six yearsaccuracy and reaction time,The Breakthroughvary in types of motion), plus of GPS-tagged video dataand in turn, optimize themThe knowledge lost fromtheir future states depend onThe Breakthrough recorded around Pittsburghfor such metrics. Thinkingrepresenting 2.5D data asother actors, road structures,A new machine-learning to train their mapping system,in these terms has helpedcollections of (x, y, z) pointsand even a self-drivingmethod to train scene flow keeping it up to date withArgo fine-tune its approachcan be efficiently recoveredvehicles intended motionsystems that relies on smartphone and transit busto tracking perception data.through 3D raycastingusingplan.self-supervised learningcameras. This dataset hasSpecifically, Argo has lowereddepth sensors to estimatemachine learning with no allowed them to develop athe amount of perception datathe free space between aThe Breakthroughhuman oversight and no computer vision algorithmits system needs to processsensor and a measuredA new deep-learninghuman-annotated data. This that compares a current videoby using dynamic scheduling3D point. Recovering thisapproach to prediction thatis crucial, because it with an existing HD map toand asynchronous tracking ofknowledge is crucial,takes into account geometricreduces the huge amount of detect any relevant changes.perception data. Instead ofbecause it makes the data(car-to-lane) and socialtime and energy needed to For example, a current videotracking each sensor stream fit for use in deep-learning(car-to-car) relationships.create a suitable dataset to may note a traffic sign not(lidar, radar, cameras) inmodels used to better trainThis approach will allowtrain (and improve) scene captured on an existingisolation, Argos perceptiona self-driving system toself-driving systems to makeflow systems.map. Their system notes thestack asynchronously fusesunderstand its surroundings.what-if predictions of future change, and immediatelysensor inputs to minimizeThis significantly improves thestates, based on different adds it to an HD map.overall latency. The approachoverall detection accuracy ofpotential configurations of also helps reduce computea systems lidar sensor. road lanes and multi-car power, meaning it reducesinteractions. This new method the amount of computingimproves a self-driving hardware needed on-vehicle,vehicles motion-planning which is hugely importantcapabilities by considering for lowering autonomousunlikely futures that might vehicle costs and scaling upimpact their intended route.production. FullBook_Mar24.indb 95 4/25/21 6:43 PM'