Reinforcement Learning in Aerospace (Final Exam)Steve BarriaultDecember 13, 2009Contents1 Introduction 12 The Challenges Addressed by the Reviewed Papers 22.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 Common Technical Challenges to All Situations . . . . . . . . . . . . . . . . . . . . . . . . . . 33 How can Reinforcement Learning be used to Solve these Challenges 33.1 Genetic Algorithm Reinforcement Learning (GARL) . . . . . . . . . . . . . . . . . . . . . . . 43.2 Associative Search Element and Adaptive Critic Element . . . . . . . . . . . . . . . . . . . . 43.3 Co-Evolutionary Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.4 Galerkin Sequential Function Approximator (SFA) . . . . . . . . . . . . . . . . . . . . . . . . 53.5 Recapitulation of the Methods Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Results 65 Analysis of the Results 95.1 Common Points that are Mutually Self-Supporting in Most or All Papers . . . . . . . . . . . 95.2 Points that Contradict Each Other . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Conclusion 91 IntroductionThis paper reviews four recent peer-reviewed articles discussing the application of machine learning tech-niques in Aerospace. More precisely, they propose different implementations of the reinforcement learningparadigm to address different challenges.The reason why I chose this topic is first and foremost by pure personal interest. I work in embeddedsoftware, especially with companies creating safety (or mission) critical systems, many of them being inAerospace and Defense industries. Also, because these applications are critical in nature, they offer bothan opportunity and a challenge to machine learning algoritms such as reinforcement learning. As we sawthroughout the semester, these can offer optimizing solutions to problems. But are they efficient enough,fast enough, and especially reliable enough to b ecome a viable solution in the Aerospace industry, even in asafety-oriented, conservative-leaning environment such as civilian aerospace (which uses one of the strongestsoftware quality standard, DO-178B, for certification purposes)? This survey is a first step toward answeringthis question.The papers were selected based on a simple search in the IEEExplore Digital Database.[1] I used keywordssuch as ”reinforcement learning”, ”machine learning” and ”aerospace” in conjunction with one another toretrieve a list of articles, which I then listed in decreasing chronological order so as to choose the most recent1ones. I also briefly reviewed the abstracts linked to each paper to determine whether these articles wereappropriate (some were not, as they were linked to other academic disciplines). I then made a final choiceamong the papers remained. I selected four of them so as to highlight different challenges in the Aerospaceindustry, ranging from the ”mundane” task of maintaining altitude (and a smooth, safe ride) in a Boeing747 to the futuristic goal of optimizing the flight of a morphing vehicle.This paper is divided in several parts. In Section 2, the challenges addressed by each paper will beexplained, and their common characteristics highlighted. Then, in Section 3, I will explain the differenttechniques that were chosen to implement the reinforcement learning algorithm. Section 4 will resume theresults and Section 5 will analyze them, highlighting where results are compatible with one another andwhere they are not. The Conclusion will explain what I learned from this exercise and the questions thatare left unanswered.2 The Challenges Addressed by the Reviewed Papers2.1 DescriptionThe first paper (by Jiang, Gong, Liu, Xu and Chen) applies reinforcement learning to altitude control forairplanes, in particular the Boeing 747 - one of the largest, highest performance civilian planes in the world.[2]This is no small challenge, as the altitude of the plane is determined by a number of factors, both internal andexternal to the aircraft. Pressure, wind direction, speed and gusts are examples of environmental impactson the plane that will, ceteris paribus, modify the altitude of the B747. To compensate, the plane may useits ailerons, flaps, elevators and rudder - usually in conjunction with one another.To make matters worse, obtaining a predictable mathematical model for both the endogenous and exoge-nous factors is either very difficult or plainly impossible. As explained by the authors, not only an aircraft isa non-linear and high dimension system for which it is difficult to obtain the precise mathematical model[2],but the environment will often change in a 15-hour flight to Australia, most of the time suddenly and withoutwarning. And since flying protocols instructs eastbound planes to fly at an altitude with an odd multipleof 1000 feet, and westbound planes to fly at an altitude with an even multiple of 1000 feet, the error ofaltitude control of an aircrat should be less than a hundred feet in order to be safe.[2] Not mentioning thata constantly changing altitude may result in passenger discomfort, extending to the frequent use of certainplastic-lined paper bags.The second paper, written by C-K. Lin, deals with a much faster, more mission-oriented aircraft:missiles.[3] Here, gross performance, not safety, is the primary goal. The author explains that bank-to-turn (BTT) missiles have been widely studied, because they have greater maneuverability and aerodynamicacceleration than traditional skid-to-turn (STT) missiles. A skid-to-turn missile or aircraft does not roll itsentire body when turning - it merely use the rudder or a differential in trust to accomplish an horizontalturn.[4] Bank-to-turn simply describes an aircraft that does rotate on its axis when turning, just like anymodern plane.On the other hand, STT can be performed much quicker than BTT.[4] Given the great speeds achieved bya missile (especially if talking about InterContinental Ballistic Missiles or ICBMs), this may have deleteriouseffects on the mission. Also, BTT introduces strong coupling between pitch and yaw motions, which makesit difficult to control at great speeds.[3]The third paper (by Berenji, Vengerov and Ametha) deals with Unmaned Aerial Vehicles (UAVs) andthe need to maximize their efficiency.[5] A UAV must maximize two goals at once. First, it must make sureto maximize the use of its own sensors
View Full Document