• Login
    View Item 
    •   DSpace Home
    • Stony Brook University
    • Stony Brook Theses & Dissertations [SBU]
    • View Item
    •   DSpace Home
    • Stony Brook University
    • Stony Brook Theses & Dissertations [SBU]
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDepartmentThis CollectionBy Issue DateAuthorsTitlesSubjectsDepartment

    My Account

    LoginRegister

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Approximating Partially Observable Markov Decision Processes with Parametric Belief Distributions for Continuous State Spaces

    Thumbnail
    View/Open
    Knapik_grad.sunysb_0771E_11143.pdf (1.367Mb)
    Date
    1-Dec-12
    Author
    Knapik, Timothy Ryan
    Publisher
    The Graduate School, Stony Brook University: Stony Brook, NY.
    Metadata
    Show full item record
    Abstract
    This dissertation focuses on training autonomous agents to plan and act under uncertainty, specifically for cases where the underlying state spaces are continuous in nature. Partially Observable Markov Decision Processes (POMDPs) are a class of models aimed at training agents to seek high rewards or low costs while navigating a state space without knowing their true location. Information regarding an agent's location is gathered in the form of possibly nonlinear and noisy measurements as a function of the true location. An exactly solved POMDP allows an agent to optimally balance seeking rewards and seeking information regarding its position in state space. It is computationally intractable to solve POMDPs for state domains that are continuous, motivating the need for efficient approximate solutions. The algorithm considered in this thesis is the Parametric POMDP (PPOMDP) method. PPOMDP represents an agent's knowledge as a parameterised probability distribution and is able to infer the impact of future actions and observations. The contribution of this thesis is in enhancing the PPOMDP algorithm making significant improvements in training and plan execution times. Several aspects of the original algorithm are generalized and the impact on training time, execution time, and performance are measured on a variety of classic robot navigation models in the literature today. In addition, a mathematically principled threefold adaptive sampling scheme is implemented. With an adaptive sampling scheme the algorithm automatically varies sampling according to the complexity of posterior distributions. Finally, a forward search algorithm is proposed to improve execution performance for sparse belief sets by searching several ply deeper than allowed by previous implementations.
    Description
    132 pg.
    URI
    http://hdl.handle.net/1951/59730
    Collections
    • Stony Brook Theses & Dissertations [SBU] [1955]

    SUNY Digital Repository Support
    DSpace software copyright © 2002-2023  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV
     

     


    SUNY Digital Repository Support
    DSpace software copyright © 2002-2023  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV