Classification Problems for MDPs and Optimal Customer Admission to M/M/k/N Queues
MetadataShow full item record
My dissertation addresses the unichain classification problem for any finite Markov decision process (MDP) with a recurrent or stopping state and the optimal admission problem for an M/M/k/N controlled queue with holding costs. In the first chapter, we study the unichain classification problem for MDPs. The unichain classification problem is to detect whether an MDP with finite states and actions is unichain or not. This problem has been proven to be NP-hard. We study this problem while an MDP contains a state which is either recurrent under all deterministic policies or absorbing under some action. We introduce the definitions of avoidable and reachable sets and provide the corresponding polynomial algorithms that finds the states from which a given set is avoidable or reachable. We also provide a polynomial algorithm that detects whether a state is recurrent and solves the unichain classification problem for an MDP with a recurrent state and a polynomial algorithm for finding all recurrent and stopping states and solving the unichain classification problem with recurrent or stopping states. At the end of the first chapter, we discuss detecting all transient states in an MDP in polynomial time.In the second chapter, we study optimal admission of arrivals to an M/M/k/N queue. The arriving customers are classified into m types. The rewards and holding costs depend on customer types. Upon admitting an arriving customer, the system collects the reward from the admitted customer and pays the holding cost to the admitted customer. We study average reward per unit time for the problem. We prove the existence of an optimal trunk reservation policy and describe the structures of stationary optimal, canonical, bias optimal, and Blackwell optimal policies. If there exist two or more stationary optimal policies, we apply more sensitive optimality criteria to detect the best policy among all stationary optimal policies. We show that bias optimal and Blackwell optimal policies are unique, coincide, and are the trunk reservation policies with the largest optimal control level for each customer type.