Changsha Hunan, 410073
China
National University of Defense Technology
opponent hidden information inference, state estimation, stable feature, action model, Texas Hold'em
structural developmental neural network, information saturation, continual learning, unsupervised learning
Overestimation reductionMulti-agent Operator switchingValue averagingReinforcement Learning.
Multi-agent InterpretabilityRisk-sensitive Reinforcement Learning Cooperative Policy.
Multi-agent \sep Adaptive risk attitudes \sep Distributional \sep Reinforcement learning \sep Risk-sensitive.