针对深度确定性策略梯度(deep deterministic policy gradient, DDPG)算法在一些大状态空间任务中存在学习效果不佳及波动较大等问题,提出一种基于渐近式k-means聚类算法的多行动者深度确定性策略梯度(multi-actor deep deterministic policy gradient based on progressive k-means clustering, MDDPG-PK-Means)算法.在训练过程中,对每一时间步下的状态进行动作选择时,根据k-means算法判别结果辅佐行动者网络的决策,同时随训练时间步的增加,逐渐增加k-means算法类簇中心的个数.将MDDPG-PK-Means算法应用于MuJoCo仿真平台上,实验结果表明,与DDPG等算法相比,MDDPG-PK-Means算法在大多数连续任务中都具有更好的效果.
针对现有的波达方向(direction of arrival, DOA)估计方法在低信噪比、小快拍、多信源条件下估计精度较低的问题,提出一种基于并行坐标下降算法的DOA估计方法.首先,对空域等角度均匀划分,构造超完备冗余字典;其次,采用并行坐标下降算法的思想对稀疏信号进行重构,得到信号在空域的稀疏系数矩阵;最后,将稀疏矩阵行向量的l2-范数映射到空域网格上,得到准确的DOA估计值.仿真实验结果表明:在低信噪比、小快拍、多信源条件下,该方法优于子空间类算法、贪婪类算法以及凸优化类算法,具有更低的均方根误差(RMSE)、更高的DOA估计精度和运行效率.
<正>"Journal of Jilin University(Science Edition)" is a comprehensive academic journal in the fields of science sponsored by Jilin University and administrated by the Ministry of Education of the People's Republic of China.The journal started publication in 1955.The original name at starting publication was "Journal of Natural Science of North east People University",which was changed into " Acta Scientiarum Naturalium Universitatis Jilinensis" in 1958 owing to the name change of the university.