重点实验室
当前位置: 首页 >> 重点实验室 >> 正文
Data-driven Convex Policy Optimization in an Assemble-to-order System
发布日期:2023-10-11  来源:   查看次数:

报告时间:2023年10月13日(星期五)下午2:30- 3:30

报告地点:第二学术报告厅

人:邓天虎

工作单位:清华大学工业工程系

举办单位:蜜桃天美星空果冻京东蜜桃天美星空果冻京东

报告简介:

This paper investigates the optimization of periodic-review assemble-to-order (ATO) production systems with multiple products assembled from multiple components, under the data-driven setting where only historical demand data is available and demand distributions are unknown. To address this challenge, we propose a semi-model-based fitted Q iteration (S-FQI) algorithm framework that leverages the known transition dynamics. We provide a proof of the statistical convergence rate of the proposed algorithm concerning the number of iterations, the number of demand samples, and the number of generated trajectories.

Additionally, we introduce the convex-TD3 (CTD3) algorithm to tackle practical challenges by incorporating the convex property of ATO systems and utilizing an input convex neural network (ICNN) to improve efficiency and effectiveness.

报告人简介:

邓天虎,邓天虎(博士,副教授)目前就职于清华大学工业工程系。2013年于美国加州大学伯克利分校获得工业工程与运筹博士学位,2008年于清华大学工业工程系获得学士学位。目前研究方向侧重智慧供应链。以第一作者和通讯作者在Manufacturing & Service Operations Management、Operations Research等国际学术期刊和学术会议发表论文20余篇。

上一条:工商管理国际期刊学术论文发表与修改经验分享
下一条:Data-driven Piecewise Affine Decision Rule Methods for Stochastic Optimization with Covariate Information

【关闭】