Data-Driven Thermal Monitoring and Run-Time Management for Manycore Processor and Chiplet Designs
Principle Investigators
- Dr. Sheldon Tan (PI)
Graduate Students
Current Students
- Sheriff Sadiqbatcha
- Shuyuan Yu
- Wentian Jin
- Jinwei Zhang
- Mohammadamir Kavousi
- Yibo Liu
- Liang Chen (post-doc, SJTU)
- Subed Lamichlane
- Jincong Lu
Graduate Students (graduated)
- Han Zhou (First job: Synopsys)
Industry Liaisons
Funding
- National Science Foundation CISE CCF Core Small program (CCF-1816361), "SHF:Small: Data-Driven Thermal Monitoring and Run-Time Management for Manycore Processor and Chiplet Designs", $500,000, Oct. 1st, 2021 to Sept 30th, 2024, single PI.
Project Descriptions
Background
Publications
Journal publications
- S. Sadiqbatcha, J. Zhang, H. Amrouch and S. X.-D. Tan, “Real-time full-chip thermal tracking: a post-silicon, machine learning perspective”, IEEE Transaction on Computers (TC), (accepted), 2021. J. Zhang, S. Sadiqbatcha, M. O’Dea, H. Amrouch and S. X.-D.
- J. Zhang, S. Sadiqbatcha, M. O’Dea, H. Amrouch and S. X.-D. Tan, “Full-chip power density and thermal map characterization for commercial micrprocessors under heat sink cooling”, IEEE Transaction on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 41, No. 5, pp. 1453-1466, May 2022, 10.1109/TCAD.2021.3088081.
- L. Chen, S. Sadiqbatcha, H. Amrouch and S. X.-D. Tan, “Electrothermal simulation and optimal design of thermoelectric cooler using analytic approach”, IEEE Transaction on Computer-Aided Design of Integrated Circuits and Systems(TCAD), vol. 41, no. 9, page: 3066-3077, 2022, 10.1109/TCAD.2021.3120533
- J. Zhang, S. Sadiqbatcha and S. X.-D. Tan, “Hot-Trim: Thermal and reliability management for commercial multi-core processors considering workload dependent hot spots”, IEEE Transaction on Computer-Aided Design of Integrated Circuits and Systems (TCAD), accepted. 10.1109/TCAD.2022.3216552.
Conference publications
- S. Sadiqbatcha, H. Zhao, H. Amrouch, J. Henkel and S. X-D. Tan, "Hot spot identification and system parameterized thermal modeling for multi-core processors through infrared thermal imaging”, Proc. Design, Automation and Test in Europe (DATE'19), Florence, Italy, March 2019.
- Z. Sun, H. Zhou, and S. X.-D. Tan, “Dynamic reliability management for multi-core processor based on deep reinforcement learning”, International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD’19), Lausanne, Switzerland, July 2019.
- S. Sadiqbatcha, Y. Zhao, J. Zhang, H. Amrouch, J. Henkel and S. X.-D. Tan, "Machine learning based online full-chip heatmap estimation," Proc. Asia South Pacific Design Automation Conference (ASP-DAC’20), Beijing, China, Jan. 2020. (35% acceptance rate)
- J. Zhang, S. Sadiqbatcha, W. Jin and S. X.-D. Tan, “Accurate power density map estimation for commercial multi-core microprocessors”, Proc. Design, Automation and Test in Europe (DATE’20), Grenoble, France, March 2020. (26% acceptance rate)
- S. Yu, H. Zhou, H. Amrouch, J. Henkel, S. X.-D. Tan, “Run-time accuracy reconfigurable stochastic computing for dynamic reliability and power management: work-in-progress”, Proc. International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’20), ESWeek 2020, Sept 2020.
- W. Jin, S. Sadiqbatcha, J. Zhang and S. X.-D. Tan, “Full-chip thermal map estimation for multi-core commercial CPUs with generative adversarial learning”, Proc. IEEE/ACM International Conf. on Computer-Aided Design (ICCAD’20), San Diego, CA, Nov. 2020. (invited)
- J. Zhang, S. Sadiqbatcha, Y. Gao, M. O’Dea, N. Yu, and S. X.-D. Tan, “HAT-DRL: Hotspot-Aware Task Mapping for Lifetime Improvement of Multicore System using Deep Reinforcement Learning”, Proc. 2nd IEEE/ACM Workshop on Machine Learning for CAD (MLCAD’20), Virtual Event, Nov. 2020.
- L. Chen, W. Jin and S. X.-D. Tan, "Fast thermal analysis for chiplet design based on graph convolution networks”, Proc. Asia South Pacific Design Automation Conference (ASP-DAC’22), virtual, Jan. 2022. (invited)
- J. Zhang, J. Lu W. Jin and S. Sachdeva and S. X.-D. Tan, “Learning based spatial power characterization and full-chip power estimation for commercial TPUs”, Proc. Asia South Pacific Design Automation Conference (ASP-DAC’23), Japan, Jan. 2023. (invited)