Kalman Predictor-Based Proactive Dynamic Thermal Management for 3D NoC Systems with Noisy Thermal Sensors

Yuxiang Fu,Li,Kun Wang,Chuan Zhang
DOI: https://doi.org/10.1109/tcad.2017.2661808
IF: 2.9
2017-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:Thermal sensor noise has a great impact on the efficiency and effectiveness of a dynamic thermal management (DTM) strategy. To address the problem of forecasting temperature based on noisy thermal sensors, we first propose a Kalman-based runtime thermal prediction scheme. To obtain accurate temperature predictions, a multivariate linear power model and a physically-based state space thermal model for 3-D network-on-chip are also proposed. Simulation results show that it reduces the standard deviations of the prediction error by 46%-53% compared with the auto-regressive based one under sensor noise with sigma = 2. Conventional reactive DTM techniques suffer from significant performance degradation due to their pessimistic reaction, thus, based on the proposed prediction scheme, we further propose a proactive DTM strategy that primarily consists of a thermal-aware routing algorithm and a proactive throttling scheme: 1) to take into account both thermal and congestion issues, we propose a proactive congestion and thermal aware routing algorithm. Simulation results demonstrate that it can achieve better throughput as well as approach better thermal balance. Specifically, under uniform traffic, the proposed scheme reduces the maximum chip temperature by about 3.9 degrees C and achieves 78.3% higher throughput compared with the competing thermal optimization approach based on dynamic programming network and 2) when the temperature exceeds the threshold, existing coarse-grained reactive throttling schemes cool down the overheated nodes at the penalty of significant performance loss. In this paper, a proactive quota-based throttling scheme is proposed. Simulation results show that it improves the throughput up to 11.1% compared with the reactive throttling schemes.
What problem does this paper attempt to address?