# Logic Picture-Based Dynamic Power Estimation for Unit Gate-Delay Model CMOS Circuits

## Omnia S. Ahmed<sup>1</sup>, Mohamed F. Abu-Elyazeed<sup>1</sup>, Mohamed B. Abdelhalim<sup>2</sup>, Hassanein H. Amer<sup>3\*</sup>, Ahmed H. Madian<sup>4</sup>

<sup>1</sup>Faculty of Engineering, Cairo University, Giza, Egypt <sup>2</sup>College of Computing and Information Technology, Arab Academy for Science, Technology & Maritime Transport, Cairo, Egypt <sup>3</sup>Electronics Engineering Department, American University in Cairo, Cairo, Egypt <sup>4</sup>Radiation Engineering Department, Egyptian Atomic Energy Authority, Cairo, Egypt Email: \*hamer@aucegypt.edu

Received February 1, 2013; revised March 1, 2013; accepted March 9, 2013

Copyright © 2013 Omnia S. Ahmed *et al.* This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

# ABSTRACT

In this research, a fast methodology to calculate the exact value of the average dynamic power consumption for CMOS combinational logic circuits is developed. The delay model used is the unit-delay model where all gates have the same propagation delay. The main advantages of this method over other techniques are its accuracy, as it is deterministic and it requires less computational effort compared to exhaustive simulation approaches. The methodology uses the Logic Pictures concept for obtaining the nodes' toggle rates. The proposed method is applied to well-known circuits and the results are compared to exhaustive simulation and Monte Carlo simulation methods.

Keywords: Dynamic Power Estimation; Logic Pictures; CMOS Digital Logic Circuits; Toggle Rate; Unit-Delay Model

## **1. Introduction**

Power dissipation is an important parameter for digital VLSI circuits as the excessive power consumption may lead to runtime errors or permanent damages due to overheating. Hence, along with low power design techniques at different levels of the design, accurate power estimation tools are highly needed. Currently, there are many methods for estimating the power consumption; they are mainly categorized as non simulative-based [1-5] and simulative-based methods [6-9]. Non simulativebased methods can either be probabilistic or statistical. They rely on probabilistic measures for the inputs and the switching activities to estimate the power. While being efficient for large circuits with acceptable margins of errors, the power produced is not accurate but only an estimate. For simulative-based methods, the circuit is simulated with different inputs to obtain the power consumption. The main problems in simulative-based methods are the large memory requirements, time consumption and how to find the representative input vector set needed to exercise the circuit. Exhaustive simulations (where all pairs of input vectors are applied to the circuit) are very accurate but, obviously, time consuming, espe-

\*Corresponding author.

cially for large circuits.

In [10], an accurate method was introduced for calculating the average and the maximum dynamic power at the gate level. The paper developed the concept of Logic Pictures (LPs) in calculating the average power. As the LP is the status of gates outputs, it was found that the number of LPs was much smaller than the number of inputs patterns; hence, LPs were used instead of input patterns to obtain all the possible transitions for circuit nodes then obtaining the power consumption. The main advantage of this method is that it is deterministic and the simulations required are much less time-consuming than exhaustive simulations. The logic picture concept was modified in [11] to calculate the average power consumption for sequential circuits. In [12], the method was generalized and extended to calculate the maximum power consumption for sequential circuits including all types of Flip-Flops and their internal nodes power consumption; it was also shown how the tool could be used for design space exploration to select the appropriate Flip-Flop that consumed less power. While the method in [10-12] is accurate, it assumed that no propagation delay was associated with logic gates, *i.e.*, zero-delay model.

In this research, a method to calculate an accurate toggle rate assuming unit-delay model, is presented using



the LP concept. The toggle rate can be directly related to the dynamic power consumption. The proposed method is backward-compatible as it can be easily modified to obtain the power consumption for the zero-delay gate model.

The rest of this paper is organized as follows. Section 2 introduces the methodology to calculate the switching activity of the circuit nodes under unit-delay model assumption for all the gates. Section 3 contains the experimental results while Section 4 has the conclusions.

#### 2. Methodology

For CMOS logic circuits, the average dynamic power can be calculated as follows [13]:

$$P_{avg} = \frac{1}{2} V_{dd}^2 f_{clk} \sum_N \alpha_i c_i \tag{1}$$

where  $V_{dd}$  is the supply voltage,  $f_{clk}$  is the clock frequency, N is the number of gates outputs (circuit nodes),  $\alpha_i$  is the toggle rate of the output of gate i and  $c_i$  is the output capacitance of gate i. From this equation, it can be seen that  $V_{dd}$  and  $f_{clk}$  depend on the fabrication technology while  $c_i$  is linearly proportional to the gate fan-out; the only parameter that depends on the circuit operation is  $\alpha_i$ . Therefore, the toggle rate of the nodes is a good indicator of power dissipation [14].

Consider the circuit in **Figure 1** and assume that all inputs have equal probabilities to be 0 or 1.

The circuit has 2 gate outputs (nodes): d and e. As shown in the truth table in **Table 1**, column 3 indicates that the circuit has 3 LPs for the output nodes: 00, 01 and 11. Each LP is associated with a Logic Group (LG) composed of the input vectors that leads to this picture. For LP<sub>1</sub>, LG<sub>1</sub> contains 3 input vectors that lead to LP<sub>1</sub>: 000, 010 and 100. Hence,  $||LG_1|| = 3$ . Similarly,  $||LG_2|| = 3$  and  $||LG_3|| = 2$ .

Now, if the unit-delay model is assumed, a propagation delay  $\delta$  is assigned for each gate and **Table 2** can be easily constructed.

Starting from an initial LP at time t = 0, if the input vector is from the LG that leads to the same initial LP, then it is not considered as there is no transition and hence no power consumption, while all the input vectors that belong to other LGs must be applied to get different LPs. The status of the nodes temporarily changes into other transient LPs at  $t = \delta$  and finally change into a third, and final, LP at  $t = 2\delta$  since there are 2 gates in the critical path. The transient LPs and the final LP are merged into one LP in the rightmost column of **Table 2**.



Figure 1. A simple 3-input circuit.

Table 1. Circuit logic pictures with zero-delay model.

| Inputs | Outputs | Logic                  | Logic  |  |
|--------|---------|------------------------|--------|--|
| a b c  | d e     | pictures               | groups |  |
| 0 0 0  | 0 0     | $LP_1 = "00"$          | $LG_1$ |  |
| 0 0 1  | 0 1     | $LP_2 = "01"$          | $LG_2$ |  |
| 0 1 0  | 0 0     | $LP_1$                 | $LG_1$ |  |
| 0 1 1  | 0 1     | $LP_2$                 | $LG_2$ |  |
| 1 0 0  | 0 0     | $LP_1$                 | $LG_1$ |  |
| 1 0 1  | 0 1     | $LP_2$                 | $LG_2$ |  |
| 1 1 0  | 1 1     | LP <sub>3</sub> = "11" | $LG_3$ |  |
| 1 1 1  | 1 1     | LP <sub>3</sub>        | $LG_3$ |  |

The number of transitions between the initial LPs and the merged LPs is calculated in **Table 3**. As an example, the transition between LP<sub>1,0</sub> and LP<sub>1</sub> can be obtained as follows: from **Table 1**, the number of inputs that leads to LP<sub>1,0</sub> is 3 (remember that  $||LG_1|| = 3$ ) while from **Table 2**, LP<sub>1</sub> appeared after LP<sub>1,0</sub> for 3 different inputs; hence, the number of different combinations of inputs that could lead from LP<sub>1,0</sub> to LP<sub>1</sub> is  $3 \times 3 = 9$ . LP<sub>2</sub> and LP<sub>3</sub> appeared only once in **Table 2** after LP<sub>1,0</sub>; hence, the number of different input combinations that leads to LP<sub>2</sub> and LP<sub>3</sub> is  $1 \times 3 = 3$ . Finally, there is no input that leads from LP<sub>1,0</sub> to LP<sub>4</sub> or LP<sub>5</sub> which means zero direct transition between them.

To obtain the node transitions, if a node in the logic picture toggles from 1 to 0 or 0 to 1, then it is considered as a toggle. Then, this toggle is multiplied by the number of all possible input vectors that lead to this toggle. The same is done for all LPs through time. The possible number for transition for each node is then accumulated and divided by 2<sup>2n</sup> to obtain the toggle rate. For example,  $LP_{3,0}$  is "11"; the logic picture changes to  $LP_{1,\delta}$  which is "01". This means that node d toggles from 1 to 0. All possible input transitions from LP<sub>3.0</sub> to LP<sub>1. $\delta$ </sub> can be obtained from **Table 3** as  $LP_{1,\delta}$  is a part of  $LP_1$  and  $LP_5$ , then the input transitions are 6 + 6 = 12. In addition, there are toggles at node d from LP<sub>1.0</sub> to LP<sub>2.0</sub>, LP<sub>1.0</sub> to  $LP_{3,\delta}$ ,  $LP_{2,0}$  to  $LP_{2,\delta}$  and  $LP_{2,0}$  to  $LP_{3,\delta}$  with 3 possible input transitions for all 4 transitions cases. This leads to 12 other possible transitions. Hence, for node d, the number of transitions is 24. The same can be done with node e resulting into 36 transitions.

To conclude, the following equation can be used to obtain the toggle rate  $\alpha_i$  with the unit-delay model:

$$\alpha_{i} = \frac{\sum_{t=1}^{s} \sum_{l=1}^{k_{1}(t)} \sum_{m=1}^{k_{2}(t+\delta)} R_{l,m} tr_{t} \left( P_{l}, P_{m} \right)}{2^{2n}}$$
(2)

where s is the number of stages in the critical path,  $K_1$  and  $K_2$  are the LPs in each state where the two states must be consecutive with respect to gates delay, *i.e.*, a

| Al<br>Logic Group | Applied Inputs | Initial LP $t = 0$       | Transient LP $t = \delta$                  | Final LP $t = 2\delta$                      | Merged LP                  |
|-------------------|----------------|--------------------------|--------------------------------------------|---------------------------------------------|----------------------------|
|                   | a b c          | d e                      | d e                                        | d e                                         |                            |
| $LG_2$            | 0 0 1          | 0 0 "LP <sub>1,0</sub> " | 0 1 "LP <sub>1,<math>\delta</math></sub> " | 0 1 "LP <sub>1,2<math>\delta</math></sub> " | 0 1 0 1 "LP <sub>1</sub> " |
| $LG_2$            | 0 1 1          | 0 0 "LP <sub>1,0</sub> " | 0 1 "LP <sub>1,<math>\delta</math></sub> " | 0 1 "LP <sub>1,2<math>\delta</math></sub> " | 0 1 0 1 "LP <sub>1</sub> " |
| $LG_2$            | 1 0 1          | 0 0 "LP <sub>1,0</sub> " | 0 1 "LP <sub>1,<math>\delta</math></sub> " | 0 1 "LP <sub>1,2<math>\delta</math></sub> " | 0 1 0 1 "LP <sub>1</sub> " |
| LG <sub>3</sub>   | 1 1 0          | 0 0 "LP <sub>1,0</sub> " | 1 0 "LP <sub>2,δ</sub> "                   | 1 1 "LP <sub>2,2δ</sub> "                   | 1 0 1 1 "LP <sub>2</sub> " |
| $LG_3$            | 1 1 1          | 0 0 'LP <sub>1,0</sub> ' | 1 1 "LP <sub>3,δ</sub> "                   | 1 1 "LP <sub>2,2<math>\delta</math></sub> " | 1 1 1 1 "LP <sub>3</sub> " |
| $LG_1$            | 0 0 0          | 0 1 "LP <sub>2,0</sub> " | $0 \ 0 \ "LP_{4,\delta}$ "                 | 0 0 "LP <sub>3,2δ</sub> "                   | 0 0 0 0 "LP <sub>4</sub> " |
| $LG_1$            | 0 1 0          | 0 1 "LP <sub>2,0</sub> " | $0 \ 0 \ "LP_{4,\delta}$ "                 | 0 0 "LP <sub>3,2δ</sub> "                   | 0 0 0 0 "LP <sub>4</sub> " |
| $LG_1$            | 1 0 0          | 0 1 "LP <sub>2,0</sub> " | $0 \ 0 \ "LP_{4,\delta}$ "                 | 0 0 "LP <sub>3,2δ</sub> "                   | 0 0 0 0 "LP <sub>4</sub> " |
| LG <sub>3</sub>   | 1 1 0          | 0 1 "LP <sub>2,0</sub> " | 1 0 "LP <sub>2,δ</sub> "                   | 1 1 "LP <sub>2,2δ</sub> "                   | 1 0 1 1 "LP <sub>2</sub> " |
| $LG_3$            | 1 1 1          | 0 1 "LP <sub>2,0</sub> " | 1 1 "LP <sub>3,δ</sub> "                   | 1 1 'LP <sub>2,2δ</sub> "                   | 1 1 1 1 "LP <sub>3</sub> " |
| $LG_1$            | 0 0 0          | 1 1 "LP <sub>3,0</sub> " | 0 1 "LP <sub>1,<math>\delta</math></sub> " | 0 0 "LP <sub>3,2<math>\delta</math></sub> " | 0 1 0 0 "LP <sub>5</sub> " |
| $LG_1$            | 0 1 0          | 1 1 "LP <sub>3,0</sub> " | 0 1 "LP <sub>1,<math>\delta</math></sub> " | 0 0 "LP <sub>3,2<math>\delta</math></sub> " | 0 1 0 0 "LP <sub>5</sub> " |
| $LG_1$            | 1 0 0          | 1 1 "LP <sub>3,0</sub> " | 0 1 "LP <sub>1,<math>\delta</math></sub> " | 0 0 "LP <sub>3,2<math>\delta</math></sub> " | 0 1 0 0 "LP <sub>5</sub> " |
| $LG_2$            | 0 0 1          | 1 1 "LP <sub>3,0</sub> " | 0 1 "LP <sub>1,<math>\delta</math></sub> " | 0 1 "LP <sub>1,2<math>\delta</math></sub> " | $0 \ 1 \ 0 \ 1 \ "LP_1$    |
| $LG_2$            | 0 1 1          | 1 1 "LP <sub>3,0</sub> " | 0 1 "LP <sub>1,<math>\delta</math></sub> " | 0 1 "LP <sub>1,2δ</sub> "                   | 0 1 0 1 "LP <sub>1</sub> " |
| $LG_2$            | 1 0 1          | 1 1 "LP <sub>3,0</sub> " | 0 1 "LP <sub>1,<math>\delta</math></sub> " | 0 1 "LP <sub>1,2<math>\delta</math></sub> " | 0 1 0 1 "LP <sub>1</sub> " |

Table 2. All possible logic pictures for the unit-delay model.

state at  $\delta$  and the other state at  $2\delta$ .  $R_{l,m}$  are the repetition of the LPs  $p_l$  and  $p_m$  within a state and  $tr(p_l, p_m) = 1$  if there is a node transition between  $p_l$  and  $p_m$  and equals 0 otherwise.

#### **3. Experimental Results**

The circuit used in [10] is shown in **Figure 2**. It was studied with the unit-delay model and it was noticed that the number of nodes transitions increased (compared to the zero-delay model) due to the glitches arising from the gates delays as shown in **Table 4**.

To validate the results of the proposed method, exhaustive and Monte Carlo simulations (as in [8]) are applied to the ISCAS-85 C17 benchmark circuit, the 7483 4-bit binary adder and the 74157 Quad 2-input multiplexer. The characteristics of these circuits are shown in **Table 5**. The resulting power is compared to that obtained using the proposed method. It is found that the difference between the obtained results from the proposed method and the Monte Carlo approach is negligible. Moreover, the results obtained are identical to those obtained using exhaustive simulations.

Since the simulation requires building the truth table

Copyright © 2013 SciRes.

(with size of  $2^n$ ), the tool running time is less than the time required for the exhaustive simulation approach. The memory saving ratio can be calculated as the ratio between the memory space required to store the LPs and the memory space needed for the exhaustive simulations [10]. For exhaustive simulations,  $\left[2^n * (2^n - 1)\right]/2$  vectors must be stored; each vector represents a circuit input transition and consists of all the possible values for all circuit nodes; hence its size, in bits, is equal to the number of circuit nodes times the number of LPs. In the methodology proposed in this research, only  $2^n * (K-1)$  vectors are required where k is the number of LGs. The size of each vector is identical to that mentioned in the exhaustive simulation method.

Finally the proposed method could be used to obtain the power consumption for the zero-delay model considering only the initial and final states; the power consumption obtained is found to be identical to the one calculated using the technique in [10].

#### 4. Conclusion

This paper discussed a deterministic and accurate method to calculate the node toggle rate, hence dynamic power

|                 | LP <sub>1,0</sub> | LP <sub>2,0</sub> | LP <sub>3,0</sub> |
|-----------------|-------------------|-------------------|-------------------|
| LP <sub>1</sub> | 9                 | 0                 | 6                 |
| LP <sub>2</sub> | 3                 | 3                 | 0                 |
| LP <sub>3</sub> | 3                 | 3                 | 0                 |
| $LP_4$          | 0                 | 9                 | 0                 |
| LP <sub>5</sub> | 0                 | 0                 | 6                 |

Table 3. All transitions between initial LPs and merged

LPs

Table 4. Node transitions with different delay models.

| Node transitions | Е  | F   | G   | Н   |
|------------------|----|-----|-----|-----|
| Zero-delay model | 96 | 120 | 110 | 126 |
| Unit-delay model | 96 | 144 | 152 | 144 |

| Table 5. | Test | circuit | characteristics |
|----------|------|---------|-----------------|
|          |      |         |                 |

| Circuit     | Inputs | Nodes | LPs<br>count | Critical<br>path<br>gates | Memory<br>saving |
|-------------|--------|-------|--------------|---------------------------|------------------|
| ISCAS85-C17 | 5      | 6     | 10           | 3                         | 1.7              |
| 7483        | 9      | 36    | 162          | 4                         | 1.6              |
| 74157       | 10     | 15    | 34           | 4                         | 15.5             |
|             |        |       |              |                           |                  |



Figure 2. 4-input combinational circuit.

consumption, under the unit-delay model assumption for CMOS combinational circuits. The method is based on the logic picture concept and takes into account intermediate logic pictures that may appear due to gate delays. The proposed method was compared with both Monte Carlo and exhaustive simulations and applied to several circuits: ISCAS85-C17, 7483 4-bit binary adder and 74157 quad 2-input multiplexer. The results are identical but with much lower complexity.

### REFERENCES

- [1] G. Theodoridis, S. Theoharis, D. Soudris and C. Goutis, "An Efficient Probabilistic Method for Logic Circuits Using Real Gate Delay Model," *Proceedings of the International Symposium on Circuits and Systems ISCAS*, Orlando, 30 May-2 June 1999, pp. 286-289.
- [2] S. Bhanja and N. Ranganathan, "Switching Activity Estimation of VLSI Circuits Using Bayesian Networks" *IEEE Transactions on VLSI Systems*, Vol. 11, No. 4, 2003, pp. 558-567. doi:10.1109/TVLSI.2003.816144

- [3] M. Xakellis and F. Najm, "Statistical Estimation of the Switching Activity in Digital Circuits," *Proceedings of the Conference on Design Automation DAC*, San Diego, 6-10 June 1994, pp. 728-733.
- [4] M. Nemani and F. Najm, "Towards a High-Level Power Estimation Capability," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 15, No. 6, 1996, pp. 588-598. doi:10.1109/43.503929
- [5] A. Ghosh, S. Devadas, K. Keutzer and J. White, "Estimation of Average Switching Activity in Combinational and Sequential Circuits," *Proceedings of the Conference on Design Automation DAC*, Anaheim, 8-12 June 1992, pp. 253-259.
- [6] C. S. Ding, C. Y. Tsui and M. Pedram, "Gate-Level Power Estimation Using Tagged Probabilistic Simulation," *IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems*, Vol. 17, No. 11, 1998, pp. 1099-1107. doi:10.1109/43.736184
- [7] S. M. Kang, "Accurate Simulation of Power Dissipation in VLSI Circuits," *IEEE Transactions on Solid-State Circuits*, Vol. 21, No. 5, 1986, pp. 889-891. doi:10.1109/JSSC.1986.1052622
- [8] R. Burch, F. N. Najm, P. Yang and T. N. Trich, "A Monte Carlo Approach for Power Estimation," *IEEE Transactions on VLSI Systems*, Vol. 1, No. 1, 1993, pp. 63-71. doi:10.1109/92.219908
- [9] J. Monteiro, S. Daved, A. Chos, K. Keutzer and J. White, "Estimation of Average Switching Activity in Combinational Logic Circuits Using Symbolic Simulation," *IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems*, Vol. 16, No. 1, 1997, pp. 121-127. doi:10.1109/43.559336
- [10] M. F. Fouda, M. B. Abdelahlim and H. H. Amer, "Average and Maximum Power Consumption of Digital CMOS Circuits Using Logic Pictures," *Proceedings of the International Conference on Computer Engineering and Systems ICCES*, Cairo, 14-16 December 2009, pp. 14-16.
- [11] M. F. Fouda, M. B. Abdelahlim and H. H. Amer, "Power Consumption of Sequential CMOS Circuits Using Logic Pictures," *Proceedings of the Biennial Baltic Electronics Conference BEC*, Tallinn, 4-6 October 2010, pp. 133-136.
- [12] M. H. Amin, M. F. Fouda, A. M. Eltantawy, M. B. Abdelahlim and H. H. Amer, "Generalization of Logic Picture-Based Power Estimation Tool," *Proceedings of the First Annual International Conference on Energy Aware Computing ICEAC*, Cairo, 16-18 December 2010, pp. 133-136.
- F. Najm, "A Survey of Power Estimation Techniques in VLSI Circuits," *IEEE Transactions on VLSI Systems*, Vol. 2, No. 4, 1994, pp. 446-455. <u>doi:10.1109/92.335013</u>
- [14] F. Najm, "Transition Density: A New Measure of Activity in Digital Circuits," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 12, No. 2, 1993, pp. 310-323. doi:10.1109/43.205010