

# **In-situ Health Monitoring of IGBT Power Modules in EV Applications**

Bing Ji

B.Eng., M.Sc.

A thesis submitted for the degree of Doctor of Philosophy

December, 2011

School of Electrical and Electronic Engineering

Newcastle University United Kingdom

### Abstract

Power electronics are an enabling technology and play a critical role in the establishment of an environmentally-friendly and sustainable low carbon economy. The electrification of passenger vehicles is one way of achieving this goal. It is well acknowledged that Electric vehicles (EVs) have inherent advantages over the conventional internal combustion engine (ICE) vehicles owing to the absence of emissions, high efficiency, and quiet and smooth operation. Over the last 20 years, EVs have improved significantly in their system integration, dynamic performance and cost. It has attracted much attention in research communities as well as in the market. In 2011 electric vehicle sales were estimated to reach about 20,000 units worldwide, increasing to more than 500,000 units by 2015 and 1.3 million by 2020 which accounts for 1.8 per cent of the total number of passenger vehicles expected to be sold that year.

In general, electric vehicles use electric motors for traction drive, power converters for energy transfer and control, and batteries, fuel cells, ultracapacitors, or flywheels for energy storage. These are the core elements of the electric power drive train and thus are desired to provide high reliability over the lifetime of the vehicle. One of the vulnerable components in an electric power drive train is the IGBT switching devices in an inverter. During the operation, IGBT power modules will experience high mechanical and thermal stresses which lead to bond wire lift-off and solder joint fatigue faults. Theses stresses can lead to malfunctions of the IGBT power modules. A short-circuit or open-circuit in any of the power modules may result in an instantaneous loss of traction power, which is dangerous for the driver and other road users. These reliability issues are very complex in their nature and demand for the development of analytical models and experimental validation.

This work is set out to develop an online measurement technique for health monitoring of IGBT and freewheeling diodes inside the power modules. The technique can provide an early warning prior to a power device failure. Bond wire lift-off and solder fatigue are the two most frequently occurred faults in power electronic modules. The former increases the forward voltage drop across the terminals of the power device while the latter increase the thermal resistance of the solder layers. As a result, bond wire lift-off can be detected by a highly sensitive and fast operating in-situ monitoring circuit. Solder joint fatigue is detected by measuring the thermal impedance of the power modules.

This thesis focuses on the design and optimisation of the in-situ health monitoring circuit in an attempt to reducing noise, temperature variations and measurement uncertainties. Experimental work is carried out on a set of various IGBT power modules that have been modified to account for different testing requirements. Then the lifetime of the power module can be estimated on this basis.

The proposed health monitoring system can be integrated into the existing IGBT driver circuits and can also be applied to other applications such as industrial drives, aerospace and renewable energy.

### Acknowledgements

A project of this nature requires the input of many individuals. I would like to express my deep and sincere gratitude to my supervisor, Prof. Volker Pickert, for his continuous guidance, encouragement and support. His broad knowledge and his logical way of thinking have been of great value for me. What I learned from him is far beyond just how to solve technical problems. I am also deeply grateful to my supervisor Dr. Bashar Zahawi, for his guidance, advice, assistance and support throughout the duration of the work. I would also like to thank other academic staff, Dr. Wenping Cao, Dr. Matthew Armstrong, Dr. Dave Atkinson, Prof. Barrie Mecrow, Dr. Glynn Atkinson and Dr. John Bennett for providing technical advice concerning the practical issues.

Many thanks are also due to the team of technical staff within the department; Jack, Stuart, Allan and Luke, for their help with the construction of the experiment equipment and Jeff, Darren and Steve for their help with the manufacturing of the electric printed circuit board. I would also like to acknowledge the significant contribution of James Richardson throughout the project; for his considerable help with the design of printed circuit boards and the assembly of the electronic hardware.

It has been a great pleasure to work in PEDM, not only because of the talented colleagues but also the friendship. Whilst carrying out the work for this project I have made many valuable friends in the UG lab; Jackie, Hong, Min, Andrew, Richard, Simon, Nelson, Rachel, and Ahmed, only to name a few. I would like to acknowledge them all for their valued friendship and help throughout and in particular for making the UG-Lab such an enjoyable place to work.

My heartfelt appreciation goes toward my parents, Chunjiang Ji and Jinfeng Shi, for their love, support and encouragement throughout my further education.

At last, with my deepest love, I would like to thank my wife, Weiwei Ren, for her support and encouragement from my life to my study. Her company made my life in Newcastle fruitful and meaningful. The author would like thank ORS and School of EEE, Newcastle University, as without their fundings this work would not have been carried out. Additional support from industrial partners has also been highly appreciated; SEMIKRON Elektronik GmbH & Co. and StarPower Semiconductor Ltd. China, for their kind donations of semiconductor devices for the tests.

## List of Figures

| Figure 1.1 Common configurations of HEVs (a) series hybrid (b) parallel hybrid                                                                                                       | 3  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| Figure 1.2 Electric motor drive system                                                                                                                                               | 4  |
| Figure 1.3 Failure distribution                                                                                                                                                      | 5  |
| Figure 1.4 Fragile components distribution                                                                                                                                           | 5  |
| Figure 1.5 The component's health as a function of time                                                                                                                              | 7  |
| Figure 1.6 Change in failure rate over time (bathtub curve)                                                                                                                          | 8  |
| <b>Figure 2.1</b> 3-D view (a) and cross-sectional view (b) of a typical IGBT power module (not to scale)                                                                            | 12 |
| <b>Figure 2.2</b> MiniSKiiP, an insulated press pack module with spring pin contact system allowing the removal of the baseplate                                                     | 13 |
| <b>Figure 2.3</b> SKiN half bridge module with bond wires replaced by sintered flex foil                                                                                             | 14 |
| Figure 2.4 three major regions for thermomechanical failures                                                                                                                         | 21 |
| Figure 2.5 Examples of bond wire damages                                                                                                                                             | 23 |
| Figure 2.6 Examples of emitter metallization damage                                                                                                                                  | 23 |
| <b>Figure 2.7</b> Cross-sectional SEM-images of the copper–solder interface of PbSn solder (a) before and (b) after ruptures                                                         | 24 |
| <b>Figure 2.7</b> Optical microscopy analysis of the DCB ceramic (AIN) (a) Ceramic crack located under the copper layer (b) Lift up of the copper layer induced by the ceramic crack | 26 |
| Figure 3.1 Change in failure rate over time (bathtub curve)                                                                                                                          | 28 |
| <b>Figure 3.2</b> The idealized bathtub reliability curves for the test circuit and the prognostic cell                                                                              | 30 |
| Figure 3.3 Degradation progress as observed on threshold voltage                                                                                                                     | 33 |
| Figure 3.4 C-V measurement                                                                                                                                                           | 33 |

| Figure 3.5 Threshold voltage variation with temperature                                                                                                                                    | 34 |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| <b>Figure 3.6</b> Emitter bond wire lift-off detection for IGBTs in parallel [69] (a) Modified IGBT module. (b) Equivalent circuits without (left) and with (right) emitter bond lift-off. | 36 |
| Figure 3.7 The embedded sensor for emitter bond wire monitoring                                                                                                                            | 36 |
| Figure 3.8 Dependence of the number of cycles to failure, $N_f$ , on the mean $(T_m)$ and amplitude $(\Delta T_j)$ of the temperature cycling                                              | 40 |
| Figure 3.9 Flow diagram of IGBT power module lifetime consumption estimation                                                                                                               | 40 |
| Figure 3.10 Example of rainflow cycle counting                                                                                                                                             | 41 |
| Figure 3.11 Fusion prognostics approach                                                                                                                                                    | 44 |
| <b>Figure 4.1</b> A differential thermocouple measurement with a grounded signal source can create a ground loop.                                                                          | 49 |
| <b>Figure 4.2</b> Isolation eliminates ground loops by separating the earth ground from the amplifier ground reference                                                                     | 49 |
| Figure 4.3 AD629B high common-mode voltage amplifier                                                                                                                                       | 51 |
| <b>Figure 4.4</b> Isolated data acquisition system (a) analog isolation amplifier; (b) isolation ADC; (c) digital isolation                                                                | 51 |
| Figure 4.5 Losses in power semiconductor devices                                                                                                                                           | 54 |
| Figure 4.6 Turn-on losses, turn-off losses and on-state losses                                                                                                                             | 55 |
| Figure 4.7 Traditional steady state $R_{thjc}$ thermal resistance measurement                                                                                                              | 58 |
| Figure 4.8 Change in heating curve slope as a function of location within package                                                                                                          | 60 |
| Figure 4.9 Time/temperature plot for two MOSFETS with 10 000 power cycles                                                                                                                  | 61 |
| <b>Figure 4.10</b> $Z_{th}$ curves for a package measured with and without thermal grease at the interface between case and heat sink                                                      | 62 |
| <b>Figure 4.11</b> Derivatives (da/dz) and the difference $\Delta$ (da/dz) of the Z <sub>th</sub> curves                                                                                   | 63 |
| Figure 4.12 Structure functions for TO263 package                                                                                                                                          | 64 |
| Figure 4.13 (a) Thermal model of a cube; (b) A single time constant thermal                                                                                                                | 65 |
|                                                                                                                                                                                            |    |

RC model; and (c) graphical representation of the thermal model

| Figure 4.14 (a) An n-stage Foster model; (b) graphical representation of the thermal model                                                                                                                        | 66 |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| Figure 4.15 An n-stage Cauer model (Cauer network)                                                                                                                                                                | 67 |
| <b>Figure 4.16</b> Measured K-factor for the forward voltage of two power diodes in series and the gate–emitter voltage for an IGBT                                                                               | 73 |
| <b>Figure 4.17</b> Sensitivity analysis of the K-factor for IGBT IRG4CH30K: (a) K-values with different collector current ( $V_{CE}=5V$ ); (b) K-values with different collector-to-emitter voltage ( $I_C=1mA$ ) | 74 |
| <b>Figure 5.1</b> Principal schematic of the proposed test circuit. M stands for motor and V1-V6 is interface to the isolated data acquisition circuit                                                            | 77 |
| <b>Figure 5.2</b> (a) New European Driving Cycle (NEDC) based urban/suburban drive cycles and (b) the relevant load current                                                                                       | 78 |
| Figure 5.3 Flowchart of (a) operation of health monitoring and (b) ISR                                                                                                                                            | 79 |
| Figure 5.4 Bridge leg and high side in-situ $V_{CE(on)}$ and $V_F$ measurement circuit                                                                                                                            | 80 |
| Figure 5.5 Isolated data acquisition circuitry                                                                                                                                                                    | 81 |
| <b>Figure 5.6</b> Protection circuits for different off-state voltage levels (a) up to 600V; (b) 600V-2500V                                                                                                       | 82 |
| <b>Figure 5.7</b> Overvoltage protection circuit (a)Diode equivalent circuit; (b)Overvoltage protection circuit                                                                                                   | 82 |
| Figure 5.8 Schematic of sense current source                                                                                                                                                                      | 84 |
| Figure 5.9 System schematic of experiment set-up                                                                                                                                                                  | 85 |
| <b>Figure 5.10</b> Voltage measurement circuits (a) with difference amplifier (b) with digital isolation                                                                                                          | 86 |
| Figure 5.11 The TSEP test setup for IGBT modules                                                                                                                                                                  | 88 |
| Figure 5.12 The reference points for the case temperature measurement                                                                                                                                             | 89 |
| Figure 5.13 Positioning tool and thermocouples attached at reference point                                                                                                                                        | 90 |

| Figure 5.14 Temperature of IGBT chip and reference points (TC1 –TC 4) at | 91 |
|--------------------------------------------------------------------------|----|
| 50A                                                                      |    |

| Figure 5.15 | Temperature | of reference | points $(1-4)$ | with IGBT | injected by | 92 |
|-------------|-------------|--------------|----------------|-----------|-------------|----|
| 50A         |             |              |                |           |             |    |

Figure 6.1 (a) Static equivalent resistance model of a half-bridge module; (b)96Photo of a half-bridge module

Figure 6.2 (a)The part-to part variation and (b) bond wire degradation 98 interference for  $V_{CE(on)}$  and  $V_F$  as a function of temperature and their linear approximation

| approximation                                                                                                       |     |
|---------------------------------------------------------------------------------------------------------------------|-----|
| Figure 6.3 Linear regression for junction temperature extrapolation                                                 | 99  |
| Figure 6.4 Experiment set-up for two driver circuits under test       10                                            | .00 |
| Figure 6.5 Impacts of gate drive temperature variation on the on-state voltage 10                                   | 01  |
| Figure 6.6 Diagram of pulse trains for bond wire lift-off monitoring10                                              | .02 |
| Figure 6.7 Voltage and current measurements during pulse train monitoring       10                                  | 03  |
| Figure 6.8 Flowchart of the bond wire lift-off monitoring process 10                                                | 04  |
| Figure 6.9 Voltages vs bond wire cuts for (a) IGBTs and (b) diodes 10                                               | 04  |
| Figure 6.10 Bond wires of IGBTs and diodes10                                                                        | 05  |
| <b>Figure 6.11</b> Relative errors of $V_{CE(n)}$ and $V_{F(n)}$ as a function of number of lift-<br>off bond wires | 05  |
| Figure 6.12 Cross section view of the heat conduction in IGBT modules       10                                      | 07  |
| Figure 6.13 Two Cauer network cells. (a) R-C cell; (b)R-C-R cell10                                                  | .09 |
| Figure 6.14 Thermal model of the IGBT power module    12                                                            | 10  |
| Figure 6.15 Simulated transient thermal impedance and manufacture's data       12                                   | 10  |
| <b>Figure 6.16</b> Comparison of thermal characteristics under different cooling 12 conditions                      | 11  |

Figure 6.17 Transient junction-to-ambient temperature changes for different 112 thermal path scenarios

Figure 6.18 Comparison of thermal characteristics under different cooling 113 conditions

Figure 6.19 Thermal cycling behaviour of a power module114

Figure 6.20 C-SAM results for different layers of a power IGBT module 116 (Sample A)

Figure 6.21 Diagram of pulse trains for thermal conduction degradation 119 monitoring

Figure 6.22 Transient junction-to-reference temperature curves for different 120 references

**Figure 6.23** Changes of  $\Delta T_{iri}$  for different references 120

Figure 6.24  $V_{CE(h)}(t)$ ,  $I_{C(h)}(t)$  and P(t) waveformes of the IGBT during heating 122 pulses at four different ambient temperatures

**Figure 6.25** Thermal impedance  $Z_{thjr1}$  values (at t = 1s and ambient **123** temperature =  $-20^{\circ}$ C,  $0^{\circ}$ C,  $20^{\circ}$ C,  $40^{\circ}$ C) for (a) the IGBT and (b) their relative errors

**Figure 6.26** Influence of thermal path degradation on  $Z_{thjr}$  values (at t = 1s, 2s **124** and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 1 for IGBT(left) and diode(right)

Figure 6.27 The averaged thermal impedance of IGBT(upper) and 125 diode(lower) taken at t = 1s corresponding to all temperatures and linear regression

| Figure 6.28 Flow chart of the bond wire lift-off monitoring algorithm    | 126 |
|--------------------------------------------------------------------------|-----|
| Figure 6.29 Flow chart of the solder fatigue monitoring algorithm        | 127 |
| Figure C.1 Schematics of relay drivers and device voltage clamp circuits | 141 |

| Figure C.2 Schematics of IGBT drivers                                                                                                                                                                               | 142 |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| Figure C.3 Schematics of thermocouple measurement circuit with AD595C                                                                                                                                               | 143 |
| Figure D.1 Temperature dependence of $V_{CE}$ (upper) and $V_F$ (lower) current ranges from 35 A to 50 A (step: 5 A)                                                                                                | 144 |
| <b>Figure E.1</b> $V_{F(h)}(t)$ , $I_{C(h)}(t)$ and $P(t)$ waveforms of the diode during heating pulses at four different ambient temperatures                                                                      | 147 |
| <b>Figure E.2</b> Thermal impedance $Z_{thjr1}$ values (at t = 1s 20°C, 0°C, 20°C, 40°C) for the IGBT(upper) and the relative errors(lower)                                                                         | 148 |
| <b>Figure E.3</b> Thermal impedance $Z_{thjr1}$ values (at t = 2s and ambient temperature = -20°C, 0°C, 20°C, 40°C) for the IGBT(upper) and the relative errors(lower)                                              | 148 |
| <b>Figure E.4</b> Thermal impedance $Z_{thjr1}$ values (at t = 1s and ambient temperature = -20°C, 0°C, 20°C, 40°C) for the diode(upper) and the relative errors(lower)                                             | 149 |
| <b>Figure E.5</b> Thermal impedance $Z_{thjr1}$ values (at t = 2s and ambient temperature = -20°C, 0°C, 20°C, 40°C) for the diode(upper) and the relative errors(lower)                                             | 149 |
| <b>Figure E.6</b> Influence of thermal path degradation on $Z_{thjr}$ values (at t = 1s, 2s and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 1 for IGBT(left) and diode(right) | 152 |
| <b>Figure E.7</b> Influence of thermal path degradation on $Z_{thjr}$ values (at t = 1s, 2s and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 5 for IGBT(left) and diode(right) | 152 |
| <b>Figure E.8</b> Influence of thermal path degradation on $Z_{thjr}$ values (at t = 1s, 2s and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 3 for IGBT(left) and diode(right) | 152 |
| <b>Figure E.9</b> Influence of thermal path degradation on $Z_{thjr}$ values (at t = 1s, 2s and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 4 for IGBT(left) and diode(right) | 153 |

Figure E.10 The averaged thermal impedance of IGBT(upper) and 153

diode(lower) taken at t = 2s corresponding to all temperatures and linear regression

| Figure F.1 SAM scans of IGBT module S1               | 157 |
|------------------------------------------------------|-----|
| Figure F.2 SAM scans of IGBT module S2               | 159 |
| Figure F.3 SAM scans of IGBT module S3               | 161 |
| Figure F.4 SAM scans of IGBT module S4               | 163 |
| Figure F.5 SAM scans of IGBT module S5               | 165 |
| Figure F.6 SAM scans of IGBT module S6               | 167 |
| Figure H.1 A pictorial description of the experiment | 171 |

### List of Tables

| <b>Table 1.1</b> Typical environmental and operational requirements for HEVapplications                                      | 6   |
|------------------------------------------------------------------------------------------------------------------------------|-----|
| Table 2.1 Comparison of material properties                                                                                  | 17  |
| Table 2.2 CTEs for some common electronic materials                                                                          | 22  |
| <b>Table 3.1</b> Comparative overview of failure mechanisms, relevant loading         conditions and models in power modules | 38  |
| Table 3.2 Comparative overview of the analytical lifetime models for IGBTs                                                   | 39  |
| Table 4.1 Seebeck coefficients                                                                                               | 48  |
| <b>Table 4.2</b> Void growth in the die-attach layer of some power MOSFET         samples                                    | 60  |
| Table 4.3 Analogies of electrical to thermal parameters                                                                      | 64  |
| Table 5.1 Individual voltage measurements with different relay settings                                                      | 84  |
| Table 6.1 Equivalent stray resistance of an IGBT module                                                                      | 95  |
| Table 6.2 Power semiconductor material data                                                                                  | 108 |
| Table 6.3 Power semiconductor internal layer thermal parameters                                                              | 109 |
| <b>Table 6.4</b> Comparison of active power cycling and passive temperature           cycling                                | 114 |
| Table 6.5 Comparisons of voided DCB solder layer (a) for top IGBT devices         and (b) bottom IGBT devices                | 117 |
| Table 6.6 Relay operation time                                                                                               | 125 |
| Table A.1 Comparisons of HEV technologies                                                                                    | 133 |

| <b>Table B.1</b> List of TSEPs that can be used to measure the semiconductor           device temperature | 134 |
|-----------------------------------------------------------------------------------------------------------|-----|
| <b>Table B.2</b> Summary of four types of commonly contact temperature sensors           in industry      | 135 |
| Table B.3 Voltage Output Temperature Sensors                                                              | 136 |
| Table B.4 Current Output Temperature Sensors                                                              | 137 |
| Table B.5 Current Output Temperature Sensors                                                              | 138 |
| Table B.6 Resistance Output Silicon Temperature Sensors                                                   | 139 |
| Table E.1 Measurement and Estimate Quantities for healthy IGBTs                                           | 145 |
| Table E.2 Measurement and Estimate Quantities for healthy diodes                                          | 146 |
| Table E.3 Thermal impedance at predefined conditions for IGBTs                                            | 150 |
| <b>Table E.4</b> Thermal impedance at predefined conditions for diodes                                    | 151 |
| Table F.1 Correlating each scan to the correspondent layer                                                | 155 |

## List of Abbreviations

| ADC, A/D  | analog to digital converter      |
|-----------|----------------------------------|
| Al        | aluminium                        |
| $Al_2O_3$ | aluminium oxide (alumina)        |
| AlSiC     | aluminium-silicon-carbide        |
| ALT       | accelerated life testing         |
| AIN       | aluminium nitride                |
| AMB       | active metal brazing             |
| CMR       | common mode rejection            |
| CMV       | common mode voltage              |
| CO2       | carbon dioxide                   |
| CTE       | coefficient of thermal expansion |
| DC        | direct current                   |
| DCB       | direct copper bonded (bonding)   |
| DPCO      | double pole changeover           |
| DSP       | digital signal processor         |
| DUT       | device under test                |
| EMF       | electromotive force              |
| EMI       | electromagnetic interference     |
| ESD       | electrostatic discharge          |
| EOS       | electrical overstress            |
| EV        | electric vehicle                 |
| FCV       | fuel cell vehicles               |
| FEM       | finite element method            |
| FIT       | failures in time                 |

| FMMEA          | failure modes, mechanisms and effects analysis    |  |
|----------------|---------------------------------------------------|--|
| FWD            | freewheeling diode                                |  |
| HEV            | hybrid electric vehicle                           |  |
| ICE            | internal combustion engine                        |  |
| IGBT           | insulated gate bipolar transistor                 |  |
| IR             | infrared                                          |  |
| ISR            | interrupt service routine                         |  |
| LED            | light-emitting diode                              |  |
| MCU            | microcontroller unit                              |  |
| MOS            | metal-oxide-semiconductor                         |  |
| MOSFET         | metal oxide semiconductor field effect transistor |  |
| MTBF           | mean time between failures                        |  |
| MTTR           | mean time to repair                               |  |
| NEDC           | new European driving cycle                        |  |
| Ni             | nickel                                            |  |
| NPT            | non-punch through                                 |  |
| PCB            | printed circuit board                             |  |
| PD             | partial discharge                                 |  |
| PHEV           | plug-in hybrid vehicle                            |  |
| PHM            | prognostic and health monitoring                  |  |
| PoF            | physics of failure                                |  |
| RC             | resistor-capacitor                                |  |
| $R_{th}C_{th}$ | thermal resistance-thermal capacitance            |  |
| RUL            | remaining useful lifetime                         |  |
| SAM            | scanning acoustic microscone                      |  |
|                | sealining acoustic interoscope                    |  |

| TDDB             | time dependent dielectric breakdown         |  |
|------------------|---------------------------------------------|--|
| TSEP             | temperature sensitive electrical parameters |  |
| TIM              | thermal interfacial material                |  |
| SAM              | scanning acoustic microscope                |  |
| SiO <sub>2</sub> | silicon dioxide                             |  |
| SOA              | safe operating area                         |  |
| TDIM             | transient dual interface measurement        |  |
| USB              | universal serial bus                        |  |

# List of Symbols

| Α                | effective area for heat dissipation                         |
|------------------|-------------------------------------------------------------|
| A <sub>V</sub>   | availability                                                |
| В                | noise bandwidth                                             |
| c                | specific heat (J/(kg*K))                                    |
| С                | electrical capacitance                                      |
| C <sub>D</sub>   | diode stray capacitance                                     |
| C <sub>DC</sub>  | DC-link capacitance                                         |
| C <sub>OX</sub>  | oxide capacitance per unit area                             |
| C <sub>P</sub>   | effective capacitance of both the diode and the zener diode |
| C <sub>th</sub>  | thermal capacitance (J/K)                                   |
| Cz               | zener stray capacitance                                     |
| d                | length of the material                                      |
| D                | diameter of the bond wires (m)                              |
| E <sub>a</sub>   | the activation energy                                       |
| Eg               | energy of bandgap                                           |
| f                | frequency for temperature cycles                            |
| fs               | switching frequency (Hz)                                    |
| Н                | huminity                                                    |
| Ι                | current (A)                                                 |
| i <sub>con</sub> | collector current during turn-on transient                  |

| i <sub>coff</sub>  | collector current during turn-off transient                      |
|--------------------|------------------------------------------------------------------|
| i <sub>ccond</sub> | collector current during on state                                |
| IL                 | load current                                                     |
| I <sub>n</sub>     | Johnson RMS current noise                                        |
| Is                 | saturation current                                               |
| I <sub>sense</sub> | TSEP sensing current                                             |
| I <sub>test</sub>  | test current                                                     |
| J                  | current density                                                  |
| k                  | thermal conductivity (W/(m*K))                                   |
| k <sub>B</sub>     | Boltzmann constant                                               |
| L <sub>C</sub>     | channel length                                                   |
| $N_{\mathrm{f}}$   | number of cycles to failure                                      |
| P <sub>av</sub>    | average power dissipated power                                   |
| P <sub>cond</sub>  | on-state power loss                                              |
| Pon                | turn-on power loss                                               |
| P <sub>off</sub>   | turn-off power loss                                              |
| q                  | the charge of electron                                           |
| Q                  | stored heat (J)                                                  |
| Q <sub>AB</sub>    | Seebeck coefficient ( $\mu V/^{\circ}C$ )                        |
| R                  | electrical resistance ( $\Omega$ )                               |
| R <sub>EH</sub>    | A sense resistor connected in parallel to the emitter bond wires |
| R <sub>LIMIT</sub> | current limiting resistance                                      |

| R <sub>on</sub>     | on-state resistance                             |  |
|---------------------|-------------------------------------------------|--|
| R <sub>ON(TH)</sub> | on-state resistance of the channel region       |  |
| R <sub>th</sub>     | thermal resistance (K/W)                        |  |
| R <sub>thjc</sub>   | junction-to-case thermal resistance (K/W)       |  |
| R <sub>thjr</sub>   | junction-to-reference thermal resistance (K/W)  |  |
| t <sub>d(on)</sub>  | turn-on delay time                              |  |
| $t_{d(off)}$        | turn-off delay time                             |  |
| ton                 | Heating time during temperature cycling         |  |
| t <sub>off</sub>    | turn-off time                                   |  |
| t <sub>f</sub>      | collector current fall time                     |  |
| Т                   | temperature (°C)                                |  |
| T <sub>a</sub>      | ambient temperature                             |  |
| T <sub>c</sub>      | the case temperature                            |  |
| T <sub>ch0</sub>    | temperature in the middle region of the channel |  |
| T <sub>m</sub>      | mean value of a temperature cycle               |  |
| T <sub>j</sub>      | junction temperature (°C)                       |  |
| T <sub>j-max</sub>  | maximum junction temperature(°C)                |  |
| T <sub>PNP</sub>    | temperature in the bipolar transistor base      |  |
| T <sub>r</sub>      | reference temperature                           |  |
| $T_{vj}$            | virtual junction temperature                    |  |
| ΔΤ                  | temperature swing (°C)                          |  |
| $\Delta T_{j}$      | junction temperature swings                     |  |

| List of Symbols       |                                                                               |
|-----------------------|-------------------------------------------------------------------------------|
| $\Delta T_{ja}$       | junction-to-ambient temperature                                               |
| $\Delta T_{jr}$       | junction-to-reference temperature                                             |
| V                     | voltage (volts)                                                               |
| V <sub>BE(TH)</sub> , | forward voltage drop of the base-emitter diode of the internal PNP transistor |
| V <sub>CE(on)</sub>   | on-state collector-emitter voltage                                            |
| V' <sub>CE(on)</sub>  | on-state collector-emitter voltage (across silicon device)                    |
| V <sub>CE(h)</sub>    | on-state collector-emitter voltage (across terminals, under high current)     |
| V <sup>'</sup> CE(h)  | on-state collector-emitter voltage (across the IGBT die, under high current)  |
| V <sub>CE(l)</sub>    | on-state collector-emitter voltage (across terminals, under low current)      |
| V <sub>DC</sub>       | DC-link voltage                                                               |
| V <sub>DS</sub>       | drain-source voltage                                                          |
| V <sub>F</sub>        | diode forward voltage                                                         |
| $V_{F(h)}$            | diode forward voltage (across terminals, under high current)                  |
| $V_{GE,th}$           | gate-emitter threshold voltage                                                |
| V <sub>GS</sub>       | gate-source voltage                                                           |
| V <sub>M</sub>        | reading voltage (the actual measured voltage)                                 |
| V <sub>n</sub>        | Johnson RMS voltage noise                                                     |
| V <sub>OFFSET</sub>   | offset voltage                                                                |
| Vs                    | signal source voltage                                                         |
| V <sub>stray</sub>    | voltage drop across the series stray resistance                               |

| List of Symbols       |                                                                            |  |
|-----------------------|----------------------------------------------------------------------------|--|
| Vz                    | zener voltage                                                              |  |
| Z                     | function of logarithmic time                                               |  |
| Z <sub>C</sub>        | channel width                                                              |  |
| Z <sub>th</sub>       | thermal impedance                                                          |  |
| Z <sub>thjc</sub> (t) | junction-to-case thermal impedance                                         |  |
| Z <sub>thjr</sub> (t) | junction-to-reference thermal impedance                                    |  |
| σ                     | electrical conductivity $(A/(m \cdot V))$                                  |  |
| τ                     | thermal time constant                                                      |  |
| ρ                     | Density (kg/m <sup>3</sup> )                                               |  |
| $\beta_{PNP}$         | current gain                                                               |  |
| $\mu_{ns}$            | surface mobility of electrons in the channel                               |  |
| 3                     | temperature coefficient (temperature dependency) of TSEP $(mV)^{\circ}C$ ) |  |
| λ                     | failure rate                                                               |  |

## List of Contents

| Abstract                                                           | i        |
|--------------------------------------------------------------------|----------|
| Acknowledgement                                                    | iii      |
| List of Figures                                                    | v        |
| List of Tables                                                     | ,<br>xii |
| List of Abbreviations                                              | kiv      |
| List of Symbols x                                                  | vii      |
| List of Contents x                                                 | xii      |
| CHAPTER 1 INTRODUCTION TO THESIS                                   | 1        |
| 1.1 BACKGROUND AND MOTIVATION                                      | 2        |
| 1.1.1 Power drive trains                                           | 2        |
| 1.1.2 Reliability of motors and inverters                          | 3        |
| 1.2 Reliability of Power Converters                                | 4        |
| 1.2.1 Reliability of power semiconductor devices                   | 5        |
| 1.2.2 Reliability of IGBT power modules for EV applications        | 5        |
| 1.3 METHODS OF IMPROVING RELIABILITY OF POWER MODULES              | 6        |
| 1.3.1 Improvement of system architecture                           | 6        |
| 1.3.2 Fault tolerant systems                                       | 7        |
| 1.3.3 Online health monitoring                                     | 7        |
| 1.4 RESEARCH GOALS AND CONTRIBUTIONS TO KNOWLEDGE                  | 9        |
| 1.5 THESIS OVERVIEW                                                | 10       |
| CHAPTER 2 OVERVIEW OF IGBT POWER MODULES                           | 12       |
| 2.1 IGBT MODULE STRUCTURES                                         | 12       |
| 2.1.1 Components of conventional IGBTs                             | 14       |
| 2.1.2 Trends toward the next generation of IGBT modules            | 16       |
| 2.2 FAILURE MODES, MECHANISMS AND EFFECTS ANALYSIS IN IGBT MODULES | 17       |
| 2.2.1 Chip-related failures                                        | 19       |
| 2.2.2 Package-related failures                                     | 20       |
| CHAPTER 3 RELIABILITY, RROGNOSTICS AND HEALTH MONITORING           | 27       |

| 3.1 GENERAL INTRODUCTION TO RELIABILITY                             | 27 |
|---------------------------------------------------------------------|----|
| 3.2 FUSES OR CANARIES                                               | 29 |
| 3.3 DATA DRIVEN METHODS                                             | 31 |
| 3.3.1 Inherent prognostic parameters                                | 32 |
| 3.3.2 Externally embedded sensors (redundant indicators)            | 35 |
| 3.4 MODEL DRIVEN METHODS (POF MODELS)                               | 37 |
| 3.5 A FUSION PROGNOSTIC METHOD                                      | 42 |
| CHAPTER 4 MEASUREMENT TECHNIQUES                                    | 45 |
| 4.1 NOISE                                                           | 45 |
| 4.2 HIGH COMMON MODE VOLTAGE MEASUREMENT TECHNIQUES                 | 49 |
| 4.3 SEMICONDUCTOR JUNCTION TEMPERATURE ACQUISITION                  | 52 |
| 4.3.1 Thermal model based estimation                                | 52 |
| 4.3.1.1 Measurement of power losses in IGBT power modules           | 54 |
| 4.3.1.2 Thermal characterization of power semiconductors            | 56 |
| 4.3.1.3 Thermal modelling of power modules                          | 63 |
| 4.3.2 Direct temperature measurement                                | 67 |
| 4.3.3 Indirect temperature measurement                              | 68 |
| CHAPTER 5 IN-SITU MEASUREMENT CIRCUIT                               | 76 |
| 5.1 DESIGN AND EXPERIMENTAL SETUP FOR THE IN-SITU HEALTH MONITORING | 76 |
| 5.1.1 The circuit design                                            | 77 |
| 5.1.2 Test circuit                                                  | 85 |
| 5.2 VOLTAGE MEASUREMENT                                             | 85 |
| 5.3 MEASUREMENT OF JUNCTION AND CASE TEMPERATURES                   | 87 |
| 5.3.1 Junction temperature measurement                              | 87 |
| 5.3.2 Case temperature measurement                                  | 88 |
| CHAPTER 6 EXPERIMENTAL RESULTS                                      | 93 |
| 6.1 MONITORING BOND WIRE LIFT-OFF                                   | 93 |
| 6.1.1 Terminal voltages and their dependencies                      | 94 |
| 6.1.2 Junction temperature measurement and discussions              | 97 |
| 6.1.2.1 Calibration of TSEPs                                        | 97 |
| 6.1.2.2 Junction temperature extrapolations                         | 98 |
| 6.1.3 Impacts of the gate-emitter voltage10                         | 00 |
| 6.1.4 Results and analysis                                          | 02 |

| 6.2 MODELLING AND SIMULATION                                               |
|----------------------------------------------------------------------------|
| 6.2.1 Construction of a 1-D Cauer model106                                 |
| 6.2.2 Thermal characterization for critical layers under different thermal |
| conditions110                                                              |
| 6.3 SOLDER FATIGUE MONITORING                                              |
| 6.3.1 Thermal cycling results with SAM analysis114                         |
| 6.3.2 Test result and discussion117                                        |
| 6.3.3 Discussion of power dissipation121                                   |
| 6.3.4 Discussion of thermal impedance results                              |
| 6.4 IN-SITU MEASUREMENT AND ELECTRIC VEHICLE DRIVING CYCLE                 |
| CHAPTER 7 CONCLUSIONS                                                      |
| 7.1 DETERMINATION OF COMMON FAILURES IN AN IGBT MODULE                     |
| 7.1.1 In-situ bond wire lift-off measurement                               |
| 7.1.2 In-situ solder fatigue measurement                                   |
| 7.2 FUTURE WORK                                                            |
| 7.3 RESEARCH OUTCOMES132                                                   |
| APPENDIX A A COMPARISION OF HYBRID TECHNOLOGIES                            |
| APPENDIX B TEMPERATURE MEASUREMENT TECHNOLOGIES                            |
| APPENDIX C SCHEMATICS OF EXPERIMENT CIRCUITS                               |
| APPENDIX D NON-SWITCHED TSEP MEASUREMENT METHOD144                         |
| APPENDIX E EXPERIMENT DATA AND RESULTS                                     |
| APPENDIX F C-SAM ANALYSIS                                                  |
| APPENDIX G LABVIEW PROGRAM AND DESCRIPTIONS                                |
| APPENDIX H A PICTORIAL DESCRIPTION OF THE EXPERIMENT                       |
| REFERENCES                                                                 |

### **CHAPTER 1**

#### **INTRODUCTION TO THESIS**

Road based transport currently accounts for approximately 22 per cent of UK  $CO_2$  emissions [1]. Compared to conventional oil fuelled vehicles with internal combustion engines (ICEs), electric vehicles (EVs) have the potential to reduce  $CO_2$  content.

Power electronics are the key components in an EV and are essential to convert and control electrical power. Due to the low cost high power-to-weight and high power-to-volume ratios of power semiconductor devices, it is understood power converters can be more easily packaged to reduce manufacturing costs, and lighter power electronics reduce the total weight of the car, increasing the overall efficiency. Increasing power densities, however, will lead to more stress, particularly in power modules [2]. More stress in the power modules will cause a higher risk of failure which, in traction applications, means loss of power and therefore loss of control of the car. Although the reliability of power modules has been improved using new manufacturing techniques [3], power devices still fail, mainly due to thermomechanical stress. Bond wire lift-off and solder fatigue are the most common failure mechanisms that limit reliability.

This research project illustrates the need for an on-board health monitoring system for power semiconductor modules. The EV driver will be warned well in advance to service the degraded power modules, long before any device within the power module fails. An early warning system has been developed that monitors the health of the power modules online, through which stress due to degradation is measured and recorded. In order to receive online data, an in-situ health monitoring circuit is developed that can be integrated into any existing driver circuit.

#### **1.1 Background and Motivation**

EVs have existed for over one hundred years and were popular in the late 19th century and early 20th century, until advances in ICE technology and mass production of cheaper gasoline vehicles led to a decline in the use of electric drive vehicle. However, they have recently regained the focus of research and development in the automotive industry. Rising concerns over decreasing oil resources and accelerating oil prices have triggered recent developments in EVs. In addition, increasingly rigorous environmental legislation for limiting the pollution of cars makes EVs attractive to customers.

The emerging EV market presents a tremendous business opportunity for the semiconductor industry, but at the same time, it poses a great technical challenge in improving performance, compactness, efficiency, reliability and cost of the power semiconductor products.

#### **1.1.1** Power drive trains

EV power drive trains come in various topologies. The main difference between an EV and an ICE powered vehicle is the power drive train. In an EV, the power drive train is electrified in order to improve the efficiency of the vehicle. According to report of UK Department for Transport [1], EVs can be divided into three categories:

- Pure EVs, which use only a battery to power an electric motor;
- Hybrid Electric Vehicles (HEVs), which are driven by the combination of the electric motor and an ICE. Some hybrid vehicles have the ability to recharge their batteries from the grid. These are also termed plug-in hybrid vehicles (PHEVs). Commonly, hybrid vehicles can come in two drive train topologies, as shown in Figure 1.1. However, there are more topologies which have been summarised in Appendix A:

Series Hybrid: 100% of the wheel power is produced by the electric motor, taking its electricity from the battery and an ICE driven generator.

Parallel Hybrid: The power to the wheels is mostly generated from the ICE and the battery assists during acceleration and/or deceleration;

• Fuel Cell Vehicles (FCVs), which convert hydrogen into electricity with the help of fuel cells, which power the electric motor

EVs and HEVs are already on the road; examples are the HEV Toyota Prius and the EV Nissan Leaf. It is reported that mass produced FCVs will not be available before 2050 due to cost and lack of hydrogen infrastructure. EVs are known to have a short range and require long charging times, and for these reasons public interest currently lies in HEVs.



Figure 1.1 Two common configurations of HEVs (a) series hybrid (b) parallel hybrid [1]

#### **1.1.2 Reliability of motors and inverters**

A simplified motor drive system is shown in Figure 1.2. It consists of a DC power input, a power converter, a motor, sensors, control unit and interfaces. The electric motor and the inverter are the major components in an EV. Although various types of the previously mentioned EVs have different power train topologies, an electric drive system is always needed to convert electric energy into kinetic energy and/or recuperate kinetic energy back to electrical energy. The energy storage system is generally a battery or double layer capacitors. Safety and reliability are critical features of a power drive train. An automotive drive system must cope with vibrations and changes of humidity and ambient temperatures. Moreover, the driving profile generates a continuous change of power demand, which results in continuous changes in heat loss in the power semiconductors. All these factors will accelerate wear in the power drive

train and leave it susceptible to faults. It is therefore of the utmost importance to monitor the health of the power drive train throughout its lifespan, in order to improve system safety and reliability.

Faults in an electric traction drive can generally be classified into two groups: 1) power electronics-related faults and 2) motor-related faults [4]. Motor faults mainly encompass bearing, rotor and stator faults [5]; however, these components are relatively more robust compared to the components of a power inverter. Thus, the power electronics inverter can be considered to be the weakest link in a motor drive system [4], and is therefore the focus of this work.



Figure 1.2 Electric motor drive system

#### **1.2 Reliability of Power Converters**

The word reliability has been given different expectations and individual interpretations throughout history. The generally accepted specifications in relation to reliability considerations are: mean time before failure (MTBF), mean time to repair (MTTR), and availability [6]. By considering the reliability aspect of power converters, the power electronics manufacturer can guarantee high availability of their products. Henceforth, the term reliability is used to express the capability of a component or a system to remain functional between scheduled maintenance periods. A power inverter is made up of power modules, gate drives, DC link capacitors, filters, a digital controller unit, sensor system and other components. All these components may break down at some

pointes which could possibly lead to a system malfunction.

#### **1.2.1** Reliability of power semiconductor devices

There have been some studies on power converter reliability in order to determine the most fragile components in an inverter. As shown in Figure 1.3, semiconductor and soldering failures in device modules total 34% of converter system failures, according to a survey based on over 200 products from 80 companies [7]. A similar conclusion was published in Shaoyong's study [8]. Figure 1.4 shows that power semiconductor devices were the most fragile components, causing 31% of failures, followed by capacitors and gate drives. A study of the reliability of servo drives reported by Fuchs [9] came to the same conclusion, with a relative failure rate of 38%.





Figure 1.4 Fragile components distribution [8]

It is reasonable to conclude that the weakest link of an inverter is the power semiconductors, followed by capacitors, gate drives and connectors. Therefore, the focus of this research project is to enhance the reliability of power semiconductors.

#### 1.2.2 Reliability of IGBT power modules for EV applications

In EV applications IGBT power modules are widely used due to their robust shortcurrent capability, their simple driver requirement, availability and relatively low cost. However, they are prone to excess electrical and thermal stress. Hence, there is a growing need and interest in improving the IGBT module's reliability, especially for safe critical applications where an unpredicted failure may trigger a catastrophic accident or unscheduled maintenance, resulting in high penalty costs, like in EV applications. Table 1.1 gives a typical overview of environmental and operational requirements for power modules in HEV applications.

Table 1.1 shows that, due to their wide range of operation and diverse usage profiles, power modules in EV applications impose a stringent reliability requirement, which is higher than in any other industrial motor drive applications.

**Table 1.1** Typical environmental and operational requirements for HEV applications [3]

| Environmental requirements |                         |
|----------------------------|-------------------------|
| Ambient air                | -40°C to 135°C          |
| Coolant water              | -40°C to 105°C          |
| Junction temperature       | -40°C to 175°C          |
| Vibration                  | 10g                     |
| Shock                      | 50g                     |
| Operational requirements   |                         |
| Operational Life           | 15 years (=131,400 h)   |
| Power Cycling              | 30,000 cycles @ΔT 100°C |
| Temperature Cycling        | 1,000 cycles @ΔT 165°C  |

#### **1.3** Methods of Improving Reliability of Power Modules

Factors such as electrical loading, thermal conditions, and environmental conditions, all exert stresses on power modules. Above all, the capability to withstand the considerable stress induced by high junction temperature and thermal cycles is a prerequisite for high reliability and lifetime of power modules [3]. Efforts to improve power module reliability have been made and are described below.

#### **1.3.1** Improvement of system architecture

The reliability of power modules is usually considered during the system design phase and can be improved in terms of system architecture and device performance. In recent years, manufacturers have made great contributions to interconnection technologies [3], efficient cooling facilities [10-12] and device derating (or over-design) [13], and the reliability for power devices has been greatly improved. For example, the average failure rate for power modules in traction dropped from 1,000 FITs in 1995 to 20 FITs in 2000 [14], where FIT stands for failure in time and "1 FIT" is equal to 1 failure per 10<sup>9</sup> device hours. However, Figure 1.5 illustrates that, although the FIT number has been significantly reduced, power modules will fail eventually given enough stress and time [15].



Figure 1.5 The component's health as a function of time

#### **1.3.2** Fault tolerant systems

Much research has been conducted on fault-tolerant systems which contain redundant power devices or other means of fault tolerance mechanism to increase reliability [16, 17]. In comparison to the method described in section 1.3.1, a fault tolerant device can maintain the system performance at the presence of component failures. When one device fails, the other will take over for a time until the faulty device is replaced. This time-to-service is normally very short, reducing the likelihood of a failure of the redundant power module. However, any fault tolerant solution increases the number of switching devices and other passive components, which consequently increases cost, size and weight, all attributes in which the automotive industry is not interested.

#### **1.3.3** Online health monitoring

Traditional reliability prediction methods are based on the use of reliability prediction handbooks, such as MIL-HDBK-217 [18]. Empirical data statistically obtained from the field are linked with the failure-rate curve, which is often termed the bathtub curve, as shown in Figure 1.6 [19]. The bathtub curve can be divided into three stages according to time. They are called the infant period (early failure period), grace period (random failure period), and breakdown period (wear-out failure period). These reliability

prediction methods establish stress and damage models based on data statistically obtained from the field or from tests. Reliability prediction is completed by calculating the failure rate and Mean Time Between Failures (MTBF). In the past, MIL-HDBK-217 was updated periodically but recently it has not been updated. The most recent revision is Revision F, which was released in February of 1995 and it is now out of date. Other well-known handbooks describing lifetime models developed for power semiconductor devices are presented in Chapter 3. This method has fundamental flaws in its reliability assessment and health estimation, since the actual operating and environmental conditions of the product are not considered.



Figure 1.6 Change in failure rate over time (bathtub curve)

Online health monitoring for effective health evaluation has been recently developed. It was originally introduced for safety-critical mechanical systems and structures [20]. In general, an online health monitoring system (also termed condition monitoring) entails capturing data from a component or device that can be used to determine its health. Depending on the data type, data can be either directly linked with the health of the component/device, or data must be processed. Online health monitoring has been applied to a variety of components, such as machines [21], transformers [22] and capacitors [23], and it has also been used for power semiconductor devices [24-28].

All published work on online health monitoring for power semiconductor devices makes use of the physics of failure and thermal models. The combination of both

models promises an accurate prediction of accumulated damage to the device. The physics of failure model is generated based on in-field data or test data in order to calculate how much damage has taken place.

All published online health monitoring systems for power modules are reliability predictions based on short-term accelerated life testing data, due to limited historical field data. One of the problems with this is the extrapolation needed in order to estimate a lifetime of 20 years from e.g. a two month or less accelerated test. Their application to health monitoring is typically constrained and can be significantly inaccurate and inconsistent when compared to actual field performance, particularly in the case of long-term EV applications. In other words, systems that are based on physics of failure and thermal models are limited. As a consequence, the work that has been presented so far regarding online health monitoring systems for power modules is inaccurate and has little meaning in terms of the true health of the power module.

The work presented here proposes for the first time an online health monitoring system that does not rely on any models, statistical data or any other means of extrapolation. Instead, it uses measurement circuitry that provides the status of the IGBT power module's health directly. The circuitry is embedded in the IGBT driver circuits and therefore the proposed method is called an in-situ health monitoring system for IGBT power modules.

#### **1.4 Research Goals and Contributions to Knowledge**

The main objectives of the research project were to:

- Obtain a comprehensive understanding of typical failure modes within IGBT power modules
- Provide an overview of prognostics and health monitoring methods for IGBT power modules
- Compare and study various in-situ condition monitoring techniques used in electric drives
- Evaluate and discuss the challenges of existing in-situ measurements and develop an online in-situ health monitoring system for EV applications
- Gain an in depth knowledge of the thermal management of power modules and measurement methods for the junction-to-case thermal impedance and junction

temperature

• Correlate the failure precursors (parameters indicative of degradation) with the main failure mechanisms and assess the health state with ongoing device degradation

All of the objectives have been met, and are described in the thesis. Novel aspects and contribution to knowledge in this project are outlined below:

- A comprehensive assessment and comparison of health monitoring systems for power semiconductor devices
- An online health monitoring system for IGBT power modules that does not require any modelling for data processing
- Development of an online in-situ health monitoring system for bond wire lift-off and solder fatigue embedded in conventional IGBT driver circuits
- Implementation of an in-situ health monitoring system for IGBT power devices in EV applications

#### **1.5 Thesis Overview**

This thesis is divided into two parts: the chapters and the appendices. Chapter 1 describes the motivation, objectives and background of this project and includes a general discussion on in-situ health monitoring of IGBT power modules. Chapter 2 provides a review of conventional IGBT power module structures and reviews the failure modes and failure mechanisms of IGBT modules. Chapter 3 considers health monitoring techniques for IGBT modules and in particular analyses performance-related parameters indicative of the product's health; this is followed by determination of the parameters that need to be monitored and of lifetime prognostic methods. Chapter 4 is the summary of measurement methods related to power module health monitoring. Various measurement techniques are compared and the advantages and disadvantages of various electrical and thermal parameter measurement methods are presented. Online junction temperature measurement techniques are discussed in detail and the method of using the chip itself as a temperature sensor shows advantages which are later used for experimental research. Chapters 5 and 6 describe the novel in-situ health monitoring circuit and its operation. Chapter 5 describes techniques to monitor the health condition of bond wires and solder layers. The circuit measuring the forward voltage drop and junction-to-case thermal impedance of each individual IGBT and diode is described in detail. Chapter 6 describes the results from the practical experiments. Finally, Chapter 7 summarizes the research work that was carried out in the project and discusses the pros and cons of the proposed in-situ health monitoring circuit. The Chapter ends with a recommendation for future work to enhance the proposed technique.

To be followed are appendices. In Appendix A, the introduction and general classification of HEVs are presented. Appendix B presents a list of commonly used TSEPs at first. Then it introduces used physical contact temperature sensors, followed by a general discussion of the integrated circuit (IC) temperature sensors. Schematics of the experiment circuits are presented in Appendix C. To be followed are the results of junction temperature measurement with the non-switched method as which is introduced in Chapter 5. Appendix E extends the experiment results presented in Chapter 6 and Appendix F shows a comparison of C-SAM results from a group of samples at both healthy and aged state. The used labview programs are presented in Appendix G. Appendix H shows a pictorial description of the experiment.
## **CHAPTER 2**

### **OVERVIEW OF IGBT POWER MODULES**

#### 2.1 IGBT Module Structures

IGBT power semiconductor modules are the key devices for medium and high power converters. IGBT power modules come in different sizes, shapes and with different functionalities, and they have become much more compact, cost efficient and reliable ever since the launch of the first IGBT power modules in the 1970s.



**Figure 2.1** 3-D view (a) and cross-sectional view [29] (b) of a typical IGBT power module (not to scale)

Figure 2.1 shows the 3-D view (a) and cross-sectional view (b) of a standard wirebonded IGBT power module. Such a structure consists of a plastic case, sometimes called a cover, which is connected to a baseplate. A Direct Copper Bonded (DCB) ceramic substrate is soldered to the baseplate and the IGBT and diode power chips are then soldered to the DCB. The layered structure from chip-to-baseplate is often called a multilayer structure in the literature [30]. Bond wires are commonly used to connect the upper side of silicon chips to substrates and to connect substrates to terminals. The final production step of the IGBT power module is the silicon gel which is poured into the module. All different assembly components shown in Fig 2.1 are discussed in detail in Chapter section 2.1.1.

IGBT power modules are required to operate reliably and efficiently over a large range of load conditions and environmental conditions. Special assembly materials and different packaging techniques have therefore been developed for different applications. For example, in order to reduce the junction-to-case thermal resistance, power modules without base plates have been developed, such as the MiniSKiiP from Semikron (Figure 2.2). In order to reduce the performance holding factor from the wire bond interface, SKiN technology has recently been developed to eliminate bond wires (Figure 2.3). However, conventional wire bonded power modules with baseplates are still in mass production due to their flexibility and economic interest. Therefore, the conventional IGBT power module SKM 50GB063D from Semikron was chosen to be considered for this research project.



**Figure 2.2** MiniSKiiP, an insulated press pack module with spring pin contact system [31] allowing the removal of the baseplate



Figure 2.3 SKiN half bridge module with bond wires replaced by sintered flex foil [32]

#### 2.1.1 Components of conventional IGBTs

This section describes briefly the different components required to manufacture an IGBT power module.

#### 1) Device metallization and bond wires

The assembly of power modules with the help of bond wires is a well established technique and provides high flexibility in the power module production process. Therefore, it is still used as a primary method in the manufacturing of power IGBT modules. As shown in Figures 2.1 and 2.2, the traditional wire bond modules utilize bond wires to connect the upper side of chips to substrates and/or to connect substrates to terminals. The diameter of bond wires being used ranges from 200 um to more than 500 um, depending on the current rating requirement. Automated wire bonders provide ultrasonic wedge bonding and force the wire to interconnect the chip metal pads. The metallization surface is established on the top side of the chips for wire bonding. It is also beneficial to add additional thermal capacitance directly to the chip surface and buffer excessive heat from short transient loads such as a short circuit.

#### 2) DCB substrate

A DCB substrate has three functions. First, it enables internal electrical interconnection for multiple silicon devices using copper tracks. Second, it ensures high insulation between the power chips and the baseplate, and third, it conducts the heat generated by the power chips to the cooling system. Copper layers are metallized to the top and bottom areas of the ceramic substrate. Common ceramics are Alumina (Al<sub>2</sub>O<sub>3</sub>) or Aluminium Nitride (AlN). The top copper layer allows the chips to be soldered on by bonding after the power module circuitry is etched and the bottom copper layer provides a solder connection with the baseplate. The metallization is typically realized by a thick (300µm) copper layer, either connected to the oxide by a eutetic bonding process (e.g. DCB) [33], or by an active metal brazing process (AMB) [33]. The copper-ceramiccopper sandwich combines good thermal conductivity and electrical insulation.

#### 3) Die-attach and DCB solder

Chips are attached to the metallized surface of the isolation substrate using solder reflow processing (or glue). The bottom side of the isolation substrate is mainly soldered to the module baseplate.

#### 4) Baseplate and heat sink

The baseplate is used to hold the housing and also provides thermal capacity and helps thermal spreading to increase the contact area to the heat sink. A baseplate (normally 3-5 mm thick) is made either from copper or a metal-matrix compound material, such as aluminium silicon carbide (AlSiC). It is usually mounted to the heat sink by means of pressure screws positioned at the margins. Hence, the unavoidable unevenness between baseplate and heat sink requires an interface layer called thermal interfacial materials (TIMs) to fill the air gap between the module and heat sink. Thermal grease is a commonly used TIM and the practical thickness for TIM is about 100 µm.

#### 5) Encapsulation and casing

A plastic case holds the terminal leads and houses the module for the purpose of protection and isolation. Once the module is mounted into a plastic case, it is then filled with a dielectric filling material, silicon gel, which provides better insulation than air, mechanical protection and protection against contamination. Epoxy resin is also overfilled to provide rigidity.

15

#### 2.1.2 Trends toward the next generation of IGBT modules

High reliability, high performance, high operation temperature, optimal operation efficiency, low losses, low cost, compactness, high power densities and ease of mass production are some of the major development trends in power electronics. Progress in power module technologies has been made in the following areas: further reduction of generated losses from silicon dies, improvement of cooling system efficiency, improvement of thermal impedance and increase of maximum operating junction temperature. With the enhancement of the new generation of IGBT modules, many research studies have been carried out and proved that the silicon die does not set the limit in junction temperature (up to 200°C) [34]; rather, the assembly and interconnections set the limit in terms of reliability. In order to break through the lifetime limitations from the three major wear-out/fatigue areas (chip front side, Chip-to-substrate, and Substrate-to-baseplate interconnects) within the IGBT module (as will be discussed in Chapter 2.3 in more detail), new assembly and interconnect technologies have been developed and presented in studies from Guth, Ciliox, Ott, and Reinhold etc. [35-38].

#### 1) Chip front side connection

In recent years, advanced wire bonding techniques such as composition of materials, bonding parameters, chip surface metallization, ultrasonic welding connections and protective coating, have been applied and the reliability of bond wires has been considerably improved. Today, aluminium (Al) is still the standard material used for bond wire and chip front side metallization. However, recently developed wire bonding techniques with copper bond wires and copper front side metallization show some clear advantages for improving reliability [35-37]. A general comparison of the relevant material properties for both copper and aluminium is shown in Table 2.1, which illustrates the fact that copper has better compatibility of coefficient of thermal expansion (CTE), higher electrical and thermal conductivity and superior mechanical properties.

#### 2) Chip-to-substrate connection

Chip-to-substrate connection will not only push the limits for solder fatigue but also has a direct impact on the lifetime of the wire bond interconnects. Low temperature silver sintering is the benchmark concerning power cycling reliability. The sintered module fails at about three times the number of cycles to failure as a soldered module in an active temperature cycling test, where temperature excursions are mainly due to the power dissipation from the power semiconductor. However, high material costs and non-compatibility with today's soldering technologies makes this technology expensive. A diffusion soldering process for power semiconductors to form a high melting bond between chip and substrate has been recently proposed [38] and is still under development.

|                        | Copper     | Aluminium  |
|------------------------|------------|------------|
| electrical resistivity | 1.7µOhm.cm | 2.7µOhm.cm |
| thermal conductivity   | 400W/m.K   | 220W/m.K   |
| CTE                    | 16.5ppm    | 25ppm      |
| yield strength         | 140MPa     | 29MPa      |
| elastic modulus        | 110-140GPa | 50GPa      |
| melting point          | 1083°C     | 660°C      |

 Table 2.1 Comparison of material properties

#### 3) Substrate-to-baseplate connection

Some new IGBT modules utilize optimized solder formula and ceramic pattern layout to maximize the life of the substrate solder joint [39]. Others press the DCB substrate directly to the heat sink by using multiple stamped and folded busbars. In this way, the main reliability risk, the large area substrate solder layer, has been eliminated [3].

## 2.2 Failure Modes, Mechanisms and Effects Analysis in IGBT Modules

Failure modes, mechanisms and effects analysis (FMMEA) is a methodology to identify failure mechanisms and models for all potential failure modes, to assess the root causes and to prioritize the failure mechanisms of a given product [40]. A potential failure mode is the manner in which a failure can occur, that is, the ways in which the item fails to perform its intended design function. Failure mechanisms are the processes by which a specific combination of different stresses (e.g. physical, electrical, thermal, chemical, mechanical, etc.) induces failures. Effects analysis refers to studying the consequences of those failures for an entire product or system [41].

In power semiconductor modules, the output of the FMMEA process is a list of critical

failure mechanisms that help to identify the precursors to monitor and the relevant Physics of Failure (PoF) models to use, enabling prediction of the component's Remaining Useful Lifetime (RUL). A failure mode is the recognizable symptom by which failure is observed, e.g. bond wire lift-off or solder crack. Each failure mode could be caused by one or more different failure mechanisms under certain operating conditions (stresses). For the IGBT modules, the stresses can be driven by temperature, voltage, current, vibration, mechanical vibration and shocks, humidity, cosmic radiation level, etc. [42, 43]

Failure mechanisms are broadly categorized as random (or sudden) failure and wear-out (gradual) failure mechanisms [44, 45]. Random (or sudden) failures are catastrophic failures as a result of a momentary over-stress condition that exceeds the threshold of a strength property. This type of failure is not related to the length of service or the age of the device. It is generally caused by external events such as particle radiation, voltage transients and electrical failures resulting in electrical discharge. Wear-out failures are attributed to the accumulation of incremental physical damage under the operating load (stress) conditions, altering the device properties beyond endurance limits. Examples of wear-out failures include bond wire heal cracking and lift-off [46, 47], chip metallization reconstruction [46-48], solder fatigue [49, 50], cracking of the DCB substrate [35-37, 51, 52], and gate oxide breakdown[19, 53-55]. The wear-out failures can be assessed with a prognostics and health monitoring (PHM) method, which is the main focus of this thesis. However, up to the present moment, no monitoring or predicting methods have been found which are suitable for both random and wear-out failures. Random failure mechanisms are always taken into consideration by means of the average failure rate. These failure process are related to the IGBT module type and stress levels caused by abnormal overload conditions, such as overcurrent, overvoltage, over-temperature, and short-circuit, etc [56, 57]. Since the random failure mechanisms are still not clear, and relevant research is still undergoing, the main approaches to prevent their occurrence are to enhance device performance by over-design and reducing electrical and thermal stresses.

In terms of failure sites, failure modes in power modules can be categorized as chiprelated failures and packaging-related failures. Chip-related failures always interact with packaging-related failures, which tend to be more significant and more frequently observed. Some of the frequently observed failure modes will be now discussed.

#### 2.2.1 Chip-related failures

Chip-related failures are relevant to semiconductor physics and are silicon die intrinsic failures which can ultimately destroy a device. These failures are separate from packaging-related failures, but may often be associated with several structural changes in the package. This is because the overstress and wear out mechanisms during operation are generally believed to cause chip performance degradation and potential failures in the long term [15]. Some of the chip-related failures are listed below.

#### 1) Time dependent dielectric breakdown (TDDB)

Dielectric breakdown occurs when a strong electric field induces a current channel through an originally insulated medium. The gate oxide for a MOS semiconductor device is subject to a wear-out failure mechanism, that is, destruction of gate oxide caused by chronic defect accumulation in the thin SiO<sub>2</sub> insulator layers [54, 55]. This is referred to as Time Dependent Dielectric Breakdown (TDDB). Gate oxides can suffer breakdown under long-time application and even under normal bias operation.

High temperature and strong electric field may impart energy into an electron or a hole which then becomes a "hot carrier" due to the high kinetic energy stored [55]. Hot carriers have sufficient energy to overcome the energy barrier and to flow through the gate oxide by tunneling. These trapped hot carriers are the primary cause of TDDB and gate oxide degradation [54] which ultimately lead to loss of gate controllability. Gate oxide degradation affects the threshold voltage  $V_{GE(th)}$  as well as transconductance  $g_m$  and reduces the collector-emitter saturation current  $I_S$  [58]. Changes in any of the three parameters will result in excessive leakage current and a change in transistor response time [54].

#### 2) Electromigration

Electromigration is a wear-out mechanism in silicon interconnects as a result of high current densities. The current flow of extremely high density knocks off atoms within thin-film conductors and causes the displacement of metal atoms, leaving a gap or void at one end. In a semiconductor chip, the formation of such a void will cause an open circuit or high resistive paths [59].

#### 3) Transient electrical stresses

Electrostatic discharge (ESD), electrical overstress (EOS), and power supply transients

from switching or lighting can cause chip failures [60, 61].

#### 4) Over-temperature

There is a theoretical maximum of the internal temperature of a semiconductor device. Today, the maximum junction temperature of commercially available silicon power chips is  $T_j=175$  °C. Almost all electrical characteristics of IGBTs and freewheeling diodes (FWD) are a function of the junction temperature. As the junction temperature increases, the failure rate increases exponentially, leading to latch-up.

#### 5) Cosmic rays

MOS devices and diodes are affected by external radiation and cosmic rays (e.g. highenergy mobile ions and neutrons), which impose severe limitations on the device's maximum DC blocking voltage [62]. It is common practice to derate the maximum rated blocking voltages by as much as 50% to meet the lifetime requirements. IGBT devices show an increased sensitivity to cosmic rays in respect to diodes [46].

#### 2.2.2 Package-related failures

Package-related degradation is of more concern in many applications than chip-related failure. This is because they could incur higher working temperatures which lead to degraded loci of the device's Safe Operating Area (SOA) and its operating performance [57]. In particular, the switching transitions become very stressed in the case of bond wire and die-attach degradations.

For a qualified off-the-shelf power electronic module, thermal limitations of the power device play an equally important role as the maximum electrical ratings. The system reliability and operation lifetime are limited by the component temperature profile, as well as the chip and packaging technology. In the FMMEA analysis, bond wire fatigue and solder fatigue (both the die-attach and the DCB substrate solder) are identified as critical failure mechanisms due to thermomechanical stress in IGBT power modules.

These wear-out/fatigue failures have been reported extensively to be the predominant causes of failure of standard wire-bonded IGBT power modules, and research projects such as LESIT (1994-1996) [42] and RAPSDRA (E.U.1996-1998) [63] have looked into this in great detail. The main reliability issues come from the fact that this module

packaging is made up of several layers, each of which has its own thermal property which can be characterized as coefficient of thermal expansion (CTE). Due to CTE mismatch between dissimilar materials and temperature cycling, shear stress is generated at various parts within the assembled structure by thermal deformation. The repetitive thermomechanical stress is exposed to the material interfaces, leading to fatigue/wear-out, and eventually failures. Theoretically speaking, thermomechanical stresses are distributed across the whole power module package. However, results from reliability tests and field tests determine three major regions for these wear-out/fatigue failures, which are: (i) chip front side connection (bond wire-to-silicon); (ii) chip-to-substrate connection, also called die-attachment (chip-solder joint); and (iii) substrate-to-baseplate connection (DCB substrate-solder joint) [64-67]. Figure 2.4 shows the structure of an IGBT, including the location of failures i) to iii). Table 2.2 shows the value of CTEs of basic elements inside the power modules.



Figure 2.4 Three major regions for thermomechanical failures

The die-attach has comparatively lower mismatch of CTEs and wear out is therefore seldom compared to bond wire lift-off and DCB solder fatigue, which are the two most frequently observed failure mechanisms [68-70] of present packaging technology. Therefore, this research work has focused on developing an in-situ condition monitoring circuit that can capture both failures. The method that has been developed is also able to capture the health of the die-attach, as explained in Chapter 6.

#### 1) Bond wire and chip metallization damage

One major reliability bottleneck of standard power modules is wire bond interface. Bond wire failures (Figure 2.5) and chip metallization reconstructions (Figure 2.6) are mainly caused by temperature swings and the large CTE mismatch between aluminium bond wires and the silicon die (Table 2.2).

| Parts               | Material          | CTE (ppm/°C) |  |
|---------------------|-------------------|--------------|--|
| Silicon gel         | Silicon resins    | 30–300       |  |
| Epoxy resins        | Epoxy             | 15-100       |  |
| Terminal            | Copper            | 16.5         |  |
| Ni plating          | Nickel            | 13.4         |  |
| Bond wire           | Aluminium         | 24           |  |
| Chip metallization  | Aummun            | 27           |  |
| Chip                | Silicon           | 3.2          |  |
| Isolation substrate | Alumina           | 6.8          |  |
|                     | Aluminium Nitride | 4.7          |  |
| Baseplate           | Copper            | 16.5         |  |
|                     | AlSiC             | 8            |  |

 Table 2.2 CTEs for some common electronic materials [71]

Bond wire lift-off has been observed in both IGBTs and freewheeling diodes and has been recognized as a major failure mode in wire bonded power modules. The difference in strain between the two materials leads to high thermomechanical stress at the interface during power cycling [42]. Due to the low yield strength of aluminium bond wires and the weakness of wire bond joints, initial cracks always start near the wire-to-chip interface and propagate along the grain boundaries of the interface [72]. Bond wire lift-off happens to both emitter and gate bond wires. Another failure mode is bond wire heel cracking, which rarely occurs in advanced IGBT power modules, though it can be observed mainly after long endurance tests (Figure 2.5 (b) and (c)) [46, 48, 73].

Held et al. [42] found that damage of chip metallization and bond wires is closely related to both the medium junction temperature  $T_m$  and the junction temperature swing  $\Delta T_j$ . In devices with multiple bond wires, this failure mechanism affects preferably those wires which are located close to the center of the chip, where the junction temperature reaches the maximum. As the most central emitter bond wire contact fails first, the remaining bond wires must then carry the full load with a higher current

density. As the degradation proceeds, an increasing number of bond wire contacts fail and the current density in the remaining contacts continues to increase towards a critical value. This will cause successive final chip destruction, such as localized over-current and chip burn-out failures [46, 54].



(a) Metallurgic damage (b) Heel-Crack (c) Fracture (d) Lift-off Figure 2.5 Examples of bond wire damages [48]

In addition to the effect described above, reconstruction of the aluminium metallization pad can also occur on the chip top surface [46-48]. It can be understood as the extrusion of Al grains form the surface of the metallization (the surface becomes rough). Figure 2.6 shows three different states of the metal pads during chip metallization. The pictures show clearly how the reconstruction of Al metallization progresses during power cycling tests, which could lead to non-uniformity of the current density and increased metal sheet resistance.



(a) Healthy state

(b) Initial damage

(c) Progressive damage

# **Figure 2.6** Examples of emitter metallization damage [48]

#### 2) Solder fatigue

Another known reliability limiting factor is the solder joints (Figure 2.7). Due to the ever-growing improvement in bonding techniques, fatigue within chip solder and substrate solder joints is considered as the most notable failure of power modules.

Deterioration in these solder layers increases the module thermal impedance, leading to a distortion of the thermal flux, which consequently leads to overheating of the chip itself. Solder layer deterioration is reported as initiating from either the periphery or the center underneath the chip as a consequence of the used solder material [49] and thermal conditions [50].



**Figure 2.7** Cross-sectional SEM-images of the copper–solder interface of PbSn solder (a) before and (b) after ruptures [43]

Due to the CTE mismatch and the repetitive temperature swing, high thermomechanical stress is imposed at both the chip-substrate solder and substrate-baseplate solder layers. However, this is not the only impact factor in causing damage. The size of the solder joints, boundary constraints, and material yield stress and melting points also have impact on the solder fatigue. Degradation in these solder layers therefore occurs in the forms of voids, cracks and delamination. A consequent thermal resistance increase follows the solder fatigue in the heat dissipation path. This will adversely accelerate the wire bonding interface degradation and has a direct impact on the lifetime of the bond wires. Meanwhile, the increased junction temperature could induce hot spots and thermal runaway in the affected areas of the module.

In addition to the above mentioned solder failures, terminal connector-to-substrate solder degradation is also reported during the temperature cycling. Again, crack propagation and condition deterioration cause an increase of the thermal resistance. Besides an increase in the thermal resistance, cracks in the solder joint will also disturb the current path with the consequence of an increased on-state voltage drop across the solder joint [43].

Besides the CTE mismatch of the materials terminal connector-to-substrate joints must also withstand ever higher temperatures. Nominal IGBT current ratings have increased by a factor of three with the same module layout from the 1st IGBT generation to the 4<sup>th</sup> IGBT generation and are continuously increasing. With continuous growth in the power density of power electronics, the power loss in terminal leads and the terminal connector-to-substrate joint has increased. These losses elevate the temperature in the connector-to-substrate joint and have an additional impact on the joint reliability.

The terminal, baseplate and foils of insulation substrate are made of copper. CTE mismatch is the principal cause of thermomechanical stress but it is not the only impact factor in causing damage. The size of the solder joint, boundary constraints and material yield stress also have an impact on the solder fatigue.

Since the implementation of advanced soldering techniques, including void free soldering and new baseplate material, solder layer reliability has been improved to a large extent [74]. However, all solder joints (chip-to-substrate solder, substrate-to-baseplate solder [51] and terminal connector-to-substrate solder [43]) are still regarded as critical points during active and passive temperature cycling [43, 51].

#### 3) DCB failures

Apart from bond wire failure and solder failure, DCB ceramic crack and DCB metallization delamination are also observed under high temperature cycles [35-37, 51, 52]. Direct copper bonding denotes a process in which copper foils are directly bonded to the top and bottom surface of a ceramic material at a high temperature. CTE mismatch between the copper metallization and ceramic material leads to mechanical stress during temperature variations.

With new assembly and interconnection methods either improved in or circumventing the solder layer, as discussed in Chapter 2.1.2, DCB becomes a potential weak spot which may set the lifetime limitation. Ceramic cracks, followed by fracture underneath all metallizations, have occurred in DCB substrates during the thermal cycling, as shown in Figure 2.8.

25



**Figure 2.8** Optical microscopy analysis of the DCB ceramic (AIN) [51] (a) Ceramic crack located under the copper layer; (b) Lift up of the copper layer induced by the ceramic crack

#### 4) Partial discharge (PD)

Partial discharge (PD) is a partial breakdown of the insulation material. The main source of PD is from metallization edges and interfaces in silicone gel at high voltages. The shape of the edge of the substrate metallization has a much greater influence on PD, and poor etching or solder residues can adversely affect PD. The type of silicone gel and the process parameters also show secondary effects on the magnitude of the PD [53]. In particular, PD resistance may be much lowered when cracks occur in ceramic substrates. If the voltage exceeds the breakdown voltage of the air included, a sudden flash-over discharges across the substrate ceramic.

## **CHAPTER 3**

### RELIABILITY, RROGNOSTICS AND HEALTH MONITORING

#### **3.1** General Introduction to Reliability

The word reliability has been given different expectations and individual interpretations throughout history. In spoken English, the term reliability is used to express the capability of a product to stay functional. In the theory of reliability, this capability is quantified as the probability R(t) of a product to perform its designed function for a specified time interval. The probability of survival, R(t), plus the probability of failure, F(t), is always unity. Reliability provides information about the failure-free interval and is generally characterized by Mean Time Between Failure (MTBF) or failure rate ( $\lambda$ ) [6]. The failure rate ( $\lambda$ ) indicates the mean number of failures per unit time. The MTBF with a constant failure rate and no redundancy can therefore be calculated by MTBF= $1/\lambda$ . Good reliability is indicated by long MTBF and low failure rate. Other commonly used terms in traditional reliability considerations include Mean Time To Repair (MTTR) and availability [6]. MTTR is the average time it takes to eliminate a failure and emphasises the length of the period necessary to restore the required function, which is indicative of maintainability [6]. It is the mean value of the entire downtime period of the product. Availability (A) is the probability of finding a product in a proper service condition at any point of time and can be measured by

$$A = MTBF / (MTBF + MTTR)$$
(3.1)

It is a fact that engineers of different disciplines often use different definitions for the above terms for reliability considerations, and reliability is often confused with availability [6]. The term reliability of IGBT power modules used in this thesis expresses the capability of a component or a system to stay functional outside of a scheduled maintenance period.

27

Based on field data, product failures are generally found not to occur at a uniform rate, but follow a distribution in time which can be described by a classic bathtub curve, as shown in Figure 3.1 [19]. The early failure stage begins at time zero, which is characterized by a high but rapidly decreasing failure rate. In the case of semiconductors, these failures are usually due to defects that could not be removed during production, such as micro dust collecting on the wafer, or to material defects. After the early failure stage, the failure rate decreases to a lower value and remains almost constant for a long period of time. This long period of an almost constant failure rate is known as the useful life period. There is always the possibility of a potential failure accidentally occurring after a long time. Consequently, the failure rate never decreases to zero. Ultimately, the failure rate begins to increase over time as a result of age-related wear-out and fatigue. This region in the bathtub curve is commonly referred as the wear-out or end-of-life period [904]. In the case of an IGBT module, bond wire lift-off, chip metallization degradation, solder fatigue and gate oxide breakdown may occur.



Figure 3.1 Change in failure rate over time (bathtub curve) [19]

Traditionally, reliability considerations of power electronics have been studied and predicted with a methodology handbook such as MIL-HDBK-217 [18] and IEEE 1314 [75]. This calculation of MTBF is based on the failure rates of the product, which have been statistically obtained from field data; however, these do not accurately account for the actual environmental and operating conditions that the product is subjected to in its life cycle [76]. It is well documented that this technique can result in inaccurate predictions. In fact, this method has been shown to be misleading and to provide false life predictions [77].

Higher field reliability requires knowledge of in-service use and life cycle environmental and operational conditions. Prognostics and health monitoring (PHM) methods have therefore been introduced as they have the advantage of allowing the reliability performance of a product to be assessed in actual life cycle conditions, and early work on PHM was applied to military aircraft and civil helicopters throughout the late 1990s and early 2000s in the U.S. and Europe [20].

PHM enables in-situ assessment of the deviation or degradation of a product from an actual application environment. Furthermore, based on current and historical application conditions, the system future health and RUL for the intended load profiles may also be predicted [20].

To use the PHM for power electronic components, the failure modes and mechanisms that can take place in the components need to be identified. The significance of their impact is weighted and the major failure mechanisms are selected. Then, appropriate performance data (thermal or electrical operating parameters such as junction temperature and voltage) need to be selected and monitored, and then used in subsequent health assessment and RUL estimations. When building a PHM system, three basic steps are required: estimation of the current state of the system, assessment of the impact of degradation on system performance and determination of the need for corrective or mitigating action, and prediction of future state [78]. Once all three components are in place, PHM is able to predict the impact of wear-out failures and fatigue. However, PHM is not capable of predicting random failures.

Sheppard and et al. [79] proposes four PHM methods which will be discussed in more detail: 1) use of expendable devices, such as "canaries" and fuses; 2) data driven method; 3) model driven method; and 4) fusion techniques. All of these methods have been explored in electronics and other applications, but work in power electronics is still in the infant stage of development. Their applications in IGBT modules will be discussed in the following Chapter sections 3.2 - 3.5.

#### **3.2 Fuses or Canaries**

Expendable devices are normally embedded into the host products and they have the same failure modes and mechanisms as the real products. Since their failures will take

place much sooner than the host product to provide advance warning of failure, they are termed as fuses and canaries.

In order to provide advance warning signs of chip-related failure mechanisms for semiconductors, the prognostic cell is used as a fuse/canary device and is mounted onto the silicon chip [20]. The Prognostic cell is designed to fail faster by scaling and is subjected to accelerated conditions compared to the original component [77]. Loading conditions (i.e. operating parameters) are altered in a controlled way by scaling and the current density inside the prognostic cell is increased in a controlled manner. Therefore, the prognostic cell is expected to fail faster when a current of higher density passes through it, as compared to the actual components. In addition, the prognostic cell is located on the same chip as the actual components; thus, it will experience substantially similar manufacturing process and operating conditions as does the actual component. This ensures that any parameter that affects the real product reliability will also affect the fuse/canary device, causing its failure. Figure 3.2 shows two idealized bathtub curves: one for the actual component and the other for the prognostic cell. The failure of the fuses/canary devices can be used to estimate the time to failure of actual components [80], given that the time to failure of the prognostic cell is pre-calibrated with respect to the time to failure of the actual component. Some measurable parameters which are indicative of the condition of the prognostic cell are required and they are monitored to be compared to a preset reference value in order to indicate the impending failures.



**Figure 3.2** The idealised bathtub reliability curves for the test circuit and the prognostic cell. The shaded region is the failure region of the prognostic cells, which is before the wear out region of the actual component[80].

The prognostic cell may consist of a family of prognostic devices with different trigger points so that the degradation level can be evaluated. Mishra et al. [80] experimented with prognostic cells (circuits) and their work has led to the so-called Sentinel SemiconductorTM technology, which has been commercialized by the Ridgetop Group [81]. Work by Goodman et al. has demonstrated a prognostic cell to monitor time-dependent dielectric breakdown (TDDB) of MOSFET [82].

Whilst the majority of research is currently oriented to semiconductor applications, little research work has been reported on embedded fuses/canary devices in power semiconductors (i.e. power MOSFETs and IGBTs) applications. The present prognostic cells are mainly used for chip-related failure detection; however, they are not indicative of packaging-related failures, which tend to be more significant and more frequently observed. In addition, canaries/fuses provide limited insight into the degraded condition and RUL prior to their failures [83].

#### **3.3 Data Driven Methods**

A key step in the data driven method is to collect significant operating parameters which are indicative of performance degradation by in-situ monitoring. These parameters are sometimes called precursor parameters and their changes indicate impending failures [77]. Based on the statistical characteristics of the device's historical (training) data, state of health can be evaluated. In order to estimate the RUL, the existing damage is first estimated, and then a suitable extrapolation to the precursor shifts is performed to find out the intersection of the extrapolated damage and the failure criterion. This method is often referred to as the data trending technique.

There are two kinds of precursors for IGBT power modules. One can be termed as inherent parameters since these are built-in prognostic parameters which correlate specific degradation modes with relative parameter shifts. They can be selected based on knowledge of the appropriate parameters established by past experience and field application data. More systematic methods, such as the previously mentioned FMMEA method, can also be used to determine the parameters to be monitored. The other method is to externally embed sensors as precursors which are additionally implemented for degradation monitoring. Although this work focuses on the most significant IGBT power module failures that are aged under thermomechanical stresses, some commonly used failure precursor parameters will be introduced in the following session.

#### **3.3.1** Inherent prognostic parameters

The major semiconductor failure mechanisms that can be monitored with inherent prognostic parameters are discussed here. Sometimes, the number of prognostic parameters for a particular failure mechanism is more than one. Most of these prognostic parameters are discussed in literature for laboratory study purposes, and an amount of commercially available equipment, such as curve tracers and semiconductor device analysers, are often used for parameter measurement. Examples of such equipment are the Agilent B1505A [84], the Tektronix 371B [85], the T3Ster [86], the 9624-KT Thermal Tester [87] and the Phase 11 Thermal Analyser [88]. However, for taking these parametric measurements, the electric circuitry of the inverter system needs to be changed, and to achieve a high degree of accuracy it is even needed to dissemble or destroy the inverter system. In addition, this equipment is normally costly and requires sophisticated software to operate. As a consequence, appropriate parameters for specified failure modes need to be carefully selected and their feasibility for on board EV application considered. These monitoring techniques can not be directly converted into real applications since their measurements require detailed practical considerations. In this section, a selection of frequently used inherent prognostic parameters for laboratory studies are reviewed; further knowledge regarding device or packaging failure mechanisms is still under development. Their in-situ monitoring techniques will be further discussed in Chapter 5 and Chapter 6 in detail.

#### 1) To detect TDDB and gate oxide degradation

Gate threshold voltage  $V_{GE,th}$ , capacitance-voltage (C-V) [43, 54, 89, 90] and leakage current are measured to identify TDDB and gate oxide degradation by detecting changes in their electrical behaviour.

Gate threshold voltage is the lowest gate voltage at which IGBT turns on and a specified small amount of  $I_D$  begins to flow. The test is run by shortening the gate to the drain so that  $V_{GS} = V_{DS}$  and two general methods are used:

- Applying a voltage to the gate contact, monitoring the current through the dielectric layer (voltage test);
- Injecting a current from the gate and measuring the gate voltage needed to

sustain such current (current test);



Figure 3.3 Degradation progress as observed on threshold voltage [89]

The changes in threshold voltage under defined drain current and temperature (as shown in Figure 3.3) are observed in the published literature [43, 54, 89, 90]. This is correlated with the presence of trapped electrons in the gate oxide, which was verified by the Capacitance-Voltage (C-V) measurements conducted in the work of Patil and et al [54]. The C-V plot translates to the right if the oxide trapped charge is negative, and to the left if the trapped charge is positive [91]. Holes trapped in the gate oxide lead to a negative threshold shift, while trapped electrons lead to a positive threshold shift. For all the parts in Patil's research [54], a right shift in the C-V measurements was observed, showing the presence of trapped electrons in the gate oxide as a result of aging (Figure 3.4).



Figure 3.4 C-V measurement [54]

As observed in Patil's experiment [54] (Figure 3.5), the threshold voltage has negative temperature dependency for both aged and new devices. This is because the increase in temperature leads to a decrease in the band-gap of the silicon, which reduces the threshold voltage.



Figure 3.5 Threshold voltage variation with temperature [54]

#### 2) To detect the bond wire lift-off and chip metallization reconstruction

Optical microscope imaging can be used to check the crack at the bond wire interface and the rough level of chip metallization. The shear stress test is an alternative, commonly used approach to quantify the degradation condition between the bond wires and chip metallization. However, both approaches are off-line tests and always lead to device destruction. The forward voltage drop  $V_{CE(on)}$  /  $V_F$  or resistance  $R_{on}$  are traditionally used to detect the bond wires or emitter metallization damage [92]. Their increases are mainly due to degradation of the bond wires and their interface to the metalized top side of silicon. Chip metallization reconstruction leads to the reduction of the effective cross section of the metallization layer, which could also contribute to increased electrical resistance [46]. The aging failure criterion proposed for bond wire lift-off, which is generally accepted as the end-of-life on IGBT modules, is a 5% increase of the on-state voltage  $V_{CE(on)}$  / forward voltage  $V_F$  with respect to its initial value at fixed conducting current and junction temperature [42, 63, 93-95]. The relative variations due to this damage are very low because the voltage across the connections constitutes a weak part of the total on-state voltage. Therefore, this measurement must be made with a very high degree of accuracy.

3) To detect the thermal path degradation

Degradations in the assembly integrity between IGBT die and baseplate, most frequently in the DCB substrate and solder layers, can hinder heat conductivity and lead to the deterioration of the thermal path. In [96, 97], scanning acoustic microscope (SAM) images are used to evaluate DCB solder layer condition and detect solder fatigue in terms of voids, cracks, and delaminations. SAM is an equipment which uses ultrasonic sound waves and the echoed signal to investigate, measure, or image an object, and it is based on the principle that propagation and reflection of acoustic waves change at interfaces where a change of acoustic impedance occurs. Cracks and voids usually lead to a stronger echo. By scanning over the desired area, the microscope converts the signal into a greyscale image in which voids and cracks appear brighter than the intact solder [43]. However, this method is an off-line test and usually used in the laboratory to check DCB solder layer fatigue. Furthermore, it is difficult to quantify die-attach (chip solder) degradation. To achieve an in-situ assessment of the thermal path condition (mainly for die-attach and DCB solder), approaches relating to thermal characterizations are widely used and thermal impedance is taken as a desired precursor. It is nonintrusive, can be performed for in-situ applications and it also has an advantage in evaluating the thermal condition of assembly, which is of significance for thermal safety. However, in-situ measurement requires detailed considerations, which will be discussed in Chapter 4.3 in detail.

#### 4) To detect other degradation

The level of partial discharge is indicative of insulation failure within silicon gel, caused by environmental stresses like high temperature and high moisture [98].

Since this thesis focuses on the two most significant failure modes due to thermal aging: bond wire lift-off and solder fatigue, forward voltage drop and thermal impedance are used as precursors and they are frequently measured to evaluate the health of power modules.

#### **3.3.2** Externally embedded sensors (redundant indicators)

Dedicated sensors embedded in the device can be used to monitor physical degradation directly. For example, mechanical stress can be monitored using strain gauges and bonding wire lift-off can be detected through localized electrical resistance measurement within a power module.



**Figure 3.6** Emitter bond wire lift-off detection for IGBTs in parallel [68] (a) Modified IGBT module; (b) Equivalent circuits without (left) and with (right) emitter bond lift-off.

Lehmann et al. presented the diagnostic functionality of bond wire lift-off in an operating power module which is realized using specific bond assemblies and integrated sub-circuits [68]. The additional sense pads, as shown in Figure 3.6, are monolithically integrated sense resistors and a sub-circuit for instrumentation. They are dedicated to detecting the occurrence of an emitter bond wire lift-off in both types of IGBT module structures: (a) single chip assembly with parallel bond wires and (b) parallel connected IGBT chips inside a module.

Another technique for early detection of bond wire lift-off in a power module is shown in Figure 3.7, which employs a resistor  $R_{EH}$  connected in parallel to the emitter bond wires to sense any increased voltage drop across the primary bond wires [99].



Figure 3.7 The embedded sensor for emitter bond wire monitoring [99]

An embedded sensor-based method to directly monitor the degradation of power electronics is reliable and accurate. However, limitations for the embedded sensors method are that they are still purpose-built and limited for specialized failure modes. They require additional sensors and monitoring circuits which will increase the complexity and the cost of the power circuitry and packaging. In addition, the modifications made in the assembly and the direct copper bonding (DCB) layout may limit their applicability. In practice, its applicability needs to be carefully evaluated by taking many factors into account, such as physical dimensions, power dissipation, cost, and system reliability.

#### **3.4 Model Driven Methods (PoF Models)**

The former two described methods in Chapters 3.2 and 3.3 (fuses or canaries and prognostic degradation parameters) are able to identify failures and to warn for potential failures in the near future, but neither method can precisely estimate the RUL. The model driven method is widely used in RUL estimation for information processing and electronics-rich systems [100]. It is also referred as lifetime prognostic modelling method due to its potential in deducing the future lifetime based on the upcoming mission profile.

In this method, damage accumulation is estimated using a PoF model. Specific data gathered from power electronic modules online is inputted to a damage model to identify firstly how much damage has taken place, and then the current state of health. Therefore, prognosis of the RUL can be carried out under certain operating conditions (stresses).

The PoF model emphasizes understanding of the physical processes and failure mechanisms. The product's physical processes are ascribed to stresses due to mechanical, electrical, thermal, chemical and radiant causes [45]. The PoF models for evaluating the influence of humidity and pressure combination effects are reported in the study of Abbad and et al [101]; models considering mechanical effect are well documented in the work of Jie and et al [102] and models of electrical stress accumulation are considered by Larcher and et al [103]. They are not repeated here as this thesis is concerned with the thermomechanical related stress and wear-out failures, which are the most important area of concern. In the PoF modelling approach, the first

step is generally to carry out FMMEA to identify the failure modes and mechanisms. The PoF model-based method is widely applied to evaluate the fatigue/wear-out failure mechanisms of power modules and some of the frequently used models are listed in Table 3.1.

**Table 3.1** Comparative overview of failure mechanisms, relevant loading conditions

 and models in power modules

| Failure Mechanisms                                                                         | Failure Sites                                 | Relevant Loads         | Lifetime Models                                     |  |
|--------------------------------------------------------------------------------------------|-----------------------------------------------|------------------------|-----------------------------------------------------|--|
| Bond wire fatigue                                                                          | Bond pads and bond<br>wires<br>Die-attach_DCB | AT. T., dT/dt          | Coffin-Manson[42,<br>64, 93]                        |  |
| Solder Joint Taligue                                                                       | solder, terminal lead                         | dwell time, $\Delta H$ | Norris-Landzberg<br>[104, 105], or<br>Bayarer [106] |  |
| Ceramic cracks                                                                             | DCB substrate                                 |                        | Dayeler [100]                                       |  |
| Time dependant<br>dielectric breakdown                                                     | Dielectric layers                             | ν, τ                   | Arrhenius (Fowler-<br>Nordheim)[107]                |  |
| Electromigration                                                                           | Metallizations                                | Т, Ј                   | Eyring (Black)                                      |  |
| V: Voltage T: Temperature T <sub>m</sub> : Mean temperature J: Current density H: Huminity |                                               |                        |                                                     |  |

Mathematical representations are used to correlate the physical changes of the product and both operating and environmental parameter variations. The development of these models requires detailed knowledge of the underlying physical processes that lead to failures. Data from the product when it is subjected to field operating and environmental conditions can be used for lifetime model parameterization. However, IGBT modules have a relatively long lifetime under normal use conditions; therefore, accelerated life testing (ALT) is used. ALT on IGBT power modules is performed with the help of thermal cycling and power cycling [42]. Based on results from quantitative samples, PoF models can be established for specified failure modes at each failure site as a function of loading profile, geometry, material properties of the product, etc. [100].

The commonly used PoF models for RUL estimation are normally modelled for specific degradation mechanisms only and are based on the assumption that a certain failure site is the weakest part of the module assembly. To develop a universal PoF model including all failure modes is very complex and even unrealistic. The weakest parts of a power module in terms of wear-out/fatigue are solder joints and bond wire interface and

are related to material fatigue. Much research has been conducted and an overview of some well-known analytical lifetime models, aiming to predict the bond wire interconnects fatigue and the solder layer fatigue, is shown in Table 3.2. They are compared according to the different parameters and variables considered in these models. Many analytical models have been developed in literature and these include the Coffin-Manson model [42, 64, 93], Norris-Landzberg model [104, 105], and Bayerer's model [106].

| Analytical PoF model         | Model parameters                                          | Model variables                                 |
|------------------------------|-----------------------------------------------------------|-------------------------------------------------|
| Coffin-Manson model          | a, n                                                      | $\Delta T_{j}$                                  |
| Modified Coffin-Manson model | a, n, E <sub>a</sub>                                      | $\Delta T_j, T_m$                               |
| Norris-Landzberg model       | $a, n_1, n_2, E_a$                                        | $\Delta T_j, T_m, f$                            |
| Bayerer's model              | $K, \beta_1, \beta_2, \beta_3, \beta_4, \beta_5, \beta_6$ | $\Delta T_j$ , $T_{j-max}$ , $t_{on}$ , I, V, D |

**Table 3.2** Comparative overview of the analytical lifetime models for IGBTs

A simple model is the Coffin-Manson method, allowing calculation of the mean number of cycles to failure  $N_f$  due to thermomechanical cyclic stress. The empirical formula for thermomechanical fatigue is shown in Equation 3.2 which only takes temperature swing into account.

$$N_f = a \cdot \left(\Delta T_i\right)^{-n} \tag{3.2}$$

where a and n are numerical constants which are determined empirically from curve fittings of cycle-to-failure versus  $\Delta T_j$  plot. The parameter n is material dependent and has been noted to range from 1 to 3 for solder alloy fatigue and from 3 to 5 for a number of metal alloys [108].  $\Delta T_j$  is the junction temperature swing.

The above Coffin-Manson equation was improved by taking into consideration the mean temperature, since the parameter  $\Delta T_j$  alone is not sufficient to describe the characteristics of failures in power cycling tests. A large sample of devices were tested across a wide range of cycle conditions in the LESIT project [42], as shown in Figure 3.8, and a modified equation was developed:

$$N_f = a \cdot \left(\Delta T_j\right)^{-n} \cdot e^{E_a / (k_B \cdot T_m)}$$
(3.3)

where  $k_B$  is the Boltzmann constant and  $T_m$  is the mean temperature, which is taken into consideration by means of using an Arrhenius term. The remaining three parameters a, the exponent n and the activation energy  $E_a$  can be determined from experiment results

[42].



**Figure 3.8** Dependence of the number of cycles to failure (N<sub>f</sub>) on the mean temperature (T<sub>m</sub>) and amplitude ( $\Delta$ T<sub>i</sub>) of the temperature cycling [42]

The Norris-Landzberg model is derived from the improved Coffin-Manson model given in Equation 3.3 and it additionally includes the frequency parameter f for temperature cycles given in Equation 3.4

$$N_{f} = a \cdot f^{-n_{2}} \left( \Delta T_{j} \right)^{-n_{1}} \cdot e^{E_{a} / (k_{B} \cdot T_{m})}$$
(3.4)

Another recently proposed multi-parameter model, also derived from the Coffin-Manson model, is the Bayerer model. It is the most comprehensive analytical model as it has the highest number of model parameters and variables. The influence of the numerous variables is extracted from both the power cycling tests and the power module properties, which are junction temperature swing  $\Delta T_j$ , the maximum junction temperature  $T_{j-max}$ , the heating time  $t_{on}$ , the amplitude of applied DC current I, the diameter D of the bond wires and the blocking voltage V.

$$N_{f} = K \cdot (\Delta T_{j})^{\beta_{1}} \cdot e^{\beta_{2}/(T_{j-\max} + 273K)} \cdot t_{on}^{\beta_{3}} \cdot I^{\beta_{4}} \cdot V^{\beta_{5}} \cdot D^{\beta_{6}}$$
(3.5)

The parameters K and  $\beta_1 \sim \beta_6$  are extracted from a large data set collected in long-term reliability testing experiments.



Figure 3.9 Flow diagram of IGBT power module lifetime consumption estimation

The health monitoring and lifetime consumption of IGBT modules based on model driven methods can be carried out as shown in the flow diagram from Figure 3.9. Given

that the appropriate PoF models have been established, the PHM for an IGBT power module is carried out in four steps:

Firstly, under the mission profile, the power losses of each semiconductor device within the module can be determined based on acquired device current and temperature.

Secondly, a compact thermal model is constructed to investigate the relationship between the power dissipation and the temperature. This compact thermal model can be used to calculate junction temperature and temperature at other layers of interest in real time [109].

Thirdly, a temperature cycle counting algorithm is used to calculate the number of cycles effectively imposed by the random temperature-time data profile. Differing from the temperature profile used in PoF models, the temperature profiles estimated from the compact thermal model consist of temperature cycles with different amplitude and mean values. A widely used method is the rain flow algorithm [110]. It starts by reducing the temperature values in the time history to a sequence of peaks and valleys (maximum and minimum values). Then the peaks and valleys are analyzed in turn, with a rain flow analogy applied to a rotated copy of the history to pick out the temperature cycles, as shown in Figure 3.10. Hence, the number of observed cycles with a combination of temperature mean and range is obtained through the rain flow process.



**Figure 3.10** Example of rainflow cycle counting[110]

Finally, the lifetime model for both bond wire and the solder joint wear-out mechanisms are applied to give degradation indications based on linear damage assumptions. The number of cycles calculated from the cycle counting algorithm above is inputted and the accumulated damage in the power module due to its thermal history is estimated [111].

Although with the model driven method, online lifetime consumption can be calculated with a combination of PoF models, compact thermal models and rain flow process, there are still no published reliability test results to validate this method. On the contrary, a few anomalous phenomena have been reported by Yang [112] and Kovačević [65], both of which indicate inaccuracies in the PoF models. According to Kovačević [65], none of the above analytical models comply exactly with the experimental results for high power IGBT modules. This is because the PoF models are normally built from experiment results which use specific life-cycle loading conditions, design details of the component, and material properties as input. However, these parameters are not always available and they vary from product to product. Another limitation for the PoF models is that they are empirical models normally built on a single failure mode. All of the above PoF models share the same difficulty that the interaction between individual failure mechanisms is not known with certainty and the mutual effects between different failure modes are not inclusive. Since the power module wear-out/fatigue process is complex, with multiple physical processes, it is difficult to create a universal model applicable to various failure mechanisms [113]. Yet another origin of inaccuracy for the model driven method is the online estimation of temperature of interest. Since thermal performance degrades due to thermal fatigue, a real-time updated compact thermal model is required for accurate temperature measurement. A dynamic thermal model representing multiple physical degradation processes occurring in the power assembly is required.

#### 3.5 A Fusion Prognostic Method

A fusion prognostic method (see Figure 3.11) is newly proposed and it essentially combines the PoF model-based and data-driven approaches to take advantage of the strengths of both methods, while overcoming their limitations. Assessment of a system's health is carried out in real time using in-situ data, which is used for both anomaly detection and lifetime consumption estimation. Although this method is in its infancy, it is gaining ground very rapidly. It gives rise to the hope that one can reliably

detect failure precursors in semiconductor device and use them in an intelligent prediction framework to derive RUL estimates. The fusion prognostics method can be implemented in several steps.

Firstly, the precursor parameters to be monitored are identified. In general these parameters can consist of any available variables, including operational and environmental loads, as well as performance parameters [114]. FMMEA, virtual simulations and field data from maintenance records can be used to identify these precursor parameters.

Secondly, the chosen parameter is continuously monitored with appropriate sensing technology in real time. The healthy (or faulty or both) baseline of the chosen precursor parameters used for the data driven method is created for anomaly detection. Meanwhile, the failure criterion for the model based method needs to be determined. All these baselines and the failure criterion are prerequisites and misclassification leads to problems such as false indications of faults (false alarms) or failure to predict faults (missed alarms).

Thirdly, the chosen parameters are continuously monitored in real time and impending failures (anomalies) are prognosed with both methods. Anomalies are detected either when the precursor parameter deviation is beyond the failure threshold or the accumulated damage is beyond the failure criterion.

Fourthly, parameter isolation is used to identify the parameters that contribute significantly to failures. In the data driven method, the parameter isolation step helps to determine the precursor parameters most relevant to the type of failure the power module is undergoing. The model driven method (PoF method), which uses the isolated parameter as the primary input, is selected in this step.

Fifthly, parameter trending is a process of predicting the behaviour of parameters in the future, based on current and historical trends. The relationship between the time and the trending parameters should be estimated by the regression method. Based on this relationship, the value of parameters in the future can be predicted.

Finally, a calculation of the RUL for the system based on the combination of anomaly

detection, parameter isolation, PoF based models, and data-driven techniques can be made. Alarms can be set off to warn the system operator of impending failure based on the value of the RUL reported.



Figure 3.11 Fusion prognostics approach

## **CHAPTER 4**

### **MEASUREMENT TECHNIQUES**

It has been previously mentioned that bond wire lift-off and solder fatigue are the two most frequently observed failure modes for power IGBT modules. To detect IGBT bond wire lift-off, on-state voltage drop at specified conditions (e.g. collector current, junction temperature and gate-emitter voltage) can be used as the failure indicator. To detect solder fatigue, the junction-to-case thermal impedance can be used as a failure indicator. Literature review has revealed that the major area of research for both cases has been conducted for laboratory studies. However, only limited research work has been performed for in-situ application. In this chapter, an overview of relevant measurement techniques is presented. It also considers the problems associated with insitu measurement and appropriate methods to use, which are primarily aimed at power train connected inverter systems. This will set the background scenario for the measurement in inverter systems of EV applications, the problem addressed and the proposed techniques are equally applicable to many other power inverter applications.

#### 4.1 Noise

Noise is an unwanted signal in any electrical system and can originate from a large number of sources. Intrinsic and extrinsic noises are the two fundamental types of noise [115]. The proposed in-situ health monitoring system can operate effectively only when measurement circuits are carefully designed in order to minimize noise levels. Therefore, this section lists the most common noise sources in the proposed data acquisition system, which helps to design an in-situ measurement that deals with them and achieves high accuracy.

45

#### 1) Thermal noise

The theoretical limit of resolution to any voltage or current measurement is determined by thermal noise, sometimes referred to as Johnson - Nyquist noise, which is associated with the motion of electrons due to their thermal energy at temperatures above absolute zero [116]. The thermal noise is generated by all the resistors present in the circuit and its RMS value can be calculated using Nyquist's relation:

$$V_n = \sqrt{4k_B TBR} \tag{4.1}$$

$$I_n = \sqrt{\frac{4k_B TB}{R}} \tag{4.2}$$

in (4.1) and (4.2),  $V_n$  is the Johnson RMS voltage noise;  $I_n$  is the Johnson RMS current noise;  $k_B$  is the Boltzmann's constant; T is the absolute temperature; B is the noise bandwidth and R is the resistance.

The thermal noise level is the limiting minimum noise any circuit can attain at a given temperature and all resistors in the circuit have a thermal noise, which can be further amplified by the gain when they are used as the input resistor of an op amp gain circuit [115]. In the proposed measurement circuit of this thesis, the largest value of resistance is the current limiting resistor of  $100k\Omega$ . At room temperature  $25^{\circ}C$  (298K) with 100kHz bandwidth, noise will be added to the circuit as below

$$V_n = \sqrt{4k_B TBR} = \sqrt{4 \times 1.38 \times 10^{-23} \times 298 \times 100 \times 10^3 \times 100 \times 10^3} = 12.8 \mu V$$

Whilst thermal noise of this level is negligible in this study, it may be reduced by scaling down the resistors and by decreasing the bandwidth of the measurement.

#### 2) Flicker noise

Flicker noise is also sometimes called 1/f noise and is present in all active and many passive devices [115]. The noise amplitude increases as the frequency decreases. Its origin is still an unsolved problem in physics. In relation with resistors, it is often referred to as "excess current noise", which appears in addition to the thermal noise when current passes through the resistor. Wire-wound resistors have the least flicker noise, while metal film resistors have somewhat greater flicker noise, and carbon composition resistors show significant flicker noise. It can generally be neglected in our experiment.

#### 3) Radiated noise

Another source of noise is radiated noise [117]. Commonly, this noise can occur because of capacitive and/or inductive coupling of signals from poor layouts of tracks on a PCB for example. The capacitive/inductive coupling operates as an emitter radiating noise to other tracks or components. Radiated noise also includes noise generated from radio stations and other electromagnetic pulse generators. It is therefore desired to add capacitive couplings between tracks or shield the inverter by using a metal cage, for example. Parasitic inductor coupling is generated from loop and stray inductances. These inductances can be reduced by shielding cables or putting tracks in close proximity. In addition, power tracks and signal tracks should be laid out at a 90 degree angle to each other. Furthermore, external filters such as inductive chokes or resistor-capacitor (R/C) filters can be added to effectively reduce radiated noise. In the proposed measurement circuit, high power cables are shielded with one end star connected to the earth. The data acquisition device is powered by cables with inductive chokes.

#### 4) Offset voltage

As shown in equation 4.3, any measurement will add or subtract an error, called offset voltage ( $V_{OFFSET}$ ), from the "real" signal source voltage ( $V_S$ ), so that the reading voltage ( $V_M$ ) is not the exact "real voltage" (the same applies to current measurements)[118].

$$V_M = V_S \pm V_{OFESET} \tag{4.3}$$

The offset voltage can normally be "nulled out" by zero offset adjustments. However, the voltage offset drift is often found as a function of the temperature and used life of the instrument. Good measurement practice dictates periodic checking and regular zero offset.

Thermal electromotive force (EMF) or the Seebeck voltage is the frequently observed source of offset drifts due to varied ambient temperatures. They are generated when conductors made of dissimilar metals are joined together and different parts of a circuit are at different temperatures. This causes such connections to act as two thermocouple pairs. The phenomenon is known as the thermoelectric effect, or Seebeck effect. The voltage generated depends on the ambient temperature and the metals' Seebeck coefficient (Table 4.1)

To minimize the voltage offset, the same kind of material (copper) is used to a large
extent for our proposed circuitry, such as PCB tracks, relay power terminals and cables. Special attention is given to the electric path of the analog measurement circuit, which should be kept clean and free of copper oxides to avoid an increase in the Seebeck coefficient. Meanwhile, the temperature gradient throughout the test circuit is minimized by placing all junctions in close proximity to one another and keeping measurement circuits at room temperature. Taking one step further, the thermoelectric EMF could be further reduced by shielding the circuit from heat sources in the final design.

| Paired     | Seebeck                      |  |  |
|------------|------------------------------|--|--|
| Materials* | Coefficient, Q <sub>AB</sub> |  |  |
| Cu – Cu    | $\leq 0.2 \ \mu V/^{\circ}C$ |  |  |
| Cu – Ag    | 0.3 µV/°C                    |  |  |
| Cu – Au    | 0.3 µV/°C                    |  |  |
| Cu – Pb/Sn | 1–3 µV/°C                    |  |  |
| Cu – Si    | 400 µV/°C                    |  |  |
| Cu – Kovar | 40–75 µV/°C                  |  |  |
| Cu – CuO   | 1000 µV/°C                   |  |  |

 Table 4.1 Seebeck coefficients[118]

\* Ag = silver; Au = gold; Cu = copper; CuO = copper oxide; Pb = lead; Si = silicon; Sn = tin

# 5) Ground Loop

Ground loop is another common source of noise and errors [119]. Figure 4.1 shows an example of a differential thermocouple measurement. Both ground references of the signal source and the measurement device are supposed to be at the same potential, but they are actually at different potentials due to non-ideal ground conductors. This unexpected ground loop interference voltage can cause the current to circulate and lead to significant error in the measurement. In addition, the induced interference current can couple voltages into nearby wires as well.



**Figure 4.1** A differential thermocouple measurement with a grounded signal source can create a ground loop.

Using isolated hardware with differential input can eliminate the noise caused by a path between the ground of the signal source and that of the measurement device, thereby preventing any current from flowing between multiple ground points (Figure 4.2). In the proposed measurement circuit, the ground loop is considered by selecting hardware with digital isolations.



**Figure 4.2** Isolation eliminates ground loops by separating the earth ground from the amplifier ground reference.

# 4.2 High Common Mode Voltage Measurement Techniques

To detect IGBT bond wire lift-off and solder fatigue, associated failure prognostic parameters are to be monitored in specified conditions (e.g. collector current, junction temperature etc.). Research has been widely performed for both failure modes in laboratories. However, only a little research has been done to apply the laboratory mature technique to real applications, and this has had limited success and has

#### insufficiently detailed methodologies [120, 121].

The deceptively simple appearance of laboratory measurement circuitry masks some potential problems in real applications. High common mode voltage (CMV), generated by the DC link voltage, is one of them and has been around for a long time. This is because the top switches in a power converter are usually accompanied by the high common-mode voltage (i.e. 400V in EV applications). When measuring the on-state voltage, the current shunt voltage or TSEPs for the top switches, high common mode voltage is inevitable. Such high voltage is beyond the voltage breakdown capabilities of most practical semiconductor components, particularly if accurate measurement is required. To accurately extract the voltages concerned while rejecting common mode voltages, various approaches will be compared below and the measurement technique with digital isolation is first proposed for in-situ health monitoring.

The high CMV can presumably be obviated by resistor networks provided with voltage divider functions. This is cost-effective and simple to build. However, this approach is seriously flawed by the mismatch or variation in the resistor network which can degrade the common mode rejection (CMR). Moreover, CMR is degraded by the loading effect of the measurement source, particularly if a high resistance source exists. Additionally, the resistors drain current from the measurement source which is an unallowable disadvantage. Also, this approach is short of isolation between the electronic controller and the power circuit, which can raise a potential safety problem. Recently, some high common-mode voltage difference amplifiers, i.e. AD629 [122] (Figure 4.3) and INA117 [123], are developed, however, their measurement accuracy and common-mode voltage range need to be improved for wider applications.



Figure 4.3 AD629B high common-mode voltage amplifier [122]

To enhance system safety and break the ground loops, isolation can be achieved through the use of isolation amplifiers, isolation ADCs or digital isolators (Figure 4.4). Amplifiers with an internal isolation barrier at the analogue front end can offer protection for the ADC and the subsequent circuitry from high common-mode voltage and voltage spikes. However, analogue isolation amplifiers add errors caused by nonlinearity and offsets. Moreover, they are also costly and can suffer from long settling times (>10us). The same statements apply to isolation ADC devices. Another method is to transform the voltage into digital information that can be transported to the control with digital isolators. Digital isolators have lower cost and higher data transfer speed and for those reasons they have been widely used.



**Figure 4.4** Isolated data acquisition system (a) analogue isolation amplifier; (b) isolation ADC; (c) digital isolation

The most common isolator is the opto-coupler [124]. Opto-couplers, based on optical coupling principles, are one of the earliest used methods for digital isolation. They can withstand high voltages and offer high immunity to electrical and magnetic noise. However, they suffer from speed, power dissipation under high-speed analogue measurements, and LED wear-outs. Texas Instruments offers digital isolation components based on capacitive coupling. These isolators provide high data transfer rates and high transient immunity [125]. The *i*Coupler technology, introduced by Analog Devices, Inc. in 2001, uses inductive coupling to offer digital isolation for high-

speed and high-channel-count applications. The *i*Coupler technology offers additional benefits such as small size and low power consumption. iCoupler products consume one-tenth to one-sixth of the power of opto-couplers at comparable signal data rates [126].

# **4.3** Semiconductor Junction Temperature Acquisition

The semiconductor junction temperature is one of the most critical parameters due to the fact that the static and dynamic electrical performances, service lifetime, and reliability of semiconductor devices are strongly dependent on the junction temperature profile [127]. In this section, a selection of different temperature measurement techniques is introduced, which can be divided into thermal model based estimation, direct measurement and indirect measurement. Although there are many more reported in the literature, only those techniques that can be used for in-situ measurements have been considered.

# 4.3.1 Thermal model based estimation

In both the design and application phases, the operating junction temperature must be guaranteed not to exceed the maximum ratings under all conditions. It is an integral step in the converter design phase to calculate the junction temperature in order to determine SOA. Based on the thermal model method, some semiconductor manufactures provide online simulation software for this purpose [128, 129]. Recently, there has been a great deal of development in real time calculation of the operating junction temperature at working conditions, with information on power losses, thermal models of packaging and the temperatures of reference points (e.g. heat sink or baseplate). Therefore, historical thermal data in standard converter operation conditions can be estimated and logged. The power losses can be calculated based on operation conditions, while thermal models based on thermal impedance analysis can be extracted from experimental measurements or by three dimensional (3-D) thermal simulations, both of which will be further discussed in the subsequent sections.

In conventional power modules, heat sink is usually selected as the reference point and measured with temperature sensors mounted to the heat sink. In this case, the effective thermal impedance  $Z_{thjh}$  will vary depending on the module mounting conditions,

distance between the sensor and each individual chip and heat sink thermal property. Therefore, to accurately estimate junction temperature, reference of the case temperature is used as an alternative and is normally measured with integrated thermistors (i.e. Semikron SEMiX). By using a superior heat sink that maintains a constant temperature, constant reference temperature can be assumed. For thermal characterization, where the case and heat sink temperature are not constant, the reference temperature should be directly measured and included into the calculation.

The thermal model based estimation method is theoretically feasible and many real time junction temperature measurement methods have been presented in recent literature [127, 130, 131]. However, the accuracy may be questionable in the following regard:

Firstly, even if the simultaneous measurement of  $T_i$  is trivial, the precise junction temperature calculation is dependent on the accurate measurement of power loss and reference point temperature, and a realistic thermal model. To reduce online measurement requirements and relieve computation pressure, power loss is always estimated with a predefined look-up table, which is a potential error source. The prevalent equivalent thermal models, especially for real time applications, have simplified the heat transfer mechanism and only a bearable average junction temperature approximation is achieved (please refer to Chapter 3.3.2.3 for more details). In addition, temperature dependency of physical parameters for the packaging and the correlated electrical and thermal behaviour are always neglected in these built models. This might deteriorate further when a thermal model is developed based on datasheet, since a safety margin of a few Kelvin degrees is always allowed for the junction temperature ripple, due to different drive strategies. Secondly, due to random variations in device characteristics, the model based estimation may not necessarily be consistent for each individual device. Finally, the aging process causes structural degradation and drifts of both electrical and thermal parameters. It is apparent that thermal path degradation such as solder fatigue will change packaging thermal resistance, and bond wire lift-off will change power loss. This results in disturbances in the measurements and requires a self-calibrated process which is not applicable to most thermal models.

On the other hand, this thermal model based estimation method can not circumvent the temperature measurement but only relieve the harsh conditions for sensor based measurement. This is because the junction temperature is required in order to determine

the thermal model during the model build-up phase and the reference temperature is required to input to the already established model during the application phase.

#### 4.3.1.1 Measurement of power losses in IGBT power modules

In power semiconductor applications, IGBTs and diodes are mainly used as electric switches and the internal self-heating power is a byproduct of electrical current flowing through these devices. Figure 4.5 shows the states that generate losses in a power device.



Figure 4.5 Losses in power semiconductor devices

Driving losses and blocking losses can always be neglected in most applications and therefore the total power loss is the sum of the turn-on losses, turn-off losses and onstate losses of each IGBT and diode in a power module.

The average power loss is given by

$$P_{av} = \frac{1}{T} \int_{0}^{T} P(t) \cdot dt = f_{s} \int_{0}^{T} P(t) \cdot dt = f_{s} \int_{0}^{T} v_{CE}(t) \cdot i_{C}(t) \cdot dt$$
(4.4)

which can be divided into three parts, including turn-on transient, turn-off transient and device conduction or on state

$$P_{av} = \frac{1}{T} \int_0^T P(t) \cdot dt = f_s \int_0^T P(t) \cdot dt = f_s \left( \int_{ton}^T P_{on}(t) \cdot dt + \int_{toff}^T P_{off}(t) \cdot dt + \int_{tcond}^T P_{cond}(t) \cdot dt \right)$$
(4.5)

where T is the period,  $P_{on}$  is the turn-on loss,  $P_{off}$  is the turn-off loss,  $P_{cond}$  is the on-state loss,  $f_s$  is the switching frequency. Normally, each of these contributions to the average power is calculated individually, as shown in Figure 4.6. The switching losses are computed using the switching energies, as expressed in (4.6) and (4.7)

$$\int_{ton} P_{on}(t) \cdot dt = E_{on}(V_{DC}, i_{con}, T_j)$$
(4.6)

$$\int_{toff} P_{off}(t) \cdot dt = E_{on} \left( V_{DC}, i_{coff}, T_j \right)$$
(4.7)

$$\int_{cond} P_{cond}(t) \cdot dt = E_{cond} \left( V_{CE}, i_{ccond}, T_j \right) = \int_{tcond} V_{CE}(t) \cdot i_C(t) \cdot dt \qquad (4.8)$$

where  $V_{DC}$  is the DC-link voltage,  $i_{con}$ ,  $i_{coff}$  and  $i_{ccond}$  are the current during turn-on transient, turn-off transient and on state respectively, and  $T_j$  is the junction temperature. Switching losses are dependent on the DC-link voltage, forward conduction current, the junction temperature, dv/dt and di/dt, while the on-state losses as expressed in (4.8) are dependent on the forward voltage drop, the current level [132].



Figure 4.6 Turn-on losses, turn-off losses and on-state losses during inductive switchings

It is apparent that on-state losses can be easily determined since the device can be assumed to operate in a static state. However, difficulties exist in calculating the switching losses since the electric transients are within hundreds of nanoseconds. The precise measurement of the voltage and current profile require a sophisticated measuring system with superior common-mode rejection, frequency response and accuracy. Furthermore, the computational cost of the system can be extremely high. In practice, precise online measurement is not cost-effective. Most commonly, transient power losses are determined with the help of look-up tables [132]. Both the IGBT and diode switching losses are measured under different conditions, such as DC-link voltage, forward conduction current, and the junction temperature, in the laboratory and they are correlated to current and device temperature by using lookup tables for online estimation [132]. Consequently, the switching losses can be directly determined by measuring the relevant parameters with less computational cost. Another method of

calculating the switching losses is to describe power losses by pulse functions such as sinus, sine square, square or triangle. The controller detects the actual load profile and associates a specific pulse function to the load profile [134]. The losses generated by the pulse function are generated at the laboratory stage.

#### 4.3.1.2 Thermal characterization of power semiconductors

Before laying out the thermal models, the thermal characterization techniques based on thermal resistance  $R_{th}$  and thermal impedance  $Z_{th}$  are first introduced, from which the thermal model based junction temperature measurement originates.

Another important reason to introduce thermal characterization here is that it denotes the thermal performance of the power module. Power density per volume per weight is an important feature for power electronics. The specific difficulty with increasing the power density is the removal of dissipated heat from silicon dies, which requires that each interface within the system must be equipped with optimized thermal characteristics. Both steady-state thermal resistance and transient-state thermal impedance are regarded as the key performance metrics to decide whether a device can be used in thermally critical applications.

Another purpose which should be particularly emphasized for this project is that  $R_{th}$  and  $Z_{th}$  are also recognized as the main criteria for the thermal path degradation that are utilized in subsequent experiments. Hence, accurate and reproducible methods for thermal parameter measurements are preferred.

# 1) Steady-state thermal resistance R<sub>th</sub>

By employing Ohm's thermal law in a steady-state, thermal resistance can be calculated as the difference in temperatures between two closed isothermal surfaces, divided by the total heat flow between them in a thermal equilibrium state. The definition for the junction-to-reference thermal resistance  $R_{thjr}$  is outlined in MIL-STD 883 [135] and in JESD51 [136] and is expressed as (4.9)

$$R_{thjr} = \frac{T_j - T_r}{P} \tag{4.9}$$

In equation (4.9),  $T_j$  is the junction temperature,  $T_r$  is the reference temperature, and P is the amount of heat flow under steady-state conditions. The definition requires the same

total heat flow across both isothermal surfaces. The traditional junction-to-case thermal resistance  $R_{thjc}$  measurement method, normally referred as the thermocouple method, is outlined in MIL-STD 883 [135] and shown in Figure 4.7. For modules with baseplates, the junction-to-case thermal resistance  $R_{thjc}$  is the generally accepted value representing the thermal properties, and  $T_c$  is the case temperature on the bottom side of the module, measured directly beneath the chip via a drill hole in the heat sink. Thermal resistance is normally given in datasheets with some safety margins and can only be used to calculate steady state junction temperature.



**Figure 4.7** Traditional measurement of thermal resistance  $(R_{thic})$ 

 $R_{thjc}$  has been widely used in literature [95, 137-140] as a failure indicator for the heat path from the junction to the case (i.e. DCB delamination and solder layer fatigue). The failure criterion is generally accepted as an increase of 20% of the internal thermal resistance from junction-to-case  $R_{thjc}$  with respect to its initial value [95, 137-139].

Although, in theory, the deviant thermal resistance  $R_{thjc}$  values under identical power dissipation conditions could indicate thermal path degradation, some shortcomings of this traditional thermocouple test method are listed below.

Firstly, the method involves difficulties in accurately measuring the package case temperature  $T_c$ . Since the case temperature distribution is non-uniform, the measured temperature depends on the exact position of the thermocouple. Moreover, the

commonly used thermocouple is prone to errors because the hole or groove provided for the thermocouple will distort the temperature distribution. Given that the bottom side of the package case is selected for measurement, it is extremely difficult to ensure that the thermocouple actually measures the case temperature and not the temperature of the heat sink or some value in between. Secondly, variables induced in the measurement, such as cooling conditions and ambient temperatures, could affect the results and produce deviant R<sub>thjc</sub> values. Thirdly, since thermal resistance is defined as a steadystate condition, having to wait a considerable amount of time for thermal equilibrium is unavoidable, and this is power and time consuming. A robust cooling mechanism, normally a programmable water circulator, is often required to regulate the heat sink temperature. Finally, the thermal resistance is a lumped characteristic parameter for the whole structure, without any internal information of the contributions of various layers along the heat path.

#### 2) Transient thermal analysis

Thermal impedance has been introduced to analyze non-equilibrium thermal transients. The power dissipation requires a finite amount of time to propagate from the junction through the various layers inside the package to the case surface of the package, and finally dissipates through the heat sink to the surrounding environment. When applying a unit power step to the device starting from time t=0, thermal impedance is measured as the difference in temperature between two isothermal surfaces divided by a constant power applied within this thermal transition. Since the transient thermal state is relevant to duration, it is given in the Equation (4.10) as

$$Z_{ihjr}(t) = \frac{T_j(t) - T_r(t)}{P}$$
(4.10)

where  $T_j$  and  $T_r$  are measured against time *t* before thermal equilibrium. The thermal impedance at thermal equilibrium is synonymous with thermal resistance.

The transient thermal analysis is usually performed by evaluating the thermal characteristic curves (heating or cooling curve) or part thereof, followed by some data post-processing techniques. The heating curve is employed as the transient thermal impedance and is taken during transient response to a step-change in power, starting from an unpowered equilibrium condition. If the system behaves linearly, the same change of impedance with a negative sign for the cooling curve can be observed when turning off the heating power after the thermal equilibrium state is reached. Although

the heating and cooling curves can both be used in principle, conditions to implement the heating curve are much stricter. Firstly, constant heating power P during heating phase is requires. Since the on-state resistance of power semiconductor devices is dependent on the junction temperature, implementation of constant heating power need to be carefully considered, which may be realised by active gate control or controlled power source. Secondly, heating the semiconductor device and sensing the junction temperature simultaneously may incur the setup of other measurement facilities. The normally used junction temperature measurement technique (i.e. switched TSEP method, as described in Chapter 4.3.3) requires a compromise between the temporal resolution and the constant power. Meanwhile, the heating power pulse renders a short time delay of effective junction measurement due to the electrical disturbances at the beginning of TSEP sensing phase.

The transient thermal results obtained from the thermal test require a considerable amount of post-processing in order to get the most useful information. Several methods have been reported in literature to show that the transient thermal analysis represents a type of cross-sectional view and reflects the internal thermal contribution of each layer [141-144]. Some of the methods are described now.

#### • *Slope (derivative) of heating curve*

The thermal characteristic curve plots reflect a cross sectional view of the physical assembly as a function of distance from the junction. For instance, a pronounced "bend" or abrupt rise in the thermal characteristic curve at some time stamp can probably be related to the heat arriving at a particular interface (or edge). The relationship between the slope of the heating curve and the device's thermal characteristics has already been explored by Katsis et al. [145]

A typical power MOSFET in the TO-247 package was divided into silicon die, solder layer and copper plate. The general shape of the temperature versus time curve is shown in Figure 4.8. This curve starts with an initial steep slope region (Zth1), which is followed by a transitional slope region (Zth2) and finally a constant shallow slope (Zth3). A constant slope has been roughly estimated for each region. The heating curve slope changes with time. As heat diffuses from the silicon die to the heat spreader within a finite time, certain parts in the characteristic curves may be identified with a physical location within the package.



**Figure 4.8** Change in heating curve slope as a function of location within package [145]

The first thermal impedance region is the silicon dominant area defined within the first five milliseconds of the heating curve [145], when heat initially spreads through the silicon die and the volume element of the die region dominates the thermal capacitance and thermal time constant. Therefore, the silicon die has a decisive influence on the initial temperature rise within this very short duration before significant heat has reached the rest of the package. Very small divergence is hardly visible in the thermal impedance over the first 5 ms and the short duration impedances are overwhelmingly representative of the heat flow in the silicon region. The next region denotes a transition from the silicon layer to the top of the copper plate. It measures from 5 to 10 ms. This region is the part where the solder layer starts to affect the slope of the temperature curve. Thermal impedance Zth3 is defined in the last stage from 10 to 50 ms. Here, the effect of solder layer degradation should be most evident. The thermal impedance of copper is naturally very low and its high thermal capacity works to stabilize the heating to a constant slope. However, the solder layer creates a bottleneck for heat flux.

| Samples                   | A    | E    | G    | D    | В    | C    | F    |
|---------------------------|------|------|------|------|------|------|------|
| Percentage of voided area | 40%  | 40%  | 42%  | 48%  | 54%  | 55%  | 62%  |
| at 10000 power cycles     | 4070 | 4070 | 7270 | 4070 | 5470 | 5570 | 0270 |

**Table 4.2** Void growth in the die-attach layer of some power MOSFET samples [145]

The slope difference of thermal impedance, which is shown by the temperature versus time curves at the 10 000 power cycles, is shown in Figure 4.9 for seven samples. The percentage of voided areas at 10 000 cycles is given in Table 4.2. The effect is evident, showing a difference in the slope of both the Zth2 and the Zth3 regions, which are

indicated with blue and red lines approximately.



Figure 4.9 Time/temperature plot for two MOSFETs with 10,000 power cycles [145]

•  $\Delta$ -value

By properly choosing the width of the heating pulse duration  $t_H$ , the heat propagation distance from the junction to the heat sink can be controlled. Thus, the thermal interface of a particular package-depth of interest can be evaluated with a high degree of sensitivity and the interference of the remaining layers in the physical assembly can be minimized. A specified heating pulse is defined in Xiao's work [141] to compare the thermal impedance and the degradation conditions of different die-attach materials. A time window during the transient thermal characterization is employed by Samuel et al. [146] for better visibility of the internal structural defects.

Based on heat propagation, the thermal impedance and time constant of each layer is calculated and then combined to represent the overall thermal impedance from the junction to the heat sink.

# • Transient dual interface measurement (TDIM)

The so called transient dual interface measurement (TDIM) method was developed recently and branched out into various evaluation methods, allowing measurement of the  $R_{thjc}$  with a higher degree of accuracy and better reproducibility than traditional methods. The basic idea remains to characterize the package's internal heat flow path from junction-to-case without measurement of the case temperature  $T_c$ , aiming to avoid

interference from the case to the heat sink interface and heat sink. The usual TDIM method uses two thermal impedance measurements  $Z_{thjc}(t)$  of the same power semiconductor device under different cooling conditions (i.e. with a different interface layer between the device case and heat sink).



**Figure 4.10**  $Z_{th}$  curves for a package measured with and without thermal grease at the interface between case and heat sink [147]

To evaluate these measurements, three methods are applied. Method 1 determines  $R_{thjc}$  directly from the separation of  $Z_{th}$ -curves.  $R_{thjc}$  is defined as the thermal impedance  $Z_{thjc}(t_s)$  at the time  $t_s$  where the two  $Z_{thjc}(t)$  curves separate (Figure 4.10). Compared to the TC-method, the measurement of  $T_c$  with thermocouples is no longer necessary, thus avoiding the related problems. However, there is no well observed point of separation.



**Figure 4.11** Derivatives (da/dz) and the difference  $\Delta$ (da/dz) of the Z<sub>th</sub> curves [147] 62

Method 2 compares the derivatives of the  $Z_{th}$  functions by examining the da/dz of the  $Z_{th}$ -curves instead of the  $Z_{th}$ -curves [147, 148] (Figure 4.11). a(z) denotes the  $Z_{th}$ -value as a function of logarithmic time z=ln(t), i.e. a(z) =  $Z_{th}(t)$ . Thus, da/dz is simply the slope of the  $Z_{th}$ -curve in the usual log-linear representation. The derivatives not only eliminate the potential errors from different offsets of the two  $Z_{th}$ -curves, but also make it easier to identify the separation point.



Figure 4.12 Structure functions for TO263 package [147]

Method 3 first calculates cumulative structure functions generated by Szekely et al. [149] and uses their separation point to determine  $R_{thjc}$  (Figure 4.12). Meanwhile, it allows the identification of internal thermal properties like die-attach and individual samples of the system. This method has already been implemented into T3Ster® by MicRed, which is an advanced thermal tester for thermal characterization of semiconductor packages [86].Last year the TDIM procedure was accepted to become part of the JESD standard as another means of measuring thermal resistance.

#### 4.3.1.3 Thermal modelling of power modules

Heat transfer is categorised in three ways: conduction, convection and radiation. Conduction is the dominant heat transfer method in power modules. Since the thermalelectrical analogy is well established, the thermal equivalent networks are frequently employed to model a thermal system. Table 4.3 summarizes the equivalence between thermal and electrical circuits.

| Elect                 | trical              | Thermal            |                        |  |
|-----------------------|---------------------|--------------------|------------------------|--|
| Parameter             | Symbol/Unit         | Parameter          | Symbol/Unit            |  |
| Voltage               | V in V              | Temperature        | T in K                 |  |
| Current               | I in A              | Heat flow          | P in W                 |  |
| Conductivity          | $\sigma$ in A/(m·V) | Conductivity       | k in W/(m·K)           |  |
| Stored charge         | q in C              | Stored heat        | Q in J                 |  |
| Electrical resistance | R in V/A            | Thermal resistance | R <sub>th</sub> in K/W |  |
| Electrical            | C in C/V            | Thermal            | C <sub>th</sub> in J/K |  |
| capacitance           |                     | capacitance        |                        |  |

**Table 4.3** Analogies of electrical to thermal parameters

Circuit simulators like PSpice and SABER are widely used to simulate the thermal system as a lumped network of discrete thermal resistances and capacitances. A variety of RC equivalent circuit thermal models can be built and will be discussed later to solve one-dimensional (1-D) thermal problems in this study. Although extensive 2-D and 3-D thermal RC networks can also be built for thermal analysis with circuit simulators, the number of RC cells is normally limited by a convergence problem. To find the exact thermal solution in a 3-D model, Finite Element Method (FEM) is always used for simulations.

Among numerous possible networks, Foster and Cauer networks are widely used to describe the power module thermal behaviour. For a given thermal system, there is an equivalent representation in the Foster and Cauer networks, so a change from one to the other is always possible. Their advantages and disadvantages will be discussed below.

# 1) Foster model

The model is based on the assumption that the semiconductor device is a little cube with thermally insulated sidewalls, attached to an ideal (isothermal) heat sink. A power step is applied at the top surface to emulate the dissipating heat source of the junction, as shown in Figure 4.13(a), which is uniformly distributed along the surface. A very simple thermal model of a one-stage RC network is shown Figure 4.13(b). In the simplest case, the thermal model of a semiconductor device consists of a thermal resistance and a thermal capacitance, which are connected in parallel.

64



**Figure 4.13** (a) Thermal model of a cube; (b) A single time constant thermal RC model; and (c) graphical representation of the thermal model

If a step power input  $P_H$  is applied to this model, the junction temperature will rise following an exponential function expressed in (4.10)

$$T(t) = T_0 + P_H \cdot R_{th} \cdot \left[1 - \exp(-t/\tau)\right]$$

$$(4.10)$$

where  $T_0$  is the reference temperature at the heat sink  $\tau = R_{th} \cdot C_{th}$ , which is called the thermal time constant. An equivalent transformation diagram of the thermal model is shown in Figure 4.13(c) represented by the  $\tau$  time-constant value and the  $R_{th}$  value describing its relative magnitude.



(b)

 $T_3$ 

 $\tau_n$ 

 $\mathbf{T}_2$ 

 $\tau_1$ 

**Figure 4.14** (a) An n-stage Foster model; (b) Graphical representation of the thermal model

The physical structures of semiconductor modules are usually more complex, having several thermal time constants. In this case, the temperature response at the power step is more precisely represented by the sum of a few exponential terms:

$$T(t) = T_0 + P_H \cdot \sum_{i=1}^n R_{thi} \cdot \left[1 - \exp(-t/\tau_i)\right]$$
(4.12)

Similarly, the structure can be characterized by the n pairs of  $R_{thi}$  -  $C_{thi}$  values and represented by a thermal model of a number of serially connected  $R_{th}C_{th}$  cells, as shown in Figure 4.14(a), which is called a Foster RC network. These pairs of  $R_{thi}$  -  $C_{thi}$  values correspond to the pairs of  $R_{thi}$  -  $\tau_i$  in graphical form, also shown in Figure 4.14(b). The position of the lines along the horizontal axis corresponds to the time-constant, whereas their height is proportional to the  $R_{th}$  value. This diagram can be regarded as a discrete spectrum displaying the thermal time constants occurring in the step input response and their relative amplitude. The knowledge of all the parameters  $\tau_i$  and  $R_{thi}$  allows the mathematical expression of the transient thermal impedance which is given by Equation (4.13)

$$Z_{th}(t) = \frac{T(t) - T_0}{P_H} = \sum_{i=1}^n R_{thi} \cdot \left[1 - \exp\left(-t/\tau_i\right)\right]$$
(4.13)

Foster RC network (also called chain model or partial-fractional circuit) is frequently given in manufactures' datasheets, due to its simplicity both in determining the parameters of the equivalent circuit and in analytical calculation of temperature curves. The coefficients of thermal time constants  $\tau_i$  and R<sub>thi</sub> can be obtained by numerical analysis of the transient characterization curves. Given that a heating or cooling curve is measured from an experiment or obtained from a datasheet, the nonlinear least square data fitting can be implemented by the Matlab program to finish the curve fitting by a set - theoretically infinite - of thermal time constants  $\tau_i$  and the corresponding amplitude factors R<sub>thi</sub> to arbitrary precision. Nonetheless, knowledge of the first n (n≥4) pairs of the coefficients is enough for the characterization.

The Foster RC network is purely a mathematically convenient model and has no physical basis. It would be unsuitable to associate the Foster network with the different physical regions of the structure since it contains node-to-node heat capacitances, having no physical reality. The  $R_{th}C_{th}$  cells are in arbitrary order and neither  $R_{th}C_{th}$  elements nor thermal time constant  $\tau_i$  has any physical significance.

#### 2) Cauer model

An equivalent model exists: the Cauer network (also called the Ladder model or

continuous-fraction circuit), which is shown in Figure 4.15. The Cauer network derives from fundamental heat-transfer physics and reflects the physical regions of the semiconductor. Therefore, it enables the network nodes to be correlated with the intermediate temperature of real physical layers (junction, solder layer, DCB, etc.), which allows the detection of internal assembly degradation [150].



Figure 4.15 An n-stage Cauer model (Cauer network)

The physical information (i.e. material property, geometry, position etc.) of each layer can be represented by one or several  $R_{th}C_{th}$  cells in order to accurately describe the heat conduction mechanism. The thermal resistors and thermal capacitors of each layer can be calculated from the following equations:

$$R_{th} = \frac{d}{k \cdot A} \tag{4.14}$$

$$C_{th} = c \cdot \rho \cdot V = c \cdot \rho \cdot d \cdot A \tag{4.15}$$

where d is the length along the heat conduction direction, A is the effective area for heat dissipation that is perpendicular to the heat conduction direction, c is the specific heat, and  $\rho$  is the density.

Normally, a single RC chain is not sufficient to model a real physical system, especially when the heat flow follows separate paths from junction to ambient. However, the Cauer network provides to some degree good correlations between the model temperature and the real temperature. The Cauer model can generally be achieved by the complex physics-based finite element method (FEM) in the first place.

#### **4.3.2** Direct temperature measurement

The accurate measurement of the junction temperature of operating power devices is not straightforward. Direct measurement using contact-less or contact sensors is widely employed for laboratory studies [151].

Infrared (IR) temperature measurement techniques [152] and optical fibers [131] are widely used contact-less methods for junction temperature measurement. Although they provide a good dynamic response, immunity to electric interference, and high precision, they are not suitable for in-situ measurement. This is because a) the state-of-the-art power modules are highly integrated and there is only limited free space left to set up the IR/optical sensors, and b) they always require direct optical access to the silicon die surface, which is not available for the real power module structure [153]. In addition, they are always expensive and require additional sophisticated software. Another frequently used approach would be to add a contact sensor, a thermocouple in most cases, on top of the chip (a summary of different contact sensors is shown in Appendix B2 and B3). However, the use of thermocouples is limited due to the fact that a) access to the chip is limited because of the encapsulated resin and case, together with the chip junction immersed in the silicon gel and b) the tiny temperature dependent parameter variations, e.g., of the K-type thermocouples in the range of several tens of microvolts per Kelvin, can not be resolved in the EMI rich environment of an operated converter. In addition, the thermal mass of the thermocouple can be significantly bigger compared to the small dimensions and small thermal mass of the die, resulting in incorrect readings.

These direct measurement methods with contact-less or contact sensors are therefore not possible for in-situ measurement, and other forms of semiconductor junction temperature acquisition techniques have evolved. The indirect temperature measurement technique is the most exclusively used method for junction temperature measurement online.

#### **4.3.3** Indirect temperature measurement

As both thermal model based estimation and direct measurement methods have their own disadvantages, a third indirect temperature measurement method can be used, which is described in detail by Jian and et al [151] where device physical values like voltage, current and resistance correlate with temperature. These parameters are called Temperature Sensitive Electrical Parameters (TSEPs). They utilize internal electrical parameters to indirectly determine the junction temperature of the silicon chip [94, 151, 154]. These electric parameters can be resistance, voltage or current, and a comparison of commonly used TSEPs is shown in Appendix B. The TSEP method is a non-invasive measurement technique which makes it attractive for in-situ measurements. TSEP measurements can be performed on fully packaged power modules without the need for physical modification of the package. TSEP measurements also provide a high time resolution and measurement bandwidth to capture all thermal device-transients [151].

The typical TSEP measurement is done in two phases, the calibration phase and the measurement phase. In the calibration phase, the selected TSEP is measured against a temperature range to generate a calibration curve. To vary the temperature, the device being tested is put into either a thermal chamber, a recirculating bath, a thermochuck or on a temperature-controlled hot plate. From the calibration curve, a look-up table of the chosen TSEP changes versus temperature can be generated and the TSEP within the whole temperature range can be interpolated. Furthermore, for certain TSEPs within the proper temperature range, a linear or near linear characteristic exists and the temperature coefficient and the intercept of the TSEP can be obtained from linear curve fitting of the results. In the measurement phase, the same test conditions as that of the calibration phase apply and the junction temperature is estimated by measured TSEP values.

The TSEP method is not suited for junction measurement under standard converter operating conditions. This is because TSEP is normally measured at small sense current when no load current passes through the measured device. Therefore, complicated changes in the control would be required to include measurement phases. Therefore, in our project the interrupted monitoring is applied when there is no load current. Another disadvantage of using TSEP is that only a mean value of chip temperature, the so-called virtual junction temperature  $(T_{vi})$ , can be determined, and it cannot provide any information on the chip's temperature distribution or local temperature [151]. However, this is trivial for our experiment and this 'current weighted' mean value agrees well with the area weighted value measured by the IR camera [153]. In addition, the TSEP calibration process is required for each individual silicon die unless they are products from the same production lot [152, 155]. In our project, the calibration for each device can be easily performed with an additional relay network. Some of the TSEP measurements take place during a switching event, meaning that the TSEP measurement circuit must cope with noise and a high common mode voltage. Commonly used TSEPs [151, 156] are compared in a summary in Appendix B1 and

will be reviewed in this section.

#### 1) Forward voltage drop of a diode

The forward voltage drop of a diode is the most commonly used TSEP. The voltage is caused by the pn-junction which is temperature dependent. The forward voltage can be expressed as

$$V_F = A + B \cdot I_F \tag{4.16}$$

where A and B are coefficients without any physical meanings. The current  $I_F$  can be described with the equation (4.17) [157]:

$$I_{F,Diode} = I_s \left[ \exp\left(\frac{qV_{F,Diode}}{Nk_BT_j} - 1\right) \right]$$
(4.17)

where  $I_s$  is the saturation current, q is the charge of electron, N is the "non-ideality" coefficient (typ. between 1 and 2),  $k_B$  is the Boltzmann's constant and  $T_j$  is the junction temperature. The saturation current  $I_s$  can be expressed as:

$$I_s = I_0 T^{\gamma} \exp\left(\frac{-E_g}{kT}\right) \tag{4.18}$$

where  $I_0$  is a constant current value that is not dependent upon temperature,  $\gamma$  is a constant value of about 3 and  $E_g$  is the bandgap (e.g. for Si it is 1.12eV at T=275K).

From the above equations, the forward bias voltage across the diode depends on both the current through it and the junction temperature. A typical pn diode shows a negative temperature coefficient of approximately -1 mV/°C to -4 mV/°C.:

The forward voltage of a PIN diode can be calculated using Equation (4.19). A PIN diode has an additional intrinsic layer compared to the pn diode, increasing the internal resistance, which makes the calculation of the forward bias voltage different to the pn diode.

$$V_F = K_0 + K_1 T + K_2 I_F (4.19)$$

In equation (4.19) K<sub>0</sub>,K<sub>1</sub> and K<sub>2</sub> are coefficients without physical meanings.

#### 2) On-state voltage drop & on-state resistance

The method of using an on-state voltage drop as a TSEP for obtaining junction temperature is widely used for both IGBT and the MOSFET devices and is known as the  $V_{CE}(T)$ -measurement method [71, 153]. The on-state voltage drop is a function of

the junction temperature, the collector-emitter current, and the gate-source voltage [121] as it was expressed in (4.20). Assuming a constant collector-emitter current and a constant gate-source current, the on-state voltage drop  $V_{CE}$ , which can also be converted into the on-state resistance  $R_{ON}$ , is only temperature dependent. There are two internal parameters that are temperature-dependent, shown in (4.21). These two parameters are the on-state resistance of the channel region  $R_{ON(TH)}$  which increases rapidly with increasing temperature, and the forward voltage drop of the base-emitter diode of the internal PNP transistor  $V_{BE(TH)}$ , which has a negative temperature coefficient. Therefore, the on-state voltage of an IGBT is the combination of a voltage with a positive temperature coefficient and a voltage with a negative temperature coefficient.

$$V_{CE} = f(T_{i}, I_{C}, V_{GE})$$
(4.20)

$$V_{CE} = V_{BE(TH)} + R_{ON(TH)} \cdot \frac{I_{CE}}{1 + \beta_{PNP}}$$

$$(4.21)$$

As the second term is not only dependent on  $R_{ON(TH)}$ , but also on the current level  $I_{CE}$ , it can be observed that at very low current level the  $V_{BE(TH)}$  dominates and therefore  $V_{CE}$  shows a drop with the temperature, whereas at a higher current level, approaching the nominal current, the on-state voltage  $V_{CE}$  is largely dependent on the second term of Equation (4.21), meaning that  $V_{CE}$  rises with the temperature.

Because the current  $I_{CE}$  has an impact on  $V_{CE}(T)$ , two  $V_{CE}(T)$ -measurement methods have been established, called switched measurement and non-switched measurement. In the switched measurement (also termed pulsed measurement), a small sensing current (e.g. 100mA) is applied to the device for TSEP measurement during the calibration/sensing procedure. The parasitic effect is small under a small current and takes advantage of the temperature dependence of the diffusion voltage of a pn-junction. A nearly linear relationship between  $V_{CE}$  and the junction temperature was observed with a K factor of around -2mV/C in both works form Scheuermann and Sofia [153, 155]. Due to the switching event, the device requires an initial delay time for the electrical transient before the measurement takes place, which makes instantaneous measurement impossible. These electrical transients are a result of the combinations of switching noise, tail current or measurement circuit response time. Consequently, the junction cools and there is an unknown temperature drop during the delay period. This temperature can be accounted for by using either Finite Element Method (FEM) or by extrapolation from the continuous recordings of the cooling response after the delay period. However, accuracy is critical when applying FEM or extrapolation techniques.

In non-switched measurement, the TSEP is measured without switching between the load and the sensing current. Therefore, continuous changes in the temperature can be monitored and trade-offs related to the sensing current are avoided. This method also eliminates the problems brought on by the switched (pulsed) measurement, as the operating junction temperature can be recorded at once and the overhead in delay time incurred by the switched (pulsed) measurement can be eliminated. Meanwhile, this method provides a potential way to measure  $T_j$  online. On-state voltage monitoring at a high current was proposed in a little literature [121, 158, 159]. However, other problems now appear. Firstly, during the calibration, the actual junction temperature is elevated due to self-heating. Nominally, a set of various current values are injected into the device for the calibration, including rated current. Therefore, the junction temperature is increased due to a large current and can not be maintained at the known environment temperature and secondly, the large test current possibly gives prominence to parasitic effects which can cause reading errors.

#### 3) Threshold voltage

The threshold voltage was employed as the TSEP for MOSFETs by Blackburn, Chen, Baliga and et al [160-162] and for IGBTs in the work from Patil, Xiao, Jakopovic and et al [54, 141, 163]. Threshold voltage for MOS-gated structure devices decreases with increasing temperature due to an increase in the intrinsic carrier concentration [162]. The threshold voltage is not measured directly but is induced from measurements made at a fixed collector current. For a given collector–emitter voltage across an IGBT, the gate–emitter voltage was observed to decrease to maintain the same collector current at elevated temperatures [54, 141, 163]. The sensitivity of the K-factor of  $V_{GE(th)}$  was investigated and compared with that of the p-n junction by Cao [141].

Figure 4.16 shows the comparison result.  $V_{GE(th)}$  is much more sensitive than the p-n junction voltage. More curves of junction temperature versus  $V_{GE(th)}$  are plotted in Figure 4.17(a) when the IGBT collector current was changed from 1 to 2 mA during the measurement period. It clearly shows that there is only a 5% change in  $V_{GE(th)}$  when the collector current changes from 1 to 2 mA. Figure 4.17(b) shows the impact of  $V_{CE}$  on the K-factor of the IBGT. It can be seen that the K-factor remains constant even if  $V_{CE}$  changes from 2.5 to 10V. These results show that  $V_{GE(th)}$  is a candidate of robust TSEPs.



Figure 4.16 Measured K-factor for the forward voltage of two power diodes in series and the gate–emitter voltage for an IGBT (Drive current for the power diodes is 0.4 mA; drive current for the IGBT is 1.5 mA; the collector–emitter voltage  $V_{CE}$  for the IGBT is 5V) [141]



**Figure 4.17** Sensitivity analysis of the K-factor for IGBT IRG4CH30K: (a) K-values with different collector current ( $V_{CE}=5V$ ); (b) K-values with different collector-to-emitter voltage ( $I_C=1mA$ ) [141].

#### 4) Saturation current

The saturation current at a specified gate voltage value is proposed as a TSEP in the work conducted by Ammous, Bergogne, Hefner and et al [164-166]. All references report that the saturation current shows a large dependency with respect to the channel temperature, compared to the threshold voltage [164]. In Ammous and Bergogne's work [164, 165], the saturation current measurement is implemented by applying a low gate-source voltage (slightly smaller than the threshold voltage  $V_{GS(th)}$ ). Low gate-source

voltage is preferred since the sensitivity is far higher than that applied to a high gatesource voltage [164], and therefore the signal to noise ratio is improved. Typically, saturation current calibration/sensing is completed with less than 1/100 of the rated current under low gate-source voltage and with pulse mode measurement, which can prevent significant self heating [165].

The IGBT saturation current is given by Ammous and Bergogne [164, 165], as shown in Equation (4.22)

$$I_{s} = \left(1 + \beta_{PNP}(T_{PNP}) \cdot \frac{\mu_{ns}(T_{ch0}) \cdot C_{OX} \cdot Z_{C}}{2L_{C}}\right) \cdot \left(V_{GS} - V_{GS(th)}(T_{ch0})\right)$$
(4.22)

where  $I_s$  is the saturation current,  $\beta_{PNP}$  is the current gain of the internal PNP bipolar transistor in the IGBT (for MOSFETs :  $\beta_{PNP} = 0$ ),  $\mu_{ns}$  is the surface mobility of electrons in the channel,  $C_{OX}$ , the oxide capacitance per unit area,  $Z_C$ , the channel width,  $L_C$ , the channel length,  $V_{GS}$  is the gate-source voltage and  $V_{GS(th)}$  is the threshold voltage.

The surface mobility of electrons in the channel  $\mu_{ns}$  and the threshold voltage  $V_{GS(th)}$  are dependent on temperature in the middle region of the channel ( $T_{ch0}$ ), while the PNP transistor current gain depends on the temperature in the bipolar transistor base ( $T_{PNP}$ ). The threshold voltage shows small temperature dependence with respect to the channel temperature while the saturation current apparently changes with channel temperature variation and thus has a good signal-to-noise ratio [164, 165]. Meanwhile, the measurement set-up of the saturation current is also easier than the threshold voltage.

#### 5) Emitter to Gate Voltage

Another parameter that is commonly used for TSEP measurement is the emitter-gate voltage, which is often measured in a negative voltage [167-169]. This parameter can be measured using a similar apparatus to the base-emitter measurements of a bipolar transistor. Essentially, the temperature dependencies of the gate-emitter voltage are the same as described for the saturation current. The only difference is that, during the testing, instead of applying a known gate voltage and measuring the saturation current for a given voltage, a known current is applied at a given voltage and the required gate voltage is measured. Often measurements of the gate voltage are made in the linear region; therefore, the current and voltage applied during the measurement phase is fairly small, which also limits any self heating in the device.

#### 6) dV/dt, dI<sub>C</sub>/dt, $t_{d(on)}$ , and $t_{off}$

Temperature dependency of the IGBT voltage slope dV/dt [170], the maximum current slope of turn-on (dI<sub>C</sub>/dt)<sub>max</sub> [171, 172], the dI<sub>c</sub>/dt at a constant  $V_{ge}$  close to  $V_{th}$  [171], the turn-on delay time  $t_{d(on)}$  [171, 172] and the turn-off time  $t_{off}$  (the sum of turn-off delay time  $t_{d(off)}$  and collector current fall time  $t_f$ ) [172] enable their potential use as a means of detecting the junction temperature. These approaches can easily be achieved under normal operating conditions when the device is operated in the pulsed mode. This technique provides a thermal impedance of the module in the time-domain allowing the calculation of the temperature. Comparisons of maximum current slope of turn-on, turn-on delay time and turn-off time are made in Kuhn's study [172]. The latter two TSEPs, or the combination of both, tend to be preferable. However, because these parameters are measured during the switching transients, they can only provide a limited accuracy, e.g. of  $t_{d(on)}$  within 20°C, and are not able to meet the accuracy requirement in our project.

# 7) avalanche breakdown voltage

It has also been reported that the avalanche breakdown voltage of an IGBT is able to increase with temperature and therefore has a positive temperature coefficient of  $0.7V/^{\circ}C$  [173]. However, as this kind of measurement operates at the limit of the device, it is not a preferred option for TSEP measurement.

# **CHAPTER 5**

# **IN-SITU MEASUREMENT CIRCUIT**

This Chapter proposes for the first time an in-situ health measurement technique for power modules applied in electric cars. The health monitoring circuit is able to detect bond wire lift-off and solder cracks at an early stage giving warnings sufficiently before the power device fails or the cars go to services. The Chapter presents the circuit, the individual functions and their operation and control. The Chapter also describes the experimental set-up for the in-situ health monitoring system. Results of the operation, accuracy and reproducibility are shown in Chapter 6.

# 5.1 Design and Experimental Setup for the In-situ Health Monitoring

Chapter 3 has described health measurement techniques for power modules. For the two mostly observed thermomechanical failure modes, bond wire lift-off and solder fatigue, precursor parameter monitoring based on data driven method shows its effectiveness in monitoring the device's state of health. Specially designed laboratory experiment circuits [42, 96, 97, 137] or off-the-shelf equipment [84-88, 174] were normally employed at a stage where a power module has not become an integral part of the inverter. Once a power module has become an integral part of the converter, measurements become unattainable. Thus, the major drawback of these methods is that they are usually used for offline diagnosis and unsuitable for general industrial applications. It was therefore the focus of this research project to develop measurement techniques and methods that would allow an accurate health measurement of power modules which are already an integral part of an inverter and where the inverter is part of the electric power drive train for an electric vehicle.

For general applications, it is ideal to have an in-situ monitoring system which provides condition monitoring in real life. Nonetheless, this calls for much development effort and capital costs and sometimes this needs to overcome some technical obstacles. The proposed in-situ measurement circuit embedded in an inverter for an electric car is based on the concepts described in Chapter 4. Effectively the proposed in-situ measurement is a modified scaled down high precision laboratory measurement circuitry designed for harsh environments experienced by electric vehicles. Such an insitu measurement faces harsh environments in terms of power inverter switching transients, full temperature range, common-mode voltages, fluctuating ground potentials, humidity and vibration. Vibration and humidity issues are out of the scope of this thesis.

To demonstrate the concept and verify feasibility of the proposed in-situ measurement a prototype test system has been constructed. The sample devices examined in this study are non-punch through (NPT) IGBTs (600V and 75A) from Semikron. The sample devices fulfil industry standards and have a footprint of 94mm by 34mm. The usual silicon gel is not present in the modules for the ease of bond wire lift-off test and thermal camera measurement. Each sample module contained two series connected IGBTs and anti-parallel diodes, forming an inverter leg. Characterization tests were performed on individual devices.

#### 5.1.1 The circuit design

The principal schematic of the proposed in-situ health monitoring circuit is shown in Figure 5.1. A three phase inverter connected to a dc-link capacitor at the input side (in the experiment setup a dc link voltage of  $V_{DC}$ = 200V was used with an input capacitor of  $C_{DC}$ = 1.88mF) and a motor at the output side. The in-situ circuit compromises a) six modified driver circuits with each of them including highly accurate voltage measurement facilities and data processing tools, b) two controlled current sources and c) selector switches.



**Figure 5.1** Principal schematic of the proposed test circuit (M stands for motor and V1-V6 is interface to the isolated data acquisition circuit).

One of the biggest concerns with in-situ measurements is the noise that is generated from the switching devices. For example it was shown in the previous chapters that TSEP measurements require an accuracy of millivolt and milliamp in order to determine the junction temperature. Measurements of voltages and currents with such high accuracies during inverter switching is only achievable by using high-end filters, shielding and other means of expensive equipment to minimise noise. This would make the in-situ measurement too expensive and unattractive. For that reason the proposed insitu circuit operates only when the electric car is not moving (e.g. red traffic light, stop and go traffic). In this mode the inverter that powers the electric motor is not switching (except for hill holding mode) and the in-situ health monitoring circuit becomes active. The proposed technique therefore does not monitor the health of the power devices continuously. However, continuous health monitoring is not required as fatigue in bond wire lift-off and solder joints has a slow degradation process. Table 1.1 has shown a typical overview of operational lifetime requirements in HEV. Lifetime of 131,400 device hours is required for power modules in HEV applications[3]. The stop-and-go aspects in urban/suburban road allow carrying out the monitoring process frequently. Driving cycles and load current patterns based on New European Driving Cycle (NEDC) standard are shown in Figure 5.2. The vehicle is expected to stop for 818s per hour with a mean stop time of 21seconds/stop for the urban/suburban drive cycle [175]. The health monitoring requires 92 seconds to check all IGBTs and diodes in the inverter. That means that the condition of all devices is known within five stops.



**Figure 5.2** (a) New European Driving Cycle (NEDC) based urban/suburban drive cycles and (b) the relevant load current [175]

The operation of the circuit can be described with the flowchart shown in Figure 5.3a. If the car stop interval is not sufficient to complete the monitoring process, an interrupt

routine is activated that ceases the health monitoring process. Information on the device that was measured is stored and further measurements will take place on the same device once the car has stopped again. An interrupt service routine (ISR) is shown in Figure 5.3b.



Figure 5.3 Flowchart of (a) operation of health monitoring and (b) ISR

In the following voltage measurement circuit, the current sources and the selector are described in detail.

# 1) High common mode voltage isolation

The top-leg IGBT and anti-parallel diode presents a challenge of high common mode

voltage isolation. In order to protect the control circuit from high voltage, digital isolation is used. The principal schematic for one bridge leg with shunt current sensor and high side in-situ  $V_{CE(on)}$  (using terminal 2'-2) and  $V_F$  measurement (using terminal 2-2') circuit by using digital isolation technology is shown in Figure 5.4.



Figure 5.4 Bridge leg and high side in-situ V<sub>CE(on)</sub> and V<sub>F</sub> measurement circuit

The device terminals 2'-2 are connected to an overvoltage protection circuit and a buffer which are described later in detail. The analogue output voltage is then digitalised using an ADC converter and the digital information is then isolated in order to feed a microcontroller for processing. Figure 5.5 shows the isolated data acquisition circuit in detail. The front-end measurement system includes an isolated 16-Bit analogue input signal interface for the power converter. Channels for top bridge devices measurement and the bottom devices measurement are banked in two groups (Bank 1 and Bank 2). Bank 1 shares a single reference point ISO1 which is the + DC bus voltage and Bank 2 shares a single reference point called ISO 2 which is the – DC bus voltage. Each bank is isolated from each other. All inputs are differential inputs to minimize exposure to noise and each input gain can be set to 1, 2, 5, or 10. The module can sample signal inputs at 125k samples per channel per second. The circuit is constructed around ADuM1402C and AD7685 from Analog Devices, Inc and illustrates the important features of the technique. The ADuM1402C isoPower coupler is used to provide isolated power and isolated data interface signals for the ADC. AD7685 is a 16-bit ADC with an SPI data interface. ADR391 is used for voltage reference. The circuit is shown without many of the passive components such as bypass capacitors and pull-up resistors for simplicity, but these components must be added appropriately for a complete schematic. The circuit communicates to the microcontroller (dsPIC30F6014A from Microchip) with an SPI

interface.



Figure 5.5 Isolated data acquisition circuitry

# 2) Protection and buffer circuit

The voltage across the IGBT/diode of a power inverter changes from a few hundred volts or kilovolts (at the device off-state) to several volts (at the device on-state) under normal drive operations. Since such high voltage of the same level of DC-link voltage normally exceeds the maximum allowable input voltage of conventional data-acquisition devices (i.e. op-amps), an additional protection and buffer circuit is required to protect the voltage measurement circuit against transient voltage spike and high level voltage during the off-state. This protection and buffer circuit must be able to deal with measuring low voltages during ON-state with considerable accuracy and resolution (typically < 3V for IGBT and diode) and filter out the unwanted high voltages. This is embedded in the IGBT driver circuits and two different types of protection circuits are proposed for medium (up to 600V) and high (600V-2500V) off-state voltages, respectively, which are shown in Figure 5.6(a) and (b).

Since the voltage clamping circuit has inevitable leakage at the power switch off-state, protection relays are an alternative used at high DC-link voltages, as shown in Fig 7(b). The voltage amplifier is gated in a manner that negates the switching transients and high voltage at device off-state. For example, the relay is turned on 1µs after the IGBT/diode

CHAPTER 5

is turned on and turned off 1µs before it is turned off. However, this buffer circuit compromises between the maximum acceptable DC-link voltage and the duration of monitoring intervals.



**Figure 5.6** Protection circuits for different off-state voltage levels (a) up to 600V; (b) 600V-2500V

For an EV application with typically 100-400V of DC-link voltage in this study, protection circuit using Zener diode is chosen and its circuit diagram showing parasitic parameters is shown below.



**Figure 5.7** Overvoltage protection circuit (a) Diode equivalent circuit; (b) Overvoltage protection circuit

Figure 5.7 illustrates the proposed overvoltage protection circuit for the IGBTs/diodes at OFF-state. This circuit consists of current limiting resistor ( $R_{LIMIT}$ ), a series of Zener diode (Z1) and a diode (D1) connected between the non-inverting and inverting input of

the op-amp. The Zener diodes are used to clamp the OFF-state voltage to the reverse Zener voltage, which should be rated higher than the maximum voltage drop of the IGBT during ON-state. However, the Zener voltage should be smaller than the absolute maximum allowable input voltage of the op-amp to prevent the op-amp from saturation. Generally a Zener diode with a reverse voltage of 10V or less is used.

In order to limit the current through the Zener diode during the OFF-state a resistor  $R_{LIMIT}$  is embedded. The measurement of the on-state voltage, however, is reduced by the voltage drop across  $R_{LIMIT}$  during the ON-state. Therefore,  $R_{LIMIT}$  must be carefully chosen. On one hand, a large value of  $R_{LIMIT}$  is preferred. During the OFF-state a large limiting resistor will reduce the current passing through the Zener diodes reducing the power rating requirement for the limiting resistor, the Zener diodes and the diode. The current is limited by  $R_{LIMIT}$  and can be calculated to :

$$I_{LIMIT} = \frac{V_{DC} - V_Z - V_F}{R_{LIMIT}}$$
(5.1)

where  $V_{DC}$  is the DC-link voltage of the inverter,  $V_Z$  is the Zener voltage and  $V_F$  is the forward drop voltage of the diode.

On the other hand, a small value of  $R_{LIMIT}$  is advantageous to increase the bandwidth and reduce the response time of the data-acquisition system. As will be shown later a current pulse train is injected into the device during the measurement mode. The actual voltage measurement, however, will not take place until the parasitic capacitance  $C_P$ across the protection diodes is charged (Figure 5.7). This can be explained with the help of Equation (5.2).

$$V_{M} = V_{S} (1 - e^{-t/R_{LIMIT}C_{P}})$$
(5.2)

In equation 5.2.  $V_M$  is the actual measured voltage, Vs is the signal source voltage and  $C_P$  is the effective capacitance of both diodes,  $C_Z$  is Zener stray capacitance, and  $C_D$  is diode stray capacitance. Hence, a resistor of low value is preferred to improve the bandwidth. In our prototype circuit,  $R_{LIMIT}$  is 100k $\Omega$ , the zener diode is 1N5339B, the diode is 1N4001 and the op-amp is AD8615.

#### 3) Current sources

The in-situ measurement circuit is equipped with two current sources. The small current source used for TSEP sensing is called  $I_{sense}$  and it provides 100mA continuous current. This current source has been constructed from the current source IC LT3080 from
Linear Technology Corporation and its schematic is shown in Figure 5.8. A second current source is called High Current and provides a continuous current of up to 80A. Although an 80A current source inverter can be constructed for the in-situ measurement it was decided to make use of the LAB-SM 320 DC power supply due to the limited time for this research. The current source  $I_{sense}$  has a typical output noise of 1nA and the LAB-SM 320 current source has a typical output ripple of 0.2%



Figure 5.8 Schematic of sense current source

#### 4) Selector

As shown in Figure 5.1, a relay arrangement is used to connect the current sources with the right IGBT or diode. The correct setting is controlled by the MCU and the first device for testing is IGBT T1 followed by IGBT T2 and so on. After all IGBTs have been measured the MCU starts activating the diodes measurement (starting with diode D1). Each IGBT/diode is measured individually and relays determine the sequence of the measurement (see Table 5.1). Only the IGBT under test will be turned-on and all other switches are off so there is no load current (if the diode is under test, all switches are turned-off). The selector circuit is made of DG85C automotive relays form DURAKOOL, Inc.

 Table 5.1 Individual voltage measurements with different relay settings

| <b>RELAY STATE</b> | S1                                          | on         | S1 off    |            |  |  |
|--------------------|---------------------------------------------|------------|-----------|------------|--|--|
|                    | DPCO / on                                   | DPCO / off | DPCO / on | DPCO / off |  |  |
| S2 / on            | T1                                          | D1         | T2        | D2         |  |  |
| S3 / on            | T3                                          | D3         | T4        | D4         |  |  |
| S4 / on            | T5                                          | D5         | T6        | D6         |  |  |
| S2,S3,S4 / off     | no assistant power going through any device |            |           |            |  |  |

#### 5.1.2 Test circuit

For the prototype test system only one phase was build for the simplicity. However, the devices of other two phases would be monitored exactly the same way as in this test circuit. The test bench is controlled by the combination of a control board with a digital signal controller (dsPIC30F6014A from Microchip) and CompactDAQ data-acquisition system (cDAQ-9174 and modules from National Instruments). The microcontroller is used to control the test sequence and the Labview application is used to control the test sequence current injection with the voltage, current and temperature acquisition.



Figure 5.9 System schematic of experiment set-up

The system schematic is shown in Figure 5.9. The connections between control board and high voltage power circuit are secured by fibre optic devices and relays. The data transfer between the ADC on the measurement board and the control board is realized by the digital isolation. Data transfers from the data-acquisition hardware to the host computer via Universal Serial Bus (USB) for post-processing with Labview and Matlab. The schematic of the drive circuit and relay drive circuit are shown in Appendix C.

# 5.2 Voltage Measurement

The proposed health monitoring tests need to measure device voltage drop, junction temperature and loading current, all of which are finally to measure voltage signals.

High accuracy and temporal resolution are prerequisite properties of the measurement set-up. Figure 5.10(a) shows the first built measurement circuit using difference amplifiers (AD629B) as front end data acquisition device. Voltage signals are acquired with a 12-bit ADC from dsPIC30F6014A. AD629 is a difference amplifier equipped with high precision voltage dividers with the help of laser trimmed resistor networks. It can provide a high common mode rejection of  $\pm 270$  volts which is feasible for the experiment conditions. This measurement circuit requires less design effort and is cost-effective. However the large magnitude of the effective resistance from the buffer circuit will imbalance the input resistor network and therefore introduces a common-mode error, which needs to be compensated by subsequent circuits. In addition, a set of test current pulse with the magnitude of up to 40 Amps is found to induce interference with sufficiently large magnitude of ground noise which could disable the microcontroller. Then another measurement circuit with isolated data acquisition technique is implemented as shown in Figure 5.10(b).



**Figure 5.10** Voltage measurement circuits (a) with difference amplifier (b) with digital isolation

In this circuit, the floating input signal is acquired with op amp and then digitized by a dedicated A/D converter (AD7685), which is powered by an isolated power supply, and the output from A/D converter is transmitted across an isolation barrier via a digital isolator (ADuM1402C). This circuitry shows capability of voltage measurement under

high common mode voltage and a good performance of response time (less than 4us) for the designated application. However, in order to achieve high accuracy and multi channel inputs, high printed circuit board design and signal conditioning skills are prerequisite. Therefore, a compact data acquisition system with digital isolation function from National Instruments is adopted in our experiment. All voltage measurements were realized with National Instruments acquisition module (NI9206 16bits A/D converter with isolation). The voltage measurement acquisition is performed at 125 kHz sampling frequency. The voltage uncertainty is less than  $\pm 0.8$  mV for the measurement range at room temperature.

#### **5.3** Measurement of Junction and Case Temperatures

Both junction-to-case and junction-to-heat sink thermal resistances/impedances can be used as precursors to detect solder degradation. Take junction-to-case thermal resistance for example. It requires the measurement of the dissipation power P, the junction temperature  $T_j$  and the case temperature  $T_c$ . The dissipation power can be controlled and measured with device voltage drop and current during conduction. The in-situ measurement measures the junction temperature with the help of TSEP. The case temperature is measured with the help of a thermocouple attached to the case.

#### **5.3.1** Junction temperature measurement

The proposed health monitoring method for both bond wire lift-off and solder fatigue requires the accurate measurement of junction temperature. This is necessary both for diagnosing the bond wire degradations and performing thermal characterizations during device aging. The TSEP method is widely used to indirectly obtain the junction temperature in the laboratory environment. The research work here describes how TSEP can also be used for in-situ measurements. Among the various TSEPs previously discussed in Chapter 4, the on-state voltage ( $V_{CE(on)}$ ) for IGBTs and the forward voltage drop ( $V_F$ ) for freewheeling diodes are selected. The selected TSEPs can basically be calibrated with low current injections (also called switched method) and high current injections (also called non-switched method). The switched method with low current injection method is used in this thesis and it has higher accuracy when bond wire degradation occurs than non-switched method. However, the calibration of TSEPs using non-switched method is recorded and shown in Appendix D.



Figure 5.11 The TSEP test setup for IGBT modules

Figure 5.11 shows the TSEP test setup for IGBT modules. An open IGBT module (SKM 50GB063D from Semikron) is the device under test (DUT). This module is mounted on a temperature-controlled hotplate and a thin layer of thermal grease is sandwiched between the baseplate and the hotplate in order to minimize thermal contact resistance. A K-type thermocouple is located in the copper substrate as close as possible to both silicon chips. Two more thermocouples are glued in the grooves on the hotplate's top surface, with their tips directly beneath the IGBT and diode dies, to monitor hotplate conditions. The TSEPs calibration is performed when thermal equilibrium has been reached after enough subside time.

Switched method is to be used for junction temperature measurement in the proposed health monitoring process. By using this method, the high level of either normal load current or test current for health monitoring needs to be switched to low sense current for temperature measurement. The load current  $I_L$  or test current  $I_{test}$  is interrupted repetitively to perform the TSEP measurements at low sense current  $I_{sense}$ . Designated test pulse patterns are used with sense current continuously injected into the device under test. This is discussed in detail in Chapter 6.1 and 6.2.

#### 5.3.2 Case temperature measurement

In the thermal resistance measurement, the case temperature is always measured at the

outside surface of the case which is directly beneath the chip and the variation of the surface temperature is minimized by effective cooling facilities. In real application, the case surface temperature is however not isothermal. Thus, in order to determine the reference point for case temperature measurement, it is important to investigate the temperature profile at different reference locations. For this question four reference points (TC1-TC4) for attaching thermocouples were investigated as shown in Figure 5.12. Accordingly, TC1 at outside surface of the case (base plate) directly below the chip is a desired point for most applications and an appropriate method would be to make a groove on the back surface of the case of the power module to allow a thermocouple glued close under the chip. However the method was not utilized in this study because all power devices will be scanned using SAM technologies. SAM stands for scanning acoustic microscopy and allows detect the delamination and cracks of internal layers in the device package. The SAM technique requires a flat surface. Thereafter, the groove is made on the top surface of the heat sink instead. The tip of the thermocouple is glued to the end of the groove at the center directly beneath the die. As a consequence, there will be an unavoidable interference from thermal interface material (TIM), such as thermal grease, between base plate and heat sink. TC2 is glued onto the top surface of the DCB substrate in close vicinity to the IGBT and diode chips to achieve a close thermal coupling. TC3 is immersed into the heat sink at a distance of 2mm between top of the hole and the heat sink top surface. TC4 is on the outside edge of the baseplate. TC5 is located at the far corner of the DCB top surface, which is used for experiments in Chapter 6. Since it is on a separate copper stripe, heat flows through substrate ceramic and subsequent layers towards TC5, while the majority of heat dissipated from the chip flows directly to the heat sink.



Figure 5.12 The reference points for the case temperature measurement

A layer of thermal grease of 30 um is used as TIM which is added between the case surface and the heat sink. A constant torque of 6Nm is applied to the terminal screws by a torque wrench in order to keep a defined pressure when mounting the sample on the heat sink. The construction can be seen in Figure 5.13.



Figure 5.13 Positioning tool and thermocouples attached at reference point

Figure 5.14 shows the heating curves of TC1 to TC4 and the junction temperature during the heating phase. Temperature changes of reference locations at TC1 to TC4 indicated in Figure 5.11 are measured with thermocouples. The junction temperature is measured indirectly with TSEP method. The sampling frequency is 1 kHz and 50 Hz for junction temperature and reference temperature measurement, respectively. The heating phase is defined as a 50A current applied for 200s. If the same power dissipation is assumed for thermal resistance calculation at different reference points, lower delta T will indicate smaller thermal resistance. Therefore, the thermal resistance can be ranked as  $R_{thj-TC2} < R_{thj-TC1} < R_{thj-TC4} < R_{thj-TC3}$ . Figure 5.14 shows that TC2 is the hottest reference point among all, however it is out of the main heat conduction path and it is ahead of DCB solder, therefore it is not suitable for thermal resistance calculation. TC1 is the hottest accessible spot along the dominant heat conduction generated by the silicon die and therefore indicates the smallest thermal resistance. Solder resistance only take a small part of the junction-to-reference resistance. Their changes will even bring a

less impact on total thermal resistance. For this reason, TC2 is an ideal reference point for fine resolution of degraded solder condition evaluations.



Figure 5.14 Temperature of IGBT chip and reference points (TC1 –TC 4) at 50A

The thermal time constant of the heat sink is assumed to be long compared to the time constants of each layer. Figure 5.15 shows the minimum pulse time of a 50A current impulse to keep the case temperature constant. If the current pulse time is less then 200ms the case temperature can be assumed to be constant. If the heating pulse duration exceeds 200ms, it is no longer valid to assume that the reference temperature is constant since the pulse time is no longer small with respect to the thermal time constant of the heat sink. However, given constant dissipated power, the temperature difference ( $T_j - TC_i$ ) could reach a relatively stationary value and the power module assembly can be assumed temporarily in thermal equilibrium.



Figure 5.15 Temperature of reference points (1 - 4) with IGBT injected by 50A

# **CHAPTER 6**

# **EXPERIMENTAL RESULTS**

In order to verify the feasibility of the proposed health monitoring system a test circuit was constructed that allows the operation of the in-situ measurement circuitry described in Chapter 5. In addition an accelerated lifetime technique has been applied to emulate aging of the devices in order to track the changes of the three diagnostic parameters:  $V_{CE(h)}$ ,  $V_{F(h)}$  and  $R_{thjc}$ . Simulation of degraded thermal path is implemented with Matlab/Simulink and its result is analysed. Experimental results presented here were taken from a set of power module samples.

## 6.1 Monitoring Bond Wire Lift-off

Chapter 3 reported that bond wire lift-off increases the effective resistance of the module, which consequently causes drifts in the terminal voltages  $V_{CE(on)}$  and  $V_F$ . The proposed in-situ measurement circuit uses the terminal voltages  $V_{CE(on)}$  and  $V_F$  as precursors to monitor the bond wire health of the IGBT and the diode. The bond wire condition diagnosis and monitoring process is carried out in case of constant load current and constant IGBT gate-emitter voltage and on the assumption that no electric conduction degradation occurs at terminal connections. At constant current and at constant gate-emitter voltage all parasitic inductive and capacitive stray components are zero and only the voltage drop across the series stray resistance  $V_{stray}$  and the semiconductor device on-state voltages  $V_{CE(on)}/V_F$  impact on the terminal voltages  $V_{CE(on)}/V_F$ .

It should be noted here that although  $V_{CE(on)}$  and  $V_F$  are widely used as failure precursors for bond wire lift-off in the literature, they actually indicate the degradation in electric conduction conditions caused by one or combinations of the failure modes such as bond wire lift-off, bond wire heel cracking, chip metallization reconstruction and terminal joint crack [30, 43, 46]. Whilst all of above degradation modes can cause changes of stray resistances and therefore the terminal voltage, they may also alter the internal current distribution of silicon dies given that significant chip surface interface structure occurs under severe degraded conditions, which also contributes to the observed terminal voltage increase of V<sub>CE(on)</sub> and V<sub>F</sub>. As a consequence, the terminal voltages V<sub>CE(on)</sub> and V<sub>F</sub> represent therefore a good approximation of electric conduction degradations. Since bond wire lift-off imposes more effect on the precursors than other degradation modes and it is also the main driving force of uneven current distribution in the semiconductor chip, it is used here in the thesis for simplicity instead of citing each degraded modes mentioned above. The in-situ bond wire lift-off monitoring requires data from the "healthy" devices. The terminal voltage of the IGBT and the diode is therefore recorded prior to the IGBT power module becoming an integral part of the inverter for various junction temperatures  $T_i$ . The two functions  $V_{CE(0)}(T_i)$  and  $V_{F(0)}(T_i)$ are called healthy baselines and each IGBT and diode has its own specific baseline. The baseline is stored in the microprocessor. Once the in-situ measurement provides field data of V<sub>CE(on)</sub> and V<sub>F</sub> for each IGBT/diode, the measured data will be compared with the baseline. Any rise above the baseline will indicate bond wire connection degradation. A 5% on-state voltage level increase is generally proposed as the failure criteria for bond wire lift-off [42, 95, 139].

#### 6.1.1 Terminal voltages and their dependencies

In the laboratory, the on-state voltage drop  $V_{CE(on)}$  is normally measured across the power module terminals. With the ever-increasing current density of power modules, any changes of the pure resistance between the collector-emitter terminals will result in terminal voltage change. In particular, this change is more obvious given high load currents. In the following sections voltages that are generated by high currents have the subscript h and voltages that are generated by low currents have the subscript 1 for clarity. At high currents the terminal voltage  $V_{CE(h)}$  is the sum of the voltage drop across the IGBT die (denoted as  $V'_{CE(h)}$ ) and the voltage drop across the series stray resistance (denoted as  $V_{stray(h)}$ ).

$$V_{CE(h)} = V_{CE(h)} + V_{stray(h)}$$
(6.1)

The IGBT on-state voltage drop  $V'_{CE(h)}$  can be expressed as a function of collector current and junction temperature under constant gate-emitter voltage. If collector current is constant, the on-state voltage drop  $V'_{CE(h)}$  with regard to junction temperature

is sufficiently linear and it can be expressed by a straight-line equation as shown in Equation (6.2):

$$V_{CE(h)}' = V_{CE0(h)}' + \varepsilon(I_{C(h)}) \cdot (T_j - T_{j0})$$
(6.2)

where  $V'_{CEo(h)}$  is the on-state voltage at the temperature  $T_{j0}$  and  $\varepsilon$  (mV/°C) is the temperature coefficient of the linearised  $V'_{CE(h)}$  graph at the given collector current value  $I_{C(h)}$ .  $\varepsilon$  is also sometimes termed as "K factor" [408]. Assuming that the current  $I_{C(h)}$  is constant  $V'_{CE(h)}$  will only depend on the junction temperature. Consequently the on-state voltage  $V_{CE(h)}$  measured at the power terminals depends on the junction temperature and stray resistances as expressed in Equation 6.3

$$V_{CE(h)} = V_{CE0(h)} + \varepsilon(I_{C(h)}) \cdot (T_j - T_{j0}) + I_{C(h)} \cdot R_{stray} \quad (6.3)$$

The same statement can be made for diodes. Although  $V_{F(h)}$  is not linearly dependent on junction temperature, at a constant current the forward voltage drop measured across the power terminals is again only a function of the junction temperature and the stray resistances as it is shown in Equation (6.4)

$$V_{F(h)} = V_{F(h)} + I_{F(h)} \cdot R_{stray}$$
(6.4)

The terminal voltage  $V_{CE(on)}$  is determined by the voltage across the internal stray resistances of the power module and the on-state voltage V'<sub>CE(on)</sub>.

The voltage drop  $V_{stray(h)}$  is determined by the stray resistance of IGBT power modules, which consists of (1) the effective resistance of multiple bond wires and (2) their contact resistance, (3) sheet resistance of chip metallization, (4) chip to substrate contact resistance, (5) resistance of DCB copper stripes, (6) power terminal resistance and (7) its contact resistance. Figure 6.1 and Table 6.1 present the static modelling of the half-bridge IGBT module SKM 50GB063D. Since DCB copper stripes and power terminals are less susceptible to degradations their stray resistance only induces a constant voltage offset at give load current. The other four parts of (1)(2)(3)(4) are in the immediate vicinity of the silicon chip and their degradation imposes a direct effect on the current distribution and silicon temperature. Any change in (1)(2)(3)(4) is picked up by the health monitoring circuit.

**Table 6.1** Equivalent stray resistance of an IGBT module

| EQUIVALENT STRAY                                                              | CONSTITUENT |
|-------------------------------------------------------------------------------|-------------|
| RESISTANCE                                                                    | PARTS       |
| $R_{L1}, R_{L2}, R_{L3}, R_{S12}$                                             | 67          |
| R <sub>T1_C</sub> , R <sub>T2_C</sub> , R <sub>D1_K</sub> , R <sub>D2_K</sub> | 4           |
| $R_{T1_E}, R_{T2_E}, R_{D1_A}, R_{D2_A}$                                      | (1)(2)(3)   |







(b)

**Figure 6.1** (a) Static equivalent resistance model of a half-bridge module; (b) Photo of a half-bridge module

Although bond wire lift-off and metallisation reconstruction are the most frequently observed failures that induce the increase of the equivalent stray resistance [30, 46], an increase in the forward voltage drop caused by the degrading joints of terminal connectors has also been observed [43, 79]. Although these degraded terminal joints lead to an increase in both electric resistance and thermal resistance and hence the increased terminal temperature, they only play a limited role in raising the chip temperature and changing its current distribution. The failure rates of terminal joints are rare in comparison to failure rates of bond wire lift-off, which allows the justification of the proposed system. Nevertheless, a mechanism to minimise the impact on terminal bond degradation is proposed here. Rather than measuring the voltage  $V_{CE(on)}$  between the collector and emitter power terminals, the voltage is measured between the collector power terminal and the Kelvin connector as shown in Figure 5.4. A further step can be taken by adding a measurement pin within the power module to measure the collector voltage (See Figure 6.1a). This proposal, however, would need a change in the power

module design.

In this thesis, since a healthy terminal joint condition is maintained and no degradation is expected, the voltage drop across stray resistances (6) should remain constant and will not affect the determination of bond wire lift-off. Hence, the measurement error induced the by degraded terminal joints is negligible.

#### 6.1.2 Junction temperature measurement and discussions

Since the chosen TSEPs are different between IGBTs and diodes and moreover the temperature dependency of the measured parameter varies from device to device of the same type, the calibrations were performed on each device individually. The selected TSEPs ( $V_{CE(on)}$  and  $V_F$ ) were calibrated at small current (100mA) and their part-to-part variations for different devices are compared and then their effects of bond wire degradation are evaluated. In order to avoid the errors induced by the switched measurement method, the junction temperature of interest is estimated by extrapolation in the end.

#### 6.1.2.1 Calibration of TSEPs

Two IGBT power modules from the same production lot were selected in this study. TSEP calibration is taken with multiple measurements for both IGBTs and diodes. Figure 6.2a gives the experimental TSEP ( $V_{CE(on)}$  and  $V_F$ ) calibration for low injected current (100 mA) for the temperature range 20 °C to 140 °C. The calibration data are linear approximated and can be expressed by a straight-line equation resulting from linear curve fitting using Matlab

$$TSEP = \varepsilon \cdot T_i + TSEP_0 \tag{6.5}$$

where  $\varepsilon$  (mV/°C) is the temperature dependency of TSEP (sometimes also termed as "K factor") and TSEP<sub>0</sub> is the Y-intercept or TSEP offset.  $\varepsilon$  is almost constant for both IGBT and diode with a negative gradient which demonstrates a negative temperature dependency. From the figure a value of approximately -2.1mV/ °C at 100 mA can be read. Figure 6.2a also shows that the part-to-part variation of calibration curves for the same batch of modules is negligible. As a consequence, the calibration in the full temperature range and linear approximation has to be performed only once using a single pair of measured voltage-temperature data. This method is called one point

recalibration in [155].

In Figure 6.2b the selected TSEPs under different bond wire degradation conditions (i.e. with different number of lift-off bond wires) are characterized.

Although terminal voltage drops are dependent on the parasitic resistance, they result in a negligible voltage drop under small values of  $I_{sense}$ . It can be therefore concluded that good consistency and accuracy of TSEPs for each device under all test conditions are expected in the proposed techniques.



**Figure 6.2** (a)The part-to part variation and (b) bond wire degradation interference for  $V_{CE(on)}$  and  $V_F$  as a function of temperature and their linear approximation

#### 6.1.2.2 Junction temperature extrapolations

One should note that the junction temperature can never be directly measured during a heating interval using the TSEP at low sense current method. A sufficiently long delaytime should be retained because the monitored TSEP voltage can be very easily contaminated by noise. First, the large change of collector current from the heating interval to sensing interval stimulates electrical transient noise and second, the control switch has a recombination time.

Within the transition from high testing current to low sensing current, the junction temperature always slightly decreases to an unknown value below the practical value of interest. In most temperature measurement applications, such as bond wire lift-off monitoring and solder fatigue monitoring, this difference is not taken into account. This is because the measured junction temperature is used as a relative value with the temperature difference taken as an offset.

However, if needed, the temperature difference can either be deduced by simulations or by extrapolation from the continuous electrical measurement after the dead time. The junction temperature, which corresponds to TSEP, is linear with respect to the square root of the cooling time after the heating pulse [176]. As shown in Figure 6.3, linear regression is performed between the junction temperature and the square root of the cooling time after the heating pulse. The temperature within the dead time can be found by extrapolating the line to the t=0 axis and the intercept denotes the last temperature before switching, which is about 3.5C higher than the first effective measurement.



Figure 6.3 Linear regression for junction temperature extrapolation

High temporal resolution and accuracy of the junction temperature is of paramount importance in TSEP measurement. For the switched measurement method, the temporal resolution is mainly limited by the delay time induced by the electrical transient. The time stamp for the measurement is also subject to the pulse train patterns. Typically, effective TSEP measurement should be taken after the first 50us since switching events

take place. Furthermore, the TSEP measurement in real application is always different from that in the calibration phase, since both of the sense current distribution and junction temperature gradient differs [153].

# 6.1.3 Impacts of the gate-emitter voltage

One of the assumptions made for monitoring bond wire lift-off is a constant gate-emitter voltage. However, this in practice is not always the case and a change in gate-emitter voltage has a dramatic effect on the on-state voltage of the device.

The gate emitter voltage  $V_{GE}$  is supplied from the gate driver, which is an integrated circuit including transistor, comparator, and DC/DC converter. The output characteristics are temperature dependent. Even if the temperature dependency can be compensated by using gate-emitter voltage regulation techniques as described in [177], off-the-shelf driver circuits do not have this technique and the effect needs to be considered.



Figure 6.4 Experiment set-up for two driver circuits under test

Figure 6.4 shows the test circuit for the two driver circuits. Driver A from Concept Technology and Driver B made in house, are both able to drive the SKM 50GB063D

IGBT from Semikron. The terminal voltage as a function of the gate drive voltage at various current levels is shown in Figure 6.5a. As expected a larger collector current produces a higher terminal voltage but a larger gate-emitter voltage reduces the terminal on-state voltage  $V_{CE(h)}$ . Figure 6.5b shows the measured gate voltage as a function of the temperature. An increase in temperature also generates an increase in the gate voltage. This can be explained as the gate driver temperature is affected by both self-heating and the environment temperature. The slopes of the gate driver output voltages are around 4.8mV/ °C and 5.65mV/°C for driver circuit A and driver circuit B, respectively. To minimise the changes caused by changing temperatures, slope drift cancellation techniques have been reported in the literature [177]. Both tested driver circuits didn't have these techniques. The impact on the terminal voltage is shown in Figure 6.5c. Figure 6.5c shows that the terminal voltage drifts by +/-20mV when operating from - $50^{\circ}$ C to  $50^{\circ}$ C. In the subsequent experiments, the gate drive regulation technique feature is not includes. The experiment will be carried out in such a way that the  $V_{GE}$  is frequently measured and is kept constant by maintaining the gate driver's environmental temperature constant whilst the temperature of the device under test is varied. It is recommended to design a driver circuit that regulates the gate-emitter voltage as described in [177] for the in-situ measurement circuit.



Figure 6.5 Impacts of gate drive temperature variation on the on-state voltage drop

#### 6.1.4 Results and analysis

The aim of the following experiments is to verify the overall feasibility of the proposed approach and the behavior of the diagnostic parameters. The test is carried out using one open Semikron module (absent of silicon gel). The gate-emitter voltage is frequently checked. In order to determine that the measured voltage rise is caused by the deteriorations of bond wire connections instead of the chip temperature, the chip temperature is indirectly measured using  $V_{CE(I)}(T_j)$  as TSEP as discussed in Chapter 4. The measurement circuit includes two current sources. I<sub>sense</sub> forces a 100mA current through the chip to determine the junction temperature. A second high current source I<sub>test</sub> is also integrated in the inverter and a designated pattern of current pulse trains is injected into the IGBT/diode. The current pulses are controlled by an auxiliary IGBT switch. In order to measure the voltage drop across the bond wire connections a pulse train of 2.5kHz is injected. The pulse train is shown in Figure 6.6.



Figure 6.6 Diagram of pulse trains for bond wire lift-off monitoring

The voltage  $V_{CE(h)}$  is measured directly at the end of the high current pulse ( $I_{test}+I_{sense}$  in Figure 6.6). Multiple measurements (3 samples) are made for averaging the readings. In order to link the voltage measurement  $V_{CE(h)}$  with the junction temperature, a second measurement takes place shortly after the 344us pulse. The delay time is chosen to be short enough to avoid significant changes in the junction temperature between two sets of measurements. Multiple measurements (3 samples) of voltage drop  $V_{CE(l)}$  are recorded and converted into temperature  $T_i$  using the TSEP calibration curve. Figure 6.7

shows the experimental waveforms. Due to the delay caused by the test current to settle down to the desired current  $I_{test}$ , a pulse train up to 16ms is injected and it can be concluded that a total of 4ms is sufficient for the data logging per device.



Figure 6.7 Voltage and current measurements during pulse train monitoring

The process takes place in the microprocessor and the flowchart is shown in Figure 6.8. Having associated the junction temperature  $T_j$  with the measured voltage  $V_{CE(h)}$ ,  $V_{CE(h)}(T_j)$  is now compared with the baseline. So long as the voltage difference between the baseline and the measured voltage  $V_{CE(h)}$  is less then 5% at the measured junction temperature no warning is sent to the driver. Once the bond wire health has received the okay-status the selector chooses the next IGBT or diode for measurement.

In the experiment, bond wire lift-off is emulated by individually cutting the bond wires. Figure 6.9a and 6.9b show, respectively, the drift of the IGBT collector-to-emitter onstate voltage and the diode forward voltage with junction temperature at different stages of the bond wire health. One can notice that the measured voltages have increased by approximately 12mV for the IGBT and 7mV for the diode with one bond wire cut. It must be noticed that the peak-to-peak accuracy of the in-situ circuit for the voltage measurement is less than 1.2mV which is sufficient to detect this  $V_{CE}$  voltage rise. Indeed, the increased voltage is enlarged by a further amount with a second bond wire cut. Both figures show that the state of bond wire health for each device can be determined by evaluating its voltage increase.



Figure 6.8 Flowchart of the bond wire lift-off monitoring process



Figure 6.9 Voltages vs bond wire cuts for (a) IGBTs and (b) diodes

The experimental tests brought forward a design specific impact on the in-situ measurement. The SKM 50GB063D power module has one die per IGBT switch with two aluminium metallization pads. Each pad is connected with the DCB using three bond wires (Figure 6.10). Thus each IGBT has a total of 6 bond wires to handle a rated current of up to 75A ( $T_{case}=25^{\circ}C$ ) per die. Consequently each bond wire carries a maximum current of about 12A.



Figure 6.10 Bond wires of IGBTs and diodes

When a bond wire fails, the current density in the die does not change dramatically since the current is still distributed between both of the metallization pads. This will lead to a gradual increase in  $V_{CE(on)}$  mainly due to the increased voltage drop across the bond wires. However, when one pad loses all its electric connection, the current density in the other pad increases twofold and this leads to a significant jump in the  $V_{CE(on)}$  measurement as shown in Figure 6.11 (see at 3 lift-off bond wires in Figure 6.11).



**Figure 6.11** Relative errors of  $V_{CE(n)}$  and  $V_{F(n)}$  as a function of number of lift-off bond wires

The y-axis in Figure 6.11 shows the relative error for the IGBT and diode. The relative error is given by Equations 6.6 and 6.7:

$$\frac{\Delta V_{CE}}{V_{CE(0)}} = \frac{V_{CE(n)} - V_{CE(0)}}{V_{CE(0)}}$$
(6.6)

$$\frac{\Delta V_F}{V_{F(0)}} = \frac{V_{F(n)} - V_{F(0)}}{V_{F(0)}}$$
(6.7)

where  $V_{CE(0)}$  and  $V_{F(0)}$  is the on-state voltage drop and forward voltage drop with no bond wire lift-off for the IGBT and diode, respectively.  $V_{CE(n)}$  and  $V_{F(n)}$  describe the onstate voltage and the forward voltage of IGBT and diode, respectively, at the number of lifted bond wires *n*. Figure 6.11 shows also that the diode does not show a "jump" in the voltage as seen in the IGBT. That is because each diode die has only one pad with six bond wires. Therefore, as the number of lift-off bond wires increases, the current density in the die does not change dramatically and the increased voltage drop across the bond wires corresponding to the lift-off bond wires results in a gradual increase in  $V_{F}$ .

#### 6.2 Modelling and Simulation

This section describes a simulation approach to estimate the thermal characteristics concerning the degradation within the solder layer and other critical layers of the IGBT power module. As described in Chapter 4, a thermal network based on 1-D Cauer model representing the power assembly is presented here. Matlab&Simulink is implemented for simulation to reveal the thermal performance of the IGBT power module. Simulation results and analysis of different heat path scenarios are given in comparison with experimental results from the following section. Contribution of individual layers and their effects of degradation are analysed based on transient thermal impedance curves which will provide a better understanding of the impact of degradation occurring at different layers. In addition, thermal impedance curves with two different thermal pads are compared with the original results and these results are validated by the experiment in the following section.

#### 6.2.1 Construction of a 1-D Cauer model

In general a Cauer model shows close relationship with physical reality and is used here to associate the thermal equivalent circuit elements with physical regions. This model is a ladder network as it was shown in Figure 4.15. Generally, thermal resistance and thermal capacitance per unit length need to be considered for the thermal model and a larger number of  $R_{th}C_{th}$  cells will achieve higher accuracy. However, commonly  $R_{th}C_{th}$  cell of lumped parameters to represent each layer of the device is used for simplicity. In this work, a 1-D Cauer model is constructed based on a multilayer structure of an IGBT power module. Although the thermal resistance and thermal capacitance of each layer can be directly calculated from material data with given physical dimensions, some factors of inaccuracies remain. Therefore some assumptions required when using a 1-D Cauer model are listed below:

1) We assume a uniform virtual temperature of each layer, and a heat spreader with back-surface temperature being held at ambient.

2) Heat transfer by radiation and convection are negligible and are not included in this model Heat conduction is assumed to be mainly confined to the assembly thermal path with one-dimension outwards from heatsink to the surroundings [351]. Since the interface area within each layer of the internal structure is greatly large than its thickness along the heat conducting path, the three dimensional gradient could be neglected. However, improvement of accuracies is achieved by introducing heat spreading angle.

3) A heat spreading angle of  $45^{\circ}$  is assumed. A heat spreading angle  $\alpha$  is empirically defined as  $45^{\circ}$  for heat conduction in homogeneous media without obstruction by subsequent layers in the heat propagation path [351] (Figure 6.12).



Figure 6.12 Cross section view of the heat conduction in IGBT modules

However, it should be mentioned that the fixed angle model may lead to inaccuracies in many practical situations. In fact, the spreading angle should be made to vary with system characteristics such as material properties, layer size and thickness and boundary conditions.

4) Temperature dependent material properties are omitted. Some material properties such as the thermal conductivity, specific thermal capacity and density are generally temperature dependent especially for silicon.

5) Thermal coupling effects are negligible. In another word, only one heat source is considered.

6) The adhesion between layers is assumed to be perfect and therefore the interface thermal resistance is zero. Although the physical thickness of these interfaces is negligibly small in comparison to the entire layer, their effect on the thermal resistance and thermal capacitance of the thermal network is still unknown in real application.

The device parameters were calculated from standard values of the material specific thermal capacity, density, thermal conductivity and element dimensions. Material quantities were taken from specified datasheets and the dimensions of each layer were given by the manufacturer. As the area of the heat generating silicon die is smaller than that of the sequent layers along the heat path, a heat spreading angle  $\alpha$ =45 is assumed and therefore the equivalent dimensions of each layer are determined. Table 6.2 contains the material properties and geometries for all the layers of the power IGBT module encased in the SEMITRANS 2 package. With the help of equations 3.21 and 3.22, Table 6.3 was generated. The thermal time constants for the each layer considered to be one-stage RC network at a given step power input which is also shown in Table 6.3.

|                        | Silicon | Solder | Copper | Aluminia | TIM  | Aluminium | GapPad | Units    |
|------------------------|---------|--------|--------|----------|------|-----------|--------|----------|
| Density p              | 2330    | 9000   | 8950   | 3900     | 2100 | 2700      | 3200   | kg/m^3   |
| Thermal conductivity k | 163     | 50     | 400    | 27       | 0.55 | 160       | 3      | W/(m*K)  |
| Specific heat c        | 703     | 150    | 385    | 900      | 1000 | 900       | 1000   | J/(kg*K) |

 Table 6.2 Power semiconductor material data

|                    | Length<br>[mm] | Width<br>[mm] | Effective<br>Area [mm^2] | Thickness<br>[mm] | Thermal<br>resistance<br>[W/K] | Thermal capacitance | Time<br>constant<br>[ms] |
|--------------------|----------------|---------------|--------------------------|-------------------|--------------------------------|---------------------|--------------------------|
| Silicon die        | 6.5            | 6.5           | 42.25                    | 0.22              | 0.0319                         | 0.015               | 0.49                     |
| Die attach         | 6.5            | 6.5           | 43.30                    | 0.08              | 0.0370                         | 0.0047              | 0.17                     |
| Copper             | 28.40          | 25.80         | 48.44                    | 0.3               | 0.0155                         | 0.050               | 0.78                     |
| Aluminium<br>oxide | 30.60          | 28.00         | 58.37                    | 0.38              | 0.2411                         | 0.078               | 18.77                    |
| Copper             | 28.40          | 25.80         | 69.22                    | 0.3               | 0.0108                         | 0.072               | 0.78                     |
| DCB<br>solder      | 28.40          | 25.80         | 75.69                    | 0.08              | 0.0211                         | 0.0082              | 0.17                     |
| Baseplate          | 91.40          | 31.40         | 138.77                   | 3                 | 0.0540                         | 1.394               | 75.36                    |
| TIM                | 92.00          | 32.00         | 219.34                   | 0.03              | 0.2487                         | 0.0138              | 3.4                      |
| Heatsink           | 125            | 100           | 10000                    | 20                | 0.0125                         | 486                 | 6075                     |
| Gap pad<br>3000S   | 91.40          | 91.40         | 1426                     | 0.17              | 0.0397                         | 0.7757              | 30.83                    |
| Gap pad<br>1500    | 91.40          | 91.40         | 1426                     | 0.17              | 0.0795                         | 0.5091              | 40.46                    |

 Table 6.3 Power semiconductor internal layer thermal parameters



Figure 6.13 Two Cauer network cells. (a) R-C cell; (b)R-C-R cell

There are two popular ways of modelling Cauer network: the CR cell and RCR cell shown in Figure 6.13(a) and (b). The RCR cell is known to be more accurate and was therefore used in the simulation. The RCR cell splits the thermal resistance equally in two values connected in series while the capacitance is connected in parallel separating them. This approach of connection is capable of representing the temperature at the centre of any particular layer. Exceptions were made for the first layer, representing the silicon region, and the last layer, representing the heatsink. This is because a step change in the input power to a RCR cell would cause a step change in the temperature whilst in practice the temperature does not go through a step change. Similarly, the thermal time constant of the heatsink is considered to be much longer than that of the

semiconductor device and the temperature does not go through a step change either. Therefore the thermal behavuour of the first layer and last layer are represented by CR cells. Figure 6.14 shows the complete device model.



Figure 6.14 Thermal model of the IGBT power module

# 6.2.2 Thermal characterization for critical layers under different thermal conditions

The transient thermal impedance of the IGBT is simulated with the Cauer model as shown in Figure 6.14.



Figure 6.15 Simulated transient thermal impedance and manufacture's data

Figure 6.15 shows the simulation result of the tested module and the comparison with those from the manufacturer datasheets. These two transient thermal impedance curves have almost same slope in each segment of the thermal transients. The accuracy of the Cauer model is verified by comparing its predictions with those from manufacture datasheets. The differences between both curves can be explained due to the approximations made earlier and the errors of the laboratory test circuits from the manufactures. In addition, a safety margin always exists in the manufacture datasheet,



which explains that datasheet shows a thermal resistance increase of 0.1 K/W at thermal equilibrium.

Figure 6.16 Comparison of thermal characteristics under different cooling conditions

Further simulations were carried out in Matlab&Simulink as shown in Figure 6.16. A 2s

response was simulated to show the junction temperature in response to a power dissipation step generated from a 210A step current at various conditions.

Thermal path conditions investigated are as follows:

1) an IGBT device at initial healthy condition, mounted on a heat sink;

2) an IGBT device with 20% increase of junction-to-case thermal resistance at dieattach layer

3) an IGBT device with 20% increase of junction-to-case thermal resistance at DCB solder layer;

4) an IGBT device with Gap Pad® 3000S inserted between baseplate and heat sink

5) an IGBT device with Gap Pad® 1500 inserted between baseplate and heat sink

6) an IGBT device which is healthy mounted on an ideal heat sink

The simulation results of the junction-to-ambient temperature  $\Delta T_{ja}$  at six different thermal path scenarios are shown in Figure 6.17. To show the temperature curve changes observably, absolute temperature changes with respect to the health device (scenario 1) is plotted in Figure 6.18. The power dissipation takes a finite amount of time to propagate from the junction through the various layers inside the package to the case surface of the package, and finally dissipates through the heat sink to the surrounding environment. It can be concluded that thermal degradation at different layers (i.e. die-attach, substrate solder, baseplate to heat sink interface and heat sink to ambient) will incur different transient temperature rises. This can be Basically, the further degradation occurs from the silicon chip along the heat conduction path, the later the relevant temperature curve separates from the healthy temperature curve.



**Figure 6.17** Transient junction-to-ambient temperature curves for different thermal path scenarios



**Figure 6.18** Transient junction-to-ambient temperature changes for different thermal path scenarios

In order to detect the thermal path degradations with transient heating curves, two methods are proposed. One is to acquire the transient junction temperature at a high sampling rate (e.g. 1kHz). To accurately correspond the transient impedance curve to the degradation conditions (i.e. location, level and effect) transient junction temperatures with a high temporal resolution is preferred. This makes a direct request on the heating power source, which must guarantee a constant heating power  $P_H$  during the heating pulse time. The other method is to measure the thermal impedance at specified time stamp of significance. As the in-situ measurement is measuring solder fatigue only scenario 2) and 3) are of importance. In both scenarios the junction temperature changes nearly negligibly at 2s. It is also apparent that under both scenarios considerable amount of temperature rises occur well before 400ms, which can definitely be detected with our proposed measurement circuit. In addition, with appropriate control of the measurement time stamp, degradation conditions of both die-attach and substrate solder layer can be disclosed.

## 6.3 Solder Fatigue Monitoring

As previously discussed in Chapters 3 and 4, solder fatigue occurs at the die-attach layer and the DCB solder layer with significant impact on the thermal performance of a power module. Often, the quality of the solder layers is characterized by the value of the thermal resistance or thermal impedance as it was described in Chapter 3. An increase of 20% of the internal thermal resistance from junction-to-case  $R_{thjc}$  with respect to its initial value is a generally accepted failure criterion. The in-situ measurement is making use of this information. The circuit described in Chapter 5 is capable of monitoring the thermal path degradation, especially that of the die-attach and DCB solder layer.

| symbol                 | type                              | causes                                                                                         | features                                                                    |
|------------------------|-----------------------------------|------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|
| $\Delta T_j$           | active power<br>cycling           | Power dissipation within semiconductor chips                                                   | Short cycling period, 3-D temperature gradient                              |
| $\Delta T_{heat-sink}$ | passive<br>temperature<br>cycling | Operational environment<br>changes (e.g. ambient<br>temperature, coolant<br>temperature, etc.) | Long cycling period,<br>large variation, identical<br>temperature excursion |

Table 6.4 Comparison of active power cycling and passive temperature cycling

A power module is generally subject to active power cycling and passive temperature cycling during operation. The main features of these two thermal cycling types are compared in Table 6.4 and their temperature changes against time are described in Figure 6.19. Normally, passive temperature cycling allows growth of cracks in the solder layers and this is implemented with an air-air thermal shock chamber to age the power devices under test.



Figure 6.19 Thermal cycling behaviour of a power module

#### 6.3.1 Thermal cycling results with SAM analysis

Since an air-to-air thermal chamber is accessible at the end of the project and the thermal cycling test has not completed yet before the completion of the thesis, only part of the results are shown at this stage. An alternative to validate the proposed

measurement method with thermal pads is introduced in the following section.

A thermal cycling profile between -55°C and +140°C with a rate of three cycles per hour is run with the air-air thermal chamber. During these temperature swings each different layer of the power module stretches and shrinks. The die-attach is the inter layer of both silicon, which is a brittle material, and copper, which is harder than solder. DCB solder is sandwiched between the substrate and the copper baseplate. Since the physical properties are different for individual layers, deformation is therefore driven by the mismatch of CTE between different layers and the temperature gradient. Upon heating, the solder and the copper are found to be in compression with respect to the rest of the structure. Upon cooling, the solder and the copper layers are found to be under tension with respect to the rest of the structure. Solder and copper, with higher CTE values, would attempt to expand at a faster rate than the silicon and alumina layers. The silicon and alumina layers, with lower CTE values, would attempt to hold these layers back putting them in compression. Hence, the silicon and alumina layers would be in tension for the opposite reason. The repetition of these temperature cycles grows cracks but also increases the size of existing voids in the die-attach layer and DCB solder layer.

After cycling, the health of the IGBTs can be examined and characterized. There are two well-known techniques for realizing this. The first technique is called scanning acoustic microscopy (SAM) and generates an image of each layer. This technique is non-destructive and can be applied at any stage of the ageing procedure. The second method is taking backscattered photographs from the cross section of the solder layers and is called scanning electronic microscopy (SEM). SEM is destructive and always carried out at the end of the thermal cycling test. During the course of the thermal cycling aging process, SAM is to be implemented at several intervals and SEM is only to be applies at the end of the project.

A first 700 thermal cycles are applied to a group of 6 power IGBT modules and their C-SAM results are compared with those of healthy modules. More aging tests and experiments are to be carried out to further validate the proposed health monitoring techniques. A general discussion of the C-SAM is made in Appendix F. Then comparisons of C-SAM results from a group of six samples at both healthy and aged state are made in detail, showing the degradation of tested samples after thermal cycling. In Figure 6.20, the C-SAM results of Sample A are compared. Scan 2 and Scan 3

clearly indicate a delamination in the layers between the baseplate and the lower copper layer, while the weaker contrast of Scan 2 compared to Scan 3 is because part of its gate trigger is still inside baseplate layer. Scan 5 and Scan 6 clearly indicate a delamination between the upper copper layer and the aluminium oxide isolation.



# SAM Scans of IGBT Module S1

Figure 6.20 C-SAM results for different layers of a power IGBT module (Sample A)

Comparisons of the DCB solder layer degradation ratios of all samples are calculated from Scan 3 for each IGBT module with the help of a image processing program by in Matlab and their results are shown in Table 6.5. Similar initial conditions and degraded conditions after 700 thermal cycles are found in all six samples. However, the differences of initial conditions between the two solder pads and hence their differences after thermal cycling in a power module are large. This is due to non-identical soldering conditions for both pads at the manufacturing stage, although the soldering condition for power modules of a same patch is consistent.

**Table 6.5** Comparisons of voided DCB solder layer (a) for top IGBT devices and (b)

 bottom IGBT devices

| Sam                                                                                      | ples                        | SA1   | SA2   | SA3   | SA4   | SA5   | SA6   | mean<br>ratio |
|------------------------------------------------------------------------------------------|-----------------------------|-------|-------|-------|-------|-------|-------|---------------|
| percentage<br>of voided<br>area at 0<br>thermal<br>cycles<br>at 700<br>thermal<br>cycles | at 0<br>thermal<br>cycles   | 2.84  | 2.05  | 4.12  | 1.26  | 2.14  | 3.38  | 2.63          |
|                                                                                          | at 700<br>thermal<br>cycles | 26.62 | 30.02 | 19.98 | 25.21 | 25.46 | 25.57 | 25.48         |
| (b)                                                                                      |                             |       |       |       |       |       |       |               |

(a)

| Sam                             | ples                        | SB1   | SB2   | SB3   | SB4   | SB5   | SB6   | mean<br>ratio |
|---------------------------------|-----------------------------|-------|-------|-------|-------|-------|-------|---------------|
| percentage<br>of voided<br>area | at 0<br>thermal<br>cycles   | 1.71  | 1.01  | 2.21  | 0.92  | 1.55  | 1.24  | 1.71          |
|                                 | at 700<br>thermal<br>cycles | 10.68 | 14.37 | 10.86 | 13.14 | 14.25 | 12.36 | 10.68         |

#### 6.3.2 Test result and discussion

As there was no power module with thermal path degradation for this project, ageing of the power module was emulated by placing thermal pads between the baseplate and the heat sink. First the in-situ measurement was measuring data without thermal pads representing a healthy power module. Then, thermal pads were attached to emulate the aged device and data was captured using the in-situ measurement circuitry. The same approach was taken in [178]. As will be shown later, the experimental results agree well with the theoretical analysis. The proposed method shows high accuracy and reproducibility.

Two thermal pads were used, Gap Pad® 1500[179] and Gap Pad® 3000 [180]. Given the thermal conductivity, effective heat conduction area and the dimension of the Gap Pad, their thermal resistances are calculated based on Equation 3.21. The effective heat conduction area is dependent on the silicon die's dimension. Therefore, the effective values of the increased thermal resistances are different between IGBT and diode power generation. The calculation and relevant parameters are discussed in Chapter 6.2 in detail. During IGBT power generation, Gap Pad® 3000 has a thermal resistance of 0.04 °C/W and Gap Pad® 1500 has a thermal resistance of 0.08 °C/W. The thermal resistance of Gap Pad® 3000 and Gap Pad® 1500 are approximately 8% and 16% of the IGBT junction-to-case thermal resistance  $R_{thjc}(T)$  of SKM 50G063D which is rated as 0.5 °C/W in the datasheet. Similarly, they introduce 0.08 °C/W and 0.16 °C/W during diode power generation and these induce approximately 8% and 16% of the diode junction-to-case thermal resistance  $R_{thjc}(D)$  of SKM 50G063D which is rated as 1 °C/W in the datasheet. The Gap Pad therefore changes the thermal behaviour of the module.

The in-situ measurement is making use of Equation (6.8). In order to determine the thermal impedance over time the junction temperature over time  $T_j(t)$ , reference temperature over time  $T_r(t)$  and the power P must be known.

$$Z_{thjr}(t) = \frac{T_{j}(t) - T_{r}(t)}{P}$$
(6.8)

 $T_j(t)$  can be measured with the help of TSEPs ( $V_{CE(t)}/V_{F(t)}$ ) and  $T_r(t)$  can be measured with the help of thermocouples.  $T_j(t)$  can be captured either during the cooling of the device or the heating of the device. Capturing the temperature during cooling requires heating up the device to a thermal equilibrium first. Heating is normally provided by the current that is flowing through the device. The temperature fall at turn–off, when the device has reached thermal equilibrium, is measured with the help of a TSEP parameter. In order to reach thermal equilibrium, a sufficiently effective cooling facility and a long experiment time are required. These conditions are not cost effective for in-situ EV applications. The heating technique samples the temperature during the heating of the device again by using a TSEP parameter. This method, however, requires an expensive online calculation of the power. In order to shorten the length of the test and meet the requirement in EV applications, the in-situ measurement is using therefore the heating curve.



Figure 6.21 Diagram of pulse trains for thermal conduction degradation monitoring

The junction temperature is measured using  $V_{CE(I)}$  and like for bond wire lift-off a current pulse pattern is generated. The pulse pattern, however, has a different function. Each current pulse is about 1ms long and is generating heat in the IGBT/diode chip. At the end of the current pulse the terminal voltage  $V_{CE(I)}$  is measured and the junction temperature is calculated from the TSEP calibration curve. The temperature sensing intervals with TSEP ( $V_{CE(I)}$  or  $V_{F(I)}$ ) method must be very short in order to avoid any considerable cooling prior to junction temperature measurement, but be long enough to guarantee no electrical transition noise. Figure 6.21 demonstrates the current pulse train with a duty cycle of 94.4%, which alternates between heating intervals and temperature sensing intervals every millisecond. The pulse starts with the heating power of known voltage and current for 944us and then enters into the temperature measurement for a period of 56us. Then another 1ms impulse is started to increase the heat in the chip and another measurement takes place to record the next junction temperature and so on. The reference temperature  $T_r$  is measured from five thermocouples as shown in Figure 5.12. It must be noticed that the in-situ measurement assumes constant heat flow at this stage and hence the junction-to-reference temperature difference is indicative of the thermal conduction performance. This assumption is further discussed in the section 6.2.2.

High resolution and accuracy of junction temperature and case temperature measurements are particularly critical in calculating the thermal impedance. This can be
achieved by using a high performance data-acquisition system. In the proposed circuit, junction temperature measurement has a peak-to-peak noise of  $0.6^{\circ}$ C and the case temperature measurement has a peak-to-peak noise of 0.3°C. Based on the root-sumsquares manner for uncorrelated noise [132], this results in a peak-to-peak noise of 0.67°C for junction-to-case temperature measurement. An alternative is to increase power losses, which can result in a large temperature gradient at given thermal resistance, to improve the measurement resolution. Although the in-situ measurement system was designed with a 50A dc current source, experiments have shown that, by adjusting the high current to 75A, a sufficient temperature difference (above  $100^{\circ}$ C) of  $\Delta T_{ir}$  can be achieved within less than 400ms. This temperature difference is normally sufficient to insure good measurement resolution. Therefore the in-situ measurement circuitry should operate at a 75A current level which has no impact on the bond wire lift-off measurement. 75A allows for enough chip temperature rise within a controlled heating period without subjecting the chip to excessive temperature stress. This power level is also chosen to prevent silicon junction temperature from exceeding 150°C during a specified heating period (in this case is 400ms), even for situations of excessive solder layer voiding.



**Figure 6.22** Transient junction-to-reference temperature  $(T_{jri})$  curves for different reference points

**Figure 6.23** Changes of  $\Delta T_{jri}$  for different reference points

The temperature difference  $\Delta T_{jri}$  (*i*=1,2,...,5) between junction temperature and all five reference points at healthy and aged conditions (using the two pads) are shown in Figure 6.22. The change of temperature difference from the healthy to the degraded state is small and can hardly be shown in the plot. Therefore, the changes between degraded state and the healthy state for each reference point are plotted in Figure 6.23. Among the five reference points, Reference 1 is the one with most noticeable changes and these changes can be easily detected after 400ms. For other reference points,  $\Delta T_{jri}$ can also be detected given enough heating pulses.

#### 6.3.3 Discussion of power dissipation

Dissipated power P can inevitably affect the calculation of thermal impedance. Hence it is necessary to investigate its value during heating process. All power semiconductors dissipate power internally both during the static state and during on and off transients. Although the IGBT/diode under test is continuously conducting a current pulse train during the health monitoring, they still suffer the transient losses passively while the auxiliary IGBT is switching to control the test current patterns.

The conduction current is injected by the controlled current source, while the on-state voltage is a function of the parameters described in Equations 6.2 and 6.4. Because semiconductor characteristics are highly dependent on the junction temperature, it is difficult to generate constant conduction power because of the self heating process caused by the current unless both the current that is flowing through the device and the voltage across the device are somehow regulated to retain constant power dissipation [141, 181]. Therefore, it impacts on calculating the thermal impedance using Equation 6.8. As a consequence, power dissipation for a power device must be discussed in detail.

An experiment was carried out to measure the dissipated power at various case temperatures. The power was generated by two consecutive high current heating pulses with a frequency of 1Hz and duty ratio of 99.995%. This pattern results in that heat is generated by conduction losses rather switching losses. The case temperature was controlled by an environmental chamber and the case temperature was the average value from 4 thermocouples (TC1 and TC3~TC5) that were placed in the vicinity of the module under test. The temperature in the thermal chamber was set to:  $T_{a1}$ =-20°C,  $T_{a2}$ =0°C,  $T_{a3}$ =20°C and  $T_{a4}$ =40°C, for each different test. For each temperature, a

complete test was performed three times for the same device (IGBT and diode).

The conduction power was calculated from the product of the on-state voltage and the forward conduction current and it is shown in Equation (6.9):

$$P_{av} = \frac{1}{t_{off} - t_{on}} \int_{t_{on}}^{t_{off}} V_{CE(h)}(t) \cdot I_{C(h)}(t) \cdot dt$$
(6.9)

where  $t_{on}$  and  $t_{off}$  denote the start and end of the high current pulse. However, as voltage and current are sampled the average power can be expressed as:

$$P_{av} = \frac{1}{t_{off} - t_{on}} \sum_{k=1}^{N} V_{CE(h)}(k) \cdot I_{C(h)}(k) \cdot \delta t = \frac{1}{N} \sum_{k=1}^{N} V_{CE(h)}(k) \cdot I_{C(h)}(k)$$
(6.10)

with *N* being the total number of samples.

Voltage, current and power at various case temperatures were recorded and are shown in Appendix E. Waveforms of  $V_{CE(h)}(t)$ ,  $I_{C(h)}(t)$  and the calculated P(t) from the first IGBT sample are shown in Figure 6.24, while those graphs for the diode are recorded in Figure E.1 in Appendix E. The on-state voltage is measured across the power module's terminals. A voltage drop induced by terminal leads and terminal joints is inevitable, however, as no degradation at terminal connection is expected during the course of the experiment, this voltage drop will not impose an adverse effect on the subsequent thermal impedance estimations.



**Figure 6.24**  $V_{CE(h)}(t)$ ,  $I_{C(h)}(t)$  and P(t) waveformes of the IGBT during heating pulses at four different ambient temperatures

Figure 6.25a shows a set of test results of the thermal impedance  $Z_{thjr1}$  for the IGBT measured at t = 1s at defined ambient temperature ( $T_{a1}$ ~ $T_{a4}$ ). At each ambient temperature three results are taken and the mean value of  $Z_{thjr1}$  is obtained by averaging the three results. A ±0.003°C/W tolerance band around the mean values is indicated by blue and red dotted lines in the figure. The relative error of the thermal impedance corresponding to the mean value at defined ambient temperature ( $T_{a1}$ ~ $T_{a4}$ ) is calculated and shown Figure 6.25b. The variations are all within a ±0.003°C/W tolerance band around the mean values. These graphs clearly suggest that at predefined conditions, the proposed approach has a high accuracy and good reproducibility to measure the  $Z_{thjr1}$ . Therefore, the thermal path degradations can be reliably discriminated from the measurement values. All junction-to-case thermal impedances  $Z_{thjr1}$  values at t=1s and t=2s and their relative error to mean values are collected for IGBT and FWD in Figure E.2-E.5 of Appendix E.



**Figure 6.25** Thermal impedance  $Z_{thjr1}$  values (at t = 1s and ambient temperature =  $-20^{\circ}$ C,  $0^{\circ}$ C,  $20^{\circ}$ C,  $40^{\circ}$ C) for (a) the IGBT and (b) their relative errors

#### 6.3.4 Discussion of thermal impedance results

Gap Pad® 1500 and Gap Pad® 3000 were inserted between the baseplate and the heat sink to emulate heat path degradations. Thermal impedances were recorded in Figure 6.26 at predefined conditions (ambient temperature = 0°C, 20°C) and the influence of thermal path degradation on  $Z_{thjr}$  values with reference temperature measured at Reference 1 are compared for IGBT(left) and diode(right). The measured thermal impedance has increased by approximately 0.017°C/W for the IGBT and 0.032°C/W for the diode at t = 1s with Gap Pad 3000S30 inserted. The thermal impedance increases by a further amount either when Gap Pad 1500, which has higher thermal resistivity, is inserted or when longer heating period is allowed. The resolution for the thermal impedance measurement is within a ±0.003°C/W tolerance band, which is sufficient to detect such thermal impedance rises. The thermal impedances measured at all reference points are shown in Figure E.6 – E.9 of Appendix E.



**Figure 6.26** Influence of thermal path degradation on  $Z_{thjr}$  values (at t = 1s, 2s and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 1 for IGBT (a) and diode (b)

The averaged values of the thermal impedance of the module's IGBT and diode taken at t = 1s corresponding to all four temperatures are plotted in Figure 6.27. The thermal impedance has a nearly linear characteristic against ambient temperature and it can be approximated with a straight-line equation resulting from linear curve fitting using Matlab. Hence, a look-up table that contains the healthy baseline thermal impedance obtained at specified time stamps at various ambient temperatures can be built up for each IGBT and diode in the power module. Any rise above the baseline in the subsequent measurement will indicate thermal path degradations. The dotted lines show

the  $0.017^{\circ}$ C/W and  $0.032^{\circ}$ C/W level drift of the initial thermal impedance obtained at t = 1s for IGBT and diode respectively. Hence, the thermal path degradations can be reliably discriminated from the measurement values. The averaged values of the thermal impedance of IGBT/diode taken at t = 2s corresponding to all four temperatures are plotted in Figure E.10 of Appendix E.



**Figure 6.27** The averaged thermal impedance of (a) IGBT and (b) diode taken at t = 1s corresponding to all temperatures and linear regression

### 6.4 In-situ Measurement and Electric Vehicle Driving Cycle

The in-situ measurement is not continuously operating. Sections 6.1 and 6.2 provided the timings that are required to measure the health of one IGBT and for one diode.

|                | Devices                             | Conditions   | Time  | Unit |
|----------------|-------------------------------------|--------------|-------|------|
|                | DURAKOOL                            | Operate time | typ 7 | ms   |
|                | DG85C Relay                         | Release time | typ 2 | ms   |
| S1-S4 and DPCO | Bounce time DC coil NO / NC contact | typ 1/3      | ms    |      |

 Table 6.6 Relay operation time [182]

Given the relay operating time (Table 6.6), a 10ms delay-time switching between two devices must be added to the total time that it takes to measure the health of all IGBTs and diodes within the inverter. The bond wire lift-off time per device was measured to 4ms. The estimated total time to diagnose bond wire lift-off of all 12 devices (6 IGBTs and 6 diodes) is thus 158ms. Solder fatigue measurement takes much longer per device and was measured up to 400ms per device. Therefore the total time to measure the health of all 12 devices is 4,910ms giving a total time of 5.068 seconds for the inverter. The average duration between stops for urban/suburban drive cycles is 92s and the average duration a car stops during stop and go traffic is 21s [175]. The in-situ measurement is therefore fast enough to complete a full set of measurement within these stops. However, as explained in Chapter 4, the in-situ measurement works with an interrupt routine which allows the measurement process when normal drive operation was interrupted. This can normally be performed during key-on/key-off and gas/charging services. Figures 6.28 and 6.29 summarise the procedure for bond wire lift-off and solder fatigue in form of flow charts.



Figure 6.28 Flow chart of the bond wire lift-off monitoring algorithm



Figure 6.29 Flow chart of the solder fatigue monitoring algorithm

# **CHAPTER 7**

### CONCLUSIONS

Electricity as an energy vector for vehicle propulsion offers the possibility of reducing green house gas emissions globally. In electric vehicles, power inverters play a major role in controlling the power flow from the battery to the electric motor. IGBTs are usually chosen as switching devices for power flow control.

The loss of any of the switches in an inverter may lead to a malfunction of the EVs where the safety is critical. It is therefore required that the inverter operates at the highest reliability levels. Although reliability of IGBT power modules has increased over the last decades, an IGBT still has a limited lifetime, sometimes shorter than other critical components in the car. Therefore, research has focused on techniques to prevent a catastrophic failure by over-rating, adding redundancy and implementing fault tolerant techniques. Redundant or fault tolerant techniques do not stop the IGBT power module from failing but can maintain the system in healthy operation. In general, the dashboard signal will warn the driver and request to approach the next service station to replace the faulty IGBT module. However, the downsides with these techniques are increased cost and complexity as additional power devices and other power components must be added to the inverter. Recently, a new technique has emerged called online health monitoring for IGBT power modules. The technique estimates the health of the device by making use of PoF models and thermal models. The model parameters are fed from historical field data and experimental data. In the online health monitoring system, the dashboard warning signals can be provided even prior to actual failures. This online health monitoring system is more cost-effective than redundancy or fault tolerant techniques. The drawbacks of this system are the requirement for historical training data from field products to build the PoF models and the processing power to run the models. Consequently all online health monitoring systems in the literature are in need of improvement in measurement accuracy to better indicate the true health of IGBT power

128

modules.

This research project has proposed a new online health monitoring system for IGBTs which depend neither upon the modelling nor upon the collection of a large amount of data. The novel system directly measures the prognostic parameters of the health of the IGBT power module by adding a measurement circuit across the devices and provides an early-warning signal of the anomaly. The added measurement and processing hardware is small in size and can be integrated in standard IGBT drivers reducing the cost of this system.

#### 7.1 Determination of Common Failures in an IGBT Module

As previously discussed, bond wire lift-off and solder fatigue are the most frequently observed failures in power IGBT modules and thus have been the focus of this research.

#### 7.1.1 In-situ bond wire lift-off measurement

An auxiliary monitoring circuit has been introduced which can be embedded into the IGBT driver circuit. The circuit has been designed to operate in a harsh EV environment. The in-situ circuit includes a series of specific functions such as digital isolation, voltage clamping, noise reduction and resistivity against false alarms. All of these functions allow an accurate measurement of operating parameters to determine the health of the bond wires and chip metallisation.

The two parameters selected for measuring bond wire lift-off are  $V_{CE(h)}$  and  $V_{F(h)}$  for IGBT bond wires and diode bond wires, respectively. Any bond wire lift-off or the ongoing progress towards a bond wire lift-off gives rise to contact resistances. The insitu measurement circuit supplies a constant current through the bond wire connections and measures the voltage drop. A rise in  $V_{CE(h)}$  or  $V_{F(h)}$  can therefore be interpreted as a degradation of the health of the bond wire connection under defined conditions. Which are:

(1)  $V_{CE(h)}$  and  $V_{F(h)}$  are based on reference junction temperature. Since the two are also a function of the junction temperature, a TSEP measurement takes place after measuring  $V_{CE(h)}$  or  $V_{F(h)}$  to offset the effect of junction temperature.

(2)  $V_{CE(h)}$  and  $V_{F(h)}$  are not affected by parasitic resistances. Since the two are also a

function of the terminal connection resistance, Kelvin connectors are used to minimise the effect of these parasitic resistances.

(3) The gate drive output voltage drift is minimised. The conductivity during the onstate of the IGBT improves with the increase in the gate drive voltage which is a function of operational temperature. The link between the voltage drift and the operation temperature is discovered in this work but the off-the-shelf IGBT gate drivers do not take this into account. As a result, the gate driver's temperature has been maintained in the experiments to minimise this effect and should be compensated by a control algorithm.

The degradation of bond wires is emulated by individually cutting the bond wires. The effectiveness of the bond wire lift-off monitoring circuit for health monitoring has been validated by testing the circuit at different power levels, temperature levels and at different health conditions of bond wires.

#### 7.1.2 In-situ solder fatigue measurement

The solder fatigue is evaluated by the same in-situ measurement circuit as the bond wire lift-off. In order to achieve a high  $Z_{thjc}$  drift resolution, a 75A current has been injected to the DUT within the period of safe operation in order to keep the junction temperature below its maximum. Degradation of solder fatigue is emulated by inserting thermal pads between the baseplate and the heat sink to increase the thermal impedance. Because the dissipated power loss varies with the ambient temperature in reality, there is an error induced in the thermal impedance response and the consequent solder fatigue estimation.

A TSEP measures the junction temperature whilst the dissipated heat power is calculated by multiplying the device current by the voltage drop across the chip. With the knowledge of temperature gradients and power over time, the thermal impedance response is obtained and then compared with previous results. Any change in the response traces indicates certain degree of solder fatigue. In terms of sampling frequency, it is 5kHz for the bond wire test and 1kHz for the solder test. The reason to choose 1kHz for the latter is to mitigate the switching loss while maintaining a satisfactory spatial resolution. The change in thermal impedance responses can be identified within 1s and therefore a full response curve obtained after achieving a thermal balance is not longer needed. This feature permits a fast measurement of the

IGBTs and diodes and is particularly favoured for electrical vehicle applications.

The effect of the IGBT power module with thermal pads has been measured experimentally and the degradation of the different layers in the IGBT power module has been simulated. These results have been compared and they have confirmed that the auxiliary circuit and the monitoring scheme are capable of reliably monitoring the health of the die solder and the DCB solder.

#### 7.2 Future Work

This work has produced significant results associated with the inverter systems for EV applications and the techniques developed can be easily applied to other low or medium voltage inverters which employ IGBT power modules.

The developed in-situ measurement methods provide an early warning to the driver and can be extended for online data loggers in field applications. Generally, data loggers are capable of recording voltage, current, temperature and other information. To build on this research work, data loggers in the future can be improved to include the information about the health of the IGBT power module. However, this demands for a new technique called Transient Dual Interface Measurement (TDIM). TDIM has been recently introduced and adopted in the new JEDEC standard. This method allows a more accurate and reproducible thermal characterization and also provides detailed information about the individual contribution of each layer to the transient response. The combination of the in-situ measurement and TDIM would provide online condition monitoring for each layer.

This would benefit the end-users as well as power semiconductor designers and manufacturers in terms of gaining insight into the ongoing physical deterioration of each layer of the power electronics device, which is of prime importance and offers them an edge to compete in the marketplace.

However, the proposed in-situ measurement techniques are based on single chipper per switch whilst power modules for high power applications have more chips per switch operating in parallel. Future work should focus on the development of an in-situ measurement circuit for multi-chip power modules.

131

### 7.3 Research Outcomes

Results from this project have been published at the following conferences:

- Ji B, Pickert V, Zahawi B. In-situ Measurement of the Bond Wire Lift-off in IGBT Power Modules. *In: Power Conversion, Intelligent Motion and Power Quality (PCIM).* 2011, Nuremberg, Germany.
- Ji B, Pickert V, Zahawi B. In-situ Bond Wire and Solder Layer Health Monitoring Circuit for IGBT Power Modules. Accepted for : *The International Conference on Integrated Power Electronics Systems (CIPS).* 2012, Nuremberg, Germany.
- Ji B, Pickert V, Zahawi B, Zhang M. In-situ Bond Wire Health Monitoring Circuit for IGBT Power Modules. Accepted for: *Biannual international IET Power Electronics, Machines and Drives Conference (PEMD).* 2012, Bristol, UK.
- Ji B, Pickert V, Zahawi B, Cao, W. "Improved prognostic health management for IGBT power modules", submitted to *IEEE Transactions on Instrumentation and Measurement*
- Ji B, Pickert V, Zahawi B, Cao, W. "Online diagnosis of the bond wire lift-off in IGBT power modules for electric vehicles", submitted to *IEEE Transactions on Sustainable Energy*

# **APPENDIX** A

# A COMPARISION OF HYBRID TECHNOLOGIES

In this section, the introduction and general classification of HEVs are presented. The first hybrid electric vehicles (HEVs) were sold in Japan in 1997 and it has taken more than ten years for HEVs to achieve 1% of the global car market [183]. Nowadays, several producers offer new generations HEVs. According to the role performed by the electric motor and the grades of hybridization, there are many configurations of HEVs, including stop-start, micro hybrid, mild hybrid, full hybrid and plug-in hybrid technologies. Their main differences are compared below in Table A.1.

| Battery charge from<br>grid electricity in addition<br>to charge supplied under<br>use conditions |                       |              |             |             |                |  |
|---------------------------------------------------------------------------------------------------|-----------------------|--------------|-------------|-------------|----------------|--|
| Can drive for short<br>periods using only the<br>electric motor                                   |                       |              |             |             |                |  |
| Uses an electric motor<br>to assist a conventional<br>combustion engine                           |                       |              |             |             |                |  |
| Uses regenerative breaking                                                                        |                       |              |             |             |                |  |
| Shuts off at idle,<br>and in stop-go traffic                                                      |                       |              |             |             |                |  |
| VEHICLE CAPABILITY                                                                                | Stop-start<br>vehicle | Micro Hybrid | Mild Hybrid | Full Hybrid | Plug-in Hybrid |  |
|                                                                                                   | VEHICLE TYPE          |              |             |             |                |  |

Table A.1 Comparisons of HEV technologies [183]

# **APPENDIX B**

## **TEMPERATURE MEASUREMENT TECHNOLOGIES**

In this thesis, semiconductor device junction temperature is measured with TSEP method while reference temperature is measured with physical contact sensors. This section presents a list of commonly used TSEPs at first. Then a summary of commonly used physical contact temperature sensors is shown, followed by a general introduction to integrated circuit (IC) temperature sensors.

### **B.1 Common Temperature Sensitive Electrical Parameters**

In this thesis, semiconductor device junction temperature is measured with TSEP method. Their usages for measuring temperature of IGBTs, power MOSFETs and diodes are compared in Table B.1.

**Table B.1** List of TSEPs that can be used to measure the semiconductor device temperature [151]

| Device / TSEP                                   | Diode        | MOSFET       | IGBT |
|-------------------------------------------------|--------------|--------------|------|
| On-state Voltage Drop / Resistance <sup>1</sup> | ~            | $\checkmark$ | ~    |
| Saturation Current                              | Х            | ~            | ~    |
| Threshold Voltage                               | Х            | ~            | ~    |
| Turn on Energy Loss / Time                      | Х            | $\checkmark$ | ~    |
| Turn off Energy Loss / Time                     | ~            | ~            | ~    |
| Transconductance <sup>2</sup>                   | Х            | ~            | ~    |
| Breakdown Voltage                               | ~            | ~            | ~    |
| Leakage Current                                 | $\checkmark$ | $\checkmark$ | ~    |
| Gate cathode voltage                            | Х            | $\checkmark$ | ~    |

 $<sup>\</sup>frac{1}{2}$  The on-state voltage drop or resistance of a diode is the same as the diode forward voltage drop.

<sup>&</sup>lt;sup>2</sup> For a bipolar device the transconductance decreases at high temperatures and thus the current capabilities of the device can be limited by the transconductance rather than the thermal limits.

## **B.2** Common Physical contact Temperature Sensors

The direct measurement with contact sensors are widely employed for laboratory studies and a summary of commonly used sensors is shown in Table B.2. K type thermocouples and DS18B20 IC temperature sensor measurement circuits were build for this project.

| Table B.2 Summary of four types of commonly contact temperature sensors in industry |
|-------------------------------------------------------------------------------------|
| (thermocouple, RTD, thermistor, and IC temperature sensor [microchip AN679]         |

|                                      | Thermocouple                                                                                                                                                                                      | RTD                                                                                                                                                                                  | Thermistor                                                                                                       | IC sensor                                                                                                      |  |
|--------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|--|
| Temperature<br>Range                 | -270 to 1800°C                                                                                                                                                                                    | -250 to 900 °C                                                                                                                                                                       | -100 to 450°C                                                                                                    | -55 to 150°C                                                                                                   |  |
| Sensitivity                          | 10s of mV / °C                                                                                                                                                                                    | 0.00385 Ω/Ω/ °C<br>(Platinum)                                                                                                                                                        | Several<br>Ω/Ω °C                                                                                                | Based on<br>technology                                                                                         |  |
| Accuracy                             | ±0.5°C                                                                                                                                                                                            | ±0.01°C                                                                                                                                                                              | ±0.1°C                                                                                                           | ±1°C                                                                                                           |  |
| Linearity                            | Requires at least a<br>4th order<br>polynomial or<br>equivalent look up<br>table.                                                                                                                 | Requires at least<br>a 2nd order<br>polynomial or<br>equivalent look<br>up table.                                                                                                    | Requires at<br>least 3rd<br>order<br>polynomial or<br>equivalent<br>look up table.                               | At best within<br>±1°C. No<br>linearization<br>required.                                                       |  |
| Ruggedness                           | The larger gage<br>wires of the<br>thermocouple make<br>this sensor more<br>rugged.<br>Additionally, the<br>insulation materials<br>that are used<br>enhance the<br>thermocouple's<br>sturdiness. | RTDs are<br>susceptible to<br>damage as a<br>result of<br>vibration. This is<br>due to the fact<br>that they<br>typically have<br>26 to 30 AWG<br>leads which are<br>prone to break. | Generally<br>thermistors<br>are more<br>difficult to<br>handle, but<br>not affected<br>by shock or<br>vibration. | As rugged as<br>any IC housed<br>in a plastic<br>package such<br>as dual-in-line<br>or surface<br>outline ICs. |  |
| Responsiven<br>ess in stirred<br>oil | less than 1 Sec                                                                                                                                                                                   | 1 to 10 Secs                                                                                                                                                                         | 1 to 5 Secs                                                                                                      | 4 to 60 Secs                                                                                                   |  |
| Excitation                           | None Required                                                                                                                                                                                     | Current Source                                                                                                                                                                       | Voltage<br>Source                                                                                                | Typically<br>Supply<br>Voltage                                                                                 |  |
| Form of<br>Output                    | Voltage                                                                                                                                                                                           | Resistance                                                                                                                                                                           | Resistance                                                                                                       | Voltage,<br>Current, or<br>Digital                                                                             |  |

### **B.3 Introduction to IC Temperature Sensors**

Semiconductor (or IC for integrated circuit) temperature sensor is an electronic device which is based on the temperature and characteristics of the semiconductors. In general, the IC temperature sensor is best suited for embedded equipment applications. These sensors share a number of characteristics - linear outputs, relatively small size, low cost and limited temperature range. A temperature sensor IC can operate over a nominal temperature range of -55 to 150 °C. Some devices go beyond this range, while others operate over a narrower range. They are reasonably linear and very usable. Limitations of the semiconductor temperature sensors lie in the unpleasant thermal contact with an outside surface. The thermal design in less-than-ideal circumstances for transient temperature measurement because the raw sensing element is generally packaged in a standard case for IC devices.

A summary of available IC temperatures sensors is presented below, followed by more detail on some of the more popular devices. The sensors can be grouped into five broad categories: voltage output, current output, resistance output, digital output and simple diode types.

• Voltage Output Temperature Sensors

The following sensors provide an output voltage signal proportional to temperature with relatively low output impedance. All require an excitation power source and all are essentially linear.

| Sensor  | Manuf.            | Output             | Tolerance                               | Package       | Comments                                                                      |
|---------|-------------------|--------------------|-----------------------------------------|---------------|-------------------------------------------------------------------------------|
|         |                   |                    | (range)                                 |               |                                                                               |
| LM35    | National<br>Semi  | 10mV/°C            | ±1°C &<br>±1.5°C<br>(-20°C to<br>120°C) | TO-220        | Needs a negative<br>supply for<br>temperatures < 10°C                         |
| AD22100 | Analog<br>Devices | 22.5mV/°C<br>at 5V | ±2°C &<br>±4°C<br>(-50 to<br>+150°C)    | TO-92<br>SO-8 | Output ratiometric<br>with supply voltage -<br>good with ratiometric<br>ADC's |
| FM20    | Fairchild         | -11.77<br>mV/°C    | ±5°C<br>-55°C to<br>130°C               | SOT23         | Low power                                                                     |

 Table B.3 Voltage Output Temperature Sensors

#### • Current Output Temperature Sensors

The current output sensors provide an output current proportional to the temperature. It acts as a high-impedance, constant current regulator typically passing a  $1\mu A/^{\circ}C$  temperature coefficient for a specified temperature range within a wide range of supply voltages.

| Sensor | Manuf.            | Output | Tolerance<br>(range)                      | Package       | Comments                                                                                                  |
|--------|-------------------|--------|-------------------------------------------|---------------|-----------------------------------------------------------------------------------------------------------|
| AD590  | Analog<br>Devices | 1µA/°C | ±5.5°C &<br>±10°C<br>(-55°C to<br>+150°C) | TO-52         | An old favorite, but need<br>to watch cable leakage<br>currents                                           |
| AD592  | Analog<br>Devices | 1µA/°C | ±1°C &<br>±3.5°C<br>(-25°C to<br>+105°C)  | TO-92         | A more precise AD590                                                                                      |
| TMP17  | Analog<br>Devices | 1µA/°C | ±4°C<br>(-40°C to<br>+105°C)              | SO-8          | Thermally faster than<br>AD590 and immune to<br>voltage noise pickup                                      |
| LM134  | Linear<br>Tech    | 1μA/°C | ±3°C<br>(-25°C to<br>+100°C)              | SO-8<br>TO-92 | Device can be used as<br>constant current source<br>and temperature sensor at<br>different current levels |

| Table B.4 Cu | rrent Output | Temperature | Sensors |
|--------------|--------------|-------------|---------|
|--------------|--------------|-------------|---------|

### • Digital Output Temperature Sensors

The digital temperature sensor is the first sensor to integrate a sensor and an analog to digital converter (ADC) on to a single silicon chip. In general, these sensors do not lend themselves for use with standard measuring devices because of their non standard digital interfaces. Many are designed specifically for the thermal management of microprocessor chips. A selection of representative devices is presented below:

| Sensor         | Manuf.            | Output                                                        | Tolerance Packag                                              |                               | ckag Comments                                                          |  |
|----------------|-------------------|---------------------------------------------------------------|---------------------------------------------------------------|-------------------------------|------------------------------------------------------------------------|--|
|                |                   |                                                               | (range)                                                       | е                             |                                                                        |  |
| DS2435         | Dallas            | 1 wire<br>serial<br>0.5°C or<br>1°C<br>resolution             | ±4°C<br>(0°C to<br>127.5°C<br>-40°C to 85°C)                  | TO-92<br>modifie<br>d         | Also builds a time /<br>temperature histogram                          |  |
| DS18B2<br>0    | Dallas            | 1 wire<br>serial<br>0.5°C<br>resolution                       | ±0.5°C<br>(-10°C to<br>+85°C )<br>±5°C<br>(-55°C to<br>125°C) | Modifie<br>d<br>TO-92<br>SO-8 | High-Precision over -10°C<br>to +85°C range                            |  |
| TMP03<br>TMP04 | Analog<br>Devices | Pulse<br>width<br>modulatio<br>n<br>(mark-<br>space<br>ratio) | ±4°C<br>(-25°C to<br>100°C)                                   | TO-92<br>SO-8<br>TSSOP<br>-8  | Nominal 35 Hz output with<br>1:1 mark-space ratio at 25°C              |  |
| DS1624         | Dallas            | 2 wire<br>serial, I2C<br>Serial                               | ±0.5°C<br>(-55°C to<br>125°C)                                 | SOP-8<br>DIP-8                | Addressable, multi drop<br>connection. Also has 256-<br>byte of EEPROM |  |

**Table B.5** Current Output Temperature Sensors

The Analog Devices parts are interesting. They employ a sigma-delta ADC that produces continuous pulse stream output with a mark-space ratio, which is proportional to the temperature. This makes for easy interfacing to a microprocessor and also for isolating by optical or other means. The same signal could also be passed through a low pass filter to generate an analog voltage.

The Dallas DS2435 goes beyond that of a sensor plus ADC by providing simple data reduction using an eight bin time / temperature histogram with definable bin boundaries. It appears to have been specifically designed for battery management, but other application could include food transport monitoring, machine use monitoring. This sensor demonstrated the way of the future in sensor technology where sensor, ADC, memory and microcontroller are integrated to form an application specific task very cost effectively.

• Resistance Output Silicon Temperature Sensors

Some proprietary RTD sensors are constructed by depositing a thin metal film on a silicon substrate and trimming by laser. The temperature - versus - bulk resistance characteristics of these semiconductor materials allow the manufacture of resistance output silicon temperature sensors using standard silicon semiconductor fabrication equipment. This construction can be more stable than other semiconductor sensor, due to the greater tolerance to ion migration. However other characteristics (see below) require that care to be taken in using these sensors.

| Sensor                                    | Manuf.   | Output                                        | Tolerance<br>(range)                                         | Package                               | Comments                                                                      |
|-------------------------------------------|----------|-----------------------------------------------|--------------------------------------------------------------|---------------------------------------|-------------------------------------------------------------------------------|
| KTY81<br>KTY82<br>KTY83<br>KTY84<br>KTY85 | Phillips | 1K or 2K<br>at 25°C,<br>+0.8%/°C<br>See below | ±1°C to<br>±12°C<br>(-55°C to<br>+150°C<br>some to<br>300°C) | SOD-70,<br>SOT-23<br>SOD-68<br>SOD-80 | Bulk resistance of<br>silicon. Keep excitation<br>current >0.1mA and <<br>1mA |
| KYY10<br>KTY11<br>KTY13                   | Siemens  | 1K or 2K<br>at 25°C,<br>+0.8%/°C<br>See below | ±1°C &<br>±3.5°C<br>(-25°C to<br>+105°C)                     | TO-92<br>modified                     | Bulk resistance of silicon                                                    |

Table B.6 Resistance Output Silicon Temperature Sensors

The silicon temperature sensor's resistance is given by the equation:

$$R = R_r (1 + a(T - T_r) + b(T - T_r)^2 - c(T - T_i)^d)$$

where  $R_r$  is the resistance at temperature  $T_r$  and a, b, c and d are constants.  $T_i$  is an inflection point temperature such that c = 0 for  $T < T_i$ . The resistance of some of these semiconductor sensors is dependent on the excitation current (due to current density effects in the semiconductor) and the polarity of the applied voltage. As with other non-passive temperature sensors, self-heating can induce errors. These proprietary sensors are well suited to HVAC (heating, ventilation and air conditioning) and general use inside the allowable temperature range.

#### • Diode Temperature Sensors

The ordinary semiconductor diode may be used as a temperature sensor. The diode is the lowest cost temperature sensor and can produce more than satisfactory results if you are prepared to undertake a two point calibration and provide a stable excitation current. Almost any silicon diode is ok. The forward biased voltage across a diode has a temperature coefficient of about 2.3mV/°C and is reasonably linear

# **APPENDIX C**

## SCHEMATICS OF EXPERIMENT CIRCUITS

In this section, different schematics of experiment circuits are presented. Figure C.1 shows the relay drives and voltage clamp circuits. Figure C.2 shows the IGBT drivers. Figure C.3 shows the thermocouple measurement circuit with AD595C.



Figure C.1 Schematics of relay drivers and device voltage clamp circuits



Figure C.2 Schematics of IGBT drivers



Figure C.3 Schematics of thermocouple measurement circuit with AD595C

# **APPENDIX D**

### NON-SWITCHED TSEP MEASUREMENT METHOD

Beside the switched TSEP measurement method used in this thesis, the non-switched measurement method can also be used for junction temperature measurements. It can be used in the normal working conditions without switching between the load (or test) current and the sensing current. The disadvantages of this method has been reviewed in Chapter 4.3.3. Figure D.1 shows the dependence of  $V_{CE}$  and  $V_F$  on the junction temperature at various values of high currents. Given the temperature mapping of voltage drop and current, the junction temperature of healthy devices can be estimated online.



Figure D.1 Temperature dependence of (a)  $V_{CE}$  and (b)  $V_F$  current ranges from 35 A to 50 A (step: 5 A)

# **APPENDIX E**

## **EXPERIMENT DATA AND RESULTS**

This appendix extends the results presented in Chapter 6. The appendix contains a complete set of data for the results obtained from the preliminary test and thermal shock test.

E.1 Experiment Results

| Τ.              | N.  | <b>T</b> (0)  | AT (1)                                | D              | D (1)                           |                              | D       | D (2)                           |
|-----------------|-----|---------------|---------------------------------------|----------------|---------------------------------|------------------------------|---------|---------------------------------|
| 1a              | INO | $I_{c,av}(0)$ | $\Delta \mathbf{I}_{jr1}(\mathbf{I})$ | $\mathbf{P}_1$ | $\mathbf{K}_{\text{th jr1}}(1)$ | $\Delta \mathbf{I}_{jr1}(2)$ | $P_2$   | $\mathbf{K}_{\text{th jr1}}(2)$ |
| Unit            | -   | °C            | °C                                    | W              | W/°C                            | °C                           | W       | W/°C                            |
|                 | 1   | -18.80        | 96.95                                 | 195.731        | 0.495                           | 101.38                       | 201.737 | 0.503                           |
| T <sub>a1</sub> | 2   | -18.72        | 97.04                                 | 195.623        | 0.496                           | 101.37                       | 201.700 | 0.503                           |
|                 | 3   | -18.78        | 97.22                                 | 195.581        | 0.497                           | 101.10                       | 201.578 | 0.502                           |
|                 | 4   | 0.62          | 105.63                                | 204.951        | 0.515                           | 110.02                       | 211.702 | 0.520                           |
| T <sub>a2</sub> | 5   | 0.39          | 105.67                                | 204.769        | 0.516                           | 110.22                       | 211.578 | 0.521                           |
|                 | 6   | 0.52          | 105.78                                | 204.727        | 0.517                           | 110.31                       | 211.557 | 0.521                           |
|                 | 7   | 20.51         | 115.58                                | 215.017        | 0.538                           | 121.23                       | 221.226 | 0.545                           |
| T <sub>a3</sub> | 8   | 20.43         | 115.74                                | 214.533        | 0.539                           | 120.96                       | 222.010 | 0.545                           |
|                 | 9   | 20.47         | 115.56                                | 214.466        | 0.539                           | 120.66                       | 221.926 | 0.544                           |
|                 | 10  | 40.48         | 126.62                                | 222.938        | 0.568                           | 131.03                       | 230.676 | 0.568                           |
| $T_{a4}$        | 11  | 40.39         | 126.64                                | 223.432        | 0.567                           | 130.94                       | 230.963 | 0.567                           |
|                 | 12  | 40.36         | 126.74                                | 223.608        | 0.567                           | 130.87                       | 231.097 | 0.566                           |

**Table E.1** Measurement and Estimate Quantities for healthy IGBTs

| _               |    | I             |                     | _       |                 |                     | _       |                 |
|-----------------|----|---------------|---------------------|---------|-----------------|---------------------|---------|-----------------|
| Та              | No | $T_{c,av}(0)$ | $\Delta T_{jr1}(1)$ | $P_1$   | $R_{th jrl}(1)$ | $\Delta T_{jr1}(2)$ | $P_2$   | $R_{th jrl}(2)$ |
| Unit            | -  | °C            | °C                  | W       | W/°C            | °C                  | W       | W/°C            |
|                 | 1  | -18.75        | 84.714              | 121.350 | 0.698           | 85.49               | 121.456 | 0.704           |
| T <sub>a1</sub> | 2  | -18.74        | 84.868              | 121.401 | 0.699           | 85.66               | 121.559 | 0.705           |
|                 | 3  | -18.77        | 84.522              | 121.397 | 0.696           | 85.34               | 121.477 | 0.703           |
|                 | 4  | 0.51          | 88.082              | 122.179 | 0.721           | 89.26               | 121.744 | 0.733           |
| $T_{a2}$        | 5  | 0.47          | 88.180              | 122.125 | 0.722           | 89.04               | 121.712 | 0.732           |
|                 | 6  | 0.47          | 87.975              | 122.096 | 0.721           | 88.66               | 121.683 | 0.729           |
|                 | 7  | 20.53         | 90.991              | 121.593 | 0.748           | 91.51               | 120.653 | 0.758           |
| $T_{a3}$        | 8  | 20.53         | 90.886              | 121.767 | 0.746           | 91.64               | 120.836 | 0.758           |
|                 | 9  | 20.46         | 91.308              | 121.822 | 0.750           | 91.38               | 120.945 | 0.756           |
|                 | 10 | 40.30         | 92.205              | 120.652 | 0.764           | 92.34               | 119.273 | 0.774           |
| $T_{a4}$        | 11 | 40.31         | 92.362              | 120.707 | 0.765           | 92.84               | 119.351 | 0.778           |
|                 | 12 | 40.43         | 92.381              | 120.576 | 0.766           | 92.63               | 119.210 | 0.777           |

**Table E.2** Measurement and Estimate Quantities for healthy diodes

 $T_{c,av}(0)$  is the initial average temperature of points R1 – R4;  $\Delta T_{jr1}(1)$  is the temperature differences between junction and reference point R1 immediately after the first high current pulse; P<sub>1</sub> is the power losses of the high current pulse; R<sub>th jr1</sub>(1) is the thermal impedance at t=1sec.



**Figure E.1**  $V_{F(h)}(t)$ ,  $I_{C(h)}(t)$  and P(t) waveforms of the diode during heating pulses at four different ambient temperatures







**Figure E.3** Thermal impedance  $Z_{thjr1}$  values (at t = 2s and ambient temperature =  $-20^{\circ}$ C,  $0^{\circ}$ C,  $20^{\circ}$ C,  $40^{\circ}$ C) for the IGBT(upper) and the relative errors(lower)







**Figure E.5** Thermal impedance  $Z_{thjr1}$  values (at t = 2s and ambient temperature =  $-20^{\circ}$ C,  $0^{\circ}$ C,  $20^{\circ}$ C,  $40^{\circ}$ C) for the diode(upper) and the relative errors(lower)

| Ta No |    | TIM        | $T_{c,av}(0),$ | $t = 1s, W/^{\circ}C$ |                    |                    |                    |                    | $t = 2s, W/^{\circ}C$ |                    |                    |  |
|-------|----|------------|----------------|-----------------------|--------------------|--------------------|--------------------|--------------------|-----------------------|--------------------|--------------------|--|
| 14    | NO | 1 11/1     | °C             | Z <sub>thjr1</sub>    | Z <sub>thjr3</sub> | Z <sub>thjr4</sub> | Z <sub>thjr5</sub> | Z <sub>thjr1</sub> | Z <sub>thjr3</sub>    | Z <sub>thjr4</sub> | Z <sub>thjr5</sub> |  |
| 0°C   | 1  | No GapPad  | 0.62           | 0.515                 | 0.554              | 0.558              | 0.547              | 0.520              | 0.561                 | 0.577              | 0.558              |  |
| 0°C   | 2  | No GapPad  | 0.39           | 0.516                 | 0.555              | 0.558              | 0.548              | 0.521              | 0.562                 | 0.577              | 0.560              |  |
| 0°C   | 3  | No GapPad  | 0.52           | 0.517                 | 0.556              | 0.558              | 0.548              | 0.521              | 0.562                 | 0.578              | 0.560              |  |
| 0°C   | 4  | GapPad3000 | 0.56           | 0.532                 | 0.556              | 0.561              | 0.552              | 0.552              | 0.569                 | 0.594              | 0.575              |  |
| 0°C   | 5  | GapPad3000 | 0.57           | 0.534                 | 0.557              | 0.562              | 0.554              | 0.552              | 0.570                 | 0.595              | 0.576              |  |
| 0°C   | 6  | GapPad3000 | 0.57           | 0.533                 | 0.556              | 0.561              | 0.553              | 0.552              | 0.570                 | 0.594              | 0.576              |  |
| 0°C   | 7  | GapPad1500 | -0.63          | 0.554                 | 0.566              | 0.573              | 0.562              | 0.587              | 0.587                 | 0.620              | 0.594              |  |
| 0°C   | 8  | GapPad1500 | 0.52           | 0.555                 | 0.567              | 0.573              | 0.563              | 0.588              | 0.588                 | 0.620              | 0.594              |  |
| 0°C   | 9  | GapPad1500 | -0.06          | 0.554                 | 0.566              | 0.573              | 0.563              | 0.588              | 0.587                 | 0.620              | 0.594              |  |
| 20°C  | 10 | No GapPad  | 20.51          | 0.538                 | 0.576              | 0.578              | 0.568              | 0.545              | 0.588                 | 0.601              | 0.582              |  |
| 20°C  | 11 | No GapPad  | 20.43          | 0.539                 | 0.577              | 0.580              | 0.570              | 0.545              | 0.587                 | 0.600              | 0.582              |  |
| 20°C  | 12 | No GapPad  | 20.47          | 0.539                 | 0.578              | 0.580              | 0.569              | 0.544              | 0.586                 | 0.600              | 0.582              |  |
| 20°C  | 13 | GapPad3000 | 20.19          | 0.558                 | 0.579              | 0.584              | 0.576              | 0.578              | 0.597                 | 0.619              | 0.600              |  |
| 20°C  | 14 | GapPad3000 | 20.37          | 0.553                 | 0.575              | 0.580              | 0.571              | 0.573              | 0.592                 | 0.615              | 0.595              |  |
| 20°C  | 15 | GapPad3000 | 20.28          | 0.555                 | 0.577              | 0.582              | 0.573              | 0.575              | 0.594                 | 0.617              | 0.597              |  |
| 20°C  | 16 | GapPad1500 | 20.43          | 0.574                 | 0.586              | 0.592              | 0.582              | 0.609              | 0.611                 | 0.640              | 0.615              |  |
| 20°C  | 17 | GapPad1500 | 20.26          | 0.576                 | 0.588              | 0.594              | 0.584              | 0.610              | 0.612                 | 0.641              | 0.616              |  |
| 20°C  | 18 | GapPad1500 | 20.35          | 0.575                 | 0.587              | 0.593              | 0.583              | 0.609              | 0.611                 | 0.641              | 0.615              |  |

 Table E.3 Thermal impedance at predefined conditions for IGBTs

| Ta No TIM |     | TIM        | $T_{c,av}(0),$ |                    | t = 1s,            | W/°C               |                    | $t = 2s, W/^{\circ}C$ |                    |                    |                    |
|-----------|-----|------------|----------------|--------------------|--------------------|--------------------|--------------------|-----------------------|--------------------|--------------------|--------------------|
| 14        | 110 | 1 11/1     | °C             | Z <sub>thjr1</sub> | Z <sub>thjr3</sub> | Z <sub>thjr4</sub> | Z <sub>thjr5</sub> | Z <sub>thjr1</sub>    | Z <sub>thjr3</sub> | Z <sub>thjr4</sub> | Z <sub>thjr5</sub> |
| 0°C       | 1   | No GapPad  | 0.51           | 0.721              | 0.805              | 0.801              | 0.797              | 0.733                 | 0.838              | 0.835              | 0.828              |
| 0°C       | 2   | No GapPad  | 0.47           | 0.722              | 0.806              | 0.803              | 0.798              | 0.732                 | 0.838              | 0.834              | 0.827              |
| 0°C       | 3   | No GapPad  | 0.47           | 0.721              | 0.805              | 0.802              | 0.798              | 0.729                 | 0.835              | 0.832              | 0.826              |
| 0°C       | 4   | GapPad3000 | 0.68           | 0.754              | 0.806              | 0.806              | 0.801              | 0.776                 | 0.846              | 0.852              | 0.840              |
| 0°C       | 5   | GapPad3000 | 0.59           | 0.752              | 0.806              | 0.804              | 0.801              | 0.773                 | 0.843              | 0.849              | 0.838              |
| 0°C       | 6   | GapPad3000 | 0.64           | 0.753              | 0.806              | 0.805              | 0.801              | 0.774                 | 0.845              | 0.850              | 0.839              |
| 0°C       | 7   | GapPad1500 | 0.58           | 0.780              | 0.815              | 0.815              | 0.808              | 0.812                 | 0.860              | 0.870              | 0.852              |
| 0°C       | 8   | GapPad1500 | 0.47           | 0.779              | 0.814              | 0.814              | 0.809              | 0.810                 | 0.857              | 0.868              | 0.850              |
| 0°C       | 9   | GapPad1500 | 0.53           | 0.780              | 0.814              | 0.814              | 0.808              | 0.811                 | 0.859              | 0.869              | 0.851              |
| 20°C      | 10  | No GapPad  | 20.53          | 0.748              | 0.830              | 0.829              | 0.824              | 0.758                 | 0.864              | 0.861              | 0.852              |
| 20°C      | 11  | No GapPad  | 20.53          | 0.746              | 0.829              | 0.826              | 0.823              | 0.758                 | 0.865              | 0.862              | 0.853              |
| 20°C      | 12  | No GapPad  | 20.46          | 0.750              | 0.832              | 0.830              | 0.826              | 0.756                 | 0.861              | 0.858              | 0.850              |
| 20°C      | 13  | GapPad3000 | 20.48          | 0.781              | 0.830              | 0.830              | 0.826              | 0.800                 | 0.871              | 0.875              | 0.864              |
| 20°C      | 14  | GapPad3000 | 20.61          | 0.780              | 0.830              | 0.829              | 0.825              | 0.801                 | 0.870              | 0.875              | 0.864              |
| 20°C      | 15  | GapPad3000 | 20.55          | 0.780              | 0.830              | 0.829              | 0.825              | 0.800                 | 0.870              | 0.875              | 0.864              |
| 20°C      | 16  | GapPad1500 | 20.56          | 0.809              | 0.841              | 0.842              | 0.838              | 0.841                 | 0.889              | 0.898              | 0.881              |
| 20°C      | 17  | GapPad1500 | 20.31          | 0.808              | 0.841              | 0.842              | 0.835              | 0.841                 | 0.888              | 0.898              | 0.882              |
| 20°C      | 18  | GapPad1500 | 20.43          | 0.808              | 0.841              | 0.842              | 0.836              | 0.841                 | 0.888              | 0.898              | 0.882              |

 Table E.4 Thermal impedance at predefined conditions for diodes



**Figure E.6** Influence of thermal path degradation on  $Z_{thjr}$  values (at t = 1s, 2s and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 1 for IGBT(left) and diode(right)



**Figure E.7** Influence of thermal path degradation on  $Z_{thjr}$  values (at t = 1s, 2s and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 5 for IGBT(left) and diode(right)



**Figure E.8** Influence of thermal path degradation on  $Z_{thjr}$  values (at t = 1s, 2s and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 3 for IGBT(left) and diode(right)



**Figure E.9** Influence of thermal path degradation on  $Z_{thjr}$  values (at t = 1s, 2s and ambient temperature = 0°C, 20°C) with reference temperature measured at Reference 4 for IGBT(left) and diode(right)



**Figure E.10** The averaged thermal impedance of IGBT(upper) and diode(lower) taken at t = 2s corresponding to all temperatures and linear regression

# **APPENDIX F**

### **C-SAM ANALYSIS**

A general discussion of the C-SAM is made at first is this section. Then comparisons of C-SAM results from a group of six samples at both healthy and aged state are made, showing the degradation of tested samples after thermal cycling. More aging tests and experiments are to be carried out to further validate the proposed health monitoring techniques.

#### F.1 Introduction to SAM Measurement

C-scan gate start = 1120ns after surface reflection, width =80ns; Each subsequent scan starts at a time 80ns later.

B-scan gate start = 1120ns, width = 960ns to produce 12 images. Gain = 50dB.

Ageing regime: 1 cycle consists of 10 minutes at  $T = -50^{\circ}$ C followed by 10 minutes at  $T = +160^{\circ}$ C.

C-scan gate start = 1120ns after surface trigger.

The gate position for the first scan (scan 1) is calculated as follows:

Cu  $(4700 \text{ms}^{-1})$  x time/2 = about 2.632mm

To estimate the time required to detect subsequent layers of the structure and therefore correlating each scan to the correspondent layer, time required for sound go through subsequent layers are roughly calculated by the equation below

time required = layer thickness x 2 / velocity

The results are shown in Table F.1.

Scans 2 & 3 clearly indicate a delamination in the layers between the baseplate and the lower copper layer, while the weaker contrast of Scan 2 compared to Scan 3 is because part of its gate trigger is still inside baseplate layer;

Scans 5 & 6 clearly indicate a delamination between the upper copper layer and the aluminium oxide isolation;

Scans 7 & 8 do not indicate any damage to the solder layers between the silicon chips and upper copper layer.

|                      | sound velocity<br>(m/s)[184] | layer<br>thickness<br>(mm) | time required<br>for each layer<br>(ns) | time<br>required<br>totally (ns) | total number of<br>scans at far ends<br>of each layer |
|----------------------|------------------------------|----------------------------|-----------------------------------------|----------------------------------|-------------------------------------------------------|
| baseplate            | 4700                         | 0.368                      | 156.60                                  | 156.60                           | 2.96                                                  |
| substrate<br>solder  | 2700                         | 0.08                       | 59.26                                   | 215.86                           | 3.70                                                  |
| lower<br>copper      | 4700                         | 0.3                        | 127.66                                  | 343.51                           | 5.29                                                  |
| alumina<br>substrate | 9900                         | 0.38                       | 76.77                                   | 420.28                           | 6.25                                                  |
| upper<br>copper      | 4700                         | 0.3                        | 127.66                                  | 547.94                           | 7.85                                                  |
| die-attach           | 2700                         | 0.08                       | 59.26                                   | 607.20                           | 8.59                                                  |

Table F.1 Correlating each scan to the correspondent layer

## **F.2 SAM Results**

Comparisons of the area ratios of degraded DCB solder layer are calculated from Scan 3 for each IGBT module and their results are shown in Table F.2 and in Figures F.1 to F.6.

**Table F.2** Comparisons of voided DCB solder layer (a) for top IGBT devices and (b)

 bottom IGBT devices

| Samples                         |                             | SA1   | SA2   | SA3    | SA4   | SA5   | SA6   | mean<br>ratio |
|---------------------------------|-----------------------------|-------|-------|--------|-------|-------|-------|---------------|
| percentage<br>of voided<br>area | at 0<br>thermal<br>cycles   | 2.84  | 2.05  | 4.12   | 1.26  | 2.14  | 3.38  | 2.63          |
|                                 | at 700<br>thermal<br>cycles | 26.62 | 30.02 | 19.98  | 25.21 | 25.46 | 25.57 | 25.48         |
|                                 |                             |       |       | (m. ). |       |       |       |               |

| 1 | <b>a</b> ) |
|---|------------|
| l | a)         |
| ` | /          |

| ſ  | h)         |
|----|------------|
| ٦, | <b>v</b> ) |

| Samples           |                             | SB1   | SB2   | SB3   | SB4   | SB5   | SB6   | mean<br>ratio |
|-------------------|-----------------------------|-------|-------|-------|-------|-------|-------|---------------|
| percentage        | at 0<br>thermal<br>cycles   | 1.71  | 1.01  | 2.21  | 0.92  | 1.55  | 1.24  | 1.71          |
| of voided<br>area | at 700<br>thermal<br>cycles | 10.68 | 14.37 | 10.86 | 13.14 | 14.25 | 12.36 | 10.68         |
### SAM Scans of IGBT Module S1



C-scan gate start = 1120ns after surface reflection, width =80ns; B-scan gate start = 1120ns, width = 960ns to produce 12 images. Gain =50dB. Ageing regime: 1 cycle consists of 10 minutes at  $T = -50^{\circ}$ C followed by 10 minutes at  $T = +160^{\circ}$ C.



Figure F.1 SAM scans of IGBT module S1



 $=+160^{\circ}C.$ 



C-scan gate start = 1120ns after surface reflection, width =80ns; B-scan gate start = 1120ns, width = 960ns to produce 12 images. Gain =50dB. Ageing regime: 1 cycle consists of 10 minutes at  $T = -50^{\circ}$ C followed by 10 minutes at T

Figure F.2 SAM scans of IGBT module S2



C-scan gate start = 1120ns after surface reflection, width =80ns; B-scan gate start = 1120ns, width = 960ns to produce 12 images. Gain =50dB. Ageing regime: 1 cycle consists of 10 minutes at  $T = -50^{\circ}$ C followed by 10 minutes at  $T = +160^{\circ}$ C.

### 160



Figure F.3 SAM scans of IGBT module S3





Figure F.4 SAM scans of IGBT module S4



#### 164



Figure F.5 SAM scans of IGBT module S5



 $=+160^{\circ}$ C.



Figure F.6 SAM scans of IGBT module S6

# **APPENDIX G**

## LABVIEW PROGRAM AND DESCRIPTIONS

#### Steps:

1a. Create analog input voltage channels for voltage and shunt measurement.

1b. Create analog input voltage channels for thermocouple measurement.

1c. Create digital output channels for pulse output Create a Counter Output channel to produce a Pulse train in terms of Frequency. The first transition of the generated signal is from low to high by setting the Idle State of the pulse to be low.

2a. Use the DAQmx Timing VI (Implicit) to configure the duration of the pulse generation.

2b. Set the sampling frequency and number of samples

3. Use Property Node to change NI9219 operation mode

4a. Call the Get Terminal Name with Device Prefix VI. This will take a Task and a terminal and create a properly formatted device + terminal name to use as the source of the Trigger.

4b. Define the parameters for a Digital Edge Start Trigger. Set the Analog output to trigger off the AI Start Trigger. This is an internal trigger signal.

5. Call the Start VI to arm the two functions. Make sure the digital output is armed before the analog input. This will ensure both will start at the same time.

6. Read waveforms from the task that contains all the analog input channels

7. Call the Clear VI to stop acquiring samples, clear the task and rlease any reserved resources.

8. Call the Configure Logging (TDMS) VI and configure the task to log the data.

9-13. Write TDMS files

14. Open the TDMS File Viewer to examine the data file.

15. Use the popup dialog box to display an error if any.





# **APPENDIX H**

## A PICTORIAL DESCRIPTION OF THE EXPERIMENT







### REFERENCES

- [1] "Investigation into the Scope for the Transport Sector to Switch to Electric Vehicles and Plug-in Hybrid Vehicles," Arup and Cenex, Department for Business Enterprise and Regulatory Reform and Department for Transport, London2008.
- [2] D. J. Chamund, L. Coulbeck, D. R. Newcombe, and P. R. Waind, "High power density IGBT module for high reliability applications," in *Power Electronics* and Motion Control Conference, 2009. IPEMC '09. IEEE 6th International, 2009, pp. 274-280.
- [3] S. U, "Reliability challenges of automotive power electronics," *Microelectronics Reliability*, vol. 49, pp. 1319-1325.
- [4] M. Yi Lu, M. A. Masrur, C. ZhiHang, and Z. Baifang, "Model-based fault diagnosis in electric drives using machine learning," *Mechatronics, IEEE/ASME Transactions on*, vol. 11, pp. 290-303, 2006.
- [5] S. Nandi, H. A. Toliyat, and L. Xiaodong, "Condition monitoring and fault diagnosis of electrical motors-a review," *Energy Conversion, IEEE Transactions on*, vol. 20, pp. 719-729, 2005.
- [6] P. Wikstrom, L. A. Terens, and H. Kobi, "Reliability, availability, and maintainability of high-power variable-speed drive systems," *Industry Applications, IEEE Transactions on,* vol. 36, pp. 231-241, 2000.
- [7] E. Wolfgang, "Examples for failures in power electronics systems," EPE Tutorial 'Reliability of Power Electronic Systems' April 2007.
- [8] Y. Shaoyong, A. Bryant, P. Mawby, X. Dawei, R. Li, and P. Tavner, "An Industry-Based Survey of Reliability in Power Electronic Converters," *Industry Applications, IEEE Transactions on*, vol. 47, pp. 1441-1451, 2011.
- [9] F. W. Fuchs, "Some diagnosis methods for voltage source inverters in variable speed drives with induction machines a survey," in *Industrial Electronics Society, 2003. IECON '03. The 29th Annual Conference of the IEEE*, 2003, pp. 1378-1385 Vol.2.
- [10] C. Gillot, C. Schaeffer, C. Massit, and L. Meysenc, "Double-sided cooling for high power IGBT modules using flip chip technology," *Components and Packaging Technologies, IEEE Transactions on*, vol. 24, pp. 698-704, 2001.
- [11] J. B. Campbell, L. M. Tolbert, C. W. Ayers, B. Ozpineci, and K. T. Lowe, "Two-Phase Cooling Method Using the R134a Refrigerant to Cool Power Electronic Devices," *Industry Applications, IEEE Transactions on*, vol. 43, pp. 648-656, 2007.
- [12] G. Mulcahy and J. Santini, "Next Generation Military Vehicle Power Conversion Modules," TDI PowerApril 2008.
- [13] "Advanced Modular Inverter Technology Development," Electricore, IncJanuary 2006.
- [14] M. Mermet-Guyennet, "Reliability requirement for traction power converters," ECPE Workshop 'Built-in reliability into power electronic systems'June 2008.
- [15] S. Yang, D. Xiang, A. Bryant, P. Mawby, L. Ran, and P. Tavner, "Condition Monitoring for Device Reliability in Power Electronic Converters: A Review," *Power Electronics, IEEE Transactions on*, vol. 25, pp. 2734-2752, 2010.

- [16] K. Ambusaidi, V. Pickert, and B. Zahawi, "New Circuit Topology for Fault Tolerant H-Bridge DC–DC Converter," *Power Electronics, IEEE Transactions on*, vol. 25, pp. 1509-1516, 2010.
- [17] S. Karimi, A. Gaillard, P. Poure, and S. Saadate, "FPGA-Based Real-Time Power Converter Failure Diagnosis for Wind Energy Conversion Systems," *Industrial Electronics, IEEE Transactions on*, vol. 55, pp. 4299-4308, 2008.
- [18] "MIL-HDBK-217 handbook (version A)," United States Department of Defense (DoD) and Quanterion Solutions Inc1965.
- [19] "Semiconductor Reliability Handbook," Renesas Electronics CorporationNov, 2008.
- [20] N. M. Vichare and M. G. Pecht, "Prognostics and health management of electronics," *Components and Packaging Technologies, IEEE Transactions on*, vol. 29, pp. 222-229, 2006.
- [21] R. M. Tallam, L. Sang Bin, G. C. Stone, G. B. Kliman, Y. Jiyoon, T. G. Habetler, and R. G. Harley, "A Survey of Methods for Detection of Stator-Related Faults in Induction Machines," *Industry Applications, IEEE Transactions on*, vol. 43, pp. 920-933, 2007.
- [22] A. Santosh Kumar, R. P. Gupta, K. Udayakumar, and A. Venkatasami, "Online partial discharge detection and location techniques for condition monitoring of power transformers: A review," in *Condition Monitoring and Diagnosis, 2008. CMD 2008. International Conference on*, 2008, pp. 927-931.
- [23] G. M. Buiatti, Marti, x, J. A. n-Ramos, Garci, C. H. R. a, A. M. R. Amaral, and A. J. M. Cardoso, "An Online and Noninvasive Technique for the Condition Monitoring of Capacitors in Boost Converters," *Instrumentation and Measurement, IEEE Transactions on*, vol. 59, pp. 2134-2143, 2010.
- [24] M. Musallam, C. M. Johnson, Y. Chunyan, L. Hua, and C. Bailey, "In-service life consumption estimation in power modules," in *Power Electronics and Motion Control Conference, 2008. EPE-PEMC 2008. 13th*, 2008, pp. 76-83.
- [25] L. Hua and C. Bailey, "Lifetime prediction of an IGBT power electronics module under cyclic temperature loading conditions," in *Electronic Packaging Technology & High Density Packaging*, 2009. ICEPT-HDP '09. International Conference on, 2009, pp. 274-279.
- [26] M. Ciappa, "Lifetime Modeling and Prediction of Power Devices," *Integrated Power Systems (CIPS), 2008 5th International Conference on*, pp. 1-9, 2008.
- [27] G. Khatibi, M. Lederer, B. Weiss, T. Licht, J. Bernardi, and H. Danninger, "Accelerated mechanical fatigue testing and lifetime of interconnects in microelectronics," *Procedia Engineering*, vol. 2, pp. 511-519, 2010.
- [28] A. T. Bryant, P. A. Mawby, P. R. Palmer, E. Santi, and J. L. Hudgins, "Exploration of Power Device Reliability Using Compact Device Models and Fast Electrothermal Simulation," *Industry Applications, IEEE Transactions on*, vol. 44, pp. 894-903, 2008.
- [29] "FUJI IGBT modules application manual," Fuji Electric Device Technology Co., Ltd.2004.
- [30] M. Ciappa and A. Castellazzi, "Reliability of High-Power IGBT Modules for Traction Applications," in *Reliability physics symposium, 2007. proceedings. 45th annual. ieee international,* 2007, pp. 480-485.
- [31] T. Stockmeier, "From Packaging to "Un"-Packaging Trends in Power Semiconductor Modules," in *Power Semiconductor Devices and IC's, 2008. ISPSD '08. 20th International Symposium on, 2008, pp. 12-19.*
- [32] P. Beckedahl, M. Hermann, M. Kind, M. Knebel, J. Nascimento, and A. Wintrich, "Performance Comparison of Traditional Packaging Technologies to a

Novel Bond Wire Less All Sintered Power Module," presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 2011.

- [33] N. Ulrich, T. Werner, and R. Tobias, *Power Semiconductor Application Manual*. Nuremberg: SEMIKRON INTERNATIONAL GmbH.
- [34] R. Bayerer, "Higher Junction Temperature in Power Modules a demand from hybrid cars, a potential for the next step increase in power density for various Variable Speed Drives," presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 2008.
- [35] K. Guth, F. Hille, F. Umbach, D. Siepe, J. Görlich, H. Torwesten, and R. Roth, "New assembly and interconnects beyond sintering methods," presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 2010.
- [36] A. Ciliox, F. Hille, F. Umbach, J. Görlich, K. Guth, D. Siepe, S. Krasel, and P. Szczupak, "New module generation for higher lifetime," presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 2010.
- [37] R. Ott, M. Bässler, R. Tschirbs, and D. Siepe, "New superior assembly technologies for modules with highest power densities," presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 2010.
- [38] B. Reinhold, "Advanced packaging yields higher performance and reliability in power electronics," *Microelectronics Reliability*, vol. 50, pp. 1715-1719.
- [39] E. R. Motto and J. F. Donlon, "The latest advances in industrial IGBT module technology," in *Applied Power Electronics Conference and Exposition, 2004. APEC '04. Nineteenth Annual IEEE*, 2004, pp. 235-240 Vol.1.
- [40] M. Pecht and A. Dasgupta, "Physics-of-failure: an approach to reliable product development," in *Integrated Reliability Workshop*, 1995. Final Report., *International*, 1995, pp. 1-4.
- [41] S. Kumar, E. Dolev, and M. Pecht, "Parameter selection for health monitoring of electronic products," *Microelectronics Reliability*, vol. 50, pp. 161-168, 2010.
- [42] M. Held, P. Jacob, G. Nicoletti, P. Scacco, and M. H. Poech, "Fast power cycling test of IGBT modules in traction application," in *Power Electronics and Drive Systems, 1997. Proceedings., 1997 International Conference on*, 1997, pp. 425-430 vol.1.
- [43] L. Feller, S. Hartmann, and D. Schneider, "Lifetime analysis of solder joints in high power IGBT modules for increasing the reliability for operation at 150°C," *Microelectronics Reliability*, vol. 48, pp. 1161-1166.
- [44] M. Pecht, "Integrated Circuit Hybrid, and Multichip Module Package Design Guidelines [Books and Reports]," *Power Engineering Review, IEEE*, vol. 15, p. 33, 1995.
- [45] M. Pecht and G. Jie, "Physics-of-failure-based prognostics for electronic products," *Transactions of the Institute of Measurement and Control*, vol. 31, pp. 309-322, June/August 2009 2009.
- [46] C. Mauro, "Selected failure mechanisms of modern power modules," *Microelectronics Reliability*, vol. 42, pp. 653-667.
- [47] M. Ciappa and W. Fichtner, "Lifetime prediction of IGBT modules for traction applications," in *Reliability Physics Symposium*, 2000. Proceedings. 38th Annual 2000 IEEE International, 2000, pp. 210-216.
- [48] V. Smet, F. Forest, J.-J. Huselstein, F. Richardeau, Z. Khatir, S. Lefebvre, and M. Berkani, "Ageing and failure modes of IGBT modules in high-temperature power cycling," *IEEE Transactions on Industrial Electronics*, vol. 58, pp. 4931-4941, 2011.

- [49] A. Morozumi, K. Yamada, T. Miyasaka, S. Sumi, and Y. Seki, "Reliability of power cycling for IGBT power semiconductor modules," *Industry Applications, IEEE Transactions on*, vol. 39, pp. 665-671, 2003.
- [50] T. Herrmann, M. Feller, J. Lutz, R. Bayerer, and T. Licht, "Power cycling induced failure mechanisms in solder layers," in *Power Electronics and Applications, 2007 European Conference on, 2007, pp. 1-7.*
- [51] L. Dupont, Z. Khatir, S. Lefebvre, R. Meuret, B. Parmentier, and S. Bontemps, "Electrical characterizations and evaluation of thermo-mechanical stresses of a power module dedicated to high temperature applications," in *Power Electronics and Applications, 2005 European Conference on, 2005, pp. 11 pp.-P.11.*
- [52] L. Dupont, Z. Khatir, S. Lefebvre, and S. Bontemps, "Effects of metallization thickness of ceramic substrates on the reliability of power assemblies under high temperature cycling," *Microelectronics Reliability*, vol. 46, pp. 1766-1771.
- [53] G. Mitic and G. Lefranc, "Localization of electrical-insulation and partialdischarge failures of IGBT modules," *Industry Applications, IEEE Transactions on*, vol. 38, pp. 175-180, 2002.
- [54] N. Patil, J. Celaya, D. Das, K. Goebel, and M. Pecht, "Precursor Parameter Identification for Insulated Gate Bipolar Transistor (IGBT) Prognostics," *Reliability, IEEE Transactions on*, vol. 58, pp. 271-276, 2009.
- [55] E. A. Amerasekera and F. N. Najm, *Failure Mechanisms in Semiconductor Devices, 2nd ed.*: , 2nd ed.: John Wiley & Sons Ltd, 1998.
- [56] C. Busca, R. Teodorescu, F. Blaabjerg, S. Munk-Nielsen, L. Helle, T. Abeyasekera, and P. Rodriguez, "An overview of the reliability prediction related aspects of high power IGBTs in wind power applications," *Microelectronics Reliability*, vol. 51, pp. 1903-1907.
- [57] X. Perpiñà, J. F. Serviere, X. Jordà, A. Fauquet, S. Hidalgo, J. Urresti-Ibañez, J. Rebollo, and M. Mermet-Guyennet, "IGBT module failure analysis in railway applications," *Microelectronics Reliability*, vol. 48, pp. 1427-1431.
- [58] N. Patil, D. Das, Y. Chunyan, L. Hua, C. Bailey, and M. Pecht, "A fusion approach to IGBT power module prognostics," in *Thermal, Mechanical and Multi-Physics simulation and Experiments in Microelectronics and Microsystems, 2009. EuroSimE 2009. 10th International Conference on*, 2009, pp. 1-5.
- [59] D. L. Goodman, "Prognostic methodology for deep submicron semiconductor failure modes," *Components and Packaging Technologies, IEEE Transactions on*, vol. 24, pp. 109-111, 2001.
- [60] J.-S. Jeong, S.-H. Hong, and S.-D. Park, "Field failure mechanism and improvement of EOS failure of integrated IGBT inverter modules," *Microelectronics Reliability*, vol. 47, pp. 1795-1799, 2007.
- [61] M. Trivedi and K. Shenai, "Failure mechanisms of IGBT's under short-circuit and clamped inductive switching stress," *IEEE Transactions on Power Electronics*, vol. 14, pp. 108-116, 1999.
- [62] B. Friedhelm D, "Accurate analytical modelling of cosmic ray induced failure rates of power semiconductor devices," *Solid-State Electronics*, vol. 53, pp. 584-589, 2009.
- [63] H. Berg and E. Wolfgang, "Advanced IGBT modules for railway traction applications: Reliability testing," *Microelectronics Reliability*, vol. 38, pp. 1319-1323, 1998.
- [64] V. A. Sankaran, C. Chen, C. S. Avant, and X. Xu, "Power cycling reliability of IGBT power modules," in *Industry Applications Conference*, 1997. Thirty-Second IAS Annual Meeting, IAS '97., Conference Record of the 1997 IEEE, 1997, pp. 1222-1227 vol.2.

- [65] F. Kovačević, U. Drofenik, and J. W. Kolar, "New physical model for lifetime estimation of power modules," in *Power Electronics Conference (IPEC), 2010 International*, 2010, pp. 2106-2114.
- [66] J. Lutz, "IGBT-Modules: Design for reliability," in *Power Electronics and Applications, 2009. EPE '09. 13th European Conference on, 2009, pp. 1-3.*
- [67] H. Ye, M. Lin, and C. Basaran, "Failure modes and FEM analysis of power electronic packaging," *Finite Elements in Analysis and Design*, vol. 38, pp. 601-612, 2002.
- [68] Z. Khatir and S. Lefebvre, "Boundary element analysis of thermal fatigue effects on high power IGBT modules," *Microelectronics Reliability*, vol. 44, pp. 929-938, 2004.
- [69] R. Amro, J. Lutz, J. Rudzki, M. Thoben, and A. Lindemann, "Double-sided lowtemperature joining technique for power cycling capability at high temperature," in *Power Electronics and Applications, 2005 European Conference on*, 2005, pp. 10 pp.-P.10.
- [70] A. Hamidi, N. Beck, K. Thomas, and E. Herr, "Reliability and lifetime evaluation of different wire bonding technologies for high power IGBT modules," *Microelectronics Reliability*, vol. 39, pp. 1153-1158, 1999.
- [71] B. Ji, V. Pickert, and B. Zahawi, "In-situ Measurement of the Bond Wire Lift-off in IGBT Power Modules," presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 2011.
- J. Onuki and M. Koizumi, "Reliability of thick Al wire bonds in IGBT modules for traction motor drives," in *Power Semiconductor Devices and ICs*, 1995. *ISPSD '95. Proceedings of the 7th International Symposium on*, 1995, pp. 428-433.
- [73] B. Nagl, J. Nicolics, and W. Gschohsmann, "Analysis of thermomechanically related failures of traction IGBT power modules at short circuit switching," in *Electronic System-Integration Technology Conference (ESTC)*, 2010 3rd, 2010, pp. 1-6.
- [74] T. Schuetze, H. Berg, and O. Schilling, "The new 6.5kV IGBT module a reliable device for medium voltage applications," presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 2001.
- [75] M. Pechta, D. Dasa, and A. Ramakrishnanb, "The IEEE standards on reliability program and reliability prediction methods for electronic equipment," *Microelectronics Reliability*, vol. 42, pp. 1259-1266.
- [76] P. Lall, M. Pecht, and E. B. Hakim, *Influence of Temperature on Microelectronics and System Reliability: A Physics of Failure Approach* 1st edition ed.: CRC Press.
- [77] M. Pecht, B. Tuchband, N. Vichare, and J. Y. Qu, "Prognostics and health monitoring of electronics," in *EuroSime 2007: International Conference on Thermal, Mechanical and Multi-Physics Simulation Experiments in Microelectronics and Micro-Systems, 2007, April 16, 2007 - April 18, 2007,* London, United kingdom, 2007.
- [78] J. W. Sheppard, M. A. Kaufman, and T. J. Wilmering, "IEEE standards for prognostics and health management," in *IEEE Autotestcon 2008, September 8,* 2008 - September 11, 2008, Salt Lake City, UT, United states, 2008, pp. 97-103.
- [79] H. Lu, C. Bailey, and C. Yin, "Design for reliability of power electronics modules," *Microelectronics Reliability*, vol. 49, pp. 1250-1255, 2009.
- [80] S. Mishra, M. Pecht, and D. L. Goodman, "In-situ sensors for product reliability monitoring," *Proceedings of SPIE The International Society for Optical Engineering*, vol. 4755, pp. 10-19, 2002.
- [81] Hot Carrier (HC) Prognostic Cell [Online].

- [82] D. Goodman, B. Vermeire, J. Ralston-Good, and R. Graves, "A board-level prognostic monitor for MOSFET TDDB," in *Aerospace Conference*, 2006 IEEE, 2006, p. 6 pp.
- [83] P. Lall, M. Hande, C. Bhat, V. More, R. Vaidya, and J. Suhling, "Algorithms for prognostication of prior damage and residual life in lead-free electronics subjected to thermo-mechanical loads," in *Thermal and Thermomechanical Phenomena in Electronic Systems, 2008. ITHERM 2008. 11th Intersociety Conference on, 2008, pp. 638-651.*
- [84] (Dec, 2011). Agilent B1505A Power Device Analyzer/Curve Tracer Datasheet. Available: http://cp.literature.agilent.com/litweb/pdf/5990-3853EN.pdf
- [85] (Dec, 2011). Tektronix 371B Programmable Curve Tracers Datasheet. Available: http://www2.tek.com/cmsreplive/psrep/13558/76W\_10757\_3\_2008.05.15.16.34. 12\_13558\_EN.pdf
- [86] (Dec, 2011). *T3Ster Thermal Tester Datasheet*. Available: http://www.mentor.com/products/mechanical/products/upload/t3ster.pdf
- [87] (Dec, 2011). *TESEC* 9624-*KT/L Thermal Tester*. Available: http://www.tesec.co.jp/english/index.html
- [88] (Dec, 2011). *Phase 11 Thermal Analyzer*. Available: http://www.analysistech.com/semi-product.htm
- [89] J. R. Celaya, V. Vashchenko, P. Wysocki, S. Saha, and K. Goebel, "Accelerated aging system for prognostics of power semiconductor devices," in 45 Years of Support Innovation - Moving Forward at the Speed of Light, AUTOTESTCON 2010, September 13, 2010 - September 16, 2010, Orlando, FL, United states, 2010, pp. 118-123.
- [90] H. Zhang, R. Kang, M. Luo, and M. Pecht, "Precursor parameter identification for power supply prognostics and health management," in 2009 8th International Conference on Reliability, Maintainability and Safety, ICRMS 2009, July 20, 2009 - July 24, 2009, Chengdu, China, 2009, pp. 883-887.
- [91] R. S. Muller, T. I. Kamins, and M. Chan, *Device electronics for integrated circuits*, 3rd edition ed.: John Wiley & Sons, 2003
- [92] J. Due, S. Munk-Nielsen, and R. Nielsen, "Lifetime investigation of high power IGBT modules," in *Power Electronics and Applications (EPE 2011)*, *Proceedings of the 2011-14th European Conference on*, 2011, pp. 1-8.
- [93] M. Held, P. Jacob, G. Nicoletti, P. Scacco, and M. H. Poech, "Fast power cycling test for insulated gate bipolar transistor modules in traction application," *International Journal of Electronics*, vol. 86, pp. 1193-1204, 1999.
- [94] A. Hamidi and G. Coquery, "Effects of current density and chip temperature distribution on lifetime of high power IGBT modules in traction working conditions," *Microelectronics Reliability*, vol. 37, pp. 1755-1758, 1997.
- [95] G. Coquery and R. Lallemand, "Failure criteria for long term Accelerated Power Cycling Test linked to electrical turn off SOA on IGBT module. A 4000 hours test on 1200A–3300V module with AlSiC base plate," *Microelectronics Reliability*, vol. 40, pp. 1665-1670.
- [96] E. Herr, T. Frey, R. Schlegel, A. Stuck, and R. Zehringer, "Substrate-to-base solder joint reliability in high power IGBT modules," *Microelectronics Reliability*, vol. 37, pp. 1719-1722, 1997.
- [97] G. Lefranc, T. Licht, H. J. Schultz, R. Beinert, and G. Mitic, "Reliability testing of high-power multi-chip IGBT modules," *Microelectronics Reliability*, vol. 40, pp. 1659-1663.

- [98] J.Göttert, W.Köhler, K.H.Sommer, and G.Lefranc, "Insulation Voltage Test and Partial Discharge Test of 23.3 kV IGBT modules," presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 1997.
- [99] Farokhzad, "Method for early failure recognition in power semiconductor modules."
- [100] M. Pecht and R. Jaai, "A prognostics and health management roadmap for information and electronics-rich systems," *Microelectronics Reliability*, vol. 50, pp. 317-323, 2010.
- [101] H. Abbad, S. Azzopardi, E. Woirgard, J. Deletage, P. Rollin, K. Marchand, T. Lhommeau, and M. Piton, "A first approach on the failure mechanisms of IGBT inverters for aeronautical applications: Effect of humidity-pressure combination," in *Power Electronics Conference (IPEC), 2010 International*, 2010, pp. 2450-2456.
- [102] G. Jie, D. Barker, and M. Pecht, "Health Monitoring and Prognostics of Electronics Subject to Vibration Load Conditions," *Sensors Journal, IEEE*, vol. 9, pp. 1479-1485, 2009.
- [103] L. Larcher, A. Paccagnella, and G. Ghidini, "A model of the stress induced leakage current in gate oxides," *Electron Devices, IEEE Transactions on*, vol. 48, pp. 285-288, 2001.
- [104] O. Salmela, "Acceleration factors for lead-free solder materials," *IEEE Transactions on Components and Packaging Technologies*, vol. 30, pp. 700-707, 2007.
- [105] L. Yang, J. B. Bernstein, and T. Koschmieder, "Assessment of acceleration models used for BGA solder joint reliability studies," *Microelectronics Reliability*, vol. 49, pp. 1546-1554, 2009.
- [106] R. Bayerer, T. Herrmann, T. Licht, J. Lutz, and M. Feller, "Model for Power Cycling lifetime of IGBT Modules - various factors influencing lifetime," *Integrated Power Systems (CIPS), 2008 5th International Conference on*, pp. 1-6, 2008.
- [107] Q. Jin and J. B. Bernstein, "Non-arrhenius temperature acceleration and stressdependent voltage acceleration for semiconductor device involving multiple failure mechanisms," in 2006 IEEE International Integrated Reliability Workshop Final Report, IIRW, October 16, 2006 - October 19, 2006, South Lake Tahoe, CA, United states, 2006, pp. 93-97.
- [108] R. C. Blish Ii, "Temperature cycling and thermal shock failure rate modeling," in Proceedings of the 1997 35th Annual IEEE International Reliability Physics Symposium, April 8, 1997 - April 10, 1997, Denver, CO, USA, 1997, pp. 110-117.
- [109] M. Musallam and C. M. Johnson, "Real-time compact thermal models for health management of power electronics," *IEEE Transactions on Power Electronics*, vol. 25, pp. 1416-1425, 2010.
- [110] A. T. Bryant, P. A. Mawby, P. R. Palmer, E. Santi, and J. L. Hudgins, "Exploration of power device reliability using compact device models and fast electro-thermal simulation," in 2006 IEEE Industry Applications Conference -Forty-First IAS Annual Meeting, October 8, 2006 - October 12, 2006, Tampa, FL, United states, 2006, pp. 1465-1472.
- [111] M. Musallam, P. P. Acarnley, C. M. Johnson, L. Pritchard, and V. Pickert, "Estimation and control of power electronic device temperature during operation with variable conducting current," *IET Circuits, Devices and Systems,* vol. 1, pp. 111-116, 2007.

- [112] L. Yang, P. A. Agyakwa, and C. M. Johnson, "A time-domain physics-of-failure model for the lifetime prediction of wire bond interconnects," Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom, 2011, pp. 1882-1886.
- [113] V. G, L. FL, R. M, H. A, and W. B, *Intelligent fault diagnosis and prognosis for engineering systems*, 1st ed ed. Hoboken (NJ): John Wiley & Sons, Inc., 2006.
- [114] S. Cheng and M. Pecht, "A fusion prognostics method for remaining useful life prediction of electronic products," in 2009 IEEE International Conference on Automation Science and Engineering, CASE 2009, August 22, 2009 - August 25, 2009, Bangalore, India, 2009, pp. 102-107.
- [115] A. Kay. Analysis And Measurement Of Intrinsic Noise In Op Amp circuits Part I: Introduction And Review of Statistics. Available: http://www.analogzone.com/avt\_0904.pdf
- [116] R. Mancini. (2002, Dec, 2011). Op Amps for Everyone Design Reference (Rev.B)-TexasInstruments.Available:http://www.ti.com/lit/an/slod006b/slod006b.pdf
- [117] B. Baker. (2010). Techniques that Reduce System Noise in ADC Circuits
- [118] Low Level Measurements Handbook: Precision DC Current, Voltage, and Resistance Measurements, 6th edition ed.: Keithley, 2004.
- [119] CHAPTER 12: PRINTED CIRCUIT BOARD (PCB) DESIGN ISSUES. Available: http://www.analog.com/library/analogdialogue/archives/43-09/EDch%2012%20pc%20issues.pdf
- [120] Y. Xiong, X. Cheng, Z. J. Shen, C. Mi, H. Wu, and V. K. Garg, "Prognostic and warning system for power-electronic modules in electric, hybrid electric, and fuel-cell vehicles," *IEEE Transactions on Industrial Electronics*, vol. 55, pp. 2268-2276, 2008.
- [121] Y.-S. Kim and S.-K. Sul, "On-line estimation of IGBT junction temperature using on-state voltage drop," in *Proceedings of the 1998 IEEE Industry Applications Conference. Part 1 (of 3), October 12, 1998 - October 15, 1998*, St.Louis, MO, USA, 1998, pp. 853-859.
- [122] (Dec, 2011). *High Common-Mode Voltage Difference Amplifier AD629 Datasheet.* Available: http://www.analog.com/static/importedfiles/data\_sheets/AD629.pdf
- [123] (Dec, 2011). *High Common-Mode Voltage Difference Amplifier INA117 Datasheet*. Available: http://www.ti.com/lit/ds/symlink/ina117.pdf
- [124] (Dec, 2011). *High Speed CMOS Digital Optocoupler ACPL-772L Datasheet*. Available: http://www.avagotech.com/docs/AV02-0324EN
- [125] "Interface Guide," Texas Instruments, Inc.
- [126] B. Chen, "iCoupler<sup>®</sup> Products with isoPower<sup>™</sup> Technology: Signal and Power Transfer Across Isolation Barrier Using Microtransformers," Analog Devices, Inc.2006.
- [127] M. M. R. Ahmed and G. A. Putrus, "A method for predicting IGBT junction temperature under transient condition," in *Industrial Electronics*, 2008. IECON 2008. 34th Annual Conference of IEEE, 2008, pp. 454-459.
- [128] (Dec, 2011). *IPOSIM The Infineon Power Simulation program for loss and thermal calculation of Infineon power modules and disk devices*. Available: http://www.infineon.com/cms/en/product/promopages/designtools/index.html
- [129] (Dec, 2011). *SEMISEL simulation tool*. Available: http://www.semikron.com/skcompub/en/index.htm
- [130] T. Bruckner and S. Bernet, "Estimation and Measurement of Junction Temperatures in a Three-Level Voltage Source Converter," *Power Electronics, IEEE Transactions on*, vol. 22, pp. 3-12, 2007.

- [131] Z. Khatir, S. Carubelli, and F. Lecoq, "Real-time computation of thermal constraints in multichip power electronic devices," *Components and Packaging Technologies, IEEE Transactions on*, vol. 27, pp. 337-344, 2004.
- [132] D. Xu, H. Lu, L. Huang, S. Azuma, M. Kimata, and R. Uchida, "Power loss and junction temperature analysis of power semiconductor devices," *IEEE Transactions on Industry Applications*, vol. 38, pp. 1426-1431, 2002.
- [133] J. Antonios, N. Ginot, C. Batard, Y. Scudeller, and M. Machmoum, "Electrothermal investigations on silicon inverters operating at low frequency," in *Thermal, Mechanical & Multi-Physics Simulation, and Experiments in Microelectronics and Microsystems (EuroSimE), 2010 11th International Conference on,* 2010, pp. 1-5.
- [134] J. Antonios, N. Ginot, C. Batard, Y. Scudeller, and M. Machmoum, "Electrothermal investigations on silicon inverters operating at low frequency," in 2010 11th International Conference on Thermal, Mechanical and Multi-Physics Simulation, and Experiments in Microelectronics and Microsystems, EuroSimE 2010, April 26, 2010 - April 28, 2010, Bordeaux, France, 2010.
- [135] "MIL-STD-883G, DEPARTMENT OF DEFENSE TEST METHOD STANDARD: MICROCIRCUITS," United States Department of Defense28 February 2006.
- [136] "JESD51-1, EIA / JEDEC Standard, Electronic Industries Association, Integrated Circuit Thermal Measurement Method - Electrical Test Method," 1995.
- [137] R. Amro, J. Lutz, and A. Lindemann, "Power cycling with high temperature swing of discrete components based on different technologies," in 2004 IEEE 35th Annual Power Electronics Specialists Conference, PESC04, June 20, 2004 - June 25, 2004, Aachen, Germany, 2004, pp. 2593-2598.
- [138] U. Scheuermann and U. Hecht, "Power Cycling Lifetime of Advanced Power Modules for Different Temperature Swings " presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 2002.
- [139] G. Coquery, G. Lefrance, T. Licht, R. Lallemand, N. Seliger, and H. Berg, "High Temperature Reliability on Autmotive Power Modules Verified by Power Cycling Tests up to 150°C," presented at the European Symposium on Reliability of Electron Devices, Failure Physics and Analysis, Arcachon, France, 2003.
- [140] S. Sumi, K. Ohga, and K. Shirai, "Thermal Fatigue Failures of Large Scale Package Type Power Transistor Modules," presented at the International Symposium for Testing and Failure Analysis, Los Angeles, California; USA, 1989.
- [141] C. Xiao, W. Tao, K. D. T. Ngo, and L. Guo-Quan, "Characterization of Lead-Free Solder and Sintered Nano-Silver Die-Attach Layers Using Thermal Impedance," *Components, Packaging and Manufacturing Technology, IEEE Transactions on*, vol. 1, pp. 495-501, 2011.
- [142] J. W. Sofia, "Analysis of thermal transient data with synthesized dynamic models for semiconductor devices," *Components, Packaging, and Manufacturing Technology, Part A, IEEE Transactions on*, vol. 18, pp. 39-47, 1995.
- [143] D. L. Blackburn, "An Electrical Technique for the Measurement of the Peak Junction Temperature of Power Transistors," in *Reliability Physics Symposium*, 1975. 13th Annual, 1975, pp. 142-150.
- [144] D. L. Blackburn and F. F. Oettinger, "Transient Thermal Response Measurements of Power Transistors," *Industrial Electronics and Control Instrumentation, IEEE Transactions on*, vol. IECI-22, pp. 134-141, 1975.

- [145] D. C. Katsis and J. D. van Wyk, "Void-induced thermal impedance in power semiconductor modules: some transient temperature effects," *Industry Applications, IEEE Transactions on*, vol. 39, pp. 1239-1246, 2003.
- [146] S. Hartmann, M. Bayer, D. Schneider, and L. Feller, "Observation of chip solder degradation by electrical measurements during power cycling," in *Integrated Power Electronics Systems (CIPS), 2010 6th International Conference on*, 2010, pp. 1-6.
- [147] D. Schweitzer, H. Pape, and C. Liu, "Transient Measurement of the Junction-To-Case Thermal Resistance Using Structure Functions: Chances and Limits," in Semiconductor Thermal Measurement and Management Symposium, 2008. Semi-Therm 2008. Twenty-fourth Annual IEEE, 2008, pp. 191-197.
- [148] D. Schweitzer, H. Pape, C. Liu, R. Kutscherauer, and M. Walder, "Transient dual interface measurement — A new JEDEC standard for the measurement of the junction-to-case thermal resistance," in *Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM)*, 2011 27th Annual IEEE, 2011, pp. 222-229.
- [149] V. Székely and T. Van Bien, "Fine structure of heat flow path in semiconductor devices: A measurement and identification method," *Solid-State Electronics*, vol. 31, pp. 1363-1368, 1988.
- [150] Y. Jian, J. D. van Wyk, W. G. Odendaal, and L. Zhenxian, "Comparison of transient thermal parameters for different die connecting approaches," in *Industry Applications Conference*, 2003. 38th IAS Annual Meeting. Conference Record of the, 2003, pp. 1818-1825 vol.3.
- [151] D. L. Blackburn, "Temperature measurements of semiconductor devices a review," in *Semiconductor Thermal Measurement and Management Symposium*, 2004. Twentieth Annual IEEE, 2004, pp. 70-80.
- [152] T. E. Salem, D. Ibitayo, and B. R. Geil, "Validation of Infrared Camera Thermal Measurements on High-Voltage Power Electronic Components," *Instrumentation and Measurement, IEEE Transactions on*, vol. 56, pp. 1973-1978, 2007.
- [153] U.Scheuermann and R.Schmidt, "Investigations on the VCE(T)-Method to Determine the Junction Temperature by Using the Chip Itself as Sensor," presented at the Intelligent Motion and Power Quality (PCIM) Europe, Nuremberg, Germany, 2009.
- [154] S. Rael, C. Schaeffer, and R. Perret, "Electrothermal characterization of IGBT," in *Proceedings of the 29th IAS Annual Meeting. Part 3 (of 3), October 2, 1994 -October 5, 1994*, Denver, CO, USA, 1994, pp. 1269-1276.
- [155] J. W. Sofia, "Component Thermal Characterization:Transient to Steady State," Analysis Tech Inc.2011.
- [156] F. F. Oettinger and D. L. Blackbum, *Thermal resistance measurements*: NIST Special Publication 1990.
- [157] S. M. Sze, *Physics of Semiconductor Devices*, 2nd ed.: John Wiley and Sons, 1981.
- [158] X. Perpina, J. F. Serviere, J. Saiz, D. Barlini, M. Mermet-Guyennet, and J. Millan, "Temperature measurement on series resistance and devices in power packs based on on-state voltage drop monitoring at high current," *Microelectronics Reliability*, vol. 46, pp. 1834-1839, 2006.
- [159] A. J. Forsyth, S. Y. Yang, P. A. Mawby, and P. Igic, "Measurement and modelling of power electronic devices at cryogenic temperatures," *IEE Proceedings: Circuits, Devices and Systems*, vol. 153, pp. 407-415, 2006.

- [160] D. L. Blackburn and D. W. Berning, "POWER MOSFET TEMEPERATURE MEASUREMENTS," in PESC '82 Record, 13th Annual IEEE Power Electronics Specialists Conference., Cambridge, MA, Engl, 1982, pp. 400-407.
- [161] H. Chen, V. Pickert, D. J. Atkinson, and L. S. Pritchard, "On-line monitoring of the mosfet device junction temperature by computation of the threshold voltage," in *Power Electronics, Machines and Drives, 2006. PEMD 2006. The 3rd IET International Conference on*, 2006, pp. 440-444.
- [162] B. J. Baliga, *Fundamentals of Power Semiconductor Devices*. New York: Springer-Verlag, 2008.
- [163] Z. Jakopovic, Z. Bencic, and F. Kolonic, "Important properties of transient thermal impedance for MOS-gated power semiconductors," in *Proceedings of the 1999 IEEE International Symposium on Industrial Electronics (ISIE'99), July 12, 1999 - July 16, 1999*, Bled, Slovenia, 1999, pp. 574-578.
- [164] A. Ammous, B. Allard, and H. Morel, "Transient temperature measurements and modeling of IGBT's under short circuit," *IEEE Transactions on Power Electronics*, vol. 13, pp. 12-25, 1998.
- [165] D. Bergogne, B. Allard, and H. Morel, "An estimation method of the channel temperature of power MOS devices," in *Power Electronics Specialists Conference, 2000. PESC 00. 2000 IEEE 31st Annual*, 2000, pp. 1594-1599 vol.3.
- [166] A. R. Hefner Jr and D. M. Diebolt, "An experimentally verified IGBT model implemented in the Saber circuit simulator," in 22nd Annual IEEE Power Electronics Specialists Conference - PESC '91, June 24, 1991 - June 27, 1991, Boston, MA, USA, 1991, pp. 10-19.
- [167] D. L. Blackburn, "A REVIEW OF THERMAL CHARACTERIZATION OF POWER TRANSISTORS," in Fourth Annual IEEE Semiconductor Thermal and Temperature Measurement Symposium, Proceedings 1988., San Diego, CA, USA, 1988, pp. 1-7.
- [168] D. Berning, J. Reichl, A. Hefner, M. Hernandez, C. Ellenwood, and J. S. Lai, "High Speed IGBT Module Transient Thermal Response Measurements for Model Validation," in 2003 IEEE Industry Applications Conference; 38th Annual Meeting: Crossroads To Innovation, October 12, 2003 - October 16, 2003, Salt Lake City, UT, United states, 2003, pp. 1826-1832.
- [169] "Thermal Impedance Measurement for Insulated Gate Bipolar Transistors," JEDEC SOLID STATE TECHNOLOGY ASSOCIATION2002.
- [170] A. Bryant, Y. Shaoyong, P. Mawby, X. Dawei, R. Li, P. Tavner, and P. Palmer, "Investigation Into IGBT dV/dt During Turn-Off and Its Temperature Dependence," *Power Electronics, IEEE Transactions on*, vol. 26, pp. 3019-3031, 2011.
- [171] D. Barlini, M. Ciappa, M. Mermet-Guyennet, and W. Fichtner, "Measurement of the transient junction temperature in MOSFET devices under operating conditions," *Microelectronics Reliability*, vol. 47, pp. 1707-1712.
- [172] H. Kuhn and A. Mertens, "On-line junction temperature measurement of IGBTs based on temperature sensitive electrical parameters," in 2009 13th European Conference on Power Electronics and Applications, EPE '09, September 8, 2009 - September 10, 2009, Barcelona, Spain, 2009.
- [173] W. Bo, T. Yong, and C. Ming, "Experimental Study on Voltage Breakdown Characteristic of IGBT," in *Power and Energy Engineering Conference* (APPEEC), 2011 Asia-Pacific, 2011, pp. 1-4.
- [174] (Dec, 2011). Agilent 4155C Semiconductor Parameter Analyzer Datasheet. Available: http://cp.literature.agilent.com/litweb/pdf/5988-9238EN.pdf
- [175] F. Ciancetta, A. Ometto, and N. Rotondale, "Analysis of PEM fuel cell Supercapacitor – Battery pack system during standard cycle," in *Power*

*Electronics Electrical Drives Automation and Motion (SPEEDAM), 2010 International Symposium on, 2010, pp. 1286-1290.* 

- [176] H. D. Baehr and K. Stephan, *Heat and Mass Transfer*, 2nd ed ed. New York Springer-Verlag, 2003.
- [177] T. Wu and A. Castellazzi, "Temperature adaptive IGBT gate-driver design," in *Power Electronics and Applications (EPE 2011), Proceedings of the 2011-14th European Conference on*, 2011, pp. 1-6.
- [178] D. Xiang, L. Ran, P. Tavner, A. Bryant, S. Yang, and P. Mawby, "Monitoring Solder Fatigue in a Power Module Using Case-Above-Ambient Temperature Rise," *Industry Applications, IEEE Transactions on*, vol. 47, pp. 2578-2591, 2011.
- [179] (Dec, 2011). *Gap Pad*® *1500 Datasheet*. Available: http://www.bergquistcompany.com/pdfs/dataSheets/PDS\_GP\_1500\_12.08\_E.pd f
- [180] (Dec, 2011). Gap Pad® 3000S30 Datasheet. Available: http://orionind.com/catalog/datasheets/PDS\_GP\_3000S30\_12.08\_E.pdf
- [181] X. Cao, K. D. T. Ngo, and G.-Q. Lu, "Thermal design of power module to minimize peak transient temperature," in 2009 International Conference on Electronic Packaging Technology and High Density Packaging, ICEPT-HDP 2009, August 10, 2009 - August 13, 2009, Beijing, China, 2009, pp. 248-254.
- [182] *Automotive* relay DG85C Datasheet. Available: http://www.aecsensors.com/DG85.pdf
- [183] "Investigation into the Scope for the Transport Sector to Switch to Electric Vehicles and Plugin Hybrid Vehicles," BERR & DfT2008.
- [184] (Dec, 2011). Sound Velocity Chart. Available: http://www.phase2plus.com/sound\_velocity\_chart.htm