Hole burned through CoolMOS P7 tab when it fails

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
Brewster
Level 1
Level 1
5 sign-ins First reply posted First question asked

I'm working on an audio amplifier system where the ACDC supply uses IPP60R060P7 devices (in a pair to drive a transformer) and we see high failure rates from this device in a consistent pattern where a hole has burned through in the same spot.  The underlying insulator and heatsink don't seem to be at issue (i.e. it's not a short of the tab to the heatsink)

Brewster_0-1666288293365.png

 

The supplies fail within a few weeks of being put in to use.  It may be that some other system failure causes this part and the gate driver to fail, we are not sure yet of what is happening.  I'm putting this here on the off chance someone has seen these parts do this and what might be the cause.  This is Q5 (low side) in the circuit, the HV supply is 400V and the supply design supports a 2kW peak and 1 kW continuous load profile. The failures all happen with light loads (<100W) and probably when the controller is pulse skipping (we have not been able to make a failure happen on the bench).

Brewster_1-1666288596129.png

I'm not the supply designer but this has production stopped so we're all looking at this.

Thanks,

Brewster

0 Likes
8 Replies
Meghana
Moderator
Moderator
Moderator
50 likes received 100 solutions authored 10 likes given

Hello @Brewster ,

Can you please share some more information to analyze this issue further. 

1. Which switch in the half-bridge is failing? and is this consistent in all the failed board?

2. How many devices have failed so far?

3. Is thermal measurement data available for the failed devices?

4. Please share the Vgs, Vds & Id waveforms through both the switches in the half bridge. 

Can you also please share the complete schematic & layout of the device. It will help our analysis.

Regards

Meghana

 

 

0 Likes

Hi @Meghana ,

Thank you for the reply. I'm not sure how but somehow I must have bumped the "solution" button but I can not find a way to undo whatever I did.

The problem is definitely not solved.

I can't share the design publicly but can certainly make it available to Infineon. Being new to the forums I wasn't sure if everything posted is public?  Let me know the best way to get the design to you (schematics and gerbers).

> 1. Which switch in the half-bridge is failing? and is this consistent in all the failed board?

It is the lower switch (Q5 in the schematic section posted).  We have not checked every failure in detail, but we are pretty sure it is always Q5.  We also believe the gate driver U5 always fails too.

> 2. How many devices have failed so far?

Around 10 out of 50 systems.  It's a new product and has not gone out to a lot of customers yet.

> 3. Is thermal measurement data available for the failed devices?

The ACDC supply does measure temperature on the heat sinks in two different places but there's only an "overtemp" signal that comes from the board to the controller.  We do not  believe there is any long term heating as this is used in a benign environment. We do monitor the inside the box ambient and it settles in around 35C.

We do not have short term measurement data yet from an instrumented ACDC supply as so far we have not had one fail on the bench.

> 4. Please share the Vgs, Vds & Id waveforms through both the switches in the half bridge.

The designer is in the process of capturing those and will post as soon as available.

> Please share the operating conditions

In some failure cases (in EU) box is powered by 230V/50Hz, and others in US (120V/60Hz)

It has happened after the system has been on for several hours.

Based on customer information we estimate RMS load on the supply was <100W and peak load < 400W.  In at least two failures the box was idle (i.e. no audio) or around 50W load.

The LLC does start pulse skipping at low power and that is one of the things we are investigating - some sort of pathological switching between normal and skipping. A microchip DSPic is used to manage the primary side and a second one manages the secondary.

Thanks,

Brewster

0 Likes
Meghana
Moderator
Moderator
Moderator
50 likes received 100 solutions authored 10 likes given

Hello @Brewster,

I've done the required changes to revert the thread status to "Unsolved". And Yes-  All the data shared here will be public. 

As @Len_CONSULTRON mentioned, there should be a really high junction temperature for this failure. During the light load operation, especially with the high switching frequency, there could be a loss of ZVS resulting in the incomplete rail-to-rail transition within the dead time causing a shoot-through, which further could have caused this high-temperature rise. Kindly provide the measurements to assist further on this topic.

Regards

Meghana

0 Likes

Hi @Meghana ,

I owe everyone a mea culpa.

One of the downsides of being part of a team that's in 5 different physical locations is the difficulty in getting the same details to everyone.  I had not taken the supplies I had apart.

In this case it was what part was actually used on the power supply.  While the prototypes and preproduction units were using the Infineon parts, the contract manufacturer saw that they were not available (and remain unavailable except for grey market, which is an unacceptable risk factor on this project). The designer found and validated a part from another vendor and the CM purchased and used those.

We believe all of the failed units are using the non-Infineon parts;  a comparison of the circuit operation between the two parts has not shown a meaningful performance difference that might somehow lead to failure. It might be coincidental as there are more non-Infineon production units being used  than the (older/prototype) Infineon units.

I'm not sure how to flag the status of this post. It's certainly an interesting failure mode regardless of vendor's parts. I'll post some sort of "this was it" when we get there on the off chance someone else finds themselves in this situation.

I certainly appreciate the thoughtful responses from everyone.

Regards,

Brewster

-------------------------

Current status:

More failed supplies have now been taken apart and we're seeing that both MOSFETs are actually blown. Some have failed more traditionally, with the case rupturing and the smoke getting out that way. Replacing the MOSFETs and the gate driver part restore operation of the ACDC.

0 Likes

Brewster,

I can sympathize with you.   It is difficult to get to the root cause of an issue if not all the correct information is available to you.  Had you a picture of the front of the IC with the "crater" on the heat tab you might have been able to check the IC markings to match against the Infineon markings.

Normally when I've dealt with remote teams, a failure requires a full report and sending the failed unit to us at design engineering.   This allows us to do a proper deep dive.

Also, in our organization, ANY ATTEMPT to source a non-specified part MUST be brought to the  design team for their analysis and blessing.  Usually if needed, the design team choses the part to change out that EXCEEDS the intended requirements.  Using a replacement part that underperforms is inviting disaster.

Note:  Getting ICs from the 'gray' or 'black' market is a disaster waiting to happen.  It is commonly known that the parts used are not-to-spec.  It is not unusual for someone to "dumpster-dive" for failed parts from IC manufacturers.  This is why labelling the ICs usually occur AFTER the part is tested and meets spec.   It is also not unusual that people will rebrand the 'good-ish' parts as similar parts with better and more desirable specifications than the part was designed to meet.

Len
"Engineering is an Art. The Art of Compromise."
0 Likes
Len_CONSULTRON
Level 9
Level 9
Beta tester 500 solutions authored 1000 replies posted

Brewster,

Interesting failure.  I've worked with with power FETs driving 500W to 1200W loads.

In order for the metal to melt as shown, the heat needs to exceed at least 1000C for a prolonged time.  Given that the tab is as thick as shown, this time gets longer.

 

Is there a silicon pad insulator between the FET and the heatsink?  I noticed that HVBUS is on the Tab and the heatsink (ra-t2x-64e) is grounded t HVGND.  According to you HVBUS is 400V.

If not, you might be getting some significant "punch-through" event from the Tab potential to the heatsink.

You might also need a sil-pad rated for 500V+.

Normal silicon FETs start failing when the junction gets to be about 170C.  They usually fail shorted.  However if they are shorted with significant power applied for a relatively short period of time, they then disintegrate causing them to fail in the "open" state.

In order for the metal to melt at 1000C+, the junction would need to get to that temp.  This is highly unlikely since this would be  catastrophic failure causing the silicon to fail in the open state within a few milliseconds.

Suggestion for future consideration:

I realize you are trying to determine if there is a silicon-design issue or at worst a user-design implementation issue.  Hopefully Infineon can help with that.

You might want to consider a placing an in-line fuse  or circuit breaker on HVBUS.  Proper selection of circuit protection might prevent IC damage.

You indicated that the failures seems to appear under light loads.  I wonder if your getting inductive kickbacks from the coupling transformer.

Len
"Engineering is an Art. The Art of Compromise."
0 Likes

Hi Len,

The thermal pad is rated for 9 kV.  It does not look like there's a short the develops there, i.e. the heatsink metal seems OK.  I have seen when tab connects to heatsink (ground) and is usually a pretty destructive event.  I might even expect one of the device leads to probably melt like a fuse link in that scenario.

I agree with your "fail open" comment, regardless of what shorted, a lot of current flows, and the short vaporizes and things are left open circuited.    Sometimes the energy source runs out of juice before the short totally vaporizes and the device is left in a shorted state. 

So far the only think I can think of for a hole like this is somehow after it failed shorted and starts melting/going open circuit an arc forms inside the package; that would produce high temperatures and probably be high enough impedance that the HV supply (which has 1600uF of capacitance) can hold up long enough even with the AC mains fuses blown. The supply is actually designed to have good hold up so it can shut down the amp in a normal power fail case cleanly/no click or pops.

On the light loads - yes, that seems to be the best starting point, as it's possibly going in to or out of pulse skipping mode at those power levels.

We're still early on with this, but the hole pattern was so unique I figured a post about it might generate a "oh, I've seen that" response.  And if not, well now others will know it's possible...

Brewster

0 Likes
Meghana
Moderator
Moderator
Moderator
50 likes received 100 solutions authored 10 likes given

Hello @Brewster ,

Its unfortunate that you are facing these failures. But since its a different vendor product, we will not be able to offer any support here. Nevertheless, please feel free to reach out to us if you have any questions related Infineon components. 

Regards

Meghana

0 Likes