importance of reliability testing

During the new product introduction process, it takes an enormous amount of time, energy, resources, and manpower to go from one pre-production product build to the next, improving its reliability through reliability testing until such a time as the design, BOM, etc, can be frozen and we can go into mass production.

There’s a lot of work that needs to be done before mass production can start if you want a product with very few quality issues and failures once it hits the market, perhaps more than many importers expect when starting out, so in this article, we’ll describe the work required to reach that objective… 

 

The 3 pre-production reliability testing phases that commonly occur during NPI

Going from prototype build to build and performing reliability testing to uncover and fix problems each time, is a critical part of the NPI process.

During the NPI process, you’ll often go through EVT, DVT, and PVT*. These are the validation stages that get you increasingly closer to being ready for mass production and help assure good quality and reliable products. They are:

  1. Engineering validation
  2. Design validation
  3. Production validation

As you can see, the three stages validate different aspects of your production in order on numerous prototype samples. Once one has occurred and been accepted, we move on to the next.

*Note: EVT, DVT, and PVT are very commonly followed when producing consumer electronics that aim at production volumes in the hundreds of thousands (or higher).

 

First, in EVT, we validate from an engineering point of view that the product works as intended and reaches your expectations for performance, functions, and reliability. The prototypes made at this point use final production processes and materials, too, and allow any issues with hardware or software to be ironed out.

 

Then, in DVT, we validate that the product reaches your aesthetic expectations and functions as expected in different environments. The onus on environmental testing here means that a lot of stress testing takes place and the product will be tested to relevant certifications, such as RoHS testing for electronic components. Tests may include drop tests, HALT & HASS testing for product life testing, and more.
Again, issues found can be corrected, taking the product one step closer to being ‘ready for production.’

 

Finally, in PVT, attention turns to the production processes to be used in mass production. This usually takes place just as production is starting and the production line performs a pilot run/s to check the capacity and effectiveness of the line to validate that your product can be made in the correct volumes and on time. Issues found may require adjustments to the processes or tooling before mass production begins in earnest, however, if everything is fine, then the products made can be considered good for sale and mass production can continue.

 

Why do we go through the rigours of so much testing?

Every build has its own story. From a reliability point of view, we need to ‘expect the unexpected.’ Speed bumps affecting reliability can occur due to numerous reasons, commonly from mistakes made during design and validation, where staff miss something, or from poor cross-communication.

Let’s illustrate what can go wrong with some examples…

During the first prototype build in EVT, all it will take to get the correct reliability testing on the samples missed is for the project manager to miss speaking to the reliability testing team leaving them with too few samples to test. In this situation, thorough testing could be skipped… which leads to issues with the product down the line which may be carried over into mass production. Not good news.

Ideally, before validation and testing begin the project manager needs to gather together the teams and check if critical questions have been answered. Has anything been forgotten? If anything is flagged at this point, reliability testing does not start. For instance, if someone in purchasing flags up that not enough PCBs have been ordered, then too few sample prototypes can be made for testing which would lead to inferior reliability testing that is not thorough enough.

 

Why samples can be problematic for companies developing a new product

A lot of companies who’re launching new products, especially hardware startups, have cost constraints. The numerous samples tested during NPI, which could number in the hundreds, all need to be paid for. So the temptation is to reduce the number of samples made for reliability testing to shave the costs.

This is a mistake.

It’s a sound investment to maximise the number of samples we make for testing early on, well before mass production starts, to find and fix issues that could be very damaging should they make it into production batches.

The number of samples that you test during the validation phases actually increases as we go through builds. Why? As some will break and be scrapped and by testing more we uncover more issues that need to be fixed, with the end goal being to have a product that is at an almost 0% failure rate by the time we reach PVT.

Let’s say in our very first build during EVT we test samples and uncover the top 5 issues. That’s enough for build one.

In build 2, perhaps around 30/40% of the product’s issues have been found, so to find more issues to fix it follows that we need to test even more samples than in build one. So if build one was 25 samples, build 2 would be around 40/50, and then in build 3, this figure would be tripled.

It’s clear then, that numerous prototype builds will be made during each phase, and we discussed how many you’ll need to budget for in this post: How Many Product Samples Are Required For Reliability & Compliance Testing?

By testing so many sample prototypes we find and fix problems through the validation phases, reducing them to the point where there are almost no possible failures in mass-produced products. We can expect approximately 70% failures during EVT, 20/30% in DVT, and then around 0% in PVT.

At this point the design and BOM are frozen and we’re ready for mass production.

 

What issues can occur in production after EVT, DVT, and PVT?

Hopefully, any failures found in PVT won’t occur in the field and be found by consumers as they’ve been identified and fixed in a controlled environment mimicking real usage conditions during DVT (such as drop tests of a mobile phone on its corner which is going to be rare in real life but is tested against in any case as a potential failure mode).

In a PVT environment, the goal is to obtain ‘final hardware’ which has gone through proper reliability testing as just described to assure that mass-produced products won’t have reliability issues in the field.

If ‘Design For Reliability (DFR)’ has been followed, including a reliability growth plan, design development and testing, then the only remaining issues that can occur will be found during production.

These will be caused by the following:

  1. Production assembly (mistakes are made by operators)
  2. Production testing (bad items are passed as OK)
  3. QA during production
  4. Component quality

The first three issues can be caught during PVT and dealt with internally, however poor component quality is something to be wary of as components are often coming from external sub-suppliers.

 

Controlling component quality to assure reliable products reach consumers

Let’s assume that we’ve performed thorough reliability testing throughout EVT, DVT, and PVT and we’re now in mass production. 

Products are in use in the field and we start receiving reports of failures.

After an internal investigation, we confirm that production assembly, testing, and QA don’t seem to have issues, so perhaps the issue lies with components. However, the BOM has been frozen so components should be sound having been tested during the validation phases (that necessarily use production processes and components).

A Bill Of Materials will include two or three reserve component suppliers, all of whom provide components that will have been tested during the NPI validation phases and are guaranteed to be reliable. These second or third source suppliers provide redundancy to guard against supply disruption should our main supplier have supply issues. Deviating from these options, however, will render all reliability testing pointless and we’re then putting products onto the market that are untested, unreliable, and, potentially, unsafe!

How can incorrect components end up in products?

One potential cause of the failures here is that an overzealous buyer has found a cheaper version of ‘the same component’ and has approved them, the cheaper ones subsequently finding their way into production.
These components are not in the frozen BOM. Maybe the person mistakenly approved the components and has since left meaning that the other departments assumed they were still from the BOM.

Fortunately, with our organization at Agilian, this is extremely unlikely to happen.

Another realistic scenario is when an outsourced Chinese supplier swaps out our approved component for a cheaper non-approved one in return for a kickback. In this way, they’ll make more margin on your order and potentially receive a lot of money from the component supplier (imagine if they purchase millions of a particular part, the kickback could be large).

How to avoid unapproved components from being used?

Here’s how to prevent rogue components from making it into production:

  • Freeze the BOM as soon you go into production and do not adjust it except in case of emergencies.
  • Assure that there are several fully tested and approved component sources.
  • Keep control over the BOM and check that components are from an approved supplier every time a shipment is received.
  • If working with an outsourced supplier, audit and inspect them regularly and keep pressure on them to deter them from cutting corners or behaving badly.

 

Conclusion – how to assure we get ‘great, reliable products’

Even if everything is done perfectly during the NPI process, it only takes forgetting a couple of key points to completely upset the applecart and end up with defective products hitting the market.

For example, if too few samples are reliability tested to find and fix all possible problems, second or third source components suppliers aren’t sourced, tested, and approved, or an issue in production is missed internally.

Although it’s a lot of work, a comprehensive EVT, DVT, and PVT process, frozen BOM and product design, and a close eye on production will go a long way to assuring your customers end up with reliable and good quality products. These elements together can be described as a ‘reliability growth program.’

Hardware startups in particular will benefit from not underfunding the reliability testing element of NPI, buying approved components in bulk, and performing boring and repetitive mass production (the hallmarks of good mass production).

 

You may also enjoy these blog posts

Keep reading on this topic with these p0sts:

And remember, if you are unsure about how to get your new product idea through validation and into mass-production, we can help you. Just contact us to discuss your project and we’ll always try to offer some advice where possible.

About Andrew Amirnovin

Andrew Amirnovin, is an electrical and electronics engineer and is an ASQ-Certified Reliability Engineer. He is our customers’ go-to resource when it comes to building reliability into the products we help develop. He honed his craft over the decades at some of the world’s largest electronics companies. At Agilian, he works closely with customers and helps structure our processes.
Posted in NPI Process | Tagged