It’s always a pleasure to receive an article from Muhammad, one of the senior members of this Electrical Engineergin Community Blog. Let him and his 20 years of experience telling you about faulty systems and read his tips on how to detect them.
If just like him you want to send us quality articles, we’ll be glad to publish them on the blog! It’s easy, just send us a mail so that we can discuss. For now, enjoy this great article.
Faulty System: Engineers are detectives
An engineer is always a bit of a detective when he or she is trouble-shooting a faulty system. Different faults can have different levels of complexity. How difficult is a fault to trace? That depends on the qualification and expertise of the engineer who is trying to diagnose and repair, and the nature of the fault.
When I was a young engineer, I often wondered how some (some!) seniors were able to know the source of the fault just as we started to report to them. With decades of experience behind me, now that is quite understandable, while I am sure, younger people must now be having similar thoughts about their seniors.
Faulty System: The Elusive Fault
Just as some elusive criminals are sometimes able to play hide and seek with police detectives, so do some electrical faults with electrical engineers.
Such faults can often defy both experienced and inexperienced engineers alike. These are the intermittent faults. They come and go. For example, a system shuts down automatically, or reports a parameter out of tolerance. When you restart the system, it continues to work fine. What can you do except wait for the next occurrence?
Faulty System: The Bad Ones
Occasionally there are some faults which need you to be something of a Sherlock Holmes. Prominent among these are faults which are intermittent. Intermittent faults are the biggest headache for a repair engineer. They come and go, playing hide and seek with the engineer. All your expertise and all your test equipment will be useless when the fault temporarily disappears. They are rather less of a problem if the offending assembly or system can be replaced.
But what if a replacement is not available? Worst, when the fault is in the backbone system itself. The engineer is helpless and loses face in front of the operations staff unless he is lucky to be able to trace and rectify the fault.
Ignoring Them Can Be Dangerous!
Loosing face is one thing. But some faults can be potentially dangerous? How dangerous can a fault in your system be? If it is a power transmission or distribution service, the results of a fault can be very disruptive. And if it happens to be an intermittent fault, it can be really disastrous. It can cause arcing, repeated interruptions, revenue loss, fire, damage to equipment, and damage to customer property, in addition to loss of reputation.
Many household fires in the developing world routinely claiming lives and causing damage to property are normally attributed to ‘short circuit’ (actually, faulty wiring) causing fire. As a massive example, a chemicals factory in Karachi, Pakistan, caught fire at night, resulting in a loss of over 250 lives and total destruction of the factory! Officially, the blame was placed on an electrical ‘short-circuit’. Intermittent faults must be eliminated as soon as possible. Unfortunately, you cannot catch a ghost unless it appears.
Intermittent Faults in Communication and Control
Intermittent faults in communications are not as serious, but can be very distracting. Imagine you are on the internet looking for something very important and the net starts playing hide-and-seek with you. How would you feel? Bugged? But no physical damage.
However, if you are remotely monitoring the security of your house, or the operating parameters of a nuclear power station, an intermittent net is the last thing you want to have.
Catching the Elusive Thief
Early catching of an elusive fault depends quite much on your good deeds. Modern systems monitor and report key parameters continuously, so a playback can lead you close to the subassembly causing it. You replace the assembly, and if is not a problem of plug and socket, you are OK. Then woe to the guy in the workshop! If monitoring is not there or is insufficient, it is a wise thing to try to recreate the fault AFTER taking precautions. However, never try to recreate catastrophic occurrences on load.
Caution! By the time you and your staff realize that the fault is intermittent in nature, you have already gone through some diagnostic effort and, of course, found no fault. Note that there is no guarantee that the areas you investigated so far are OK. That is, unless you did that while the fault existed.
Your task is difficult! But here are some comments which may help.
- Intelligence- The first thing you need is, intelligence. No! No! Don’t get offended in a hurry. In military and detective terminology, intelligence means gathering information related to the case or the enemy. If your system has, as do many moderns systems, a monitoring and recording set-up, data recorded just prior to the failure may give an indication of the possible sources of the fault.
- Know Thy System- The most helpful stuff in fault finding is a detailed knowledge of your system, its architecture, and its operation. This knowledge will help you make intelligent guesses as to possible faults.
- Past Fault History of Your system- Past history will point to probabilities. Combined with the nature of the fault, this data will help you make intelligent an approach to tracing it.
- General Trade Knowledge-Trying to catch a ghost you need to learn all you can about ghosts, their habits, hideouts, and modes of operation. Similarly, we should think of the most common causes of intermittent faults.
Where Do They Occur?
Intermittent faults will normally be caused at:
- Mating connectors
- Hidden breaks in cables-e.g., data and control cables
- Bad solders in PCB’s
- Broken track on a PCB
What Causes an Intermittent Fault?
Some of the operating conditions which promote intermittent faults are:
- Wear- mechanical things wear out when subjected to relative movement, e.g., plug-socket arrangements where equipment is frequently plugged and unplugged. This will also happen in slide switches.
- Vibration- Rotating machinery will generate vibrations which will tend to loosen contacts, and even cause internal breaks in wires. Vibrations are particularly severe in the aviation, nautical, and road transport equipment.
- Environment- Ingress of dust particles will cause severe problems in switches and relays. Vapor ingress will cause rust, and uneven surfaces will again cause similar problems.
- Temperature Cycling- Outdoor equipment, if not airconditioned, will go through temperature cycles and cause problems with the passage of time. This can be particularly applicable to soldered components.
- The Technician- Untrained, irresponsible, overworked, or disgruntled technicians will do improper jobs, leaving connections and fittings loose.
- Bad Luck- Effects of all the above factors are probabilistic, i.e., may occur randomly. If two or more factors strike together, that would be bad luck. I want to warn people of a special human tendency. When looking for a fault, most engineers often presume that there is only one fault in the system under investigation . They trace one fault and start rejoicing soon to realize that the fault still exists, or has been removed partially. It would be really bad luck if one or both simultaneous faults were intermittent.
Note: Among the technicians in some underdeveloped countries a ‘short’ can mean anything from a loose connection to incorrect connections to an actual short-circuit.
Note²: The engineering profession is not alone in this tendency. Most people who have to visit doctors frequently have a similar observation about the doctors. And I am one of them.